Commit Graph

174 Commits

Author SHA1 Message Date
Scott Wood f1e89028f0 kvm/ppc/booke: Hold srcu lock when calling gfn functions
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:10:59 +03:00
Scott Wood eb1e4f43e0 kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:24 +02:00
Scott Wood 5df554ad5b kvm/ppc/mpic: in-kernel MPIC emulation
Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:23 +02:00
Mihai Caraman 35b299e279 KVM: PPC: Book3E: Refactor ONE_REG ioctl implementation
Refactor Book3E ONE_REG ioctl implementation to use kvmppc_get_one_reg/
kvmppc_set_one_reg delegation interface introduced by Book3S. This is
necessary for MMU SPRs which are platform specifics.

Get rid of useless case braces in the process.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:05 +02:00
Bharat Bhushan 9b4f530807 booke: exit to user space if emulator request
This allows the exit to user space if emulator request by returning
EMULATE_EXIT_USER. This will be used in subsequent patches in list

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:04 +02:00
Bharat Bhushan 092d62ee93 KVM: PPC: debug stub interface parameter defined
This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
ioctl support. Follow up patches will use this for setting up
hardware breakpoints, watchpoints and software breakpoints.

Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
This is because I am not sure what is required for book3s. So this ioctl
behaviour will not change for book3s.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:02 +02:00
Bharat Bhushan 8c32a2ea65 Added ONE_REG interface for debug instruction
This patch adds the one_reg interface to get the special instruction
to be used for setting software breakpoint from userspace.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-17 15:21:14 +02:00
Paul Mackerras 4fe27d2add KVM: PPC: Remove unused argument to kvmppc_core_dequeue_external
Currently kvmppc_core_dequeue_external() takes a struct kvm_interrupt *
argument and does nothing with it, in any of its implementations.
This removes it in order to make things easier for forthcoming
in-kernel interrupt controller emulation code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:17 +01:00
Bharat Bhushan 78accda4f8 KVM: PPC: Added one_reg interface for timer registers
If userspace wants to change some specific bits of TSR
(timer status register) then it uses GET/SET_SREGS ioctl interface.
So the steps will be:
      i)   user-space will make get ioctl,
      ii)  change TSR in userspace
      iii) then make set ioctl.
It can happen that TSR gets changed by kernel after step i) and
before step iii).

To avoid this we have added below one_reg ioctls for oring and clearing
specific bits in TSR. This patch adds one registerface for:
     1) setting specific bit in TSR (timer status register)
     2) clearing specific bit in TSR (timer status register)
     3) setting/getting the TCR register. There are cases where we want to only
        change TCR and not TSR. Although we can uses SREGS without
        KVM_SREGS_E_UPDATE_TSR flag but I think one reg is better. I am open
        if someone feels we should use SREGS only here.
     4) getting/setting TSR register

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:06 +01:00
Bharat Bhushan d26f22c9cd KVM: PPC: move tsr update in a separate function
This is done so that same function can be called from SREGS and
ONE_REG interface (follow up patch).

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:05 +01:00
Takuya Yoshikawa 8482644aea KVM: set_memory_region: Refactor commit_memory_region()
This patch makes the parameter old a const pointer to the old memory
slot and adds a new parameter named change to know the change being
requested: the former is for removing extra copying and the latter is
for cleaning up the code.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Alexander Graf 011da89962 KVM: PPC: BookE: Handle alignment interrupts
When the guest triggers an alignment interrupt, we don't handle it properly
today and instead BUG_ON(). This really shouldn't happen.

Instead, we should just pass the interrupt back into the guest so it can deal
with it.

Reported-by: Gao Guanhua-B22826 <B22826@freescale.com>
Tested-by: Gao Guanhua-B22826 <B22826@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-02-13 12:56:45 +01:00
Bharat Bhushan 1d542d9c2b KVM: PPC: booke: Allow multiple exception types
Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and
all handlers are considered to be the same size. This will not be
the case if we want to use different macros for different handlers.

This patch improves the kvmppc_booke_handler so that it can
support different macros for different handlers.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[bharat.bhushan@freescale.com: Substantial changes]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-02-13 12:56:40 +01:00
Alexander Graf 324b3e6316 KVM: PPC: BookE: Add EPR ONE_REG sync
We need to be able to read and write the contents of the EPR register
from user space.

This patch implements that logic through the ONE_REG API and declares
its (never implemented) SREGS counterpart as deprecated.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-10 13:42:33 +01:00
Alexander Graf 1c81063655 KVM: PPC: BookE: Implement EPR exit
The External Proxy Facility in FSL BookE chips allows the interrupt
controller to automatically acknowledge an interrupt as soon as a
core gets its pending external interrupt delivered.

Today, user space implements the interrupt controller, so we need to
check on it during such a cycle.

This patch implements logic for user space to enable EPR exiting,
disable EPR exiting and EPR exiting itself, so that user space can
acknowledge an interrupt when an external interrupt has successfully
been delivered into the guest vcpu.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-10 13:42:31 +01:00
Alexander Graf b8c649a99d KVM: PPC: BookE: Allow irq deliveries to inject requests
When injecting an interrupt into guest context, we usually don't need
to check for requests anymore. At least not until today.

With the introduction of EPR, we will have to create a request when the
guest has successfully accepted an external interrupt though.

So we need to prepare the interrupt delivery to abort guest entry
gracefully. Otherwise we'd delay the EPR request.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-10 13:42:21 +01:00
Mihai Caraman 352df1deb2 KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
Implement ONE_REG interface for EPCR register adding KVM_REG_PPC_EPCR to
the list of ONE_REG PPC supported registers.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
[agraf: remove HV dependency, use get/put_user]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:34:20 +01:00
Mihai Caraman 38f988240c KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
Add EPCR support in booke mtspr/mfspr emulation. EPCR register is defined only
for 64-bit and HV categories, we will expose it at this point only to 64-bit
virtual processors running on 64-bit HV hosts.
Define a reusable setter function for vcpu's EPCR.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
[agraf: move HV dependency in the code]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:34:19 +01:00
Mihai Caraman 95e90b43c9 KVM: PPC: bookehv: Add guest computation mode for irq delivery
When delivering guest IRQs, update MSR computation mode according to guest
interrupt computation mode found in EPCR.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
[agraf: remove HV dependency in the code]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:34:18 +01:00
Mihai Caraman b50df19ccc KVM: PPC: booke: Fix get_tb() compile error on 64-bit
Include header file for get_tb() declaration.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:34:09 +01:00
Scott Wood 5bd1cf1185 KVM: PPC: set IN_GUEST_MODE before checking requests
Avoid a race as described in the code comment.

Also remove a related smp_wmb() from booke's kvmppc_prepare_to_enter().
I can't see any reason for it, and the book3s_pr version doesn't have it.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:54 +02:00
Paul Mackerras a47d72f361 KVM: PPC: Book3S HV: Fix updates of vcpu->cpu
This removes the powerpc "generic" updates of vcpu->cpu in load and
put, and moves them to the various backends.

The reason is that "HV" KVM does its own sauce with that field
and the generic updates might corrupt it. The field contains the
CPU# of the -first- HW CPU of the core always for all the VCPU
threads of a core (the one that's online from a host Linux
perspective).

However, the preempt notifiers are going to be called on the
threads VCPUs when they are running (due to them sleeping on our
private waitqueue) causing unload to be called, potentially
clobbering the value.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:52 +02:00
Paul Mackerras dfe49dbd1f KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly
This adds an implementation of kvm_arch_flush_shadow_memslot for
Book3S HV, and arranges for kvmppc_core_commit_memory_region to
flush the dirty log when modifying an existing slot.  With this,
we can handle deletion and modification of memory slots.

kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which
on Book3S HV now traverses the reverse map chains to remove any HPT
(hashed page table) entries referring to pages in the memslot.  This
gets called by generic code whenever deleting a memslot or changing
the guest physical address for a memslot.

We flush the dirty log in kvmppc_core_commit_memory_region for
consistency with what x86 does.  We only need to flush when an
existing memslot is being modified, because for a new memslot the
rmap array (which stores the dirty bits) is all zero, meaning that
every page is considered clean already, and when deleting a memslot
we obviously don't care about the dirty bits any more.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:51 +02:00
Paul Mackerras a66b48c3a3 KVM: PPC: Move kvm->arch.slot_phys into memslot.arch
Now that we have an architecture-specific field in the kvm_memory_slot
structure, we can use it to store the array of page physical addresses
that we need for Book3S HV KVM on PPC970 processors.  This reduces the
size of struct kvm_arch for Book3S HV, and also reduces the size of
struct kvm_arch_memory_slot for other PPC KVM variants since the fields
in it are now only compiled in for Book3S HV.

This necessitates making the kvm_arch_create_memslot and
kvm_arch_free_memslot operations specific to each PPC KVM variant.
That in turn means that we now don't allocate the rmap arrays on
Book3S PR and Book E.

Since we now unpin pages and free the slot_phys array in
kvmppc_core_free_memslot, we no longer need to do it in
kvmppc_core_destroy_vm, since the generic code takes care to free
all the memslots when destroying a VM.

We now need the new memslot to be passed in to
kvmppc_core_prepare_memory_region, since we need to initialize its
arch.slot_phys member on Book3S HV.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:51 +02:00
Alexander Graf 7a08c2740f KVM: PPC: BookE: Support FPU on non-hv systems
When running on HV aware hosts, we can not trap when the guest sets the FP
bit, so we just let it do so when it wants to, because it has full access to
MSR.

For non-HV aware hosts with an FPU (like 440), we need to also adjust the
shadow MSR though. Otherwise the guest gets an FP unavailable trap even when
it really enabled the FP bit in MSR.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:50 +02:00
Bharat Bhushan 6df8d3fc58 booke: Added ONE_REG interface for IAC/DAC debug registers
IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
interface is added to set/get them.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:47 +02:00
Bharat Bhushan f61c94bb99 KVM: PPC: booke: Add watchdog emulation
This patch adds the watchdog emulation in KVM. The watchdog
emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
The kernel timer are used for watchdog emulation and emulates
h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
it is configured.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
[bharat.bhushan@freescale.com: reworked patch]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: adjust to new request framework]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:47 +02:00
Alexander Graf 7c973a2ebb KVM: PPC: Add return value to core_check_requests
Requests may want to tell us that we need to go back into host state,
so add a return value for the checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:46 +02:00
Alexander Graf 7ee788556b KVM: PPC: Add return value in prepare_to_enter
Our prepare_to_enter helper wants to be able to return in more circumstances
to the host than only when an interrupt is pending. Broaden the interface a
bit and move even more generic code to the generic helper.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:46 +02:00
Alexander Graf 3766a4c693 KVM: PPC: Move kvm_guest_enter call into generic code
We need to call kvm_guest_enter in booke and book3s, so move its
call to generic code.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:45 +02:00
Alexander Graf bd2be6836e KVM: PPC: Book3S: PR: Rework irq disabling
Today, we disable preemption while inside guest context, because we need
to expose to the world that we are not in a preemptible context. However,
during that time we already have interrupts disabled, which would indicate
that we are in a non-preemptible context.

The reason the checks for irqs_disabled() fail for us though is that we
manually control hard IRQs and ignore all the lazy EE framework. Let's
stop doing that. Instead, let's always use lazy EE to indicate when we
want to disable IRQs, but do a special final switch that gets us into
EE disabled, but soft enabled state. That way when we get back out of
guest state, we are immediately ready to process interrupts.

This simplifies the code drastically and reduces the time that we appear
as preempt disabled.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:45 +02:00
Alexander Graf 24afa37b9c KVM: PPC: Consistentify vcpu exit path
When getting out of __vcpu_run, let's be consistent about the state we
return in. We want to always

  * have IRQs enabled
  * have called kvm_guest_exit before

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:45 +02:00
Alexander Graf 706fb730cb KVM: PPC: Exit guest context while handling exit
The x86 implementation of KVM accounts for host time while processing
guest exits. Do the same for us.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:43 +02:00
Alexander Graf e85ad380c6 KVM: PPC: BookE: Drop redundant vcpu->mode set
We only need to set vcpu->mode to outside once.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:43 +02:00
Alexander Graf 03d25c5bd5 KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr
We need to do the same things when preparing to enter a guest for booke and
book3s_pr cores. Fold the generic code into a generic function that both call.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:42 +02:00
Alexander Graf 2d8185d4ee KVM: PPC: BookE: No duplicate request != 0 check
We only call kvmppc_check_requests() when vcpu->requests != 0, so drop
the redundant check in the function itself

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:42 +02:00
Alexander Graf 6346046c3a KVM: PPC: BookE: Add some more trace points
Without trace points, debugging what exactly is going on inside guest
code can be very tricky. Add a few more trace points at places that
hopefully tell us more when things go wrong.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:42 +02:00
Alexander Graf 862d31f788 KVM: PPC: E500: Implement MMU notifiers
The e500 target has lived without mmu notifiers ever since it got
introduced, but fails for the user space check on them with hugetlbfs.

So in order to get that one working, implement mmu notifiers in a
reasonably dumb fashion and be happy. On embedded hardware, we almost
never end up with mmu notifier calls, since most people don't overcommit.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:41 +02:00
Alexander Graf d69c643644 KVM: PPC: BookE: Add support for vcpu->mode
Generic KVM code might want to know whether we are inside guest context
or outside. It also wants to be able to push us out of guest context.

Add support to the BookE code for the generic vcpu->mode field that describes
the above states.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:41 +02:00
Alexander Graf 4ffc6356ec KVM: PPC: BookE: Add check_requests helper function
We need a central place to check for pending requests in. Add one that
only does the timer check we already do in a different place.

Later, this central function can be extended by more checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:41 +02:00
Alexander Graf cf1c5ca473 KVM: PPC: BookE: Expose remote TLB flushes in debugfs
We're already counting remote TLB flushes in a variable, but don't export
it to user space yet. Do so, so we know what's going on.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:39 +02:00
Alexander Graf 97c9505984 KVM: PPC: PR: Use generic tracepoint for guest exit
We want to have tracing information on guest exits for booke as well
as book3s. Since most information is identical, use a common trace point.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:39 +02:00
Bharat Bhushan 6328e593c3 booke/bookehv: Add host crit-watchdog exception support
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-07-11 17:39:36 +02:00
Bharat Bhushan 21bd000abf KVM: PPC: booke: Added DECAR support
Added the decrementer auto-reload support. DECAR is readable
on e500v2/e500mc and later cpus.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-30 11:43:11 +02:00
Alexander Graf 966cd0f3bd KVM: PPC: Ignore unhalt request from kvm_vcpu_block
When running kvm_vcpu_block and it realizes that the CPU is actually good
to run, we get a request bit set for KVM_REQ_UNHALT. Right now, there's
nothing we can do with that bit, so let's unset it right after the call
again so we don't get confused in our later checks for pending work.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 14:02:38 +03:00
Alexander Graf 6020c0f6e7 KVM: PPC: Pass EA to updating emulation ops
When emulating updating load/store instructions (lwzu, stwu, ...) we need to
write the effective address of the load/store into a register.

Currently, we write the physical address in there, which is very wrong. So
instead let's save off where the virtual fault was on MMIO and use that
information as value to put into the register.

While at it, also move the XOP variants of the above instructions to the new
scheme of using the already known vaddr instead of calculating it themselves.

Reported-by: Jörg Sommer <joerg@alea.gnuu.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 14:01:37 +03:00
Alexander Graf 03660ba270 KVM: PPC: Booke: only prepare to enter when we enter
So far, we've always called prepare_to_enter even when all we did was return
to the host. This patch changes that semantic to only call prepare_to_enter
when we actually want to get back into the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:29 +03:00
Alexander Graf 7cc1e8ee78 KVM: PPC: booke: Reinject performance monitor interrupts
When we get a performance monitor interrupt, we need to make sure that
the host receives it. So reinject it like we reinject the other host
destined interrupts.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:28 +03:00
Alexander Graf 4e642ccbd6 KVM: PPC: booke: expose good state on irq reinject
When reinjecting an interrupt into the host interrupt handler after we're
back in host kernel land, we need to tell the kernel where the interrupt
happened. We can't tell it that we were in guest state, because that might
lead to random code walking host addresses. So instead, we tell it that
we came from the interrupt reinject code.

This helps getting reasonable numbers out of perf.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:26 +03:00
Alexander Graf 95f2e92144 KVM: PPC: booke: Support perfmon interrupts
When during guest context we get a performance monitor interrupt, we
currently bail out and oops. Let's route it to its correct handler
instead.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:24 +03:00
Alexander Graf 0268597c81 KVM: PPC: booke: add GS documentation for program interrupt
The comment for program interrupts triggered when using bookehv was
misleading. Update it to mention why MSR_GS indicates that we have
to inject an interrupt into the guest again, not emulate it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:19 +03:00
Alexander Graf c35c9d84cf KVM: PPC: booke: Readd debug abort code for machine check
When during guest execution we get a machine check interrupt, we don't
know how to handle it yet. So let's add the error printing code back
again that we dropped accidently earlier and tell user space that something
went really wrong.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:17 +03:00
Alexander Graf 8b3a00fcd3 KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1
The semantics of BOOKE_IRQPRIO_MAX changed to denote the highest available
irqprio + 1, so let's reflect that in the code too.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:06 +03:00
Alexander Graf a8e4ef8414 KVM: PPC: booke: rework rescheduling checks
Instead of checking whether we should reschedule only when we exited
due to an interrupt, let's always check before entering the guest back
again. This gets the target more in line with the other archs.

Also while at it, generalize the whole thing so that eventually we could
have a single kvmppc_prepare_to_enter function for all ppc targets that
does signal and reschedule checking for us.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:05 +03:00
Alexander Graf d1ff54992d KVM: PPC: booke: deliver program int on emulation failure
When we fail to emulate an instruction for the guest, we better go in and
tell it that we failed to emulate it, by throwing an illegal instruction
exception.

Please beware that we basically never get around to telling the guest that
we failed thanks to the debugging code right above it. If user space however
decides that it wants to ignore the debug, we would at least do "the right
thing" afterwards.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:03 +03:00
Alexander Graf acab052906 KVM: PPC: booke: remove leftover debugging
The e500mc patches left some debug code in that we don't need. Remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:55:01 +03:00
Alexander Graf bf7ca4bdcb KVM: PPC: rename CONFIG_KVM_E500 -> CONFIG_KVM_E500V2
The CONFIG_KVM_E500 option really indicates that we're running on a V2 machine,
not on a machine of the generic E500 class. So indicate that properly and
change the config name accordingly.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:54:57 +03:00
Alexander Graf 79300f8cb9 KVM: PPC: e500mc: implicitly set MSR_GS
When setting MSR for an e500mc guest, we implicitly always set MSR_GS
to make sure the guest is in guest state. Since we have this implicit
rule there, we don't need to explicitly pass MSR_GS to set_msr().

Remove all explicit setters of MSR_GS.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:54:52 +03:00
Alexander Graf 4ab969199e KVM: PPC: e500mc: Add doorbell emulation support
When one vcpu wants to kick another, it can issue a special IPI instruction
called msgsnd. This patch emulates this instruction, its clearing counterpart
and the infrastructure required to actually trigger that interrupt inside
a guest vcpu.

With this patch, SMP guests on e500mc work.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:54:50 +03:00
Scott Wood 8fae845f49 KVM: PPC: booke: standard PPC floating point support
e500mc has a normal PPC FPU, rather than SPE which is found
on e500v1/v2.

Based on code from Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:54:15 +03:00
Scott Wood d30f6e4800 KVM: PPC: booke: category E.HV (GS-mode) support
Chips such as e500mc that implement category E.HV in Power ISA 2.06
provide hardware virtualization features, including a new MSR mode for
guest state.  The guest OS can perform many operations without trapping
into the hypervisor, including transitions to and from guest userspace.

Since we can use SRR1[GS] to reliably tell whether an exception came from
guest state, instead of messing around with IVPR, we use DO_KVM similarly
to book3s.

Current issues include:
 - Machine checks from guest state are not routed to the host handler.
 - The guest can cause a host oops by executing an emulated instruction
   in a page that lacks read permission.  Existing e500/4xx support has
   the same problem.

Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
Varun Sethi <Varun.Sethi@freescale.com>, and
Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: remove pt_regs usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:19 +03:00
Scott Wood fafd683278 KVM: PPC: booke: Move vm core init/destroy out of booke.c
e500mc will want to do lpid allocation/deallocation here.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:05 +03:00
Scott Wood 94fa9d9927 KVM: PPC: booke: add booke-level vcpu load/put
This gives us a place to put load/put actions that correspond to
code that is booke-specific but not specific to a particular core.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:04 +03:00
Paul Mackerras 31f3438eca KVM: PPC: Move kvm_vcpu_ioctl_[gs]et_one_reg down to platform-specific code
This moves the get/set_one_reg implementation down from powerpc.c into
booke.c, book3s_pr.c and book3s_hv.c.  This avoids #ifdefs in C code,
but more importantly, it fixes a bug on Book3s HV where we were
accessing beyond the end of the kvm_vcpu struct (via the to_book3s()
macro) and corrupting memory, causing random crashes and file corruption.

On Book3s HV we only accept setting the HIOR to zero, since the guest
runs in supervisor mode and its vectors are never offset from zero.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf update to apply on top of changed ONE_REG patches]
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:41 +02:00
Scott Wood dfd4d47e9a KVM: PPC: booke: Improve timer register emulation
Decrementers are now properly driven by TCR/TSR, and the guest
has full read/write access to these registers.

The decrementer keeps ticking (and setting the TSR bit) regardless of
whether the interrupts are enabled with TCR.

The decrementer stops at zero, rather than going negative.

Decrementers (and FITs, once implemented) are delivered as
level-triggered interrupts -- dequeued when the TSR bit is cleared, not
on delivery.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[scottwood@freescale.com: significant changes]
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:27 +02:00
Scott Wood b59049720d KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASn
This allows additional registers to be accessed by the guest
in PR-mode KVM without trapping.

SPRG4-7 are readable from userspace.  On booke, KVM will sync
these registers when it enters the guest, so that accesses from
guest userspace will work.  The guest kernel, OTOH, must consistently
use either the real registers or the shared area between exits.  This
also applies to the already-paravirted SPRG3.

On non-booke, it's not clear to what extent SPRG4-7 are supported
(they're not architected for book3s, but exist on at least some classic
chips).  They are copied in the get/set regs ioctls, but I do not see any
non-booke emulation.  I also do not see any syncing with real registers
(in PR-mode) including the user-readable SPRG3.  This patch should not
make that situation any worse.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:26 +02:00
Scott Wood 29ac26efbd KVM: PPC: booke: Fix int_pending calculation for MSR[EE] paravirt
int_pending was only being lowered if a bit in pending_exceptions
was cleared during exception delivery -- but for interrupts, we clear
it during IACK/TSR emulation.  This caused paravirt for enabling
MSR[EE] to be ineffective.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:26 +02:00
Scott Wood c59a6a3e4e KVM: PPC: booke: Check for MSR[WE] in prepare_to_enter
This prevents us from inappropriately blocking in a KVM_SET_REGS
ioctl -- the MSR[WE] will take effect when the guest is next entered.

It also causes SRR1[WE] to be set when we enter the guest's interrupt
handler, which is what e500 hardware is documented to do.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:26 +02:00
Scott Wood 25051b5a5a KVM: PPC: Move prepare_to_enter call site into subarch code
This function should be called with interrupts disabled, to avoid
a race where an exception is delivered after we check, but the
resched kick is received before we disable interrupts (and thus doesn't
actually trigger the exit code that would recheck exceptions).

booke already does this properly in the lightweight exit case, but
not on initial entry.

For now, move the call of prepare_to_enter into subarch-specific code so
that booke can do the right thing here.  Ideally book3s would do the same
thing, but I'm having a hard time seeing where it does any interrupt
disabling of this sort (plus it has several additional call sites), so
I'm deferring the book3s fix to someone more familiar with that code.
book3s behavior should be unchanged by this patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:26 +02:00
Scott Wood 7e28e60ef9 KVM: PPC: Rename deliver_interrupts to prepare_to_enter
This function also updates paravirt int_pending, so rename it
to be more obvious that this is a collection of checks run prior
to (re)entering a guest.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:25 +02:00
Scott Wood 1d1ef22208 KVM: PPC: booke: check for signals in kvmppc_vcpu_run
Currently we check prior to returning from a lightweight exit,
but not prior to initial entry.

book3s already does a similar test.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:25 +02:00
Scott Wood 841741f23b KVM: PPC: e500: Don't hardcode PIR=0
The hardcoded behavior prevents proper SMP support.

user space shall specify the vcpu's PIR as the vcpu id.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 14:52:24 +02:00
Alexander Graf af8f38b349 KVM: PPC: Add sanity checking to vcpu_run
There are multiple features in PowerPC KVM that can now be enabled
depending on the user's wishes. Some of the combinations don't make
sense or don't work though.

So this patch adds a way to check if the executing environment would
actually be able to run the guest properly. It also adds sanity
checks if PVR is set (should always be true given the current code
flow), if PAPR is only used with book3s_64 where it works and that
HV KVM is only used in PAPR mode.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:27 +03:00
Paul Mackerras df6909e5d5 KVM: PPC: Move guest enter/exit down into subarch-specific code
Instead of doing the kvm_guest_enter/exit() and local_irq_dis/enable()
calls in powerpc.c, this moves them down into the subarch-specific
book3s_pr.c and booke.c.  This eliminates an extra local_irq_enable()
call in book3s_pr.c, and will be needed for when we do SMT4 guest
support in the book3s hypervisor mode code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:51 +03:00
Paul Mackerras f9e0554dec KVM: PPC: Pass init/destroy vm and prepare/commit memory region ops down
This arranges for the top-level arch/powerpc/kvm/powerpc.c file to
pass down some of the calls it gets to the lower-level subarchitecture
specific code.  The lower-level implementations (in booke.c and book3s.c)
are no-ops.  The coming book3s_hv.c will need this.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:50 +03:00
Liu Yu dd9ebf1f94 KVM: PPC: e500: Add shadow PID support
Dynamically assign host PIDs to guest PIDs, splitting each guest PID into
multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS].  Use
both PID0 and PID1 so that the shadow PIDs for the right mode can be
selected, that correspond both to guest TID = zero and guest TID = guest
PID.

This allows us to significantly reduce the frequency of needing to
invalidate the entire TLB.  When the guest mode or PID changes, we just
update the host PID0/PID1.  And since the allocation of shadow PIDs is
global, multiple guests can share the TLB without conflict.

Note that KVM does not yet support the guest setting PID1 or PID2 to
a value other than zero.  This will need to be fixed for nested KVM
to work.  Until then, we enforce the requirement for guest PID1/PID2
to stay zero by failing the emulation if the guest tries to set them
to something else.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:39 +03:00
Scott Wood a4cd8b23ac KVM: PPC: e500: enable magic page
This is a shared page used for paravirtualization.  It is always present
in the guest kernel's effective address space at the address indicated
by the hypercall that enables it.

The physical address specified by the hypercall is not used, as
e500 does not have real mode.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:37 +03:00
Scott Wood 4cd35f675b KVM: PPC: e500: Save/restore SPE state
This is done lazily.  The SPE save will be done only if the guest has
used SPE since the last preemption or heavyweight exit.  Restore will be
done only on demand, when enabling MSR_SPE in the shadow MSR, in response
to an SPE fault or mtmsr emulation.

For SPEFSCR, Linux already switches it on context switch (non-lazily), so
the only remaining bit is to save it between qemu and the guest.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood ecee273fc4 KVM: PPC: booke: use shadow_msr
Keep the guest MSR and the guest-mode true MSR separate, rather than
modifying the guest MSR on each guest entry to produce a true MSR.

Any bits which should be modified based on guest MSR must be explicitly
propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in
kvmppc_set_msr().

While we're modifying the guest entry code, reorder a few instructions
to bury some load latencies.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood 5ce941ee42 KVM: PPC: booke: add sregs support
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-22 08:47:53 -04:00
Peter Tyser bc9c1933d9 KVM: PPC: Fix SPRG get/set for Book3S and BookE
Previously SPRGs 4-7 were improperly read and written in
kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs();

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Tyser <ptyser@xes-inc.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17 13:08:25 -03:00
Alexander Graf c5335f1765 KVM: PPC: Implement level interrupts for BookE
BookE also wants to support level based interrupts, so let's implement
all the necessary logic there. We need to trick a bit here because the
irqprios are 1:1 assigned to architecture defined values. But since there
is some space left there, we can just pick a random one and move it later
on - it's internal anyways.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:20 +02:00
Hollis Blanchard 082decf29a KVM: PPC: initialize IVORs in addition to IVPR
Developers can now tell at a glace the exact type of the premature interrupt,
instead of just knowing that there was some premature interrupt.

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:17 +02:00
Alexander Graf 90bba35887 KVM: PPC: Tell guest about pending interrupts
When the guest turns on interrupts again, it needs to know if we have an
interrupt pending for it. Because if so, it should rather get out of guest
context and get the interrupt.

So we introduce a new field in the shared page that we use to tell the guest
that there's a pending interrupt lying around.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:46 +02:00
Alexander Graf 5c6cedf488 KVM: PPC: Add PV guest critical sections
When running in hooked code we need a way to disable interrupts without
clobbering any interrupts or exiting out to the hypervisor.

To achieve this, we have an additional critical field in the shared page. If
that field is equal to the r1 register of the guest, it tells the hypervisor
that we're in such a critical section and thus may not receive any interrupts.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:46 +02:00
Alexander Graf 2a342ed577 KVM: PPC: Implement hypervisor interface
To communicate with KVM directly we need to plumb some sort of interface
between the guest and KVM. Usually those interfaces use hypercalls.

This hypercall implementation is described in the last patch of the series
in a special documentation file. Please read that for further information.

This patch implements stubs to handle KVM PPC hypercalls on the host and
guest side alike.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf a73a9599e0 KVM: PPC: Convert SPRG[0-4] to shared page
When in kernel mode there are 4 additional registers available that are
simple data storage. Instead of exiting to the hypervisor to read and
write those, we can just share them with the guest using the page.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf de7906c36c KVM: PPC: Convert SRR0 and SRR1 to shared page
The SRR0 and SRR1 registers contain cached values of the PC and MSR
respectively. They get written to by the hypervisor when an interrupt
occurs or directly by the kernel. They are also used to tell the rfi(d)
instruction where to jump to.

Because it only gets touched on defined events that, it's very simple to
share with the guest. Hypervisor and guest both have full r/w access.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf 5e030186df KVM: PPC: Convert DAR to shared page.
The DAR register contains the address a data page fault occured at. This
register behaves pretty much like a simple data storage register that gets
written to on data faults. There is no hypervisor interaction required on
read or write.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf 666e7252a1 KVM: PPC: Convert MSR to shared page
One of the most obvious registers to share with the guest directly is the
MSR. The MSR contains the "interrupts enabled" flag which the guest has to
toggle in critical sections.

So in order to bring the overhead of interrupt en- and disabling down, let's
put msr into the shared page. Keep in mind that even though you can fully read
its contents, writing to it doesn't always update all state. There are a few
safe fields that don't require hypervisor interaction. See the documentation
for a list of MSR bits that are safe to be set from inside the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:43 +02:00
Asias He 6045be5dea KVM: PPC: fix uninitialized variable warning in kvm_ppc_core_deliver_interrupts
Fixes:
arch/powerpc/kvm/booke.c: In function 'kvmppc_core_deliver_interrupts':
arch/powerpc/kvm/booke.c:147: warning: 'msr_mask' may be used uninitialized in this function

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:53 +03:00
Avi Kivity 2122ff5eab KVM: move vcpu locking to dispatcher for generic vcpu ioctls
All vcpu ioctls need to be locked, so instead of locking each one specifically
we lock at the generic dispatcher.

This patch only updates generic ioctls and leaves arch specific ioctls alone.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:35:47 +03:00
Avi Kivity 98001d8d01 KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19 11:41:10 +03:00
Alexander Graf 4496f97482 KVM: PPC: Add dequeue for external on BookE
Commit a0abee86af2d1f048dbe99d2bcc4a2cefe685617 introduced unsetting of the
IRQ line from userspace. This added a new core specific callback that I
apparently forgot to add for BookE.

So let's add the callback for BookE as well, making it build again.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17 12:17:32 +03:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Liu Yu daf5e27109 KVM: ppc/booke: Set ESR and DEAR when inject interrupt to guest
Old method prematurely sets ESR and DEAR.
Move this part after we decide to inject interrupt,
which is more like hardware behave.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:36:10 -03:00
Alexander Graf 25a8a02d26 KVM: PPC: Emulate trap SRR1 flags properly
Book3S needs some flags in SRR1 to get to know details about an interrupt.

One such example is the trap instruction. It tells the guest kernel that
a program interrupt is due to a trap using a bit in SRR1.

This patch implements above behavior, making WARN_ON behave like WARN_ON.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf 992b5b29b5 KVM: PPC: Add helpers for CR, XER
We now have helpers for the GPRs, so let's also add some for CR and XER.

Having them in the PACA simplifies code a lot, as we don't need to care
about where to store CC or not to overflow any integers.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:47 -03:00
Alexander Graf 8e5b26b55a KVM: PPC: Use accessor functions for GPR access
All code in PPC KVM currently accesses gprs in the vcpu struct directly.

While there's nothing wrong with that wrt the current way gprs are stored
and loaded, it doesn't suffice for the PACA acceleration that will follow
in this patchset.

So let's just create little wrapper inline functions that we call whenever
a GPR needs to be read from or written to. The compiled code shouldn't really
change at all for now.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:47 -03:00
Alexander Graf 7706664d39 KVM: powerpc: Improve DEC handling
We treated the DEC interrupt like an edge based one. This is not true for
Book3s. The DEC keeps firing until mtdec is issued again and thus clears
the interrupt line.

So let's implement this logic in KVM too. This patch moves the line clearing
from the firing of the interrupt to the mtdec emulation.

This makes PPC64 guests work without AGGRESSIVE_DEC defined.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:42 -03:00