Commit Graph

544 Commits

Author SHA1 Message Date
Christoffer Dall 00dafa0fcf KVM: arm/arm64: vgic: Get rid of live_lrs
There is no need to calculate and maintain live_lrs when we always
populate the lowest numbered LRs first on every entry and clear all LRs
on every exit.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09 07:45:31 -07:00
Shih-Wei Li f6769581e9 KVM: arm/arm64: vgic: Avoid flushing vgic state when there's no pending IRQ
We do not need to flush vgic states in each world switch unless
there is pending IRQ queued to the vgic's ap list. We can thus reduce
the overhead by not grabbing the spinlock and not making the extra
function call to vgic_flush_lr_state.

Note: list_empty is a single atomic read (uses READ_ONCE) and can
therefore check if a list is empty or not without the need to take the
spinlock protecting the list.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:45:31 -07:00
Christoffer Dall 328e566479 KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put
We don't have to save/restore the VMCR on every entry to/from the guest,
since on GICv2 we can access the control interface from EL1 and on VHE
systems with GICv3 we can access the control interface from KVM running
in EL2.

GICv3 systems without VHE becomes the rare case, which has to
save/restore the register on each round trip.

Note that userspace accesses may see out-of-date values if the VCPU is
running while accessing the VGIC state via the KVM device API, but this
is already the case and it is up to userspace to quiesce the CPUs before
reading the CPU registers from the GIC for an up-to-date view.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:45:22 -07:00
Christoffer Dall 6d56111c92 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI
As an oversight, for GICv2, we accidentally export the GICC_PMR register
in the format of the GICH_VMCR.VMPriMask field in the lower 5 bits of a
word, meaning that userspace must always use the lower 5 bits to
communicate with the KVM device and must shift the value left by 3
places to obtain the actual priority mask level.

Since GICv3 supports the full 8 bits of priority masking in the ICH_VMCR,
we have to fix the value we export when emulating a GICv2 on top of a
hardware GICv3 and exporting the emulated GICv2 state to userspace.

Take the chance to clarify this aspect of the ABI.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-04 14:33:59 +02:00
Christoffer Dall 5b0d2cc280 KVM: arm64: Ensure LRs are clear when they should be
We currently have some code to clear the list registers on GICv3, but we
never call this code, because the caller got nuked when removing the old
vgic.  We also used to have a similar GICv2 part, but that got lost in
the process too.

Let's reintroduce the logic for GICv2 and call the logic when we
initialize the use of hypervisors on the CPU, for example when first
loading KVM or when exiting a low power state.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-04 14:33:58 +02:00
Ard Biesheuvel 63d7c6afc5 arm: kvm: move kvm_vgic_global_state out of .text section
The kvm_vgic_global_state struct contains a static key which is
written to by jump_label_init() at boot time. So in preparation of
making .text regions truly (well, almost truly) read-only, mark
kvm_vgic_global_state __ro_after_init so it moves to the .rodata
section instead.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-03-23 13:53:46 +00:00
Andre Przywara a5e1e6ca94 KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
The ITS spec says that ITS commands are only processed when the ITS
is enabled (section 8.19.4, Enabled, bit[0]). Our emulation was not taking
this into account.
Fix this by checking the enabled state before handling CWRITER writes.

On the other hand that means that CWRITER could advance while the ITS
is disabled, and enabling it would need those commands to be processed.
Fix this case as well by refactoring actual command processing and
calling this from both the GITS_CWRITER and GITS_CTLR handlers.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-03-07 15:44:08 +00:00
Jintack Lim 370a0ec181 KVM: arm/arm64: Let vcpu thread modify its own active state
Currently, if a vcpu thread tries to change the active state of an
interrupt which is already on the same vcpu's AP list, it will loop
forever. Since the VGIC mmio handler is called after a vcpu has
already synced back the LR state to the struct vgic_irq, we can just
let it proceed safely.

Cc: stable@vger.kernel.org
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-03-07 14:48:16 +00:00
Marc Zyngier 4dfc050571 KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass
Our GICv3 emulation always presents ICC_SRE_EL1 with DIB/DFB set to
zero, which implies that there is a way to bypass the GIC and
inject raw IRQ/FIQ by driving the CPU pins.

Of course, we don't allow that when the GIC is configured, but
we fail to indicate that to the guest. The obvious fix is to
set these bits (and never let them being changed again).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-03-06 10:30:57 +00:00
Jintack Lim 7b6b46311a KVM: arm/arm64: Emulate the EL1 phys timer registers
Emulate read and write operations to CNTP_TVAL, CNTP_CVAL and CNTP_CTL.
Now VMs are able to use the EL1 physical timer.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:37 +00:00
Jintack Lim f242adaf0c KVM: arm/arm64: Set up a background timer for the physical timer emulation
Set a background timer for the EL1 physical timer emulation while VMs
are running, so that VMs get the physical timer interrupts in a timely
manner.

Schedule the background timer on entry to the VM and cancel it on exit.
This would not have any performance impact to the guest OSes that
currently use the virtual timer since the physical timer is always not
enabled.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:36 +00:00
Jintack Lim fb280e9757 KVM: arm/arm64: Set a background timer to the earliest timer expiration
When scheduling a background timer, consider both of the virtual and
physical timer and pick the earliest expiration time.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:35 +00:00
Jintack Lim 58e0c9732a KVM: arm/arm64: Update the physical timer interrupt level
Now that we maintain the EL1 physical timer register states of VMs,
update the physical timer interrupt level along with the virtual one.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:35 +00:00
Jintack Lim a91d18551e KVM: arm/arm64: Initialize the emulated EL1 physical timer
Initialize the emulated EL1 physical timer with the default irq number.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:34 +00:00
Jintack Lim 9171fa2e09 KVM: arm/arm64: Decouple kvm timer functions from virtual timer
Now that we have a separate structure for timer context, make functions
generic so that they can work with any timer context, not just the
virtual timer context.  This does not change the virtual timer
functionality.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:33 +00:00
Jintack Lim 90de943a43 KVM: arm/arm64: Move cntvoff to each timer context
Make cntvoff per each timer context. This is helpful to abstract kvm
timer functions to work with timer context without considering timer
types (e.g. physical timer or virtual timer).

This also would pave the way for ever doing adjustments of the cntvoff
on a per-CPU basis if that should ever make sense.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:33 +00:00
Jintack Lim fbb4aeec5f KVM: arm/arm64: Abstract virtual timer context into separate structure
Abstract virtual timer context into a separate structure and change all
callers referring to timer registers, irq state and so on. No change in
functionality.

This is about to become very handy when adding the EL1 physical timer.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:32 +00:00
Shanker Donthineni 0bdbf3b071 KVM: arm/arm64: vgic: Stop injecting the MSI occurrence twice
The IRQFD framework calls the architecture dependent function
twice if the corresponding GSI type is edge triggered. For ARM,
the function kvm_set_msi() is getting called twice whenever the
IRQFD receives the event signal. The rest of the code path is
trying to inject the MSI without any validation checks. No need
to call the function vgic_its_inject_msi() second time to avoid
an unnecessary overhead in IRQ queue logic. It also avoids the
possibility of VM seeing the MSI twice.

Simple fix, return -1 if the argument 'level' value is zero.

Cc: stable@vger.kernel.org
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-02-08 15:13:14 +00:00
Christoffer Dall 11710dec8a KVM: arm/arm64: Remove kvm_vgic_inject_mapped_irq
The only benefit of having kvm_vgic_inject_mapped_irq separate from
kvm_vgic_inject_irq is that we pass a boolean that we use for error
checking on the injection path.

While this could potentially help in some aspect of robustness, it's
also a little bit of a defensive move, and arguably callers into the
vgic should have make sure they have marked their virtual IRQs as mapped
if required.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-02-01 11:56:35 +01:00
Vijaya Kumar K e96a006cb0 KVM: arm/arm64: vgic: Implement KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO ioctl
Userspace requires to store and restore of line_level for
level triggered interrupts using ioctl KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30 13:47:29 +00:00
Vijaya Kumar K d017d7b0bd KVM: arm/arm64: vgic: Implement VGICv3 CPU interface access
VGICv3 CPU interface registers are accessed using
KVM_DEV_ARM_VGIC_CPU_SYSREGS ioctl. These registers are accessed
as 64-bit. The cpu MPIDR value is passed along with register id.
It is used to identify the cpu for registers access.

The VM that supports SEIs expect it on destination machine to handle
guest aborts and hence checked for ICC_CTLR_EL1.SEIS compatibility.
Similarly, VM that supports Affinity Level 3 that is required for AArch64
mode, is required to be supported on destination machine. Hence checked
for ICC_CTLR_EL1.A3V compatibility.

The arch/arm64/kvm/vgic-sys-reg-v3.c handles read and write of VGIC
CPU registers for AArch64.

For AArch32 mode, arch/arm/kvm/vgic-v3-coproc.c file is created but
APIs are not implemented.

Updated arch/arm/include/uapi/asm/kvm.h with new definitions
required to compile for AArch32.

The version of VGIC v3 specification is defined here
Documentation/virtual/kvm/devices/arm-vgic-v3.txt

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30 13:47:25 +00:00
Vijaya Kumar K 5fb247d79c KVM: arm/arm64: vgic: Introduce VENG0 and VENG1 fields to vmcr struct
ICC_VMCR_EL2 supports virtual access to ICC_IGRPEN1_EL1.Enable
and ICC_IGRPEN0_EL1.Enable fields. Add grpen0 and grpen1 member
variables to struct vmcr to support read and write of these fields.

Also refactor vgic_set_vmcr and vgic_get_vmcr() code.
Drop ICH_VMCR_CTLR_SHIFT and ICH_VMCR_CTLR_MASK macros and instead
use ICH_VMCR_EOI* and ICH_VMCR_CBPR* macros.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30 13:47:21 +00:00
Vijaya Kumar K 94574c9488 KVM: arm/arm64: vgic: Add distributor and redistributor access
VGICv3 Distributor and Redistributor registers are accessed using
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
with KVM_SET_DEVICE_ATTR and KVM_GET_DEVICE_ATTR ioctls.
These registers are accessed as 32-bit and cpu mpidr
value passed along with register offset is used to identify the
cpu for redistributor registers access.

The version of VGIC v3 specification is defined here
Documentation/virtual/kvm/devices/arm-vgic-v3.txt

Also update arch/arm/include/uapi/asm/kvm.h to compile for
AArch32 mode.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30 13:47:07 +00:00
Vijaya Kumar K 2df903a89a KVM: arm/arm64: vgic: Implement support for userspace access
Read and write of some registers like ISPENDR and ICPENDR
from userspace requires special handling when compared to
guest access for these registers.

Refer to Documentation/virtual/kvm/devices/arm-vgic-v3.txt
for handling of ISPENDR, ICPENDR registers handling.

Add infrastructure to support guest and userspace read
and write for the required registers
Also moved vgic_uaccess from vgic-mmio-v2.c to vgic-mmio.c

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30 13:47:02 +00:00
Christoffer Dall 10f92c4c53 KVM: arm/arm64: vgic: Add debugfs vgic-state file
Add a file to debugfs to read the in-kernel state of the vgic.  We don't
do any locking of the entire VGIC state while traversing all the IRQs,
so if the VM is running the user/developer may not see a quiesced state,
but should take care to pause the VM using facilities in user space for
that purpose.

We also don't support LPIs yet, but they can be added easily if needed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-01-25 13:50:03 +01:00
Christoffer Dall 8694e4da66 KVM: arm/arm64: Remove struct vgic_irq pending field
One of the goals behind the VGIC redesign was to get rid of cached or
intermediate state in the data structures, but we decided to allow
ourselves to precompute the pending value of an IRQ based on the line
level and pending latch state.  However, this has now become difficult
to base proper GICv3 save/restore on, because there is a potential to
modify the pending state without knowing if an interrupt is edge or
level configured.

See the following post and related message for more background:
https://lists.cs.columbia.edu/pipermail/kvmarm/2017-January/023195.html

This commit gets rid of the precomputed pending field in favor of a
function that calculates the value when needed, irq_is_pending().

The soft_pending field is renamed to pending_latch to represent that
this latch is the equivalent hardware latch which gets manipulated by
the input signal for edge-triggered interrupts and when writing to the
SPENDR/CPENDR registers.

After this commit save/restore code should be able to simply restore the
pending_latch state, line_level state, and config state in any order and
get the desired result.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-01-25 13:26:13 +01:00
Marc Zyngier 1193e6aeec KVM: arm/arm64: vgic: Fix deadlock on error handling
Dmitry Vyukov reported that the syzkaller fuzzer triggered a
deadlock in the vgic setup code when an error was detected, as
the cleanup code tries to take a lock that is already held by
the setup code.

The fix is to avoid retaking the lock when cleaning up, by
telling the cleanup function that we already hold it.

Cc: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-13 11:19:35 +00:00
Jintack Lim 488f94d721 KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems
Current KVM world switch code is unintentionally setting wrong bits to
CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
timer.  Bit positions of CNTHCTL_EL2 are changing depending on
HCR_EL2.E2H bit.  EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
not set, but they are 11th and 10th bits respectively when E2H is set.

In fact, on VHE we only need to set those bits once, not for every world
switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
1, which makes those bits have no effect for the host kernel execution.
So we just set those bits once for guests, and that's it.

Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-13 11:19:25 +00:00
Christoffer Dall 63e41226af KVM: arm/arm64: Fix occasional warning from the timer work function
When a VCPU blocks (WFI) and has programmed the vtimer, we program a
soft timer to expire in the future to wake up the vcpu thread when
appropriate.  Because such as wake up involves a vcpu kick, and the
timer expire function can get called from interrupt context, and the
kick may sleep, we have to schedule the kick in the work function.

The work function currently has a warning that gets raised if it turns
out that the timer shouldn't fire when it's run, which was added because
the idea was that in that case the work should never have been cancelled.

However, it turns out that this whole thing is racy and we can get
spurious warnings.  The problem is that we clear the armed flag in the
work function, which may run in parallel with the
kvm_timer_unschedule->timer_disarm() call.  This results in a possible
situation where the timer_disarm() call does not call
cancel_work_sync(), which effectively synchronizes the completion of the
work function with running the VCPU.  As a result, the VCPU thread
proceeds before the work function completees, causing changes to the
timer state such that kvm_timer_should_fire(vcpu) returns false in the
work function.

All we do in the work function is to kick the VCPU, and an occasional
rare extra kick never harmed anyone.  Since the race above is extremely
rare, we don't bother checking if the race happens but simply remove the
check and the clearing of the armed flag from the work function.

Reported-by: Matthias Brugger <mbrugger@suse.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-13 11:15:30 +00:00
Linus Torvalds 3ddc76dfc7 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
 "This series does a tree wide cleanup of types related to
  timers/timekeeping.

   - Get rid of cycles_t and use a plain u64. The type is not really
     helpful and caused more confusion than clarity

   - Get rid of the ktime union. The union has become useless as we use
     the scalar nanoseconds storage unconditionally now. The 32bit
     timespec alike storage got removed due to the Y2038 limitations
     some time ago.

     That leaves the odd union access around for no reason. Clean it up.

  Both changes have been done with coccinelle and a small amount of
  manual mopping up"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ktime: Get rid of ktime_equal()
  ktime: Cleanup ktime_set() usage
  ktime: Get rid of the union
  clocksource: Use a plain u64 instead of cycle_t
2016-12-25 14:30:04 -08:00
Thomas Gleixner a5a1d1c291 clocksource: Use a plain u64 instead of cycle_t
There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
2016-12-25 11:04:12 +01:00
Thomas Gleixner 73c1b41e63 cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-25 10:47:44 +01:00
Linus Torvalds 93173b5bf2 Small release, the most interesting stuff is x86 nested virt improvements.
x86: userspace can now hide nested VMX features from guests; nested
 VMX can now run Hyper-V in a guest; support for AVX512_4VNNIW and
 AVX512_FMAPS in KVM; infrastructure support for virtual Intel GPUs.
 
 PPC: support for KVM guests on POWER9; improved support for interrupt
 polling; optimizations and cleanups.
 
 s390: two small optimizations, more stuff is in flight and will be
 in 4.11.
 
 ARM: support for the GICv3 ITS on 32bit platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYTkP0FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 lZIH/iT1n9OQXcuTpYYnQhuCenzI3GZZOIMTbCvK2i5bo0FIJKxVn0EiAAqZSXvO
 nO185FqjOgLuJ1AD1kJuxzye5suuQp4HIPWWgNHcexLuy43WXWKZe0IQlJ4zM2Xf
 u31HakpFmVDD+Cd1qN3yDXtDrRQ79/xQn2kw7CWb8olp+pVqwbceN3IVie9QYU+3
 gCz0qU6As0aQIwq2PyalOe03sO10PZlm4XhsoXgWPG7P18BMRhNLTDqhLhu7A/ry
 qElVMANT7LSNLzlwNdpzdK8rVuKxETwjlc1UP8vSuhrwad4zM2JJ1Exk26nC2NaG
 D0j4tRSyGFIdx6lukZm7HmiSHZ0=
 =mkoB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Small release, the most interesting stuff is x86 nested virt
  improvements.

  x86:
   - userspace can now hide nested VMX features from guests
   - nested VMX can now run Hyper-V in a guest
   - support for AVX512_4VNNIW and AVX512_FMAPS in KVM
   - infrastructure support for virtual Intel GPUs.

  PPC:
   - support for KVM guests on POWER9
   - improved support for interrupt polling
   - optimizations and cleanups.

  s390:
   - two small optimizations, more stuff is in flight and will be in
     4.11.

  ARM:
   - support for the GICv3 ITS on 32bit platforms"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
  arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest
  KVM: arm/arm64: timer: Check for properly initialized timer on init
  KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
  KVM: x86: Handle the kthread worker using the new API
  KVM: nVMX: invvpid handling improvements
  KVM: nVMX: check host CR3 on vmentry and vmexit
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
  KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry
  KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID
  KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation
  KVM: nVMX: support restore of VMX capability MSRs
  KVM: nVMX: generate non-true VMX MSRs based on true versions
  KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.
  KVM: x86: Add kvm_skip_emulated_instruction and use it.
  KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12
  KVM: VMX: Reorder some skip_emulated_instruction calls
  KVM: x86: Add a return value to kvm_emulate_cpuid
  KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h
  ...
2016-12-13 15:47:02 -08:00
Christoffer Dall 8e1a0476f8 KVM: arm/arm64: timer: Check for properly initialized timer on init
When the arch timer code fails to initialize (for example because the
memory mapped timer doesn't work, which is currently seen with the AEM
model), then KVM just continues happily with a final result that KVM
eventually does a NULL pointer dereference of the uninitialized cycle
counter.

Check directly for this in the init path and give the user a reasonable
error in this case.

Cc: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-12-09 15:47:00 +00:00
Andre Przywara 266068eabb KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
The GICv2 spec says in section 4.3.12 that a "CPU targets field bit that
corresponds to an unimplemented CPU interface is RAZ/WI."
Currently we allow the guest to write any value in there and it can
read that back.
Mask the written value with the proper CPU mask to be spec compliant.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-12-09 15:46:59 +00:00
Marc Zyngier 8ca18eec2b KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs
When we inject a level triggerered interrupt (and unless it
is backed by the physical distributor - timer style), we request
a maintenance interrupt. Part of the processing for that interrupt
is to feed to the rest of KVM (and to the eventfd subsystem) the
information that the interrupt has been EOIed.

But that notification only makes sense for SPIs, and not PPIs
(such as the PMU interrupt). Skip over the notification if
the interrupt is not an SPI.

Cc: stable@vger.kernel.org # 4.7+
Fixes: 140b086dd1 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
Fixes: 59529f69f5 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-24 13:12:07 +00:00
Wei Huang b112c84a6f KVM: arm64: Fix the issues when guest PMCCFILTR is configured
KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
But this function can't deals with PMCCFILTR correctly because the evtCount
bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
type of other PMXEVTYPER<n> registers. To fix it, when eventsel == 0, this
function shouldn't return immediately; instead it needs to check further
if select_idx is ARMV8_PMU_CYCLE_IDX.

Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
blindly to attr.config. Instead it ought to convert the request to the
"cpu cycle" event type (i.e. 0x11).

To support this patch and to prevent duplicated definitions, a limited
set of ARMv8 perf event types were relocated from perf_event.c to
asm/perf_event.h.

Cc: stable@vger.kernel.org # 4.6+
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-18 09:06:58 +00:00
Longpeng(Mike) fd5ebf99f8 arm/arm64: KVM: Clean up useless code in kvm_timer_enable
1) Since commit:41a54482 changed timer enabled variable to per-vcpu,
   the correlative comment in kvm_timer_enable is useless now.

2) After the kvm module init successfully, the timecounter is always
   non-null, so we can remove the checking of timercounter.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-15 11:54:16 +00:00
Vladimir Murzin 2988509dd8 ARM: KVM: Support vGICv3 ITS
This patch allows to build and use vGICv3 ITS in 32-bit mode.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-14 10:32:54 +00:00
Vladimir Murzin e29bd6f267 KVM: arm64: vgic-its: Fix compatibility with 32-bit
Evaluate GITS_BASER_ENTRY_SIZE once as an int data (GITS_BASER<n>'s
Entry Size is 5-bit wide only), so when used as divider no reference
to __aeabi_uldivmod is generated when build for AArch32.

Use unsigned long long for GITS_BASER_PAGE_SIZE_* since they are
used in conjunction with 64-bit data.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-14 10:32:24 +00:00
Shih-Wei Li d42c79701a KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
In cases like IPI, we could be queueing an interrupt for a VCPU
that is already running and is not about to exit, because the
VCPU has entered the VM with the interrupt pending and would
not trap on EOI'ing that interrupt. This could result to delays
in interrupt deliveries or even loss of interrupts.
To guarantee prompt interrupt injection, here we have to try to
kick the VCPU.

Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-04 17:56:56 +00:00
Andre Przywara 112b0b8f8f KVM: arm/arm64: vgic: Prevent access to invalid SPIs
In our VGIC implementation we limit the number of SPIs to a number
that the userland application told us. Accordingly we limit the
allocation of memory for virtual IRQs to that number.
However in our MMIO dispatcher we didn't check if we ever access an
IRQ beyond that limit, leading to out-of-bound accesses.
Add a test against the number of allocated SPIs in check_region().
Adjust the VGIC_ADDR_TO_INT macro to avoid an actual division, which
is not implemented on ARM(32).

[maz: cleaned-up original patch]

Cc: stable@vger.kernel.org
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-04 17:56:54 +00:00
Christoffer Dall 0099b7701f KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
If the vgic hasn't been created and initialized, we shouldn't attempt to
look at its data structures or flush/sync anything to the GIC hardware.

This fixes an issue reported by Alexander Graf when using a userspace
irqchip.

Fixes: 0919e84c0f ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Cc: stable@vger.kernel.org
Reported-by: Alexander Graf <agraf@suse.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-27 18:57:35 +02:00
Christoffer Dall 6fe407f2d1 KVM: arm64: Require in-kernel irqchip for PMU support
If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
irqchip, then we end up in a nasty path where we try to take an
uninitialized spinlock, which can lead to all sorts of breakages.

Luckily, QEMU always creates the VGIC before the PMU, so we can
establish this as ABI and check for the VGIC in the PMU init stage.
This can be relaxed at a later time if we want to support PMU with a
userspace irqchip.

Cc: stable@vger.kernel.org
Cc: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-27 18:57:07 +02:00
Vladimir Murzin acda5430be ARM: KVM: Support vgic-v3
This patch allows to build and use vgic-v3 in 32-bit mode.

Unfortunately, it can not be split in several steps without extra
stubs to keep patches independent and bisectable.  For instance,
virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
to be already defined.

It is how support has been done:

* handle SGI requests from the guest

* report configured SRE on access to GICv3 cpu interface from the guest

* required vgic-v3 macros are provided via uapi.h

* static keys are used to select GIC backend

* to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
  the static inlines

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:22:21 +02:00
Vladimir Murzin d7d0a11e44 KVM: arm: vgic: Support 64-bit data manipulation on 32-bit host systems
We have couple of 64-bit registers defined in GICv3 architecture, so
unsigned long accesses to these registers will only access a single
32-bit part of that regitser. On the other hand these registers can't
be accessed as 64-bit with a single instruction like ldrd/strd or
ldmia/stmia if we run a 32-bit host because KVM does not support
access to MMIO space done by these instructions.

It means that a 32-bit guest accesses these registers in 32-bit
chunks, so the only thing we need to do is to ensure that
extract_bytes() always takes 64-bit data.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:59 +02:00
Vladimir Murzin e533a37f7b KVM: arm: vgic: Fix compiler warnings when built for 32-bit
Well, this patch is looking ahead of time, but we'll get following
compiler warnings as soon as we introduce vgic-v3 to 32-bit world

  CC      arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.o
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_mmio_read_v3r_typer':
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:184:35: warning: left shift count >= width of type [-Wshift-count-overflow]
  value = (mpidr & GENMASK(23, 0)) << 32;
                                   ^
In file included from ./include/linux/kernel.h:10:0,
                 from ./include/asm-generic/bug.h:13,
                 from ./arch/arm/include/asm/bug.h:59,
                 from ./include/linux/bug.h:4,
                 from ./include/linux/io.h:23,
                 from ./arch/arm/include/asm/arch_gicv3.h:23,
                 from ./include/linux/irqchip/arm-gic-v3.h:411,
                 from arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:14:
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_v3_dispatch_sgi':
./include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow]
 #define BIT(nr)   (1UL << (nr))
                        ^
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:614:20: note: in expansion of macro 'BIT'
  broadcast = reg & BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);
                    ^
Let's fix them now.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:48 +02:00
Vladimir Murzin 7a1ff70828 KVM: arm64: vgic-its: Introduce config option to guard ITS specific code
By now ITS code guarded with KVM_ARM_VGIC_V3 config option which was
introduced to hide everything specific to vgic-v3 from 32-bit world.
We are going to support vgic-v3 in 32-bit world and KVM_ARM_VGIC_V3
will gone, but we don't have support for ITS there yet and we need to
continue keeping ITS away.
Introduce the new config option to prevent ITS code being build in
32-bit mode when support for vgic-v3 is done.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:47 +02:00
Vladimir Murzin 19f0ece439 arm64: KVM: Move vgic-v3 save/restore to virt/kvm/arm/hyp
So we can reuse the code under arch/arm

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:46 +02:00
Vladimir Murzin 5a7a8426b2 arm64: KVM: Use static keys for selecting the GIC backend
Currently GIC backend is selected via alternative framework and this
is fine. We are going to introduce vgic-v3 to 32-bit world and there
we don't have patching framework in hand, so we can either check
support for GICv3 every time we need to choose which backend to use or
try to optimise it by using static keys. The later looks quite
promising because we can share logic involved in selecting GIC backend
between architectures if both uses static keys.

This patch moves arm64 from alternative to static keys framework for
selecting GIC backend. For that we embed static key into vgic_global
and enable the key during vgic initialisation based on what has
already been exposed by the host GIC driver.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:35 +02:00
Paolo Bonzini 5d947a1447 KVM: ARM: cleanup kvm_timer_hyp_init
Remove two unnecessary labels now that kvm_timer_hyp_init is not
creating its own workqueue anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:54:00 +02:00
Marc Zyngier 3272f0d08e arm64: KVM: Inject a vSerror if detecting a bad GICV access at EL2
If, when proxying a GICV access at EL2, we detect that the guest is
doing something silly, report an EL1 SError instead ofgnoring the
access.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier a07d3b07a8 arm64: KVM: vgic-v2: Enable GICV access from HYP if access from guest is unsafe
So far, we've been disabling KVM on systems where the GICV region couldn't
be safely given to a guest. Now that we're able to handle this access
safely by emulating it in HYP, we can enable this feature when we detect
an unsafe configuration.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier bf8feb3964 arm64: KVM: vgic-v2: Add GICV access from HYP
Now that we have the necessary infrastructure to handle MMIO accesses
in HYP, perform the GICV access on behalf of the guest. This requires
checking that the access is strictly 32bit, properly aligned, and
falls within the expected range.

When all condition are satisfied, we perform the access and tell
the rest of the HYP code that the instruction has been correctly
emulated.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier fb5ee369cc arm64: KVM: vgic-v2: Add the GICV emulation infrastructure
In order to efficiently perform the GICV access on behalf of the
guest, we need to be able to avoid going back all the way to
the host kernel.

For this, we introduce a new hook in the world switch code,
conveniently placed just after populating the fault info.
At that point, we only have saved/restored the GP registers,
and we can quickly perform all the required checks (data abort,
translation fault, valid faulting syndrome, not an external
abort, not a PTW).

Coming back from the emulation code, we need to skip the emulated
instruction. This involves an additional bit of save/restore in
order to be able to access the guest's PC (and possibly CPSR if
this is a 32bit guest).

At this stage, no emulation code is provided.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier 8cebe750c4 arm64: KVM: Make kvm_skip_instr32 available to HYP
As we plan to do some emulation at HYP, let's make kvm_skip_instr32
as part of the hyp_text section. This doesn't preclude the kernel
from using it.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier 3aedd5c49e arm: KVM: Use common AArch32 conditional execution code
Add the bit of glue and const-ification that is required to use
the code inherited from the arm64 port, and move over to it.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier 427d7cacf9 arm64: KVM: Move the AArch32 conditional execution to common code
It would make some sense to share the conditional execution code
between 32 and 64bit. In order to achieve this, let's move that
code to virt/kvm/arm/aarch32.c. While we're at it, drop a
superfluous BUG_ON() that wasn't that useful.

Following patches will migrate the 32bit port to that code base.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier 55d7cad6a9 KVM: arm: vgic: Drop build compatibility hack for older kernel versions
As kvm_set_routing_entry() was changing prototype between 4.7 and 4.8,
an ugly hack was put in place in order to survive both building in
-next and the merge window.

Now that everything has been merged, let's dump the compatibility
hack for good.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Christoffer Dall 1fe0009833 KVM: arm/arm64: Rename vgic_attr_regs_access to vgic_attr_regs_access_v2
Just a rename so we can implement a v3-specific function later.

We take the chance to get rid of the V2/V3 ops comments as well.

No functional change.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2016-09-08 12:53:00 +02:00
Christoffer Dall ba7b9169b5 KVM: arm/arm64: Factor out vgic_attr_regs_access functionality
As we are about to deal with multiple data types and situations where
the vgic should not be initialized when doing userspace accesses on the
register attributes, factor out the functionality of
vgic_attr_regs_access into smaller bits which can be reused by a new
function later.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2016-09-08 12:53:00 +02:00
Bhaktipriya Shridhar 3706feacd0 KVM: Remove deprecated create_singlethread_workqueue
The workqueue "irqfd_cleanup_wq" queues a single work item
&irqfd->shutdown and hence doesn't require ordering. It is a host-wide
workqueue for issuing deferred shutdown requests aggregated from all
vm* instances. It is not being used on a memory reclaim path.
Hence, it has been converted to use system_wq.
The work item has been flushed in kvm_irqfd_release().

The workqueue "wqueue" queues a single work item &timer->expired
and hence doesn't require ordering. Also, it is not being used on
a memory reclaim path. Hence, it has been converted to use system_wq.

System workqueues have been able to handle high level of concurrency
for a long time now and hence it's not required to have a singlethreaded
workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue
created with create_singlethread_workqueue(), system_wq allows multiple
work items to overlap executions even on the same CPU; however, a
per-cpu workqueue doesn't have any CPU locality or global ordering
guarantee unless the target CPU is explicitly specified and thus the
increase of local concurrency shouldn't make any difference.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-07 19:34:28 +02:00
Paolo Bonzini 2eeb321fd2 KVM/ARM Fixes for v4.8-rc3
This tag contains the following fixes on top of v4.8-rc1:
  - ITS init issues
  - ITS error handling issues
  - ITS IRQ leakage fix
  - Plug a couple of ITS race conditions
  - An erratum workaround for timers
  - Some removal of misleading use of errors and comments
  - A fix for GICv3 on 32-bit guests
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXtLicAAoJEEtpOizt6ddyEC4H/16IngntN6Gz1WPwtBBelgyj
 ZfU970uzGOyDtDOeOX1NT+gJpkDvUMhsNlngWnMrMwqqqPVKdE4XBShPiW2v53E7
 JquDTd2kKl+OO9e9XnkLw9yUcARmJFKIjHdISlg+E78t2kcNHn+XB2jrfTLKQVl8
 tk1ztDALb4LXSGYPZQ/uHTYp9U0qei+2SbbQufRcdQ3ggyxLDwPP2aO25amctzEP
 0Y42tlnNoZj7yBBp0X9BWRrHF2AZuOp+qBJnpFiQdsgLL6G1P3DcU/t9+KDjVBVr
 LYKN8jId2r5eyGGg8aKb4I3trevayToWhDw/jzarrTNAovB1cp8G5J7ozfmeS3g=
 =4PCW
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Fixes for v4.8-rc3

This tag contains the following fixes on top of v4.8-rc1:
 - ITS init issues
 - ITS error handling issues
 - ITS IRQ leakage fix
 - Plug a couple of ITS race conditions
 - An erratum workaround for timers
 - Some removal of misleading use of errors and comments
 - A fix for GICv3 on 32-bit guests
2016-08-18 12:19:19 +02:00
Marc Zyngier cabdc5c59a KVM: arm/arm64: timer: Workaround misconfigured timer interrupt
Similarily to f005bd7e3b ("clocksource/arm_arch_timer: Force
per-CPU interrupt to be level-triggered"), make sure we can
survive an interrupt that has been misconfigured as edge-triggered
by forcing it to be level-triggered (active low is assumed, but
the GIC doesn't really care whether this is high or low).

Hopefully, the amount of shouting in the kernel log will convince
the user to do something about their firmware.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-17 12:23:47 +02:00
Andre Przywara 286054a7a8 KVM: arm64: ITS: avoid re-mapping LPIs
When a guest wants to map a device-ID/event-ID combination that is
already mapped, we may end up in a situation where an LPI is never
"put", thus never being freed.
Since the GICv3 spec says that mapping an already mapped LPI is
UNPREDICTABLE, lets just bail out early in this situation to avoid
any potential leaks.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-16 19:27:22 +02:00
Andre Przywara 505a19eec4 KVM: arm64: check for ITS device on MSI injection
When userspace provides the doorbell address for an MSI to be
injected into the guest, we find a KVM device which feels responsible.
Lets check that this device is really an emulated ITS before we make
real use of the container_of-ed pointer.

  [ Moved NULL-pointer check to caller of static function
    - Christoffer ]

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-15 23:00:22 +02:00
Andre Przywara c7735769d5 KVM: arm64: ITS: move ITS registration into first VCPU run
Currently we register an ITS device upon userland issuing the CTLR_INIT
ioctl to mark initialization of the ITS as done.
This deviates from the initialization sequence of the existing GIC
devices and does not play well with the way QEMU handles things.
To be more in line with what we are used to, register the ITS(es) just
before the first VCPU is about to run, so in the map_resources() call.
This involves iterating through the list of KVM devices and map each
ITS that we find.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-15 23:00:21 +02:00
Christoffer Dall d9ae449b3d KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
There are two problems with the current implementation of the MMIO
handlers for the propbaser and pendbaser:

First, the write to the value itself is not guaranteed to be an atomic
64-bit write so two concurrent writes to the structure field could be
intermixed.

Second, because we do a read-modify-update operation without any
synchronization, if we have two 32-bit accesses to separate parts of the
register, we can loose one of them.

By using the atomic cmpxchg64 we should cover both issues above.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-15 23:00:20 +02:00
Christoffer Dall a28ebea2ad KVM: Protect device ops->create and list_add with kvm->lock
KVM devices were manipulating list data structures without any form of
synchronization, and some implementations of the create operations also
suffered from a lack of synchronization.

Now when we've split the xics create operation into create and init, we
can hold the kvm->lock mutex while calling the create operation and when
manipulating the devices list.

The error path in the generic code gets slightly ugly because we have to
take the mutex again and delete the device from the list, but holding
the mutex during anon_inode_getfd or releasing/locking the mutex in the
common non-error path seemed wrong.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-08-12 12:01:27 +02:00
Christoffer Dall 2cccbb368a KVM: arm64: vgic-its: Plug race in vgic_put_irq
Right now the following sequence of events can happen:

  1. Thread X calls vgic_put_irq
  2. Thread Y calls vgic_add_lpi
  3. Thread Y gets lpi_list_lock
  4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
  5. Thread Y finds the irq via the lpi_list_lock, raises the ref
     count to 1, and release the lpi_list_lock.
  6. Thread X proceeds and frees the irq.

Avoid this by holding the spinlock around the kref_put.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-10 11:41:54 +02:00
Christoffer Dall 99e5e886a0 KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
During low memory conditions, we could be dereferencing a NULL pointer
when vgic_add_lpi fails to allocate memory.

Consider for example this call sequence:

  vgic_its_cmd_handle_mapi
      itte->irq = vgic_add_lpi(kvm, lpi_nr);
          update_lpi_config(kvm, itte->irq, NULL);
              ret = kvm_read_guest(kvm, propbase + irq->intid
	                                             ^^^^
						     kaboom?

Instead, return an error pointer from vgic_add_lpi and check the return
value from its single caller.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-10 11:41:35 +02:00
Andre Przywara fd837b08d9 KVM: arm64: ITS: return 1 on successful MSI injection
According to the KVM API documentation a successful MSI injection
should return a value > 0 on success.
Return possible errors in vgic_its_trigger_msi() and report a
successful injection back to userland, while also reporting the
case where the MSI could not be delivered due to the guest not
having the LPI mapped, for instance.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-09 16:43:23 +02:00
Paolo Bonzini 6f49b2f341 KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
 has been cooking in -next for a while.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXoydqAAoJEEtpOizt6ddyM3oH/1A4VeG/J9q4fBPXqY2tVWXs
 c3P7UgNcrEgUNs/F9ykQY/lb31deecUzaBt1OyTf+RlsNbihq3dQdYcBhxtUODw/
 Faok582ya3UFgLW+IRHcID0EbkVOpIzMhOStYsnU/Dz7HG1JL9HdPzwkid7iu9LT
 fI6yrrBnJFjdWAAQ4BkcEKBENRsY8NTs7jX5vnFA92MkUBby7BmariPDD3FtrB+f
 Ob9B7CxM30pNqsN7OA/QvFOHMJHxf3s1TBKwmPHe5TLIfSzV1YxcEGiMc0lWqF4v
 BT8ZeMGCtjDw94tND1DskfQQRPaMqPmGuRTrAW/IuE2n92bFtbqIqs7Cbw0fzLE=
 =Vm6Q
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.8-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Changes for v4.8 - Take 2

Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
2016-08-04 13:59:56 +02:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Marc Zyngier 3f312db6b6 KVM: arm: vgic-irqfd: Workaround changing kvm_set_routing_entry prototype
kvm_set_routing_entry is changing in -next, and causes things to
explode. Add a temporary workaround that should be dropped when
we hit 4.8-rc1

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-24 10:37:38 +01:00
Eric Auger 995a0ee980 KVM: arm/arm64: Enable MSI routing
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries.

For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
include SPI irqchip routes plus MSI routes. In the future this
might be extended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-22 18:52:03 +01:00
Eric Auger 180ae7b118 KVM: arm/arm64: Enable irqchip routing
This patch adds compilation and link against irqchip.

Main motivation behind using irqchip code is to enable MSI
routing code. In the future irqchip routing may also be useful
when targeting multiple irqchips.

Routing standard callbacks now are implemented in vgic-irqfd:
- kvm_set_routing_entry
- kvm_set_irq
- kvm_set_msi

They only are supported with new_vgic code.

Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.

So from now on IRQCHIP routing is enabled and a routing table entry
must exist for irqfd injection to succeed for a given SPI. This patch
builds a default flat irqchip routing table (gsi=irqchip.pin) covering
all the VGIC SPI indexes. This routing table is overwritten by the
first first user-space call to KVM_SET_GSI_ROUTING ioctl.

MSI routing setup is not yet allowed.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-22 18:52:01 +01:00
Marc Zyngier 3a88bded20 KVM: arm64: vgic-its: Simplify MAPI error handling
If we care to move all the checks that do not involve any memory
allocation, we can simplify the MAPI error handling. Let's do that,
it cannot hurt.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:20 +01:00
Marc Zyngier a3e7aa271e KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
vgic_its_cmd_handle_mapi has an extra "subcmd" argument, which is
already contained in the command buffer that all command handlers
obtain from the command queue. Let's drop it, as it is not that
useful.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:19 +01:00
Marc Zyngier 6d03a68f80 KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
There is no need to have separate functions to validate devices
and collections, as the architecture doesn't really distinguish the
two, and they are supposed to be managed the same way.

Let's turn the DevID checker into a generic one.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:19 +01:00
Marc Zyngier bb7176449f KVM: arm64: vgic-its: Add pointer to corresponding kvm_device
Going from the ITS structure to the corresponding KVM structure
would be quite handy at times. The kvm_device pointer that is
passed at create time is quite convenient for this, so let's
keep a copy of it in the vgic_its structure.

This will be put to a good use in subsequent patches.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:18 +01:00
Marc Zyngier 17a21f58ff KVM: arm64: vgic-its: Add collection allocator/destructor
Instead of spreading random allocations all over the place,
consolidate allocation/init/freeing of collections in a pair
of constructor/destructor.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:18 +01:00
Marc Zyngier d6c7f865f0 KVM: arm64: vgic-its: Fix L2 entry validation for indirect tables
When checking that the storage address of a device entry is valid,
it is critical to compute the actual address of the entry, rather
than relying on the beginning of the page to match a CPU page of
the same size: for example, if the guest places the table at the
last 64kB boundary of RAM, but RAM size isn't a multiple of 64kB...

Fix this by computing the actual offset of the device ID in the
L2 page, and check the corresponding GFN.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:18 +01:00
Marc Zyngier 333a53ff7f KVM: arm64: vgic-its: Validate the device table L1 entry
Checking that the device_id fits if the table, and we must make
sure that the associated memory is also accessible.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:17 +01:00
Marc Zyngier b90338b7cb KVM: arm64: vgic-its: Fix misleading nr_entries in vgic_its_check_device_id
The nr_entries variable in vgic_its_check_device_id actually
describe the size of the L1 table, and not the number of
entries in this table.

Rename it to l1_tbl_size, so that we can now change the code
with a better understanding of what is what.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:17 +01:00
Marc Zyngier 7e3963a515 KVM: arm64: vgic-its: Fix vgic_its_check_device_id BE handling
The ITS tables are stored in LE format. If the host is reading
a L1 table entry to check its validity, it must convert it to
the CPU endianness.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:16 +01:00
Marc Zyngier c0091073dd KVM: arm64: vgic-its: Fix handling of indirect tables
The current code will fail on valid indirect tables, and happily
use the ones that are pointing out of the guest RAM. Funny what a
small "!" can do for you...

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:16 +01:00
Marc Zyngier d97594e6bc KVM: arm64: vgic-its: Generalize use of vgic_get_irq_kref
Instead of sprinkling raw kref_get() calls everytime we cannot
do a normal vgic_get_irq(), use the existing vgic_get_irq_kref(),
which does the same thing and is paired with a vgic_put_irq().

vgic_get_irq_kref is moved to vgic.h in order to be easily shared.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:16 +01:00
Eric Auger 9d5fcb9dd7 KVM: arm/arm64: Fix vGICv2 KVM_DEV_ARM_VGIC_GRP_CPU/DIST_REGS
For VGICv2 save and restore the CPU interface registers
are accessed. Restore the modality which has been altered.
Also explicitly set the iodev_type for both the DIST and CPU
interface.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:15:15 +01:00
Andre Przywara 0e4e82f154 KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller
Now that all ITS emulation functionality is in place, we advertise
MSI functionality to userland and also the ITS device to the guest - if
userland has configured that.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:38 +01:00
Andre Przywara 2891a7dfb6 KVM: arm64: vgic-its: Implement MSI injection in ITS emulation
When userland wants to inject an MSI into the guest, it uses the
KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with
the payload and the device ID.
With the help of the KVM IO bus framework we learn the corresponding
ITS from the doorbell address. We then use our wrapper functions to
iterate the linked lists and find the proper Interrupt Translation Table
Entry (ITTE) and thus the corresponding struct vgic_irq to finally set
the pending bit.
We also provide the handler for the ITS "INT" command, which allows a
guest to trigger an MSI via the ITS command queue. Since this one knows
about the right ITS already, we directly call the MMIO handler function
without using the kvm_io_bus framework.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:38 +01:00
Andre Przywara df9f58fbea KVM: arm64: vgic-its: Implement ITS command queue command handlers
The connection between a device, an event ID, the LPI number and the
associated CPU is stored in in-memory tables in a GICv3, but their
format is not specified by the spec. Instead software uses a command
queue in a ring buffer to let an ITS implementation use its own
format.
Implement handlers for the various ITS commands and let them store
the requested relation into our own data structures. Those data
structures are protected by the its_lock mutex.
Our internal ring buffer read and write pointers are protected by the
its_cmd mutex, so that only one VCPU per ITS can handle commands at
any given time.
Error handling is very basic at the moment, as we don't have a good
way of communicating errors to the guest (usually an SError).
The INT command handler is missing from this patch, as we gain the
capability of actually injecting MSIs into the guest only later on.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:38 +01:00
Andre Przywara f9f77af9e2 KVM: arm64: vgic-its: Allow updates of LPI configuration table
The (system-wide) LPI configuration table is held in a table in
(guest) memory. To achieve reasonable performance, we cache this data
in our struct vgic_irq. If the guest updates the configuration data
(which consists of the enable bit and the priority value), it issues
an INV or INVALL command to allow us to update our information.
Provide functions that update that information for one LPI or all LPIs
mapped to a specific collection.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:37 +01:00
Andre Przywara 33d3bc9556 KVM: arm64: vgic-its: Read initial LPI pending table
The LPI pending status for a GICv3 redistributor is held in a table
in (guest) memory. To achieve reasonable performance, we cache the
pending bit in our struct vgic_irq. The initial pending state must be
read from guest memory upon enabling LPIs for this redistributor.
As we can't access the guest memory while we hold the lpi_list spinlock,
we create a snapshot of the LPI list and iterate over that.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:37 +01:00
Andre Przywara 3802411d01 KVM: arm64: vgic-its: Connect LPIs to the VGIC emulation
LPIs are dynamically created (mapped) at guest runtime and their
actual number can be quite high, but is mostly assigned using a very
sparse allocation scheme. So arrays are not an ideal data structure
to hold the information.
We use a spin-lock protected linked list to hold all mapped LPIs,
represented by their struct vgic_irq. This lock is grouped between the
ap_list_lock and the vgic_irq lock in our locking order.
Also we store a pointer to that struct vgic_irq in our struct its_itte,
so we can easily access it.
Eventually we call our new vgic_get_lpi() from vgic_get_irq(), so
the VGIC code gets transparently access to LPIs.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:36 +01:00
Andre Przywara 424c33830f KVM: arm64: vgic-its: Implement basic ITS register handlers
Add emulation for some basic MMIO registers used in the ITS emulation.
This includes:
- GITS_{CTLR,TYPER,IIDR}
- ID registers
- GITS_{CBASER,CREADR,CWRITER}
  (which implement the ITS command buffer handling)
- GITS_BASER<n>

Most of the handlers are pretty straight forward, only the CWRITER
handler is a bit more involved by taking the new its_cmd mutex and
then iterating over the command buffer.
The registers holding base addresses and attributes are sanitised before
storing them.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:36 +01:00
Andre Przywara 1085fdc68c KVM: arm64: vgic-its: Introduce new KVM ITS device
Introduce a new KVM device that represents an ARM Interrupt Translation
Service (ITS) controller. Since there can be multiple of this per guest,
we can't piggy back on the existing GICv3 distributor device, but create
a new type of KVM device.
On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data
structure and store the pointer in the kvm_device data.
Upon an explicit init ioctl from userland (after having setup the MMIO
address) we register the handlers with the kvm_io_bus framework.
Any reference to an ITS thus has to go via this interface.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:35 +01:00
Andre Przywara 59c5ab4098 KVM: arm64: vgic-its: Introduce ITS emulation file with MMIO framework
The ARM GICv3 ITS emulation code goes into a separate file, but needs
to be connected to the GICv3 emulation, of which it is an option.
The ITS MMIO handlers require the respective ITS pointer to be passed in,
so we amend the existing VGIC MMIO framework to let it cope with that.
Also we introduce the basic ITS data structure and initialize it, but
don't return any success yet, as we are not yet ready for the show.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:35 +01:00
Andre Przywara 0aa1de5731 KVM: arm64: vgic: Handle ITS related GICv3 redistributor registers
In the GICv3 redistributor there are the PENDBASER and PROPBASER
registers which we did not emulate so far, as they only make sense
when having an ITS. In preparation for that emulate those MMIO
accesses by storing the 64-bit data written into it into a variable
which we later read in the ITS emulation.
We also sanitise the registers, making sure RES0 regions are respected
and checking for valid memory attributes.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:14:35 +01:00
Andre Przywara 5dd4b924e3 KVM: arm/arm64: vgic: Add refcounting for IRQs
In the moment our struct vgic_irq's are statically allocated at guest
creation time. So getting a pointer to an IRQ structure is trivial and
safe. LPIs are more dynamic, they can be mapped and unmapped at any time
during the guest's _runtime_.
In preparation for supporting LPIs we introduce reference counting for
those structures using the kernel's kref infrastructure.
Since private IRQs and SPIs are statically allocated, we avoid actually
refcounting them, since they would never be released anyway.
But we take provisions to increase the refcount when an IRQ gets onto a
VCPU list and decrease it when it gets removed. Also this introduces
vgic_put_irq(), which wraps kref_put and hides the release function from
the callers.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:10:48 +01:00
Andre Przywara 42c8870f90 KVM: arm/arm64: vgic: Check return value for kvm_register_vgic_device
kvm_register_device_ops() can return an error, so lets check its return
value and propagate this up the call chain.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:10:08 +01:00
Andre Przywara 8f6cdc1c2e KVM: arm/arm64: vgic: Move redistributor kvm_io_devices
Logically a GICv3 redistributor is assigned to a (v)CPU, so we should
aim to keep redistributor related variables out of our struct vgic_dist.

Let's start by replacing the redistributor related kvm_io_device array
with two members in our existing struct vgic_cpu, which are naturally
per-VCPU and thus don't require any allocation / freeing.
So apart from the better fit with the redistributor design this saves
some code as well.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-18 18:09:40 +01:00
Anna-Maria Gleixner 15d7e3d349 KVM/arm/arm64/vgic-new: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Eric Auger <eric.auger@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.900484868@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:41:43 +02:00
Richard Cochran b3c9950a5c arm/kvm/arch_timer: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.634155707@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:40:27 +02:00
Richard Cochran 42ec50b5f2 arm/kvm/vgic: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
The VGIC callback is run after KVM's main callback since it reflects the
makefile order.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.546953286@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:40:26 +02:00
Marc Zyngier 50926d82fa KVM: arm/arm64: The GIC is dead, long live the GIC
I don't think any single piece of the KVM/ARM code ever generated
as much hatred as the GIC emulation.

It was written by someone who had zero experience in modeling
hardware (me), was riddled with design flaws, should have been
scrapped and rewritten from scratch long before having a remote
chance of reaching mainline, and yet we supported it for a good
three years. No need to mention the names of those who suffered,
the git log is singing their praises.

Thankfully, we now have a much more maintainable implementation,
and we can safely put the grumpy old GIC to rest.

Fellow hackers, please raise your glass in memory of the GIC:

	The GIC is dead, long live the GIC!

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-07-03 23:09:37 +02:00
Marc Zyngier 05fb05a6ca KVM: arm/arm64: vgic-new: Removel harmful BUG_ON
When changing the active bit from an MMIO trap, we decide to
explode if the intid is that of a private interrupt.

This flawed logic comes from the fact that we were assuming that
kvm_vcpu_kick() as called by kvm_arm_halt_vcpu() would not return before
the called vcpu responded, but this is not the case, so we need to
perform this wait even for private interrupts.

Dropping the BUG_ON seems like the right thing to do.

 [ Commit message tweaked by Christoffer ]

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-02 11:52:21 +02:00
Marc Zyngier 637d122baa KVM: arm/arm64: vgic-v3: Always resample level interrupts
When reading back from the list registers, we need to perform
two actions for level interrupts:
1) clear the soft-pending bit if the interrupt is not pending
   anymore *in the list register*
2) resample the line level and propagate it to the pending state

But these two actions shouldn't be linked, and we should *always*
resample the line level, no matter what state is in the list
register. Otherwise, we may end-up injecting spurious interrupts
that have been already retired.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-31 16:12:16 +02:00
Marc Zyngier df7942d17e KVM: arm/arm64: vgic-v2: Always resample level interrupts
When reading back from the list registers, we need to perform
two actions for level interrupts:
1) clear the soft-pending bit if the interrupt is not pending
   anymore *in the list register*
2) resample the line level and propagate it to the pending state

But these two actions shouldn't be linked, and we should *always*
resample the line level, no matter what state is in the list
register. Otherwise, we may end-up injecting spurious interrupts
that have been already retired.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-31 16:12:15 +02:00
Christoffer Dall 4d3afc9bad KVM: arm/arm64: vgic-v2: Clear all dirty LRs
When saving the state of the list registers, it is critical to
reset them zero, as we could otherwise leave unexpected EOI
interrupts pending for virtual level interrupts.

Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-31 16:09:28 +02:00
Christoffer Dall 35a2d58588 KVM: arm/arm64: vgic-new: Synchronize changes to active state
When modifying the active state of an interrupt via the MMIO interface,
we should ensure that the write has the intended effect.

If a guest sets an interrupt to active, but that interrupt is already
flushed into a list register on a running VCPU, then that VCPU will
write the active state back into the struct vgic_irq upon returning from
the guest and syncing its state.  This is a non-benign race, because the
guest can observe that an interrupt is not active, and it can have a
reasonable expectations that other VCPUs will not ack any IRQs, and then
set the state to active, and expect it to stay that way.  Currently we
are not honoring this case.

Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the
world, change the irq state, potentially queue the irq if we're setting
it to active, and then continue.

We take this chance to slightly optimize these functions by not stopping
the world when touching private interrupts where there is inherently no
possible race.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 16:26:38 +02:00
Andre Przywara efffe55af5 KVM: arm/arm64: vgic-new: enable build
Now that the new VGIC implementation has reached feature parity with
the old one, add the new files to the build system and add a Kconfig
option to switch between the two versions.
We set the default to the new version to get maximum test coverage,
in case people experience problems they can switch back to the old
behaviour if needed.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:09 +02:00
Andre Przywara 568e8c901e KVM: arm/arm64: vgic-new: implement mapped IRQ handling
We now store the mapped hardware IRQ number in our struct, so we
don't need the irq_phys_map for the new VGIC.
Implement the hardware IRQ mapping on top of the reworked arch
timer interface.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:09 +02:00
Andre Przywara 03f0c94c73 KVM: arm/arm64: vgic-new: Wire up irqfd injection
Connect to the new VGIC to the irqfd framework, so that we can
inject IRQs.
GSI routing and MSI routing is not yet implemented.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:08 +02:00
Eric Auger f7b6985cc3 KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable
Enable the VGIC operation by properly initialising the registers
in the hypervisor GIC interface.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:07 +02:00
Eric Auger b0442ee227 KVM: arm/arm64: vgic-new: vgic_init: implement map_resources
map_resources is the last initialization step. It is executed on
first VCPU run. At that stage the code checks that userspace has provided
the base addresses for the relevant VGIC regions, which depend on the
type of VGIC that is exposed to the guest.  Also we check if the two
regions overlap.
If the checks succeeded, we register the respective register frames with
the kvm_io_bus framework.

If we emulate a GICv2, the function also forces vgic_init execution if
it has not been executed yet. Also we map the virtual GIC CPU interface
onto the guest's CPU interface.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:07 +02:00
Eric Auger ad275b8bb1 KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init
This patch allocates and initializes the data structures used
to model the vgic distributor and virtual cpu interfaces. At that
stage the number of IRQs and number of virtual CPUs is frozen.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:06 +02:00
Eric Auger 5e6431da8f KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create
This patch implements the vgic_creation function which is
called on CREATE_IRQCHIP VM IOCTL (v2 only) or KVM_CREATE_DEVICE

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:06 +02:00
Eric Auger 9097773245 KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init
Implements kvm_vgic_hyp_init and vgic_probe function.
This uses the new firmware independent VGIC probing to support both ACPI
and DT based systems (code from Marc Zyngier).

The vgic_global struct is enriched with new fields populated
by those functions.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:05 +02:00
Andre Przywara 878c569e45 KVM: arm/arm64: vgic-new: Add userland GIC CPU interface access
Using the VMCR accessors we provide access to GIC CPU interface state
to userland by wiring it up to the existing userland interface.
[Marc: move and make VMCR accessors static, streamline MMIO handlers]

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:05 +02:00
Andre Przywara e4823a7a1b KVM: arm/arm64: vgic-new: Add GICH_VMCR accessors
Since the GIC CPU interface is always virtualized by the hardware,
we don't have CPU interface state information readily available in our
emulation if userland wants to save or restore it.
Fortunately the GIC hypervisor interface provides the VMCR register to
access the required virtual CPU interface bits.
Provide wrappers for GICv2 and GICv3 hosts to have access to this
register.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:04 +02:00
Andre Przywara 7d450e2821 KVM: arm/arm64: vgic-new: Add userland access to VGIC dist registers
Userland may want to save and restore the state of the in-kernel VGIC,
so we provide the code which takes a userland request and translate
that into calls to our MMIO framework.

From Christoffer:
When accessing the VGIC state from userspace we really don't want a VCPU
to be messing with the state at the same time, and the API specifies
that we should return -EBUSY if any VCPUs are running.
Check and prevent VCPUs from running by grabbing their mutexes, one by
one, and error out if we fail.
(Note: This could potentially be simplified to just do a simple check
and see if any VCPUs are running, and return -EBUSY then, without
enforcing the locking throughout the duration of the uaccess, if we
think that taking/releasing all these mutexes for every single GIC
register access is too heavyweight.)

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:03 +02:00
Christoffer Dall c3199f28e0 KVM: arm/arm64: vgic-new: Export register access interface
Userland can access the emulated GIC to save and restore its state
for initialization or migration purposes.
The kvm_io_bus API requires an absolute gpa, which does not fit the
KVM_DEV_ARM_VGIC_GRP_DIST_REGS user API, that only provides relative
offsets. So we provide a wrapper to plug into our MMIO framework and
find the respective register handler.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:40:03 +02:00
Eric Auger f94591e2e6 KVM: arm/arm64: vgic-new: vgic_kvm_device: access to VGIC registers
This patch implements the switches for KVM_DEV_ARM_VGIC_GRP_DIST_REGS
and KVM_DEV_ARM_VGIC_GRP_CPU_REGS API which allows the userspace to
access VGIC registers.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:02 +02:00
Eric Auger e5c3029467 KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_ADDR
This patch implements the KVM_DEV_ARM_VGIC_GRP_ADDR group which
enables to set the base address of GIC regions as seen by the guest.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:02 +02:00
Eric Auger e2c1f9abff KVM: arm/arm64: vgic-new: vgic_kvm_device: implement kvm_vgic_addr
kvm_vgic_addr is used by the userspace to set the base address of
the following register regions, as seen by the guest:
- distributor(v2 and v3),
- re-distributors (v3),
- CPU interface (v2).

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:01 +02:00
Eric Auger afcc7c50ce KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_CTRL
This patch implements the KVM_DEV_ARM_VGIC_GRP_CTRL group API
featuring KVM_DEV_ARM_VGIC_CTRL_INIT attribute. The vgic_init
function is not yet implemented though.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:00 +02:00
Eric Auger fca256026b KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_NR_IRQS
This patch implements the KVM_DEV_ARM_VGIC_GRP_NR_IRQS group. This
modality is supported by both VGIC V2 and V3 KVM device as will be
other groups, hence the introduction of common helpers.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:40:00 +02:00
Eric Auger c86c772191 KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM device ops registration
This patch introduces the skeleton for the KVM device operations
associated to KVM_DEV_TYPE_ARM_VGIC_V2 and KVM_DEV_TYPE_ARM_VGIC_V3.

At that stage kvm_vgic_create is stubbed.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:59 +02:00
Andre Przywara 621ecd8d21 KVM: arm/arm64: vgic-new: Add GICv3 SGI system register trap handler
In contrast to GICv2 SGIs in a GICv3 implementation are not triggered
by a MMIO write, but with a system register write. KVM knows about
that register already, we just need to implement the handler and wire
it up to the core KVM/ARM code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:59 +02:00
Andre Przywara 78a714aba0 KVM: arm/arm64: vgic-new: Add GICv3 IROUTER register handlers
Since GICv3 supports much more than the 8 CPUs the GICv2 ITARGETSR
register can handle, the new IROUTER register covers the whole range
of possible target (V)CPUs by using the same MPIDR that the cores
report themselves.
In addition to translating this MPIDR into a vcpu pointer we store
the originally written value as well. The architecture allows to
write any values into the register, which must be read back as written.

Since we don't support affinity level 3, we don't need to take care
about the upper word of this 64-bit register, which simplifies the
handling a bit.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:58 +02:00
Andre Przywara 54f59d2b3a KVM: arm/arm64: vgic-new: Add GICv3 IDREGS register handler
We implement the only one ID register that is required by the
architecture, also this is the one that Linux actually checks.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:57 +02:00
Andre Przywara 741972d8a6 KVM: arm/arm64: vgic-new: Add GICv3 redistributor IIDR and TYPER handler
The redistributor TYPER tells the OS about the associated MPIDR,
also the LAST bit is crucial to determine the number of redistributors.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:57 +02:00
Andre Przywara fd59ed3be1 KVM: arm/arm64: vgic-new: Add GICv3 CTLR, IIDR, TYPER handlers
As in the GICv2 emulation we handle those three registers in one
function.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:56 +02:00
Andre Przywara ed9b8cefa9 KVM: arm/arm64: vgic-new: Add GICv3 MMIO handling framework
Create a new file called vgic-mmio-v3.c and describe the GICv3
distributor and redistributor registers there.
This adds a special macro to deal with the split of SGI/PPI in the
redistributor and SPIs in the distributor, which allows us to reuse
the existing GICv2 handlers for those registers which are compatible.
Also we provide a function to deal with the registration of the two
separate redistributor frames per VCPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:56 +02:00
Andre Przywara ed40213ef9 KVM: arm/arm64: vgic-new: Add SGIPENDR register handlers
As this register is v2 specific, its implementation lives entirely
in vgic-mmio-v2.c.
This register allows setting the source mask of an IPI.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:55 +02:00
Andre Przywara 55cc01fb90 KVM: arm/arm64: vgic-new: Add SGIR register handler
Triggering an IPI via this register is v2 specific, so the
implementation lives entirely in vgic-mmio-v2.c.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:55 +02:00
Andre Przywara 2c234d6f18 KVM: arm/arm64: vgic-new: Add TARGET registers handlers
The target register handlers are v2 emulation specific, so their
implementation lives entirely in vgic-mmio-v2.c.
We copy the old VGIC behaviour of assigning an IRQ to the first VCPU
set in the target mask instead of making it possibly pending on
multiple VCPUs.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:54 +02:00
Andre Przywara 79717e4ac0 KVM: arm/arm64: vgic-new: Add CONFIG registers handlers
The config register handlers are shared between the v2 and v3
emulation, so their implementation goes into vgic-mmio.c, to be
easily referenced from the v3 emulation as well later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:53 +02:00
Andre Przywara 055658bf48 KVM: arm/arm64: vgic-new: Add PRIORITY registers handlers
The priority register handlers are shared between the v2 and v3
emulation, so their implementation goes into vgic-mmio.c, to be
easily referenced from the v3 emulation as well later.
There is a corner case when we change the priority of a pending
interrupt which we don't handle at the moment.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:53 +02:00
Andre Przywara 69b6fe0c6e KVM: arm/arm64: vgic-new: Add ACTIVE registers handlers
The active register handlers are shared between the v2 and v3
emulation, so their implementation goes into vgic-mmio.c, to be
easily referenced from the v3 emulation as well later.
Since activation/deactivation of an interrupt may happen entirely
in the guest without it ever exiting, we need some extra logic to
properly track the active state.
For clearing the active state, we basically have to halt the guest to
make sure this is properly propagated into the respective VCPUs.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:52 +02:00
Andre Przywara 96b298000d KVM: arm/arm64: vgic-new: Add PENDING registers handlers
The pending register handlers are shared between the v2 and v3
emulation, so their implementation goes into vgic-mmio.c, to be easily
referenced from the v3 emulation as well later.
For level triggered interrupts the real line level is unaffected by
this write, so we keep this state separate and combine it with the
device's level to get the actual pending state.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:52 +02:00
Andre Przywara fd122e6209 KVM: arm/arm64: vgic-new: Add ENABLE registers handlers
As the enable register handlers are shared between the v2 and v3
emulation, their implementation goes into vgic-mmio.c, to be easily
referenced from the v3 emulation as well later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:51 +02:00
Marc Zyngier 2b0cda8789 KVM: arm/arm64: vgic-new: Add CTLR, TYPER and IIDR handlers
Those three registers are v2 emulation specific, so their implementation
lives entirely in vgic-mmio-v2.c. Also they are handled in one function,
as their implementation is pretty simple.
When the guest enables the distributor, we kick all VCPUs to get
potentially pending interrupts serviced.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:50 +02:00
Andre Przywara fb848db396 KVM: arm/arm64: vgic-new: Add GICv2 MMIO handling framework
Create vgic-mmio-v2.c to describe GICv2 emulation specific handlers
using the initializer macros provided by the VGIC MMIO framework.
Provide a function to register the GICv2 distributor registers to
the kvm_io_bus framework.
The actual handler functions are still stubs in this patch.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:50 +02:00
Marc Zyngier 4493b1c486 KVM: arm/arm64: vgic-new: Add MMIO handling framework
Add an MMIO handling framework to the VGIC emulation:
Each register is described by its offset, size (or number of bits per
IRQ, if applicable) and the read/write handler functions. We provide
initialization macros to describe each GIC register later easily.

Separate dispatch functions for read and write accesses are connected
to the kvm_io_bus framework and binary-search for the responsible
register handler based on the offset address within the region.
We convert the incoming data (referenced by a pointer) to the host's
endianess and use pass-by-value to hand the data over to the actual
handler functions.

The register handler prototype and the endianess conversion are
courtesy of Christoffer Dall.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:49 +02:00
Eric Auger 90eee56c5f KVM: arm/arm64: vgic-new: Implement kvm_vgic_vcpu_pending_irq
Tell KVM whether a particular VCPU has an IRQ that needs handling
in the guest. This is used to decide whether a VCPU is runnable.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-20 15:39:49 +02:00
Marc Zyngier 59529f69f5 KVM: arm/arm64: vgic-new: Add GICv3 world switch backend
As the GICv3 virtual interface registers differ from their GICv2
siblings, we need different handlers for processing maintenance
interrupts and reading/writing to the LRs.
Implement the respective handler functions and connect them to
existing code to be called if the host is using a GICv3.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:48 +02:00
Marc Zyngier 140b086dd1 KVM: arm/arm64: vgic-new: Add GICv2 world switch backend
Processing maintenance interrupts and accessing the list registers
are dependent on the host's GIC version.
Introduce vgic-v2.c to contain GICv2 specific functions.
Implement the GICv2 specific code for syncing the emulation state
into the VGIC registers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:48 +02:00
Marc Zyngier 0919e84c0f KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework
Implement the framework for syncing IRQs between our emulation and
the list registers, which represent the guest's view of IRQs.
This is done in kvm_vgic_flush_hwstate and kvm_vgic_sync_hwstate,
which gets called on guest entry and exit.
The code talking to the actual GICv2/v3 hardware is added in the
following patches.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:47 +02:00
Christoffer Dall 8e44474579 KVM: arm/arm64: vgic-new: Add IRQ sorting
Adds the sorting function to cover the case where you have more IRQs
to consider than you have LRs. We now consider priorities.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-20 15:39:46 +02:00
Christoffer Dall 81eeb95ddb KVM: arm/arm64: vgic-new: Implement virtual IRQ injection
Provide a vgic_queue_irq_unlock() function which decides whether a
given IRQ needs to be queued to a VCPU's ap_list.
This should be called whenever an IRQ becomes pending or enabled,
either as a result of userspace injection, from in-kernel emulated
devices like the architected timer or from MMIO accesses to the
distributor emulation.
Also provides the necessary functions to allow userland to inject an
IRQ to a guest.
Since this is the first code that starts using our locking mechanism, we
add some (hopefully) clear documentation of our locking strategy and
requirements along with this patch.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:46 +02:00
Christoffer Dall 64a959d66e KVM: arm/arm64: vgic-new: Add acccessor to new struct vgic_irq instance
The new VGIC implementation centers around a struct vgic_irq instance
per virtual IRQ.
Provide a function to retrieve the right instance for a given IRQ
number and (in case of private interrupts) the right VCPU.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-20 15:39:45 +02:00
Andre Przywara 44bfc42e94 KVM: arm/arm64: move GICv2 emulation defines into arm-gic-v3.h
As (some) GICv3 hosts can emulate a GICv2, some GICv2 specific masks
for the list register definition also apply to GICv3 LRs.
At the moment we have those definitions in the KVM VGICv3
implementation, so let's move them into the GICv3 header file to
have them automatically defined.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-20 15:39:44 +02:00
Andre Przywara 2defaff48a KVM: arm/arm64: pmu: abstract access to number of SPIs
Currently the PMU uses a member of the struct vgic_dist directly,
which not only breaks abstraction, but will fail with the new VGIC.
Abstract this access in the VGIC header file and refactor the validity
check in the PMU code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:43 +02:00
Christoffer Dall 83091db981 KVM: arm/arm64: Fix MMIO emulation data handling
When the kernel was handling a guest MMIO read access internally, we
need to copy the emulation result into the run->mmio structure in order
for the kvm_handle_mmio_return() function to pick it up and inject the
	result back into the guest.

Currently the only user of kvm_io_bus for ARM is the VGIC, which did
this copying itself, so this was not causing issues so far.

But with the upcoming new vgic implementation we need this done
properly.

Update the kvm_handle_mmio_return description and cleanup the code to
only perform a single copying when needed.

Code and commit message inspired by Andre Przywara.

Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:42 +02:00
Christoffer Dall 2db4c104fa KVM: arm/arm64: Get rid of vgic_cpu->nr_lr
The number of list registers is a property of the underlying system, not
of emulated VGIC CPU interface.

As we are about to move this variable to global state in the new vgic
for clarity, move it from the legacy implementation as well to make the
merge of the new code easier.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:41 +02:00
Christoffer Dall 41a54482c0 KVM: arm/arm64: Move timer IRQ map to latest possible time
We are about to modify the VGIC to allocate all data structures
dynamically and store mapped IRQ information on a per-IRQ struct, which
is indeed allocated dynamically at init time.

Therefore, we cannot record the mapped IRQ info from the timer at timer
reset time like it's done now, because VCPU reset happens before timer
init.

A possible later time to do this is on the first run of a per VCPU, it
just requires us to move the enable state to be a per-VCPU state and do
the lookup of the physical IRQ number when we are about to run the VCPU.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2016-05-20 15:39:41 +02:00
Andre Przywara c8eb3f6b9b KVM: arm/arm64: vgic: Remove irq_phys_map from interface
Now that the virtual arch timer does not care about the irq_phys_map
anymore, let's rework kvm_vgic_map_phys_irq() to return an error
value instead. Any reference to that mapping can later be done by
passing the correct combination of VCPU and virtual IRQ number.
This makes the irq_phys_map handling completely private to the
VGIC code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:40 +02:00
Andre Przywara a7e33ad9b2 KVM: arm/arm64: arch_timer: Remove irq_phys_map
Now that the interface between the arch timer and the VGIC does not
require passing the irq_phys_map entry pointer anymore, let's remove
it from the virtual arch timer and use the virtual IRQ number instead
directly.
The remaining pointer returned by kvm_vgic_map_phys_irq() will be
removed in the following patch.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:39 +02:00
Christoffer Dall b452cb5207 KVM: arm/arm64: Remove the IRQ field from struct irq_phys_map
The communication of a Linux IRQ number from outside the VGIC to the
vgic was a leftover from the day when the vgic code cared about how a
particular device injects virtual interrupts mapped to a physical
interrupt.

We can safely remove this notion, leaving all physical IRQ handling to
be done in the device driver (the arch timer in this case), which makes
room for a saner API for the new VGIC.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
2016-05-20 15:39:39 +02:00
Andre Przywara 63306c28ac KVM: arm/arm64: vgic: avoid map in kvm_vgic_unmap_phys_irq()
kvm_vgic_unmap_phys_irq() only needs the virtual IRQ number, so let's
just pass that between the arch timer and the VGIC to get rid of
the irq_phys_map pointer.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:38 +02:00
Andre Przywara e262f41936 KVM: arm/arm64: vgic: avoid map in kvm_vgic_map_is_active()
For getting the active state of a mapped IRQ, we actually only need
the virtual IRQ number, not the pointer to the mapping entry.
Pass the virtual IRQ number from the arch timer to the VGIC directly.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:38 +02:00
Andre Przywara 4f551a3d96 KVM: arm/arm64: vgic: avoid map in kvm_vgic_inject_mapped_irq()
When we want to inject a hardware mapped IRQ into a guest, we actually
only need the virtual IRQ number from the irq_phys_map.
So let's pass this number directly from the arch timer to the VGIC
to avoid using the map as a parameter.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:37 +02:00
Andre Przywara 7cbc084dc2 KVM: arm/arm64: vgic: streamline vgic_update_irq_pending() interface
We actually don't use the irq_phys_map parameter in
vgic_update_irq_pending(), so let's just remove it.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20 15:39:37 +02:00
Julien Grall 503a62862e KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables
Currently, the firmware tables are parsed 2 times: once in the GIC
drivers, the other time when initializing the vGIC. It means code
duplication and make more tedious to add the support for another
firmware table (like ACPI).

Use the recently introduced helper gic_get_kvm_info() to get
information about the virtual GIC.

With this change, the virtual GIC becomes agnostic to the firmware
table and KVM will be able to initialize the vGIC on ACPI.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall 29c2d6ff4c KVM: arm/arm64: arch_timer: Rely on the arch timer to parse the firmware tables
The firmware table is currently parsed by the virtual timer code in
order to retrieve the virtual timer interrupt. However, this is already
done by the arch timer driver.

To avoid code duplication, use the newly function arch_timer_get_kvm_info()
which return all the information required by the virtual timer code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Marc Zyngier 1c5631c73f KVM: arm/arm64: Handle forward time correction gracefully
On a host that runs NTP, corrections can have a direct impact on
the background timer that we program on the behalf of a vcpu.

In particular, NTP performing a forward correction will result in
a timer expiring sooner than expected from a guest point of view.
Not a big deal, we kick the vcpu anyway.

But on wake-up, the vcpu thread is going to perform a check to
find out whether or not it should block. And at that point, the
timer check is going to say "timer has not expired yet, go back
to sleep". This results in the timer event being lost forever.

There are multiple ways to handle this. One would be record that
the timer has expired and let kvm_cpu_has_pending_timer return
true in that case, but that would be fairly invasive. Another is
to check for the "short sleep" condition in the hrtimer callback,
and restart the timer for the remaining time when the condition
is detected.

This patch implements the latter, with a bit of refactoring in
order to avoid too much code duplication.

Cc: <stable@vger.kernel.org>
Reported-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-04-06 13:17:54 +02:00
Will Deacon 7d4bd1d281 arm64: KVM: Add braces to multi-line if statement in virtual PMU code
The kernel is written in C, not python, so we need braces around
multi-line if statements. GCC 6 actually warns about this, thanks to the
fantastic new "-Wmisleading-indentation" flag:

 | virt/kvm/arm/pmu.c: In function ‘kvm_pmu_overflow_status’:
 | virt/kvm/arm/pmu.c:198:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
 |    reg &= vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
 |    ^~~
 | arch/arm64/kvm/../../../virt/kvm/arm/pmu.c:196:2: note: ...this ‘if’ clause, but it is not
 |   if ((vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
 |   ^~

As it turns out, this particular case is harmless (we just do some &=
operations with 0), but worth fixing nonetheless.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-04-01 13:30:55 +02:00
Marc Zyngier 0d98d00b8d arm64: KVM: vgic-v3: Reset LRs at boot time
In order to let the GICv3 code be more lazy in the way it
accesses the LRs, it is necessary to start with a clean slate.

Let's reset the LRs on each CPU when the vgic is probed (which
includes a round trip to EL2...).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:09 +00:00
Marc Zyngier 1b8e83c04e arm64: KVM: vgic-v3: Avoid accessing ICH registers
Just like on GICv2, we're a bit hammer-happy with GICv3, and access
them more often than we should.

Adopt a policy similar to what we do for GICv2, only save/restoring
the minimal set of registers. As we don't access the registers
linearly anymore (we may skip some), the convoluted accessors become
slightly simpler, and we can drop the ugly indexing macro that
tended to confuse the reviewers.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:04 +00:00
Marc Zyngier 667a87a928 KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
The GICD_SGIR register lives a long way from the beginning of
the handler array, which is searched linearly. As this is hit
pretty often, let's move it up. This saves us some precious
cycles when the guest is generating IPIs.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:24:03 +00:00
Marc Zyngier cc1daf0b82 KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
So far, we're always writing all possible LRs, setting the empty
ones with a zero value. This is obvious doing a lot of work for
nothing, and we're better off clearing those we've actually
dirtied on the exit path (it is very rare to inject more than one
interrupt at a time anyway).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:23:56 +00:00
Marc Zyngier d6400d7746 KVM: arm/arm64: vgic-v2: Reset LRs at boot time
In order to let make the GICv2 code more lazy in the way it
accesses the LRs, it is necessary to start with a clean slate.

Let's reset the LRs on each CPU when the vgic is probed.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:23:00 +00:00
Marc Zyngier f8cfbce1bb KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
On exit, any empty LR will be signaled in GICH_ELRSR*. Which
means that we do not have to save it, and we can just clear
its state in the in-memory copy.

Take this opportunity to move the LR saving code into its
own function.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:22:24 +00:00
Marc Zyngier 2a1044f8b7 KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
In order to make the saving path slightly more readable and
prepare for some more optimizations, let's move the GICH_ELRSR
saving to its own function.

No functional change.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:22:23 +00:00
Marc Zyngier c813bb17f2 KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
Next on our list of useless accesses is the maintenance interrupt
status registers (GICH_MISR, GICH_EISR{0,1}).

It is pointless to save them if we haven't asked for a maintenance
interrupt the first place, which can only happen for two reasons:
- Underflow: GICH_HCR_UIE will be set,
- EOI: GICH_LR_EOI will be set.

These conditions can be checked on the in-memory copies of the regs.
Should any of these two condition be valid, we must read GICH_MISR.
We can then check for GICH_MISR_EOI, and only when set read
GICH_EISR*.

This means that in most case, we don't have to save them at all.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:22:21 +00:00
Marc Zyngier 59f00ff9af KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
GICv2 registers are *slow*. As in "terrifyingly slow". Which is bad.
But we're equaly bad, as we make a point in accessing them even if
we don't have any interrupt in flight.

A good solution is to first find out if we have anything useful to
write into the GIC, and if we don't, to simply not do it. This
involves tracking which LRs actually have something valid there.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 04:22:20 +00:00
Marc Zyngier 9b4a300443 KVM: arm/arm64: timer: Add active state caching
Programming the active state in the (re)distributor can be an
expensive operation so it makes some sense to try and reduce
the number of accesses as much as possible. So far, we
program the active state on each VM entry, but there is some
opportunity to do less.

An obvious solution is to cache the active state in memory,
and only program it in the HW when conditions change. But
because the HW can also change things under our feet (the active
state can transition from 1 to 0 when the guest does an EOI),
some precautions have to be taken, which amount to only caching
an "inactive" state, and always programing it otherwise.

With this in place, we observe a reduction of around 700 cycles
on a 2GHz GICv2 platform for a NULL hypercall.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:22 +00:00
Shannon Zhao bb0c70bcca arm64: KVM: Add a new vcpu device control group for PMUv3
To configure the virtual PMUv3 overflow interrupt number, we use the
vcpu kvm_device ioctl, encapsulating the KVM_ARM_VCPU_PMU_V3_IRQ
attribute within the KVM_ARM_VCPU_PMU_V3_CTRL group.

After configuring the PMUv3, call the vcpu ioctl with attribute
KVM_ARM_VCPU_PMU_V3_INIT to initialize the PMUv3.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao 808e738142 arm64: KVM: Add a new feature bit for PMUv3
To support guest PMUv3, use one bit of the VCPU INIT feature array.
Initialize the PMU when initialzing the vcpu with that bit and PMU
overflow interrupt set.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao 5f0a714a2b arm64: KVM: Free perf event of PMU when destroying vcpu
When KVM frees VCPU, it needs to free the perf_event of PMU.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao 2aa36e9840 arm64: KVM: Reset PMU state when resetting vcpu
When resetting vcpu, it needs to reset the PMU state to initial status.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao b02386eb7d arm64: KVM: Add PMU overflow interrupt routing
When calling perf_event_create_kernel_counter to create perf_event,
assign a overflow handler. Then when the perf event overflows, set the
corresponding bit of guest PMOVSSET register. If this counter is enabled
and its interrupt is enabled as well, kick the vcpu to sync the
interrupt.

On VM entry, if there is counter overflowed and interrupt level is
changed, inject the interrupt with corresponding level. On VM exit, sync
the interrupt level as well if it has been changed.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao 76993739cd arm64: KVM: Add helper to handle PMCR register bits
According to ARMv8 spec, when writing 1 to PMCR.E, all counters are
enabled by PMCNTENSET, while writing 0 to PMCR.E, all counters are
disabled. When writing 1 to PMCR.P, reset all event counters, not
including PMCCNTR, to zero. When writing 1 to PMCR.C, reset PMCCNTR to
zero.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:21 +00:00
Shannon Zhao 7a0adc7064 arm64: KVM: Add access handler for PMSWINC register
Add access handler which emulates writing and reading PMSWINC
register and add support for creating software increment event.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:20 +00:00
Shannon Zhao 76d883c4e6 arm64: KVM: Add access handler for PMOVSSET and PMOVSCLR register
Since the reset value of PMOVSSET and PMOVSCLR is UNKNOWN, use
reset_unknown for its reset handler. Add a handler to emulate writing
PMOVSSET or PMOVSCLR register.

When writing non-zero value to PMOVSSET, the counter and its interrupt
is enabled, kick this vcpu to sync PMU interrupt.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:20 +00:00
Shannon Zhao 7f76635871 arm64: KVM: PMU: Add perf event map and introduce perf event creating function
When we use tools like perf on host, perf passes the event type and the
id of this event type category to kernel, then kernel will map them to
hardware event number and write this number to PMU PMEVTYPER<n>_EL0
register. When getting the event number in KVM, directly use raw event
type to create a perf_event for it.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:20 +00:00
Shannon Zhao 96b0eebcc6 arm64: KVM: Add access handler for PMCNTENSET and PMCNTENCLR register
Since the reset value of PMCNTENSET and PMCNTENCLR is UNKNOWN, use
reset_unknown for its reset handler. Add a handler to emulate writing
PMCNTENSET or PMCNTENCLR register.

When writing to PMCNTENSET, call perf_event_enable to enable the perf
event. When writing to PMCNTENCLR, call perf_event_disable to disable
the perf event.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:20 +00:00
Shannon Zhao 051ff581ce arm64: KVM: Add access handler for event counter register
These kind of registers include PMEVCNTRn, PMCCNTR and PMXEVCNTR which
is mapped to PMEVCNTRn.

The access handler translates all aarch32 register offsets to aarch64
ones and uses vcpu_sys_reg() to access their values to avoid taking care
of big endian.

When reading these registers, return the sum of register value and the
value perf event counts.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:20 +00:00
Marc Zyngier 6d50d54cd8 arm64: KVM: Move vgic-v2 and timer save/restore to virt/kvm/arm/hyp
We already have virt/kvm/arm/ containing timer and vgic stuff.
Add yet another subdirectory to contain the hyp-specific files
(timer and vgic again).

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29 18:34:18 +00:00
Mark Rutland 236cf17c25 KVM: arm/arm64: vgic: Ensure bitmaps are long enough
When we allocate bitmaps in vgic_vcpu_init_maps, we divide the number of
bits we need by 8 to figure out how many bytes to allocate. However,
bitmap elements are always accessed as unsigned longs, and if we didn't
happen to allocate a size such that size % sizeof(unsigned long) == 0,
bitmap accesses may go past the end of the allocation.

When using KASAN (which does byte-granular access checks), this results
in a continuous stream of BUGs whenever these bitmaps are accessed:

=============================================================================
BUG kmalloc-128 (Tainted: G    B          ): kasan: bad access detected
-----------------------------------------------------------------------------

INFO: Allocated in vgic_init.part.25+0x55c/0x990 age=7493 cpu=3 pid=1730
INFO: Slab 0xffffffbde6d5da40 objects=16 used=15 fp=0xffffffc935769700 flags=0x4000000000000080
INFO: Object 0xffffffc935769500 @offset=1280 fp=0x          (null)

Bytes b4 ffffffc9357694f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffffffc935769570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Padding ffffffc9357695b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Padding ffffffc9357695c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Padding ffffffc9357695d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Padding ffffffc9357695e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Padding ffffffc9357695f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
CPU: 3 PID: 1740 Comm: kvm-vcpu-0 Tainted: G    B           4.4.0+ #17
Hardware name: ARM Juno development board (r1) (DT)
Call trace:
[<ffffffc00008e770>] dump_backtrace+0x0/0x280
[<ffffffc00008ea04>] show_stack+0x14/0x20
[<ffffffc000726360>] dump_stack+0x100/0x188
[<ffffffc00030d324>] print_trailer+0xfc/0x168
[<ffffffc000312294>] object_err+0x3c/0x50
[<ffffffc0003140fc>] kasan_report_error+0x244/0x558
[<ffffffc000314548>] __asan_report_load8_noabort+0x48/0x50
[<ffffffc000745688>] __bitmap_or+0xc0/0xc8
[<ffffffc0000d9e44>] kvm_vgic_flush_hwstate+0x1bc/0x650
[<ffffffc0000c514c>] kvm_arch_vcpu_ioctl_run+0x2ec/0xa60
[<ffffffc0000b9a6c>] kvm_vcpu_ioctl+0x474/0xa68
[<ffffffc00036b7b0>] do_vfs_ioctl+0x5b8/0xcb0
[<ffffffc00036bf34>] SyS_ioctl+0x8c/0xa0
[<ffffffc000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
 ffffffc935769400: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffffffc935769480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffffffc935769500: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                   ^
 ffffffc935769580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffffffc935769600: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Fix the issue by always allocating a multiple of sizeof(unsigned long),
as we do elsewhere in the vgic code.

Fixes: c1bfb577a ("arm/arm64: KVM: vgic: switch to dynamic allocation")
Cc: stable@vger.kernel.org
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-23 19:02:48 +00:00
Andre Przywara b3aff6ccbb KVM: arm/arm64: Fix reference to uninitialised VGIC
Commit 4b4b4512da ("arm/arm64: KVM: Rework the arch timer to use
level-triggered semantics") brought the virtual architected timer
closer to the VGIC. There is one occasion were we don't properly
check for the VGIC actually having been initialized before, but
instead go on to check the active state of some IRQ number.
If userland hasn't instantiated a virtual GIC, we end up with a
kernel NULL pointer dereference:
=========
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc9745c5000
[00000000] *pgd=00000009f631e003, *pud=00000009f631e003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#2] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 2144 Comm: kvm_simplest-ar Tainted: G      D 4.5.0-rc2+ #1300
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc976da8000 ti: ffffffc976e28000 task.ti: ffffffc976e28000
PC is at vgic_bitmap_get_irq_val+0x78/0x90
LR is at kvm_vgic_map_is_active+0xac/0xc8
pc : [<ffffffc0000b7e28>] lr : [<ffffffc0000b972c>] pstate: 20000145
....
=========

Fix this by bailing out early of kvm_timer_flush_hwstate() if we don't
have a VGIC at all.

Reported-by: Cosmin Gorgovan <cosmin@linux-geek.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org> # 4.4.x
2016-02-08 15:23:39 +00:00
Linus Torvalds 1baa5efbeb * s390: Support for runtime instrumentation within guests,
support of 248 VCPUs.
 
 * ARM: rewrite of the arm64 world switch in C, support for
 16-bit VM identifiers.  Performance counter virtualization
 missed the boat.
 
 * x86: Support for more Hyper-V features (synthetic interrupt
 controller), MMU cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWlSKwAAoJEL/70l94x66DY0UIAK5vp4zfQoQOJC4KP4Xgxwdu
 kpnK2Boz3/74o1b0y5+eJZoUZCsXCVLtmP5uhmMxUYWDgByFG2X8ZDhPFwB5FYLT
 2dN+Lr4tsolgIfRdHZtrT6Svp9SDL039bWTdscnbR6l37/j9FRWvpKdhI3orloFD
 /i4CSW2dVIq1/9Xctwu/rtcOEesEx4Cad+6YV3/530eVAXFzE908nXfmqJNZTocY
 YCGcmrMVCOu0ng5QM4xSzmmYjKMLUcRs+QzZWkVBzdJtTgwZUr09yj7I2dZ1yj/i
 cxYrJy6shSwE74XkXsmvG+au3C5u3vX4tnXjBFErnPJ99oqzHatVnFWNRhj4dLQ=
 =PIj1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "PPC changes will come next week.

   - s390: Support for runtime instrumentation within guests, support of
     248 VCPUs.

   - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM
     identifiers.  Performance counter virtualization missed the boat.

   - x86: Support for more Hyper-V features (synthetic interrupt
     controller), MMU cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits)
  kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL
  kvm/x86: Hyper-V SynIC timers tracepoints
  kvm/x86: Hyper-V SynIC tracepoints
  kvm/x86: Update SynIC timers on guest entry only
  kvm/x86: Skip SynIC vector check for QEMU side
  kvm/x86: Hyper-V fix SynIC timer disabling condition
  kvm/x86: Reorg stimer_expiration() to better control timer restart
  kvm/x86: Hyper-V unify stimer_start() and stimer_restart()
  kvm/x86: Drop stimer_stop() function
  kvm/x86: Hyper-V timers fix incorrect logical operation
  KVM: move architecture-dependent requests to arch/
  KVM: renumber vcpu->request bits
  KVM: document which architecture uses each request bit
  KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests
  kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock
  KVM: s390: implement the RI support of guest
  kvm/s390: drop unpaired smp_mb
  kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa
  KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table()
  arm/arm64: KVM: Detect vGIC presence at runtime
  ...
2016-01-12 13:22:12 -08:00
Marc Zyngier 9d8415d6c1 arm64: KVM: Turn system register numbers to an enum
Having the system register numbers as #defines has been a pain
since day one, as the ordering is pretty fragile, and moving
things around leads to renumbering and epic conflict resolutions.

Now that we're mostly acessing the sysreg file in C, an enum is
a much better type to use, and we can clean things up a bit.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:43 +00:00
Marc Zyngier 3c13b8f435 KVM: arm/arm64: vgic-v3: Make the LR indexing macro public
We store GICv3 LRs in reverse order so that the CPU can save/restore
them in rever order as well (don't ask why, the design is crazy),
and yet generate memory traffic that doesn't completely suck.

We need this macro to be available to the C version of save/restore.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-14 11:30:38 +00:00
Jisheng Zhang 8cdb654abe KVM: arm/arm64: vgic: make vgic_io_ops static
vgic_io_ops is only referenced within vgic.c, so it can be declared
static.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:29:59 +00:00
Christoffer Dall fdec12c12e KVM: arm/arm64: vgic: Fix kvm_vgic_map_is_active's dist check
External inputs to the vgic from time to time need to poke into the
state of a virtual interrupt, the prime example is the architected timer
code.

Since the IRQ's active state can be represented in two places; the LR or
the distributor, we first loop over the LRs but if not active in the LRs
we just return if *any* IRQ is active on the VCPU in question.

This is of course bogus, as we should check if the specific IRQ in
quesiton is active on the distributor instead.

Reported-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-11 16:33:31 +00:00
Christoffer Dall 9f958c11b7 KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
We were probing the physial distributor state for the active state of a
HW virtual IRQ, because we had seen evidence that the LR state was not
cleared when the guest deactivated a virtual interrupted.

However, this issue turned out to be a software bug in the GIC, which
was solved by: 84aab5e68c2a5e1e18d81ae8308c3ce25d501b29
(KVM: arm/arm64: arch_timer: Preserve physical dist. active
state on LR.active, 2015-11-24)

Therefore, get rid of the complexities and just look at the LR.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:08:37 +01:00
Christoffer Dall 0e3dfda91d KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
We were incorrectly removing the active state from the physical
distributor on the timer interrupt when the timer output level was
deasserted.  We shouldn't be doing this without considering the virtual
interrupt's active state, because the architecture requires that when an
LR has the HW bit set and the pending or active bits set, then the
physical interrupt must also have the corresponding bits set.

This addresses an issue where we have been observing an inconsistency
between the LR state and the physical distributor state where the LR
state was active and the physical distributor was not active, which
shouldn't happen.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:07:40 +01:00
Linus Torvalds 933425fb00 s390: A bunch of fixes and optimizations for interrupt and time
handling.
 
 PPC: Mostly bug fixes.
 
 ARM: No big features, but many small fixes and prerequisites including:
 - a number of fixes for the arch-timer
 - introducing proper level-triggered semantics for the arch-timers
 - a series of patches to synchronously halt a guest (prerequisite for
   IRQ forwarding)
 - some tracepoint improvements
 - a tweak for the EL2 panic handlers
 - some more VGIC cleanups getting rid of redundant state
 
 x86: quite a few changes:
 
 - support for VT-d posted interrupts (i.e. PCI devices can inject
 interrupts directly into vCPUs).  This introduces a new component (in
 virt/lib/) that connects VFIO and KVM together.  The same infrastructure
 will be used for ARM interrupt forwarding as well.
 
 - more Hyper-V features, though the main one Hyper-V synthetic interrupt
 controller will have to wait for 4.5.  These will let KVM expose Hyper-V
 devices.
 
 - nested virtualization now supports VPID (same as PCID but for vCPUs)
 which makes it quite a bit faster
 
 - for future hardware that supports NVDIMM, there is support for clflushopt,
 clwb, pcommit
 
 - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in
 userspace, which reduces the attack surface of the hypervisor
 
 - obligatory smattering of SMM fixes
 
 - on the guest side, stable scheduler clock support was rewritten to not
 require help from the hypervisor.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a
 f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw
 VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u
 p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ
 PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG
 iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc=
 =iv2z
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "First batch of KVM changes for 4.4.

  s390:
     A bunch of fixes and optimizations for interrupt and time handling.

  PPC:
     Mostly bug fixes.

  ARM:
     No big features, but many small fixes and prerequisites including:

      - a number of fixes for the arch-timer

      - introducing proper level-triggered semantics for the arch-timers

      - a series of patches to synchronously halt a guest (prerequisite
        for IRQ forwarding)

      - some tracepoint improvements

      - a tweak for the EL2 panic handlers

      - some more VGIC cleanups getting rid of redundant state

  x86:
     Quite a few changes:

      - support for VT-d posted interrupts (i.e. PCI devices can inject
        interrupts directly into vCPUs).  This introduces a new
        component (in virt/lib/) that connects VFIO and KVM together.
        The same infrastructure will be used for ARM interrupt
        forwarding as well.

      - more Hyper-V features, though the main one Hyper-V synthetic
        interrupt controller will have to wait for 4.5.  These will let
        KVM expose Hyper-V devices.

      - nested virtualization now supports VPID (same as PCID but for
        vCPUs) which makes it quite a bit faster

      - for future hardware that supports NVDIMM, there is support for
        clflushopt, clwb, pcommit

      - support for "split irqchip", i.e.  LAPIC in kernel +
        IOAPIC/PIC/PIT in userspace, which reduces the attack surface of
        the hypervisor

      - obligatory smattering of SMM fixes

      - on the guest side, stable scheduler clock support was rewritten
        to not require help from the hypervisor"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits)
  KVM: VMX: Fix commit which broke PML
  KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()
  KVM: x86: allow RSM from 64-bit mode
  KVM: VMX: fix SMEP and SMAP without EPT
  KVM: x86: move kvm_set_irq_inatomic to legacy device assignment
  KVM: device assignment: remove pointless #ifdefs
  KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic
  KVM: x86: zero apic_arb_prio on reset
  drivers/hv: share Hyper-V SynIC constants with userspace
  KVM: x86: handle SMBASE as physical address in RSM
  KVM: x86: add read_phys to x86_emulate_ops
  KVM: x86: removing unused variable
  KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs
  KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()
  KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
  KVM: arm/arm64: Optimize away redundant LR tracking
  KVM: s390: use simple switch statement as multiplexer
  KVM: s390: drop useless newline in debugging data
  KVM: s390: SCA must not cross page boundaries
  KVM: arm: Do not indent the arguments of DECLARE_BITMAP
  ...
2015-11-05 16:26:26 -08:00
Pavel Fedin 26caea7693 KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()
Now we see that vgic_set_lr() and vgic_sync_lr_elrsr() are always used
together. Merge them into one function, saving from second vgic_ops
dereferencing every time.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-04 15:29:49 +01:00
Pavel Fedin 212c76545d KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
1. Remove unnecessary 'irq' argument, because irq number can be retrieved
   from the LR.
2. Since cff9211eb1
   ("arm/arm64: KVM: Fix arch timer behavior for disabled interrupts ")
   LR_STATE_PENDING is queued back by vgic_retire_lr() itself. Also, it
   clears vlr.state itself. Therefore, we remove the same, now duplicated,
   check with all accompanying bit manipulations from vgic_unqueue_irqs().
3. vgic_retire_lr() is always accompanied by vgic_irq_clear_queued(). Since
   it already does more than just clearing the LR, move
   vgic_irq_clear_queued() inside of it.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-04 15:29:49 +01:00
Pavel Fedin c4cd4c168b KVM: arm/arm64: Optimize away redundant LR tracking
Currently we use vgic_irq_lr_map in order to track which LRs hold which
IRQs, and lr_used bitmap in order to track which LRs are used or free.

vgic_irq_lr_map is actually used only for piggy-back optimization, and
can be easily replaced by iteration over lr_used. This is good because in
future, when LPI support is introduced, number of IRQs will grow up to at
least 16384, while numbers from 1024 to 8192 are never going to be used.
This would be a huge memory waste.

In its turn, lr_used is also completely redundant since
ae705930fc ("arm/arm64: KVM: Keep elrsr/aisr
in sync with software model"), because together with lr_used we also update
elrsr. This allows to easily replace lr_used with elrsr, inverting all
conditions (because in elrsr '1' means 'free').

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-04 15:29:49 +01:00
Linus Torvalds 6aa2fdb87c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq departement delivers:

   - Rework the irqdomain core infrastructure to accomodate ACPI based
     systems.  This is required to support ARM64 without creating
     artificial device tree nodes.

   - Sanitize the ACPI based ARM GIC initialization by making use of the
     new firmware independent irqdomain core

   - Further improvements to the generic MSI management

   - Generalize the irq migration on CPU hotplug

   - Improvements to the threaded interrupt infrastructure

   - Allow the migration of "chained" low level interrupt handlers

   - Allow optional force masking of interrupts in disable_irq[_nosysnc]

   - Support for two new interrupt chips - Sigh!

   - A larger set of errata fixes for ARM gicv3

   - The usual pile of fixes, updates, improvements and cleanups all
     over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  Document that IRQ_NONE should be returned when IRQ not actually handled
  PCI/MSI: Allow the MSI domain to be device-specific
  PCI: Add per-device MSI domain hook
  of/irq: Use the msi-map property to provide device-specific MSI domain
  of/irq: Split of_msi_map_rid to reuse msi-map lookup
  irqchip/gic-v3-its: Parse new version of msi-parent property
  PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Add support code for multi-parent version of "msi-parent"
  irqchip/gic-v3-its: Add handling of PCI requester id.
  PCI/MSI: Add helper function pci_msi_domain_get_msi_rid().
  of/irq: Add new function of_msi_map_rid()
  Docs: dt: Add PCI MSI map bindings
  irqchip/gic-v2m: Add support for multiple MSI frames
  irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec
  irqchip/mxs: Add Alphascale ASM9260 support
  irqchip/mxs: Prepare driver for hardware with different offsets
  irqchip/mxs: Panic if ioremap or domain creation fails
  irqdomain: Documentation updates
  irqdomain/msi: Use fwnode instead of of_node
  ...
2015-11-03 14:40:01 -08:00
Christoffer Dall e21f091087 arm/arm64: KVM: Add tracepoints for vgic and timer
The VGIC and timer code for KVM arm/arm64 doesn't have any tracepoints
or tracepoint infrastructure defined.  Rewriting some of the timer code
handling showed me how much we need this, so let's add these simple
trace points once and for all and we can easily expand with additional
trace points in these files as we go along.

Cc: Wei Huang <wei@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:48 +02:00
Christoffer Dall 8fe2f19e6e arm/arm64: KVM: Support edge-triggered forwarded interrupts
We mark edge-triggered interrupts with the HW bit set as queued to
prevent the VGIC code from injecting LRs with both the Active and
Pending bits set at the same time while also setting the HW bit,
because the hardware does not support this.

However, this means that we must also clear the queued flag when we sync
back a LR where the state on the physical distributor went from active
to inactive because the guest deactivated the interrupt.  At this point
we must also check if the interrupt is pending on the distributor, and
tell the VGIC to queue it again if it is.

Since these actions on the sync path are extremely close to those for
level-triggered interrupts, rename process_level_irq to
process_queued_irq, allowing it to cater for both cases.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:44 +02:00
Christoffer Dall 4b4b4512da arm/arm64: KVM: Rework the arch timer to use level-triggered semantics
The arch timer currently uses edge-triggered semantics in the sense that
the line is never sampled by the vgic and lowering the line from the
timer to the vgic doesn't have any effect on the pending state of
virtual interrupts in the vgic.  This means that we do not support a
guest with the otherwise valid behavior of (1) disable interrupts (2)
enable the timer (3) disable the timer (4) enable interrupts.  Such a
guest would validly not expect to see any interrupts on real hardware,
but will see interrupts on KVM.

This patch fixes this shortcoming through the following series of
changes.

First, we change the flow of the timer/vgic sync/flush operations.  Now
the timer is always flushed/synced before the vgic, because the vgic
samples the state of the timer output.  This has the implication that we
move the timer operations in to non-preempible sections, but that is
fine after the previous commit getting rid of hrtimer schedules on every
entry/exit.

Second, we change the internal behavior of the timer, letting the timer
keep track of its previous output state, and only lower/raise the line
to the vgic when the state changes.  Note that in theory this could have
been accomplished more simply by signalling the vgic every time the
state *potentially* changed, but we don't want to be hitting the vgic
more often than necessary.

Third, we get rid of the use of the map->active field in the vgic and
instead simply set the interrupt as active on the physical distributor
whenever the input to the GIC is asserted and conversely clear the
physical active state when the input to the GIC is deasserted.

Fourth, and finally, we now initialize the timer PPIs (and all the other
unused PPIs for now), to be level-triggered, and modify the sync code to
sample the line state on HW sync and re-inject a new interrupt if it is
still pending at that time.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:44 +02:00
Christoffer Dall 54723bb37f arm/arm64: KVM: Use appropriate define in VGIC reset code
We currently initialize the SGIs to be enabled in the VGIC code, but we
use the VGIC_NR_PPIS define for this purpose, instead of the the more
natural VGIC_NR_SGIS.  Change this slightly confusing use of the
defines.

Note: This should have no functional change, as both names are defined
to the number 16.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:43 +02:00
Christoffer Dall 8bf9a701e1 arm/arm64: KVM: Implement GICD_ICFGR as RO for PPIs
The GICD_ICFGR allows the bits for the SGIs and PPIs to be read only.
We currently simulate this behavior by writing a hardcoded value to the
register for the SGIs and PPIs on every write of these bits to the
register (ignoring what the guest actually wrote), and by writing the
same value as the reset value to the register.

This is a bit counter-intuitive, as the register is RO for these bits,
and we can just implement it that way, allowing us to control the value
of the bits purely in the reset code.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:42 +02:00
Christoffer Dall 9103617df2 arm/arm64: KVM: vgic: Factor out level irq processing on guest exit
Currently vgic_process_maintenance() processes dealing with a completed
level-triggered interrupt directly, but we are soon going to reuse this
logic for level-triggered mapped interrupts with the HW bit set, so
move this logic into a separate static function.

Probably the most scary part of this commit is convincing yourself that
the current flow is safe compared to the old one.  In the following I
try to list the changes and why they are harmless:

  Move vgic_irq_clear_queued after kvm_notify_acked_irq:
    Harmless because the only potential effect of clearing the queued
    flag wrt.  kvm_set_irq is that vgic_update_irq_pending does not set
    the pending bit on the emulated CPU interface or in the
    pending_on_cpu bitmask if the function is called with level=1.
    However, the point of kvm_notify_acked_irq is to call kvm_set_irq
    with level=0, and we set the queued flag again in
    __kvm_vgic_sync_hwstate later on if the level is stil high.

  Move vgic_set_lr before kvm_notify_acked_irq:
    Also, harmless because the LR are cpu-local operations and
    kvm_notify_acked only affects the dist

  Move vgic_dist_irq_clear_soft_pend after kvm_notify_acked_irq:
    Also harmless, because now we check the level state in the
    clear_soft_pend function and lower the pending bits if the level is
    low.

Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:42 +02:00
Christoffer Dall d35268da66 arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block
We currently schedule a soft timer every time we exit the guest if the
timer did not expire while running the guest.  This is really not
necessary, because the only work we do in the timer work function is to
kick the vcpu.

Kicking the vcpu does two things:
(1) If the vpcu thread is on a waitqueue, make it runnable and remove it
from the waitqueue.
(2) If the vcpu is running on a different physical CPU from the one
doing the kick, it sends a reschedule IPI.

The second case cannot happen, because the soft timer is only ever
scheduled when the vcpu is not running.  The first case is only relevant
when the vcpu thread is on a waitqueue, which is only the case when the
vcpu thread has called kvm_vcpu_block().

Therefore, we only need to make sure a timer is scheduled for
kvm_vcpu_block(), which we do by encapsulating all calls to
kvm_vcpu_block() with kvm_timer_{un}schedule calls.

Additionally, we only schedule a soft timer if the timer is enabled and
unmasked, since it is useless otherwise.

Note that theoretically userspace can use the SET_ONE_REG interface to
change registers that should cause the timer to fire, even if the vcpu
is blocked without a scheduled timer, but this case was not supported
before this patch and we leave it for future work for now.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-22 23:01:42 +02:00
Christoffer Dall 0d997491f8 arm/arm64: KVM: Fix disabled distributor operation
We currently do a single update of the vgic state when the distributor
enable/disable control register is accessed and then bypass updating the
state for as long as the distributor remains disabled.

This is incorrect, because updating the state does not consider the
distributor enable bit, and this you can end up in a situation where an
interrupt is marked as pending on the CPU interface, but not pending on
the distributor, which is an impossible state to be in, and triggers a
warning.  Consider for example the following sequence of events:

1. An interrupt is marked as pending on the distributor
   - the interrupt is also forwarded to the CPU interface
2. The guest turns off the distributor (it's about to do a reboot)
   - we stop updating the CPU interface state from now on
3. The guest disables the pending interrupt
   - we remove the pending state from the distributor, but don't touch
     the CPU interface, see point 2.

Since the distributor disable bit really means that no interrupts should
be forwarded to the CPU interface, we modify the code to keep updating
the internal VGIC state, but always set the CPU interface pending bits
to zero when the distributor is disabled.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20 18:09:13 +02:00
Christoffer Dall 544c572e03 arm/arm64: KVM: Clear map->active on pend/active clear
When a guest reboots or offlines/onlines CPUs, it is not uncommon for it
to clear the pending and active states of an interrupt through the
emulated VGIC distributor.  However, since the architected timers are
defined by the architecture to be level triggered and the guest
rightfully expects them to be that, but we emulate them as
edge-triggered, we have to mimic level-triggered behavior for an
edge-triggered virtual implementation.

We currently do not signal the VGIC when the map->active field is true,
because it indicates that the guest has already been signalled of the
interrupt as required.  Normally this field is set to false when the
guest deactivates the virtual interrupt through the sync path.

We also need to catch the case where the guest deactivates the interrupt
through the emulated distributor, again allowing guests to boot even if
the original virtual timer signal hit before the guest's GIC
initialization sequence is run.

Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20 18:06:34 +02:00
Christoffer Dall cff9211eb1 arm/arm64: KVM: Fix arch timer behavior for disabled interrupts
We have an interesting issue when the guest disables the timer interrupt
on the VGIC, which happens when turning VCPUs off using PSCI, for
example.

The problem is that because the guest disables the virtual interrupt at
the VGIC level, we never inject interrupts to the guest and therefore
never mark the interrupt as active on the physical distributor.  The
host also never takes the timer interrupt (we only use the timer device
to trigger a guest exit and everything else is done in software), so the
interrupt does not become active through normal means.

The result is that we keep entering the guest with a programmed timer
that will always fire as soon as we context switch the hardware timer
state and run the guest, preventing forward progress for the VCPU.

Since the active state on the physical distributor is really part of the
timer logic, it is the job of our virtual arch timer driver to manage
this state.

The timer->map->active boolean field indicates whether we have signalled
this interrupt to the vgic and if that interrupt is still pending or
active.  As long as that is the case, the hardware doesn't have to
generate physical interrupts and therefore we mark the interrupt as
active on the physical distributor.

We also have to restore the pending state of an interrupt that was
queued to an LR but was retired from the LR for some reason, while
remaining pending in the LR.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20 18:04:54 +02:00
Pavel Fedin 437f9963bc KVM: arm/arm64: Do not inject spurious interrupts
When lowering a level-triggered line from userspace, we forgot to lower
the pending bit on the emulated CPU interface and we also did not
re-compute the pending_on_cpu bitmap for the CPU affected by the change.

Update vgic_update_irq_pending() to fix the two issues above and also
raise a warning in vgic_quue_irq_to_lr if we encounter an interrupt
pending on a CPU which is neither marked active nor pending.

  [ Commit text reworked completely - Christoffer ]

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20 18:04:43 +02:00
Jean-Philippe Brucker 4f64cb65bf arm/arm64: KVM: Only allow 64bit hosts to build VGICv3
Hardware virtualisation of GICv3 is only supported by 64bit hosts for
the moment. Some VGICv3 bits are missing from the 32bit side, and this
patch allows to still be able to build 32bit hosts when CONFIG_ARM_GIC_V3
is selected.

To this end, we introduce a new option, CONFIG_KVM_ARM_VGIC_V3, that is
only enabled on the 64bit side. The selection is done unconditionally
because CONFIG_ARM_GIC_V3 is always enabled on arm64.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:57 +01:00
Ming Lei ef748917b5 arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:

1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;

2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;

3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.

Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-17 13:13:27 +01:00
Christoffer Dall 4ad9e16af3 arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0
Provide a better quality of implementation and be architecture compliant
on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on
reset of the timer.

This change alone fixes the UEFI reset issue reported by Laszlo back in
February.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Drew Jones <drjones@redhat.com>
Cc: Wei Huang <wei@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-04 16:26:56 +01:00
Christoffer Dall 04bdfa8ab5 arm/arm64: KVM: vgic: Move active state handling to flush_hwstate
We currently set the physical active state only when we *inject* a new
pending virtual interrupt, but this is actually not correct, because we
could have been preempted and run something else on the system that
resets the active state to clear.  This causes us to run the VM with the
timer set to fire, but without setting the physical active state.

The solution is to always check the LR configurations, and we if have a
mapped interrupt in the LR in either the pending or active state
(virtual), then set the physical active state.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-04 16:26:52 +01:00
Marc Zyngier f120cd6533 KVM: arm/arm64: timer: Allow the timer to control the active state
In order to remove the crude hack where we sneak the masked bit
into the timer's control register, make use of the phys_irq_map
API control the active state of the interrupt.

This causes some limited changes to allow for potential error
propagation.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:26 +01:00
Marc Zyngier 773299a570 KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
Virtual interrupts mapped to a HW interrupt should only be triggered
from inside the kernel. Otherwise, you could end up confusing the
kernel (and the GIC's) state machine.

Rearrange the injection path so that kvm_vgic_inject_irq is
used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
used for mapped interrupts. The latter should only be called from
inside the kernel (timer, irqfd).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:26 +01:00
Marc Zyngier 6e84e0e067 KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
In order to control the active state of an interrupt, introduce
a pair of accessors allowing the state to be set/queried.

This only affects the logical state, and the HW state will only be
applied at world-switch time.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:26 +01:00
Marc Zyngier 08fd6461e8 KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
To allow a HW interrupt to be injected into a guest, we lookup the
guest virtual interrupt in the irq_phys_map list, and if we have
a match, encode both interrupts in the LR.

We also mark the interrupt as "active" at the host distributor level.

On guest EOI on the virtual interrupt, the host interrupt will be
deactivated.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:25 +01:00
Marc Zyngier 6c3d63c9a2 KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
In order to be able to feed physical interrupts to a guest, we need
to be able to establish the virtual-physical mapping between the two
worlds.

The mappings are kept in a set of RCU lists, indexed by virtual interrupts.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:25 +01:00
Marc Zyngier 7a67b4b7e0 KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
We only set the irq_queued flag for level interrupts, meaning
that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate
for all interrupts.

This will allow us to inject edge HW interrupts, for which the
state ACTIVE+PENDING is not allowed.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:25 +01:00
Marc Zyngier fb182cf845 KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
field, we can encode that information into the list registers.

This patch provides implementations for both GICv2 and GICv3.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12 11:28:24 +01:00
Linus Torvalds e3d8238d7f arm64 updates for 4.2, mostly refactoring/clean-up:
- CPU ops and PSCI (Power State Coordination Interface) refactoring
   following the merging of the arm64 ACPI support, together with
   handling of Trusted (secure) OS instances
 
 - Using fixmap for permanent FDT mapping, removing the initial dtb
   placement requirements (within 512MB from the start of the kernel
   image). This required moving the FDT self reservation out of the
   memreserve processing
 
 - Idmap (1:1 mapping used for MMU on/off) handling clean-up
 
 - Removing flush_cache_all() - not safe on ARM unless the MMU is off.
   Last stages of CPU power down/up are handled by firmware already
 
 - "Alternatives" (run-time code patching) refactoring and support for
   immediate branch patching, GICv3 CPU interface access
 
 - User faults handling clean-up
 
 And some fixes:
 
 - Fix for VDSO building with broken ELF toolchains
 
 - Fixing another case of init_mm.pgd usage for user mappings (during
   ASID roll-over broadcasting)
 
 - Fix for FPSIMD reloading after CPU hotplug
 
 - Fix for missing syscall trace exit
 
 - Workaround for .inst asm bug
 
 - Compat fix for switching the user tls tpidr_el0 register
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVitZgAAoJEGvWsS0AyF7x+ToP/0Yci5bNsYVwVay8N1rK6WHh
 SGzDMzyxcSBjQpz2IhhTJ28eTAEH+a+HWQms+0tBehjqxqkvjuzBN0okDkc/z8NB
 7Z0BV2aLkQcMwTbjgIh5jm25ZpGmvmvbWPD5oBwgmgQ4v4i1OLRKgx7+YQ+z9rWZ
 zC70d0UwyWjs2oxmjd2ZrAkps7v/qozEFhcRHxLzCn8Mbw+3FcTQsqMbfnoWGnH0
 YuGfHQQqBY4/HC7uAslMCy7tXeJXqb+NsgrnAovjfEbHGDjsg0KNl06K++LHwE37
 A5Noa3M0AQEPYqx/sg0Ec8RNUUEMB4RA2DCaibp7XlVGncXOwFfiyk/M5uVrYXIO
 ku5QF0ytUfZKzrMq3yQKbEDuCPOFTqjjdVpkeXKFdW66zYTohKVc3vUBV5xHZ5uO
 7Kr8H0ZnhAv3OxPyKdEwAuQ5sJdWwQSvZyGClxMUO4eC/UzD0Mwwf1Y8WYtiAXx+
 NSTeBKw/m33W3/KhNuNH1+qGEOKhuXuKX7AcYA84Rab8ytxYWcurHCG2bmhzpEse
 2DZtNMybrP/HMQPyJlYgGac8B3QbsAIAkkU1f+dJTAv9otuBDhscaDQyZ9Y6WVht
 /k8zJiZeMEuGAmwgTkzLmWs/8pQq42nW4J4eQdXPZAwp4ghCIypPWfaZASAwee6/
 p+es3v8P4k9wkv2TFZMh
 =YeGl
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Mostly refactoring/clean-up:

   - CPU ops and PSCI (Power State Coordination Interface) refactoring
     following the merging of the arm64 ACPI support, together with
     handling of Trusted (secure) OS instances

   - Using fixmap for permanent FDT mapping, removing the initial dtb
     placement requirements (within 512MB from the start of the kernel
     image).  This required moving the FDT self reservation out of the
     memreserve processing

   - Idmap (1:1 mapping used for MMU on/off) handling clean-up

   - Removing flush_cache_all() - not safe on ARM unless the MMU is off.
     Last stages of CPU power down/up are handled by firmware already

   - "Alternatives" (run-time code patching) refactoring and support for
     immediate branch patching, GICv3 CPU interface access

   - User faults handling clean-up

  And some fixes:

   - Fix for VDSO building with broken ELF toolchains

   - Fix another case of init_mm.pgd usage for user mappings (during
     ASID roll-over broadcasting)

   - Fix for FPSIMD reloading after CPU hotplug

   - Fix for missing syscall trace exit

   - Workaround for .inst asm bug

   - Compat fix for switching the user tls tpidr_el0 register"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
  arm64: use private ratelimit state along with show_unhandled_signals
  arm64: show unhandled SP/PC alignment faults
  arm64: vdso: work-around broken ELF toolchains in Makefile
  arm64: kernel: rename __cpu_suspend to keep it aligned with arm
  arm64: compat: print compat_sp instead of sp
  arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP
  arm64: entry: fix context tracking for el0_sp_pc
  arm64: defconfig: enable memtest
  arm64: mm: remove reference to tlb.S from comment block
  arm64: Do not attempt to use init_mm in reset_context()
  arm64: KVM: Switch vgic save/restore to alternative_insn
  arm64: alternative: Introduce feature for GICv3 CPU interface
  arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning
  arm64: fix bug for reloading FPSIMD state after CPU hotplug.
  arm64: kernel thread don't need to save fpsimd context.
  arm64: fix missing syscall trace exit
  arm64: alternative: Work around .inst assembler bugs
  arm64: alternative: Merge alternative-asm.h into alternative.h
  arm64: alternative: Allow immediate branch as alternative instruction
  arm64: Rework alternate sequence for ARM erratum 845719
  ...
2015-06-24 10:02:15 -07:00
Marc Zyngier c62e631d4a KVM: arm/arm64: vgic: Remove useless arm-gic.h #include
Back in the days, vgic.c used to have an intimate knowledge of
the actual GICv2. These days, this has been abstracted away into
hardware-specific backends.

Remove the now useless arm-gic.h #include directive, making it
clear that GICv2 specific code doesn't belong here.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-18 15:50:31 +01:00
Marc Zyngier 4839ddc27b KVM: arm/arm64: vgic: Avoid injecting reserved IRQ numbers
Commit fd1d0ddf2a (KVM: arm/arm64: check IRQ number on userland
injection) rightly limited the range of interrupts userspace can
inject in a guest, but failed to consider the (unlikely) case where
a guest is configured with 1024 interrupts.

In this case, interrupts ranging from 1020 to 1023 are unuseable,
as they have a special meaning for the GIC CPU interface.

Make sure that these number cannot be used as an IRQ. Also delete
a redundant (and similarily buggy) check in kvm_set_irq.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: <stable@vger.kernel.org> # 4.1, 4.0, 3.19, 3.18
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17 15:18:59 +01:00
Marc Zyngier f5a202db12 KVM: arm: vgic: Drop useless Group0 warning
If a GICv3-enabled guest tries to configure Group0, we print a
warning on the console (because we don't support Group0 interrupts).

This is fairly pointless, and would allow a guest to spam the
console. Let's just drop the warning.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17 09:58:12 +01:00
Marc Zyngier 8a14849b4a arm64: KVM: Switch vgic save/restore to alternative_insn
So far, we configured the world-switch by having a small array
of pointers to the save and restore functions, depending on the
GIC used on the platform.

Loading these values each time is a bit silly (they never change),
and it makes sense to rely on the instruction patching instead.

This leads to a nice cleanup of the code.

Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-12 15:12:08 +01:00
Andre Przywara c11b532910 KVM: arm64: add active register handling to GICv3 emulation as well
Commit 47a98b15ba ("arm/arm64: KVM: support for un-queuing active
IRQs") introduced handling of the GICD_I[SC]ACTIVER registers,
but only for the GICv2 emulation. For the sake of completeness and
as this is a pre-requisite for save/restore of the GICv3 distributor
state, we should also emulate their handling in the distributor and
redistributor frames of an emulated GICv3.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-09 18:05:17 +01:00
Andre Przywara fd1d0ddf2a KVM: arm/arm64: check IRQ number on userland injection
When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently
only check it against a fixed limit, which historically is set
to 127. With the new dynamic IRQ allocation the effective limit may
actually be smaller (64).
So when now a malicious or buggy userland injects a SPI in that
range, we spill over on our VGIC bitmaps and bytemaps memory.
I could trigger a host kernel NULL pointer dereference with current
mainline by injecting some bogus IRQ number from a hacked kvmtool:
-----------------
....
DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1)
DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1)
DEBUG: IRQ #114 still in the game, writing to bytemap now...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc07652e000
[00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027
Hardware name: FVP Base (DT)
task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000
PC is at kvm_vgic_inject_irq+0x234/0x310
LR is at kvm_vgic_inject_irq+0x30c/0x310
pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145
.....

So this patch fixes this by checking the SPI number against the
actual limit. Also we remove the former legacy hard limit of
127 in the ioctl code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18
[maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__,
as suggested by Christopher Covington]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-04-22 15:42:24 +01:00
Eric Auger 0b3289ebc2 KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
irqfd/arm curently does not support routing. kvm_irq_map_gsi is
supposed to return all the routing entries associated with the
provided gsi and return the number of those entries. We should
return 0 at this point.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-04-22 15:37:54 +01:00
Paolo Bonzini bf0fb67cf9 KVM/ARM changes for v4.1:
- fixes for live migration
 - irqfd support
 - kvm-io-bus & vgic rework to enable ioeventfd
 - page ageing for stage-2 translation
 - various cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVHQ0kAAoJECPQ0LrRPXpDHKQQALjw6STaZd7n20OFopNgHd4P
 qVeWYEKBxnsiSvL4p3IOSlZlEul+7x08aZqtyxWQRQcDT4ggTI+3FKKfc+8yeRpH
 WV6YJP0bGqz7039PyMLuIgs48xkSZtntePw69hPJfHZh4C1RBlP5T2SfE8mU8VZX
 fWToiU3W12QfKnmN7JFgxZopnGhrYCrG0EexdTDziAZu0GEMlDrO4wnyTR60WCvT
 4TEF73R0kpAz4yplKuhcDHuxIG7VFhQ4z7b09M1JtR0gQ3wUvfbD3Wqqi49SwHkv
 NQOStcyLsIlDosSRcLXNCwb3IxjObXTBcAxnzgm2Aoc1xMMZX1ZPQNNs6zFZzycb
 2c6QMiQ35zm7ellbvrG+bT+BP86JYWcAtHjWcaUFgqSJjb8MtqcMtsCea/DURhqx
 /kictqbPYBBwKW6SKbkNkisz59hPkuQnv35fuf992MRCbT9LAXLPRLbcirucCzkE
 p1MOotsWoO3ldJMZaVn0KYk3sQf6mCIfbYPEdOcw3fhJlvyy3NdjVkLOFbA5UUg1
 rQ7Ru2rTemBc0ExVrymngNTMpMB4XcEeJzXfhcgMl3DWbDj60Ku/O26sDtZ6bsFv
 JuDYn8FVDHz9gpEQHgiUi1YMsBKXLhnILa1ppaa6AflykU3BRfYjAk1SXmX84nQK
 mJUJEdFuxi6pHN0UKxUI
 =avA4
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next'

KVM/ARM changes for v4.1:

- fixes for live migration
- irqfd support
- kvm-io-bus & vgic rework to enable ioeventfd
- page ageing for stage-2 translation
- various cleanups
2015-04-07 18:09:20 +02:00
Andre Przywara 950324ab81 KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus
Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
data to be passed on from syndrome decoding all the way down to the
VGIC register handlers. Now as we switch the MMIO handling to be
routed through the KVM MMIO bus, it does not make sense anymore to
use that structure already from the beginning. So we keep the data in
local variables until we put them into the kvm_io_bus framework.
Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
structure. On that way we replace the data buffer in that structure
with a pointer pointing to a single location in a local variable, so
we get rid of some copying on the way.
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.

I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
because that touches a lot of code lines without any good reason.

This is based on an original patch by Nikolay.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:19 +01:00
Andre Przywara fb8f61abab KVM: arm/arm64: prepare GICv3 emulation to use kvm_io_bus MMIO handling
Using the framework provided by the recent vgic.c changes, we
register a kvm_io_bus device on mapping the virtual GICv3 resources.
The distributor mapping is pretty straight forward, but the
redistributors need some more love, since they need to be tagged with
the respective redistributor (read: VCPU) they are connected with.
We use the kvm_io_bus framework to register one devices per VCPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:13 +01:00
Andre Przywara 0ba10d5392 KVM: arm/arm64: merge GICv3 RD_base and SGI_base register frames
Currently we handle the redistributor registers in two separate MMIO
regions, one for the overall behaviour and SPIs and one for the
SGIs/PPIs. That latter forces the creation of _two_ KVM I/O bus
devices for each redistributor.
Since the spec mandates those two pages to be contigious, we could as
well merge them and save the churn with the second KVM I/O bus device.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:08 +01:00
Andre Przywara a9cf86f62b KVM: arm/arm64: prepare GICv2 emulation to be handled by kvm_io_bus
Using the framework provided by the recent vgic.c changes we register
a kvm_io_bus device when initializing the virtual GICv2.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:15 +00:00
Andre Przywara 6777f77f0f KVM: arm/arm64: implement kvm_io_bus MMIO handling for the VGIC
Currently we use a lot of VGIC specific code to do the MMIO
dispatching.
Use the previous reworks to add kvm_io_bus style MMIO handlers.

Those are not yet called by the MMIO abort handler, also the actual
VGIC emulator function do not make use of it yet, but will be enabled
with the following patches.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Andre Przywara 9f199d0a0e KVM: arm/arm64: simplify vgic_find_range() and callers
The vgic_find_range() function in vgic.c takes a struct kvm_exit_mmio
argument, but actually only used the length field in there. Since we
need to get rid of that structure in that part of the code anyway,
let's rework the function (and it's callers) to pass the length
argument to the function directly.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Andre Przywara cf50a1eb43 KVM: arm/arm64: rename struct kvm_mmio_range to vgic_io_range
The name "kvm_mmio_range" is a bit bold, given that it only covers
the VGIC's MMIO ranges. To avoid confusion with kvm_io_range, rename
it to vgic_io_range.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Christoffer Dall 1a74847885 arm/arm64: KVM: Fix migration race in the arch timer
When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:48:00 +01:00
Christoffer Dall 47a98b15ba arm/arm64: KVM: support for un-queuing active IRQs
Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:46:44 +01:00
Alex Bennée 71760950bf arm/arm64: KVM: add a common vgic_queue_irq_to_lr fn
This helps re-factor away some of the repetitive code and makes the code
flow more nicely.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:44:52 +01:00
Christoffer Dall ae705930fc arm/arm64: KVM: Keep elrsr/aisr in sync with software model
There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:42:07 +01:00
Wei Yongjun b52104e509 arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()
Add the missing unlock before return from function kvm_vgic_create()
in the error handling case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-13 11:40:57 +01:00
Eric Auger 174178fed3 KVM: arm/arm64: add irqfd support
This patch enables irqfd on arm/arm64.

Both irqfd and resamplefd are supported. Injection is implemented
in vgic.c without routing.

This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability
automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set.

Irqfd injection is restricted to SPI. The rationale behind not
supporting PPI irqfd injection is that any device using a PPI would
be a private-to-the-CPU device (timer for instance), so its state
would have to be context-switched along with the VCPU and would
require in-kernel wiring anyhow. It is not a relevant use case for
irqfds.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-12 15:15:34 +01:00
Eric Auger 649cf73994 KVM: arm/arm64: remove coarse grain dist locking at kvm_vgic_sync_hwstate
To prepare for irqfd addition, coarse grain locking is removed at
kvm_vgic_sync_hwstate level and finer grain locking is introduced in
vgic_process_maintenance only.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-12 15:15:33 +01:00