2020-02-10 14:02:58 +08:00
|
|
|
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
|
|
|
|
===================
|
|
|
|
Linux KVM Hypercall
|
2012-08-07 15:39:59 +08:00
|
|
|
===================
|
2020-02-10 14:02:58 +08:00
|
|
|
|
2012-08-07 15:39:59 +08:00
|
|
|
X86:
|
|
|
|
KVM Hypercalls have a three-byte sequence of either the vmcall or the vmmcall
|
|
|
|
instruction. The hypervisor can replace it with instructions that are
|
|
|
|
guaranteed to be supported.
|
|
|
|
|
|
|
|
Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
|
|
|
|
The hypercall number should be placed in rax and the return value will be
|
|
|
|
placed in rax. No other registers will be clobbered unless explicitly stated
|
|
|
|
by the particular hypercall.
|
|
|
|
|
|
|
|
S390:
|
|
|
|
R2-R7 are used for parameters 1-6. In addition, R1 is used for hypercall
|
|
|
|
number. The return value is written to R2.
|
|
|
|
|
|
|
|
S390 uses diagnose instruction as hypercall (0x500) along with hypercall
|
|
|
|
number in R1.
|
|
|
|
|
2013-11-13 18:15:02 +08:00
|
|
|
For further information on the S390 diagnose call as supported by KVM,
|
2020-04-15 00:48:35 +08:00
|
|
|
refer to Documentation/virt/kvm/s390-diag.rst.
|
2013-11-13 18:15:02 +08:00
|
|
|
|
2020-02-10 14:02:58 +08:00
|
|
|
PowerPC:
|
2012-08-07 15:39:59 +08:00
|
|
|
It uses R3-R10 and hypercall number in R11. R4-R11 are used as output registers.
|
|
|
|
Return value is placed in R3.
|
|
|
|
|
|
|
|
KVM hypercalls uses 4 byte opcode, that are patched with 'hypercall-instructions'
|
|
|
|
property inside the device tree's /hypervisor node.
|
2020-04-15 00:48:35 +08:00
|
|
|
For more information refer to Documentation/virt/kvm/ppc-pv.rst
|
2012-08-07 15:39:59 +08:00
|
|
|
|
2017-03-14 18:15:14 +08:00
|
|
|
MIPS:
|
|
|
|
KVM hypercalls use the HYPCALL instruction with code 0 and the hypercall
|
|
|
|
number in $2 (v0). Up to four arguments may be placed in $4-$7 (a0-a3) and
|
|
|
|
the return value is placed in $2 (v0).
|
|
|
|
|
2012-08-07 15:39:59 +08:00
|
|
|
KVM Hypercalls Documentation
|
2020-02-10 14:02:58 +08:00
|
|
|
============================
|
|
|
|
|
2012-08-07 15:39:59 +08:00
|
|
|
The template for each hypercall is:
|
|
|
|
1. Hypercall name.
|
|
|
|
2. Architecture(s)
|
|
|
|
3. Status (deprecated, obsolete, active)
|
|
|
|
4. Purpose
|
|
|
|
|
|
|
|
1. KVM_HC_VAPIC_POLL_IRQ
|
|
|
|
------------------------
|
2020-02-10 14:02:58 +08:00
|
|
|
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Trigger guest exit so that the host can check for pending
|
|
|
|
interrupts on reentry.
|
2012-08-07 15:39:59 +08:00
|
|
|
|
|
|
|
2. KVM_HC_MMU_OP
|
2020-02-10 14:02:58 +08:00
|
|
|
----------------
|
|
|
|
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: deprecated.
|
|
|
|
:Purpose: Support MMU operations such as writing to PTE,
|
|
|
|
flushing TLB, release PT.
|
2012-08-07 15:39:59 +08:00
|
|
|
|
|
|
|
3. KVM_HC_FEATURES
|
2020-02-10 14:02:58 +08:00
|
|
|
------------------
|
|
|
|
|
|
|
|
:Architecture: PPC
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
|
|
|
|
used to enumerate which hypercalls are available. On PPC, either
|
|
|
|
device tree based lookup ( which is also what EPAPR dictates)
|
|
|
|
OR KVM specific enumeration mechanism (which is this hypercall)
|
|
|
|
can be used.
|
2012-08-07 15:39:59 +08:00
|
|
|
|
|
|
|
4. KVM_HC_PPC_MAP_MAGIC_PAGE
|
2020-02-10 14:02:58 +08:00
|
|
|
----------------------------
|
|
|
|
|
|
|
|
:Architecture: PPC
|
|
|
|
:Status: active
|
|
|
|
:Purpose: To enable communication between the hypervisor and guest there is a
|
|
|
|
shared page that contains parts of supervisor visible register state.
|
|
|
|
The guest can map this shared page to access its supervisor register
|
|
|
|
through memory using this hypercall.
|
2013-08-26 16:48:36 +08:00
|
|
|
|
|
|
|
5. KVM_HC_KICK_CPU
|
2020-02-10 14:02:58 +08:00
|
|
|
------------------
|
|
|
|
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Hypercall used to wakeup a vcpu from HLT state
|
|
|
|
:Usage example:
|
|
|
|
A vcpu of a paravirtualized guest that is busywaiting in guest
|
|
|
|
kernel mode for an event to occur (ex: a spinlock to become available) can
|
|
|
|
execute HLT instruction once it has busy-waited for more than a threshold
|
|
|
|
time-interval. Execution of HLT instruction would cause the hypervisor to put
|
|
|
|
the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
|
|
|
|
same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
|
|
|
|
specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
|
|
|
|
is used in the hypercall for future use.
|
2017-01-25 01:09:39 +08:00
|
|
|
|
|
|
|
|
|
|
|
6. KVM_HC_CLOCK_PAIRING
|
2020-02-10 14:02:58 +08:00
|
|
|
-----------------------
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Hypercall used to synchronize host and guest clocks.
|
|
|
|
|
2017-01-25 01:09:39 +08:00
|
|
|
Usage:
|
|
|
|
|
|
|
|
a0: guest physical address where host copies
|
|
|
|
"struct kvm_clock_offset" structure.
|
|
|
|
|
|
|
|
a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
|
|
|
|
is supported (corresponding to the host's CLOCK_REALTIME clock).
|
|
|
|
|
2020-02-10 14:02:58 +08:00
|
|
|
::
|
|
|
|
|
2017-01-25 01:09:39 +08:00
|
|
|
struct kvm_clock_pairing {
|
|
|
|
__s64 sec;
|
|
|
|
__s64 nsec;
|
|
|
|
__u64 tsc;
|
|
|
|
__u32 flags;
|
|
|
|
__u32 pad[9];
|
|
|
|
};
|
|
|
|
|
|
|
|
Where:
|
|
|
|
* sec: seconds from clock_type clock.
|
|
|
|
* nsec: nanoseconds from clock_type clock.
|
|
|
|
* tsc: guest TSC value used to calculate sec/nsec pair
|
|
|
|
* flags: flags, unused (0) at the moment.
|
|
|
|
|
|
|
|
The hypercall lets a guest compute a precise timestamp across
|
|
|
|
host and guest. The guest can use the returned TSC value to
|
|
|
|
compute the CLOCK_REALTIME for its clock, at the same instant.
|
|
|
|
|
|
|
|
Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
|
|
|
|
or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
|
KVM: X86: Implement "send IPI" hypercall
Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.
Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):
x2apic cluster mode, vanilla
Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns
x2apic cluster mode, pv-ipi
Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns
x2apic physical mode, vanilla
Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns
x2apic physical mode, pv-ipi
Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-23 14:39:54 +08:00
|
|
|
|
|
|
|
6. KVM_HC_SEND_IPI
|
2020-02-10 14:02:58 +08:00
|
|
|
------------------
|
|
|
|
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Send IPIs to multiple vCPUs.
|
KVM: X86: Implement "send IPI" hypercall
Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.
Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):
x2apic cluster mode, vanilla
Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns
x2apic cluster mode, pv-ipi
Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns
x2apic physical mode, vanilla
Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns
x2apic physical mode, pv-ipi
Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-23 14:39:54 +08:00
|
|
|
|
2020-02-10 14:02:58 +08:00
|
|
|
- a0: lower part of the bitmap of destination APIC IDs
|
|
|
|
- a1: higher part of the bitmap of destination APIC IDs
|
|
|
|
- a2: the lowest APIC ID in bitmap
|
|
|
|
- a3: APIC ICR
|
KVM: X86: Implement "send IPI" hypercall
Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.
Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):
x2apic cluster mode, vanilla
Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns
x2apic cluster mode, pv-ipi
Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns
x2apic physical mode, vanilla
Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns
x2apic physical mode, pv-ipi
Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-23 14:39:54 +08:00
|
|
|
|
|
|
|
The hypercall lets a guest send multicast IPIs, with at most 128
|
|
|
|
128 destinations per hypercall in 64-bit mode and 64 vCPUs per
|
|
|
|
hypercall in 32-bit mode. The destinations are represented by a
|
|
|
|
bitmap contained in the first two arguments (a0 and a1). Bit 0 of
|
|
|
|
a0 corresponds to the APIC ID in the third argument (a2), bit 1
|
|
|
|
corresponds to the APIC ID a2+1, and so on.
|
|
|
|
|
|
|
|
Returns the number of CPUs to which the IPIs were delivered successfully.
|
2019-06-11 20:23:48 +08:00
|
|
|
|
|
|
|
7. KVM_HC_SCHED_YIELD
|
2020-02-10 14:02:58 +08:00
|
|
|
---------------------
|
|
|
|
|
|
|
|
:Architecture: x86
|
|
|
|
:Status: active
|
|
|
|
:Purpose: Hypercall used to yield if the IPI target vCPU is preempted
|
2019-06-11 20:23:48 +08:00
|
|
|
|
|
|
|
a0: destination APIC ID
|
|
|
|
|
2020-02-10 14:02:58 +08:00
|
|
|
:Usage example: When sending a call-function IPI-many to vCPUs, yield if
|
|
|
|
any of the IPI target vCPUs was preempted.
|