ARM:
- Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmI4fdwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMq8gf/WoeVHtw2QlL5Mmz6McvRRmPAYPLV wLUIFNrRqRvd8Tw4kivzZoh/xTpwmnojv0YdK5SjKAiMjgv094YI1LrNp1JSPvmL pitocMkA10RSJNWHeEMg9cMSKH0rKiqeYl6S1e2XsdB+UZZ2BINOCVtvglmjTAvJ dFBdKdBkqjAUZbdXAGIvz4JEEER3N/LkFDKGaUGX+0QIQOzGBPIyLTxynxIDG6mt RViCCFyXdy5NkVp5hZFm96vQ2qAlWL9B9+iKruQN++82+oqWbeTdSqPhdwF7GyFz BfOv3gobQ2c4ef/aMLO5LswZ9joI1t/4kQbbAn6dNybpOAz/NXfDnbNefg== =keox -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...
This commit is contained in:
commit
1ebdbeb03e
|
@ -2366,13 +2366,35 @@
|
||||||
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
|
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
|
||||||
Default is 0 (don't ignore, but inject #GP)
|
Default is 0 (don't ignore, but inject #GP)
|
||||||
|
|
||||||
|
kvm.eager_page_split=
|
||||||
|
[KVM,X86] Controls whether or not KVM will try to
|
||||||
|
proactively split all huge pages during dirty logging.
|
||||||
|
Eager page splitting reduces interruptions to vCPU
|
||||||
|
execution by eliminating the write-protection faults
|
||||||
|
and MMU lock contention that would otherwise be
|
||||||
|
required to split huge pages lazily.
|
||||||
|
|
||||||
|
VM workloads that rarely perform writes or that write
|
||||||
|
only to a small region of VM memory may benefit from
|
||||||
|
disabling eager page splitting to allow huge pages to
|
||||||
|
still be used for reads.
|
||||||
|
|
||||||
|
The behavior of eager page splitting depends on whether
|
||||||
|
KVM_DIRTY_LOG_INITIALLY_SET is enabled or disabled. If
|
||||||
|
disabled, all huge pages in a memslot will be eagerly
|
||||||
|
split when dirty logging is enabled on that memslot. If
|
||||||
|
enabled, eager page splitting will be performed during
|
||||||
|
the KVM_CLEAR_DIRTY ioctl, and only for the pages being
|
||||||
|
cleared.
|
||||||
|
|
||||||
|
Eager page splitting currently only supports splitting
|
||||||
|
huge pages mapped by the TDP MMU.
|
||||||
|
|
||||||
|
Default is Y (on).
|
||||||
|
|
||||||
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
|
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
|
||||||
Default is false (don't support).
|
Default is false (don't support).
|
||||||
|
|
||||||
kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit
|
|
||||||
KVM MMU at runtime.
|
|
||||||
Default is 0 (off)
|
|
||||||
|
|
||||||
kvm.nx_huge_pages=
|
kvm.nx_huge_pages=
|
||||||
[KVM] Controls the software workaround for the
|
[KVM] Controls the software workaround for the
|
||||||
X86_BUG_ITLB_MULTIHIT bug.
|
X86_BUG_ITLB_MULTIHIT bug.
|
||||||
|
|
|
@ -417,7 +417,7 @@ kvm_run' (see below).
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
:Capability: basic
|
:Capability: basic
|
||||||
:Architectures: all except ARM, arm64
|
:Architectures: all except arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_regs (out)
|
:Parameters: struct kvm_regs (out)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -450,7 +450,7 @@ Reads the general purpose registers from the vcpu.
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
:Capability: basic
|
:Capability: basic
|
||||||
:Architectures: all except ARM, arm64
|
:Architectures: all except arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_regs (in)
|
:Parameters: struct kvm_regs (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -824,7 +824,7 @@ Writes the floating point state to the vcpu.
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
|
:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
|
||||||
:Architectures: x86, ARM, arm64, s390
|
:Architectures: x86, arm64, s390
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: none
|
:Parameters: none
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -833,7 +833,7 @@ Creates an interrupt controller model in the kernel.
|
||||||
On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up
|
On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up
|
||||||
future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both
|
future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both
|
||||||
PIC and IOAPIC; GSI 16-23 only go to the IOAPIC.
|
PIC and IOAPIC; GSI 16-23 only go to the IOAPIC.
|
||||||
On ARM/arm64, a GICv2 is created. Any other GIC versions require the usage of
|
On arm64, a GICv2 is created. Any other GIC versions require the usage of
|
||||||
KVM_CREATE_DEVICE, which also supports creating a GICv2. Using
|
KVM_CREATE_DEVICE, which also supports creating a GICv2. Using
|
||||||
KVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2.
|
KVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2.
|
||||||
On s390, a dummy irq routing table is created.
|
On s390, a dummy irq routing table is created.
|
||||||
|
@ -846,7 +846,7 @@ before KVM_CREATE_IRQCHIP can be used.
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_IRQCHIP
|
:Capability: KVM_CAP_IRQCHIP
|
||||||
:Architectures: x86, arm, arm64
|
:Architectures: x86, arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_irq_level
|
:Parameters: struct kvm_irq_level
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -870,7 +870,7 @@ capability is present (or unless it is not using the in-kernel irqchip,
|
||||||
of course).
|
of course).
|
||||||
|
|
||||||
|
|
||||||
ARM/arm64 can signal an interrupt either at the CPU level, or at the
|
arm64 can signal an interrupt either at the CPU level, or at the
|
||||||
in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
|
in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
|
||||||
use PPIs designated for specific cpus. The irq field is interpreted
|
use PPIs designated for specific cpus. The irq field is interpreted
|
||||||
like this::
|
like this::
|
||||||
|
@ -896,7 +896,7 @@ When KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is
|
||||||
identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index
|
identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index
|
||||||
must be zero.
|
must be zero.
|
||||||
|
|
||||||
Note that on arm/arm64, the KVM_CAP_IRQCHIP capability only conditions
|
Note that on arm64, the KVM_CAP_IRQCHIP capability only conditions
|
||||||
injection of interrupts for the in-kernel irqchip. KVM_IRQ_LINE can always
|
injection of interrupts for the in-kernel irqchip. KVM_IRQ_LINE can always
|
||||||
be used for a userspace interrupt controller.
|
be used for a userspace interrupt controller.
|
||||||
|
|
||||||
|
@ -1087,7 +1087,7 @@ Other flags returned by ``KVM_GET_CLOCK`` are accepted but ignored.
|
||||||
|
|
||||||
:Capability: KVM_CAP_VCPU_EVENTS
|
:Capability: KVM_CAP_VCPU_EVENTS
|
||||||
:Extended by: KVM_CAP_INTR_SHADOW
|
:Extended by: KVM_CAP_INTR_SHADOW
|
||||||
:Architectures: x86, arm, arm64
|
:Architectures: x86, arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_vcpu_event (out)
|
:Parameters: struct kvm_vcpu_event (out)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -1146,8 +1146,8 @@ The following bits are defined in the flags field:
|
||||||
fields contain a valid state. This bit will be set whenever
|
fields contain a valid state. This bit will be set whenever
|
||||||
KVM_CAP_EXCEPTION_PAYLOAD is enabled.
|
KVM_CAP_EXCEPTION_PAYLOAD is enabled.
|
||||||
|
|
||||||
ARM/ARM64:
|
ARM64:
|
||||||
^^^^^^^^^^
|
^^^^^^
|
||||||
|
|
||||||
If the guest accesses a device that is being emulated by the host kernel in
|
If the guest accesses a device that is being emulated by the host kernel in
|
||||||
such a way that a real device would generate a physical SError, KVM may make
|
such a way that a real device would generate a physical SError, KVM may make
|
||||||
|
@ -1206,7 +1206,7 @@ directly to the virtual CPU).
|
||||||
|
|
||||||
:Capability: KVM_CAP_VCPU_EVENTS
|
:Capability: KVM_CAP_VCPU_EVENTS
|
||||||
:Extended by: KVM_CAP_INTR_SHADOW
|
:Extended by: KVM_CAP_INTR_SHADOW
|
||||||
:Architectures: x86, arm, arm64
|
:Architectures: x86, arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_vcpu_event (in)
|
:Parameters: struct kvm_vcpu_event (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -1241,8 +1241,8 @@ can be set in the flags field to signal that the
|
||||||
exception_has_payload, exception_payload, and exception.pending fields
|
exception_has_payload, exception_payload, and exception.pending fields
|
||||||
contain a valid state and shall be written into the VCPU.
|
contain a valid state and shall be written into the VCPU.
|
||||||
|
|
||||||
ARM/ARM64:
|
ARM64:
|
||||||
^^^^^^^^^^
|
^^^^^^
|
||||||
|
|
||||||
User space may need to inject several types of events to the guest.
|
User space may need to inject several types of events to the guest.
|
||||||
|
|
||||||
|
@ -1449,7 +1449,7 @@ for vm-wide capabilities.
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_MP_STATE
|
:Capability: KVM_CAP_MP_STATE
|
||||||
:Architectures: x86, s390, arm, arm64, riscv
|
:Architectures: x86, s390, arm64, riscv
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_mp_state (out)
|
:Parameters: struct kvm_mp_state (out)
|
||||||
:Returns: 0 on success; -1 on error
|
:Returns: 0 on success; -1 on error
|
||||||
|
@ -1467,7 +1467,7 @@ Possible values are:
|
||||||
|
|
||||||
========================== ===============================================
|
========================== ===============================================
|
||||||
KVM_MP_STATE_RUNNABLE the vcpu is currently running
|
KVM_MP_STATE_RUNNABLE the vcpu is currently running
|
||||||
[x86,arm/arm64,riscv]
|
[x86,arm64,riscv]
|
||||||
KVM_MP_STATE_UNINITIALIZED the vcpu is an application processor (AP)
|
KVM_MP_STATE_UNINITIALIZED the vcpu is an application processor (AP)
|
||||||
which has not yet received an INIT signal [x86]
|
which has not yet received an INIT signal [x86]
|
||||||
KVM_MP_STATE_INIT_RECEIVED the vcpu has received an INIT signal, and is
|
KVM_MP_STATE_INIT_RECEIVED the vcpu has received an INIT signal, and is
|
||||||
|
@ -1476,7 +1476,7 @@ Possible values are:
|
||||||
is waiting for an interrupt [x86]
|
is waiting for an interrupt [x86]
|
||||||
KVM_MP_STATE_SIPI_RECEIVED the vcpu has just received a SIPI (vector
|
KVM_MP_STATE_SIPI_RECEIVED the vcpu has just received a SIPI (vector
|
||||||
accessible via KVM_GET_VCPU_EVENTS) [x86]
|
accessible via KVM_GET_VCPU_EVENTS) [x86]
|
||||||
KVM_MP_STATE_STOPPED the vcpu is stopped [s390,arm/arm64,riscv]
|
KVM_MP_STATE_STOPPED the vcpu is stopped [s390,arm64,riscv]
|
||||||
KVM_MP_STATE_CHECK_STOP the vcpu is in a special error state [s390]
|
KVM_MP_STATE_CHECK_STOP the vcpu is in a special error state [s390]
|
||||||
KVM_MP_STATE_OPERATING the vcpu is operating (running or halted)
|
KVM_MP_STATE_OPERATING the vcpu is operating (running or halted)
|
||||||
[s390]
|
[s390]
|
||||||
|
@ -1488,8 +1488,8 @@ On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
|
||||||
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
|
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
|
||||||
these architectures.
|
these architectures.
|
||||||
|
|
||||||
For arm/arm64/riscv:
|
For arm64/riscv:
|
||||||
^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The only states that are valid are KVM_MP_STATE_STOPPED and
|
The only states that are valid are KVM_MP_STATE_STOPPED and
|
||||||
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
|
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
|
||||||
|
@ -1498,7 +1498,7 @@ KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_MP_STATE
|
:Capability: KVM_CAP_MP_STATE
|
||||||
:Architectures: x86, s390, arm, arm64, riscv
|
:Architectures: x86, s390, arm64, riscv
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_mp_state (in)
|
:Parameters: struct kvm_mp_state (in)
|
||||||
:Returns: 0 on success; -1 on error
|
:Returns: 0 on success; -1 on error
|
||||||
|
@ -1510,8 +1510,8 @@ On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
|
||||||
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
|
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
|
||||||
these architectures.
|
these architectures.
|
||||||
|
|
||||||
For arm/arm64/riscv:
|
For arm64/riscv:
|
||||||
^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The only states that are valid are KVM_MP_STATE_STOPPED and
|
The only states that are valid are KVM_MP_STATE_STOPPED and
|
||||||
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
|
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
|
||||||
|
@ -1780,14 +1780,14 @@ The flags bitmap is defined as::
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_IRQ_ROUTING
|
:Capability: KVM_CAP_IRQ_ROUTING
|
||||||
:Architectures: x86 s390 arm arm64
|
:Architectures: x86 s390 arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_irq_routing (in)
|
:Parameters: struct kvm_irq_routing (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
|
||||||
Sets the GSI routing table entries, overwriting any previously set entries.
|
Sets the GSI routing table entries, overwriting any previously set entries.
|
||||||
|
|
||||||
On arm/arm64, GSI routing has the following limitation:
|
On arm64, GSI routing has the following limitation:
|
||||||
|
|
||||||
- GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD.
|
- GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD.
|
||||||
|
|
||||||
|
@ -2855,7 +2855,7 @@ after pausing the vcpu, but before it is resumed.
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_SIGNAL_MSI
|
:Capability: KVM_CAP_SIGNAL_MSI
|
||||||
:Architectures: x86 arm arm64
|
:Architectures: x86 arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_msi (in)
|
:Parameters: struct kvm_msi (in)
|
||||||
:Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
|
:Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
|
||||||
|
@ -3043,7 +3043,7 @@ into the hash PTE second double word).
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_IRQFD
|
:Capability: KVM_CAP_IRQFD
|
||||||
:Architectures: x86 s390 arm arm64
|
:Architectures: x86 s390 arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_irqfd (in)
|
:Parameters: struct kvm_irqfd (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -3069,7 +3069,7 @@ Note that closing the resamplefd is not sufficient to disable the
|
||||||
irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
|
irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
|
||||||
and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
|
and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
|
||||||
|
|
||||||
On arm/arm64, gsi routing being supported, the following can happen:
|
On arm64, gsi routing being supported, the following can happen:
|
||||||
|
|
||||||
- in case no routing entry is associated to this gsi, injection fails
|
- in case no routing entry is associated to this gsi, injection fails
|
||||||
- in case the gsi is associated to an irqchip routing entry,
|
- in case the gsi is associated to an irqchip routing entry,
|
||||||
|
@ -3325,7 +3325,7 @@ current state. "addr" is ignored.
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
:Capability: basic
|
:Capability: basic
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_vcpu_init (in)
|
:Parameters: struct kvm_vcpu_init (in)
|
||||||
:Returns: 0 on success; -1 on error
|
:Returns: 0 on success; -1 on error
|
||||||
|
@ -3423,7 +3423,7 @@ Possible features:
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
:Capability: basic
|
:Capability: basic
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_vcpu_init (out)
|
:Parameters: struct kvm_vcpu_init (out)
|
||||||
:Returns: 0 on success; -1 on error
|
:Returns: 0 on success; -1 on error
|
||||||
|
@ -3452,7 +3452,7 @@ VCPU matching underlying host.
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
:Capability: basic
|
:Capability: basic
|
||||||
:Architectures: arm, arm64, mips
|
:Architectures: arm64, mips
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: struct kvm_reg_list (in/out)
|
:Parameters: struct kvm_reg_list (in/out)
|
||||||
:Returns: 0 on success; -1 on error
|
:Returns: 0 on success; -1 on error
|
||||||
|
@ -3479,7 +3479,7 @@ KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
|
||||||
-----------------------------------------
|
-----------------------------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
|
:Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_arm_device_address (in)
|
:Parameters: struct kvm_arm_device_address (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -3506,13 +3506,13 @@ can access emulated or directly exposed devices, which the host kernel needs
|
||||||
to know about. The id field is an architecture specific identifier for a
|
to know about. The id field is an architecture specific identifier for a
|
||||||
specific device.
|
specific device.
|
||||||
|
|
||||||
ARM/arm64 divides the id field into two parts, a device id and an
|
arm64 divides the id field into two parts, a device id and an
|
||||||
address type id specific to the individual device::
|
address type id specific to the individual device::
|
||||||
|
|
||||||
bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 |
|
bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 |
|
||||||
field: | 0x00000000 | device id | addr type id |
|
field: | 0x00000000 | device id | addr type id |
|
||||||
|
|
||||||
ARM/arm64 currently only require this when using the in-kernel GIC
|
arm64 currently only require this when using the in-kernel GIC
|
||||||
support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2
|
support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2
|
||||||
as the device id. When setting the base address for the guest's
|
as the device id. When setting the base address for the guest's
|
||||||
mapping of the VGIC virtual CPU and distributor interface, the ioctl
|
mapping of the VGIC virtual CPU and distributor interface, the ioctl
|
||||||
|
@ -3683,15 +3683,17 @@ The fields in each entry are defined as follows:
|
||||||
4.89 KVM_S390_MEM_OP
|
4.89 KVM_S390_MEM_OP
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_S390_MEM_OP
|
:Capability: KVM_CAP_S390_MEM_OP, KVM_CAP_S390_PROTECTED, KVM_CAP_S390_MEM_OP_EXTENSION
|
||||||
:Architectures: s390
|
:Architectures: s390
|
||||||
:Type: vcpu ioctl
|
:Type: vm ioctl, vcpu ioctl
|
||||||
:Parameters: struct kvm_s390_mem_op (in)
|
:Parameters: struct kvm_s390_mem_op (in)
|
||||||
:Returns: = 0 on success,
|
:Returns: = 0 on success,
|
||||||
< 0 on generic error (e.g. -EFAULT or -ENOMEM),
|
< 0 on generic error (e.g. -EFAULT or -ENOMEM),
|
||||||
> 0 if an exception occurred while walking the page tables
|
> 0 if an exception occurred while walking the page tables
|
||||||
|
|
||||||
Read or write data from/to the logical (virtual) memory of a VCPU.
|
Read or write data from/to the VM's memory.
|
||||||
|
The KVM_CAP_S390_MEM_OP_EXTENSION capability specifies what functionality is
|
||||||
|
supported.
|
||||||
|
|
||||||
Parameters are specified via the following structure::
|
Parameters are specified via the following structure::
|
||||||
|
|
||||||
|
@ -3701,33 +3703,99 @@ Parameters are specified via the following structure::
|
||||||
__u32 size; /* amount of bytes */
|
__u32 size; /* amount of bytes */
|
||||||
__u32 op; /* type of operation */
|
__u32 op; /* type of operation */
|
||||||
__u64 buf; /* buffer in userspace */
|
__u64 buf; /* buffer in userspace */
|
||||||
__u8 ar; /* the access register number */
|
union {
|
||||||
__u8 reserved[31]; /* should be set to 0 */
|
struct {
|
||||||
|
__u8 ar; /* the access register number */
|
||||||
|
__u8 key; /* access key, ignored if flag unset */
|
||||||
|
};
|
||||||
|
__u32 sida_offset; /* offset into the sida */
|
||||||
|
__u8 reserved[32]; /* ignored */
|
||||||
|
};
|
||||||
};
|
};
|
||||||
|
|
||||||
The type of operation is specified in the "op" field. It is either
|
|
||||||
KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or
|
|
||||||
KVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The
|
|
||||||
KVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check
|
|
||||||
whether the corresponding memory access would create an access exception
|
|
||||||
(without touching the data in the memory at the destination). In case an
|
|
||||||
access exception occurred while walking the MMU tables of the guest, the
|
|
||||||
ioctl returns a positive error number to indicate the type of exception.
|
|
||||||
This exception is also raised directly at the corresponding VCPU if the
|
|
||||||
flag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field.
|
|
||||||
|
|
||||||
The start address of the memory region has to be specified in the "gaddr"
|
The start address of the memory region has to be specified in the "gaddr"
|
||||||
field, and the length of the region in the "size" field (which must not
|
field, and the length of the region in the "size" field (which must not
|
||||||
be 0). The maximum value for "size" can be obtained by checking the
|
be 0). The maximum value for "size" can be obtained by checking the
|
||||||
KVM_CAP_S390_MEM_OP capability. "buf" is the buffer supplied by the
|
KVM_CAP_S390_MEM_OP capability. "buf" is the buffer supplied by the
|
||||||
userspace application where the read data should be written to for
|
userspace application where the read data should be written to for
|
||||||
KVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written is
|
a read access, or where the data that should be written is stored for
|
||||||
stored for a KVM_S390_MEMOP_LOGICAL_WRITE. When KVM_S390_MEMOP_F_CHECK_ONLY
|
a write access. The "reserved" field is meant for future extensions.
|
||||||
is specified, "buf" is unused and can be NULL. "ar" designates the access
|
Reserved and unused values are ignored. Future extension that add members must
|
||||||
register number to be used; the valid range is 0..15.
|
introduce new flags.
|
||||||
|
|
||||||
The "reserved" field is meant for future extensions. It is not used by
|
The type of operation is specified in the "op" field. Flags modifying
|
||||||
KVM with the currently defined set of flags.
|
their behavior can be set in the "flags" field. Undefined flag bits must
|
||||||
|
be set to 0.
|
||||||
|
|
||||||
|
Possible operations are:
|
||||||
|
* ``KVM_S390_MEMOP_LOGICAL_READ``
|
||||||
|
* ``KVM_S390_MEMOP_LOGICAL_WRITE``
|
||||||
|
* ``KVM_S390_MEMOP_ABSOLUTE_READ``
|
||||||
|
* ``KVM_S390_MEMOP_ABSOLUTE_WRITE``
|
||||||
|
* ``KVM_S390_MEMOP_SIDA_READ``
|
||||||
|
* ``KVM_S390_MEMOP_SIDA_WRITE``
|
||||||
|
|
||||||
|
Logical read/write:
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Access logical memory, i.e. translate the given guest address to an absolute
|
||||||
|
address given the state of the VCPU and use the absolute address as target of
|
||||||
|
the access. "ar" designates the access register number to be used; the valid
|
||||||
|
range is 0..15.
|
||||||
|
Logical accesses are permitted for the VCPU ioctl only.
|
||||||
|
Logical accesses are permitted for non-protected guests only.
|
||||||
|
|
||||||
|
Supported flags:
|
||||||
|
* ``KVM_S390_MEMOP_F_CHECK_ONLY``
|
||||||
|
* ``KVM_S390_MEMOP_F_INJECT_EXCEPTION``
|
||||||
|
* ``KVM_S390_MEMOP_F_SKEY_PROTECTION``
|
||||||
|
|
||||||
|
The KVM_S390_MEMOP_F_CHECK_ONLY flag can be set to check whether the
|
||||||
|
corresponding memory access would cause an access exception; however,
|
||||||
|
no actual access to the data in memory at the destination is performed.
|
||||||
|
In this case, "buf" is unused and can be NULL.
|
||||||
|
|
||||||
|
In case an access exception occurred during the access (or would occur
|
||||||
|
in case of KVM_S390_MEMOP_F_CHECK_ONLY), the ioctl returns a positive
|
||||||
|
error number indicating the type of exception. This exception is also
|
||||||
|
raised directly at the corresponding VCPU if the flag
|
||||||
|
KVM_S390_MEMOP_F_INJECT_EXCEPTION is set.
|
||||||
|
|
||||||
|
If the KVM_S390_MEMOP_F_SKEY_PROTECTION flag is set, storage key
|
||||||
|
protection is also in effect and may cause exceptions if accesses are
|
||||||
|
prohibited given the access key designated by "key"; the valid range is 0..15.
|
||||||
|
KVM_S390_MEMOP_F_SKEY_PROTECTION is available if KVM_CAP_S390_MEM_OP_EXTENSION
|
||||||
|
is > 0.
|
||||||
|
|
||||||
|
Absolute read/write:
|
||||||
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Access absolute memory. This operation is intended to be used with the
|
||||||
|
KVM_S390_MEMOP_F_SKEY_PROTECTION flag, to allow accessing memory and performing
|
||||||
|
the checks required for storage key protection as one operation (as opposed to
|
||||||
|
user space getting the storage keys, performing the checks, and accessing
|
||||||
|
memory thereafter, which could lead to a delay between check and access).
|
||||||
|
Absolute accesses are permitted for the VM ioctl if KVM_CAP_S390_MEM_OP_EXTENSION
|
||||||
|
is > 0.
|
||||||
|
Currently absolute accesses are not permitted for VCPU ioctls.
|
||||||
|
Absolute accesses are permitted for non-protected guests only.
|
||||||
|
|
||||||
|
Supported flags:
|
||||||
|
* ``KVM_S390_MEMOP_F_CHECK_ONLY``
|
||||||
|
* ``KVM_S390_MEMOP_F_SKEY_PROTECTION``
|
||||||
|
|
||||||
|
The semantics of the flags are as for logical accesses.
|
||||||
|
|
||||||
|
SIDA read/write:
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Access the secure instruction data area which contains memory operands necessary
|
||||||
|
for instruction emulation for protected guests.
|
||||||
|
SIDA accesses are available if the KVM_CAP_S390_PROTECTED capability is available.
|
||||||
|
SIDA accesses are permitted for the VCPU ioctl only.
|
||||||
|
SIDA accesses are permitted for protected guests only.
|
||||||
|
|
||||||
|
No flags are supported.
|
||||||
|
|
||||||
4.90 KVM_S390_GET_SKEYS
|
4.90 KVM_S390_GET_SKEYS
|
||||||
-----------------------
|
-----------------------
|
||||||
|
@ -4726,7 +4794,7 @@ to I/O ports.
|
||||||
------------------------------------
|
------------------------------------
|
||||||
|
|
||||||
:Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
:Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||||
:Architectures: x86, arm, arm64, mips
|
:Architectures: x86, arm64, mips
|
||||||
:Type: vm ioctl
|
:Type: vm ioctl
|
||||||
:Parameters: struct kvm_clear_dirty_log (in)
|
:Parameters: struct kvm_clear_dirty_log (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -4838,7 +4906,7 @@ version has the following quirks:
|
||||||
4.119 KVM_ARM_VCPU_FINALIZE
|
4.119 KVM_ARM_VCPU_FINALIZE
|
||||||
---------------------------
|
---------------------------
|
||||||
|
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
:Type: vcpu ioctl
|
:Type: vcpu ioctl
|
||||||
:Parameters: int feature (in)
|
:Parameters: int feature (in)
|
||||||
:Returns: 0 on success, -1 on error
|
:Returns: 0 on success, -1 on error
|
||||||
|
@ -5920,7 +5988,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
|
||||||
|
|
||||||
If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
|
If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
|
||||||
a system-level event using some architecture specific mechanism (hypercall
|
a system-level event using some architecture specific mechanism (hypercall
|
||||||
or some special instruction). In case of ARM/ARM64, this is triggered using
|
or some special instruction). In case of ARM64, this is triggered using
|
||||||
HVC instruction based PSCI call from the vcpu. The 'type' field describes
|
HVC instruction based PSCI call from the vcpu. The 'type' field describes
|
||||||
the system-level event type. The 'flags' field describes architecture
|
the system-level event type. The 'flags' field describes architecture
|
||||||
specific flags for the system-level event.
|
specific flags for the system-level event.
|
||||||
|
@ -5939,6 +6007,11 @@ Valid values for 'type' are:
|
||||||
to ignore the request, or to gather VM memory core dump and/or
|
to ignore the request, or to gather VM memory core dump and/or
|
||||||
reset/shutdown of the VM.
|
reset/shutdown of the VM.
|
||||||
|
|
||||||
|
Valid flags are:
|
||||||
|
|
||||||
|
- KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
|
||||||
|
a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
/* KVM_EXIT_IOAPIC_EOI */
|
/* KVM_EXIT_IOAPIC_EOI */
|
||||||
|
@ -6013,7 +6086,7 @@ in send_page or recv a buffer to recv_page).
|
||||||
__u64 fault_ipa;
|
__u64 fault_ipa;
|
||||||
} arm_nisv;
|
} arm_nisv;
|
||||||
|
|
||||||
Used on arm and arm64 systems. If a guest accesses memory not in a memslot,
|
Used on arm64 systems. If a guest accesses memory not in a memslot,
|
||||||
KVM will typically return to userspace and ask it to do MMIO emulation on its
|
KVM will typically return to userspace and ask it to do MMIO emulation on its
|
||||||
behalf. However, for certain classes of instructions, no instruction decode
|
behalf. However, for certain classes of instructions, no instruction decode
|
||||||
(direction, length of memory access) is provided, and fetching and decoding
|
(direction, length of memory access) is provided, and fetching and decoding
|
||||||
|
@ -6030,11 +6103,10 @@ did not fall within an I/O window.
|
||||||
Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable
|
Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable
|
||||||
this capability at VM creation. Once this is done, these types of errors will
|
this capability at VM creation. Once this is done, these types of errors will
|
||||||
instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from
|
instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from
|
||||||
the HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA
|
the ESR_EL2 in the esr_iss field, and the faulting IPA in the fault_ipa field.
|
||||||
in the fault_ipa field. Userspace can either fix up the access if it's
|
Userspace can either fix up the access if it's actually an I/O access by
|
||||||
actually an I/O access by decoding the instruction from guest memory (if it's
|
decoding the instruction from guest memory (if it's very brave) and continue
|
||||||
very brave) and continue executing the guest, or it can decide to suspend,
|
executing the guest, or it can decide to suspend, dump, or restart the guest.
|
||||||
dump, or restart the guest.
|
|
||||||
|
|
||||||
Note that KVM does not skip the faulting instruction as it does for
|
Note that KVM does not skip the faulting instruction as it does for
|
||||||
KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
|
KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
|
||||||
|
@ -6741,7 +6813,7 @@ and injected exceptions.
|
||||||
|
|
||||||
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||||
|
|
||||||
:Architectures: x86, arm, arm64, mips
|
:Architectures: x86, arm64, mips
|
||||||
:Parameters: args[0] whether feature should be enabled or not
|
:Parameters: args[0] whether feature should be enabled or not
|
||||||
|
|
||||||
Valid flags are::
|
Valid flags are::
|
||||||
|
@ -7011,6 +7083,56 @@ resource that is controlled with the H_SET_MODE hypercall.
|
||||||
This capability allows a guest kernel to use a better-performance mode for
|
This capability allows a guest kernel to use a better-performance mode for
|
||||||
handling interrupts and system calls.
|
handling interrupts and system calls.
|
||||||
|
|
||||||
|
7.31 KVM_CAP_DISABLE_QUIRKS2
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
:Capability: KVM_CAP_DISABLE_QUIRKS2
|
||||||
|
:Parameters: args[0] - set of KVM quirks to disable
|
||||||
|
:Architectures: x86
|
||||||
|
:Type: vm
|
||||||
|
|
||||||
|
This capability, if enabled, will cause KVM to disable some behavior
|
||||||
|
quirks.
|
||||||
|
|
||||||
|
Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of
|
||||||
|
quirks that can be disabled in KVM.
|
||||||
|
|
||||||
|
The argument to KVM_ENABLE_CAP for this capability is a bitmask of
|
||||||
|
quirks to disable, and must be a subset of the bitmask returned by
|
||||||
|
KVM_CHECK_EXTENSION.
|
||||||
|
|
||||||
|
The valid bits in cap.args[0] are:
|
||||||
|
|
||||||
|
=================================== ============================================
|
||||||
|
KVM_X86_QUIRK_LINT0_REENABLED By default, the reset value for the LVT
|
||||||
|
LINT0 register is 0x700 (APIC_MODE_EXTINT).
|
||||||
|
When this quirk is disabled, the reset value
|
||||||
|
is 0x10000 (APIC_LVT_MASKED).
|
||||||
|
|
||||||
|
KVM_X86_QUIRK_CD_NW_CLEARED By default, KVM clears CR0.CD and CR0.NW.
|
||||||
|
When this quirk is disabled, KVM does not
|
||||||
|
change the value of CR0.CD and CR0.NW.
|
||||||
|
|
||||||
|
KVM_X86_QUIRK_LAPIC_MMIO_HOLE By default, the MMIO LAPIC interface is
|
||||||
|
available even when configured for x2APIC
|
||||||
|
mode. When this quirk is disabled, KVM
|
||||||
|
disables the MMIO LAPIC interface if the
|
||||||
|
LAPIC is in x2APIC mode.
|
||||||
|
|
||||||
|
KVM_X86_QUIRK_OUT_7E_INC_RIP By default, KVM pre-increments %rip before
|
||||||
|
exiting to userspace for an OUT instruction
|
||||||
|
to port 0x7e. When this quirk is disabled,
|
||||||
|
KVM does not pre-increment %rip before
|
||||||
|
exiting to userspace.
|
||||||
|
|
||||||
|
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT When this quirk is disabled, KVM sets
|
||||||
|
CPUID.01H:ECX[bit 3] (MONITOR/MWAIT) if
|
||||||
|
IA32_MISC_ENABLE[bit 18] (MWAIT) is set.
|
||||||
|
Additionally, when this quirk is disabled,
|
||||||
|
KVM clears CPUID.01H:ECX[bit 3] if
|
||||||
|
IA32_MISC_ENABLE[bit 18] is cleared.
|
||||||
|
=================================== ============================================
|
||||||
|
|
||||||
8. Other capabilities.
|
8. Other capabilities.
|
||||||
======================
|
======================
|
||||||
|
|
||||||
|
@ -7138,7 +7260,7 @@ reserved.
|
||||||
8.9 KVM_CAP_ARM_USER_IRQ
|
8.9 KVM_CAP_ARM_USER_IRQ
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
|
|
||||||
This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
|
This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
|
||||||
that if userspace creates a VM without an in-kernel interrupt controller, it
|
that if userspace creates a VM without an in-kernel interrupt controller, it
|
||||||
|
@ -7265,7 +7387,7 @@ HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
|
||||||
8.19 KVM_CAP_ARM_INJECT_SERROR_ESR
|
8.19 KVM_CAP_ARM_INJECT_SERROR_ESR
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
:Architectures: arm, arm64
|
:Architectures: arm64
|
||||||
|
|
||||||
This capability indicates that userspace can specify (via the
|
This capability indicates that userspace can specify (via the
|
||||||
KVM_SET_VCPU_EVENTS ioctl) the syndrome value reported to the guest when it
|
KVM_SET_VCPU_EVENTS ioctl) the syndrome value reported to the guest when it
|
||||||
|
@ -7575,3 +7697,25 @@ The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset
|
||||||
of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace
|
of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace
|
||||||
the hypercalls whose corresponding bit is in the argument, and return
|
the hypercalls whose corresponding bit is in the argument, and return
|
||||||
ENOSYS for the others.
|
ENOSYS for the others.
|
||||||
|
|
||||||
|
8.35 KVM_CAP_PMU_CAPABILITY
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
:Capability KVM_CAP_PMU_CAPABILITY
|
||||||
|
:Architectures: x86
|
||||||
|
:Type: vm
|
||||||
|
:Parameters: arg[0] is bitmask of PMU virtualization capabilities.
|
||||||
|
:Returns 0 on success, -EINVAL when arg[0] contains invalid bits
|
||||||
|
|
||||||
|
This capability alters PMU virtualization in KVM.
|
||||||
|
|
||||||
|
Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of
|
||||||
|
PMU virtualization capabilities that can be adjusted on a VM.
|
||||||
|
|
||||||
|
The argument to KVM_ENABLE_CAP is also a bitmask and selects specific
|
||||||
|
PMU virtualization capabilities to be applied to the VM. This can
|
||||||
|
only be invoked on a VM prior to the creation of VCPUs.
|
||||||
|
|
||||||
|
At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting
|
||||||
|
this capability will disable PMU virtualization for that VM. Usermode
|
||||||
|
should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
|
||||||
|
|
|
@ -70,7 +70,7 @@ irqchip.
|
||||||
-ENODEV PMUv3 not supported or GIC not initialized
|
-ENODEV PMUv3 not supported or GIC not initialized
|
||||||
-ENXIO PMUv3 not properly configured or in-kernel irqchip not
|
-ENXIO PMUv3 not properly configured or in-kernel irqchip not
|
||||||
configured as required prior to calling this attribute
|
configured as required prior to calling this attribute
|
||||||
-EBUSY PMUv3 already initialized
|
-EBUSY PMUv3 already initialized or a VCPU has already run
|
||||||
-EINVAL Invalid filter range
|
-EINVAL Invalid filter range
|
||||||
======= ======================================================
|
======= ======================================================
|
||||||
|
|
||||||
|
@ -104,11 +104,43 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
|
||||||
isn't strictly speaking an event. Filtering the cycle counter is possible
|
isn't strictly speaking an event. Filtering the cycle counter is possible
|
||||||
using event 0x11 (CPU_CYCLES).
|
using event 0x11 (CPU_CYCLES).
|
||||||
|
|
||||||
|
1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
|
||||||
|
identifier.
|
||||||
|
|
||||||
|
:Returns:
|
||||||
|
|
||||||
|
======= ====================================================
|
||||||
|
-EBUSY PMUv3 already initialized, a VCPU has already run or
|
||||||
|
an event filter has already been set
|
||||||
|
-EFAULT Error accessing the PMU identifier
|
||||||
|
-ENXIO PMU not found
|
||||||
|
-ENODEV PMUv3 not supported or GIC not initialized
|
||||||
|
-ENOMEM Could not allocate memory
|
||||||
|
======= ====================================================
|
||||||
|
|
||||||
|
Request that the VCPU uses the specified hardware PMU when creating guest events
|
||||||
|
for the purpose of PMU emulation. The PMU identifier can be read from the "type"
|
||||||
|
file for the desired PMU instance under /sys/devices (or, equivalent,
|
||||||
|
/sys/bus/even_source). This attribute is particularly useful on heterogeneous
|
||||||
|
systems where there are at least two CPU PMUs on the system. The PMU that is set
|
||||||
|
for one VCPU will be used by all the other VCPUs. It isn't possible to set a PMU
|
||||||
|
if a PMU event filter is already present.
|
||||||
|
|
||||||
|
Note that KVM will not make any attempts to run the VCPU on the physical CPUs
|
||||||
|
associated with the PMU specified by this attribute. This is entirely left to
|
||||||
|
userspace. However, attempting to run the VCPU on a physical CPU not supported
|
||||||
|
by the PMU will fail and KVM_RUN will return with
|
||||||
|
exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
|
||||||
|
hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
|
||||||
|
the cpu field to the processor id.
|
||||||
|
|
||||||
2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
|
2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
|
||||||
=================================
|
=================================
|
||||||
|
|
||||||
:Architectures: ARM, ARM64
|
:Architectures: ARM64
|
||||||
|
|
||||||
2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
|
2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
|
||||||
-----------------------------------------------------------------------------
|
-----------------------------------------------------------------------------
|
||||||
|
|
|
@ -112,11 +112,10 @@ KVM_REQ_TLB_FLUSH
|
||||||
choose to use the common kvm_flush_remote_tlbs() implementation will
|
choose to use the common kvm_flush_remote_tlbs() implementation will
|
||||||
need to handle this VCPU request.
|
need to handle this VCPU request.
|
||||||
|
|
||||||
KVM_REQ_MMU_RELOAD
|
KVM_REQ_VM_DEAD
|
||||||
|
|
||||||
When shadow page tables are used and memory slots are removed it's
|
This request informs all VCPUs that the VM is dead and unusable, e.g. due to
|
||||||
necessary to inform each VCPU to completely refresh the tables. This
|
fatal error or because the VM's state has been intentionally destroyed.
|
||||||
request is used for that.
|
|
||||||
|
|
||||||
KVM_REQ_UNBLOCK
|
KVM_REQ_UNBLOCK
|
||||||
|
|
||||||
|
|
|
@ -10598,8 +10598,8 @@ F: arch/riscv/kvm/
|
||||||
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
|
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
|
||||||
M: Christian Borntraeger <borntraeger@linux.ibm.com>
|
M: Christian Borntraeger <borntraeger@linux.ibm.com>
|
||||||
M: Janosch Frank <frankja@linux.ibm.com>
|
M: Janosch Frank <frankja@linux.ibm.com>
|
||||||
|
M: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
||||||
R: David Hildenbrand <david@redhat.com>
|
R: David Hildenbrand <david@redhat.com>
|
||||||
R: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
||||||
L: kvm@vger.kernel.org
|
L: kvm@vger.kernel.org
|
||||||
S: Supported
|
S: Supported
|
||||||
W: http://www.ibm.com/developerworks/linux/linux390/
|
W: http://www.ibm.com/developerworks/linux/linux390/
|
||||||
|
|
|
@ -686,6 +686,7 @@ config ARM64_ERRATUM_2051678
|
||||||
|
|
||||||
config ARM64_ERRATUM_2077057
|
config ARM64_ERRATUM_2077057
|
||||||
bool "Cortex-A510: 2077057: workaround software-step corrupting SPSR_EL2"
|
bool "Cortex-A510: 2077057: workaround software-step corrupting SPSR_EL2"
|
||||||
|
default y
|
||||||
help
|
help
|
||||||
This option adds the workaround for ARM Cortex-A510 erratum 2077057.
|
This option adds the workaround for ARM Cortex-A510 erratum 2077057.
|
||||||
Affected Cortex-A510 may corrupt SPSR_EL2 when the a step exception is
|
Affected Cortex-A510 may corrupt SPSR_EL2 when the a step exception is
|
||||||
|
|
|
@ -50,6 +50,8 @@
|
||||||
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
|
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
|
||||||
KVM_DIRTY_LOG_INITIALLY_SET)
|
KVM_DIRTY_LOG_INITIALLY_SET)
|
||||||
|
|
||||||
|
#define KVM_HAVE_MMU_RWLOCK
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Mode of operation configurable with kvm-arm.mode early param.
|
* Mode of operation configurable with kvm-arm.mode early param.
|
||||||
* See Documentation/admin-guide/kernel-parameters.txt for more information.
|
* See Documentation/admin-guide/kernel-parameters.txt for more information.
|
||||||
|
@ -71,9 +73,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
||||||
void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu);
|
void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
struct kvm_vmid {
|
struct kvm_vmid {
|
||||||
/* The VMID generation used for the virt. memory system */
|
atomic64_t id;
|
||||||
u64 vmid_gen;
|
|
||||||
u32 vmid;
|
|
||||||
};
|
};
|
||||||
|
|
||||||
struct kvm_s2_mmu {
|
struct kvm_s2_mmu {
|
||||||
|
@ -122,20 +122,24 @@ struct kvm_arch {
|
||||||
* should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is
|
* should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is
|
||||||
* supported.
|
* supported.
|
||||||
*/
|
*/
|
||||||
bool return_nisv_io_abort_to_user;
|
#define KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER 0
|
||||||
|
/* Memory Tagging Extension enabled for the guest */
|
||||||
|
#define KVM_ARCH_FLAG_MTE_ENABLED 1
|
||||||
|
/* At least one vCPU has ran in the VM */
|
||||||
|
#define KVM_ARCH_FLAG_HAS_RAN_ONCE 2
|
||||||
|
unsigned long flags;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* VM-wide PMU filter, implemented as a bitmap and big enough for
|
* VM-wide PMU filter, implemented as a bitmap and big enough for
|
||||||
* up to 2^10 events (ARMv8.0) or 2^16 events (ARMv8.1+).
|
* up to 2^10 events (ARMv8.0) or 2^16 events (ARMv8.1+).
|
||||||
*/
|
*/
|
||||||
unsigned long *pmu_filter;
|
unsigned long *pmu_filter;
|
||||||
unsigned int pmuver;
|
struct arm_pmu *arm_pmu;
|
||||||
|
|
||||||
|
cpumask_var_t supported_cpus;
|
||||||
|
|
||||||
u8 pfr0_csv2;
|
u8 pfr0_csv2;
|
||||||
u8 pfr0_csv3;
|
u8 pfr0_csv3;
|
||||||
|
|
||||||
/* Memory Tagging Extension enabled for the guest */
|
|
||||||
bool mte_enabled;
|
|
||||||
};
|
};
|
||||||
|
|
||||||
struct kvm_vcpu_fault_info {
|
struct kvm_vcpu_fault_info {
|
||||||
|
@ -171,6 +175,7 @@ enum vcpu_sysreg {
|
||||||
PAR_EL1, /* Physical Address Register */
|
PAR_EL1, /* Physical Address Register */
|
||||||
MDSCR_EL1, /* Monitor Debug System Control Register */
|
MDSCR_EL1, /* Monitor Debug System Control Register */
|
||||||
MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */
|
MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */
|
||||||
|
OSLSR_EL1, /* OS Lock Status Register */
|
||||||
DISR_EL1, /* Deferred Interrupt Status Register */
|
DISR_EL1, /* Deferred Interrupt Status Register */
|
||||||
|
|
||||||
/* Performance Monitors Registers */
|
/* Performance Monitors Registers */
|
||||||
|
@ -435,6 +440,7 @@ struct kvm_vcpu_arch {
|
||||||
#define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */
|
#define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */
|
||||||
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
|
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
|
||||||
#define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14)
|
#define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14)
|
||||||
|
#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 15) /* Physical CPU not in supported_cpus */
|
||||||
|
|
||||||
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
|
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
|
||||||
KVM_GUESTDBG_USE_SW_BP | \
|
KVM_GUESTDBG_USE_SW_BP | \
|
||||||
|
@ -453,6 +459,15 @@ struct kvm_vcpu_arch {
|
||||||
#define vcpu_has_ptrauth(vcpu) false
|
#define vcpu_has_ptrauth(vcpu) false
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#define vcpu_on_unsupported_cpu(vcpu) \
|
||||||
|
((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
|
||||||
|
|
||||||
|
#define vcpu_set_on_unsupported_cpu(vcpu) \
|
||||||
|
((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
|
||||||
|
|
||||||
|
#define vcpu_clear_on_unsupported_cpu(vcpu) \
|
||||||
|
((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
|
||||||
|
|
||||||
#define vcpu_gp_regs(v) (&(v)->arch.ctxt.regs)
|
#define vcpu_gp_regs(v) (&(v)->arch.ctxt.regs)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -692,6 +707,12 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
|
||||||
int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
|
int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_device_attr *attr);
|
struct kvm_device_attr *attr);
|
||||||
|
|
||||||
|
extern unsigned int kvm_arm_vmid_bits;
|
||||||
|
int kvm_arm_vmid_alloc_init(void);
|
||||||
|
void kvm_arm_vmid_alloc_free(void);
|
||||||
|
void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
|
||||||
|
void kvm_arm_vmid_clear_active(void);
|
||||||
|
|
||||||
static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
|
static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
|
||||||
{
|
{
|
||||||
vcpu_arch->steal.base = GPA_INVALID;
|
vcpu_arch->steal.base = GPA_INVALID;
|
||||||
|
@ -730,6 +751,10 @@ void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
|
||||||
void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
|
void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
|
||||||
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
|
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
|
||||||
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
|
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
|
#define kvm_vcpu_os_lock_enabled(vcpu) \
|
||||||
|
(!!(__vcpu_sys_reg(vcpu, OSLSR_EL1) & SYS_OSLSR_OSLK))
|
||||||
|
|
||||||
int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
|
int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_device_attr *attr);
|
struct kvm_device_attr *attr);
|
||||||
int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
|
int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
|
||||||
|
@ -791,7 +816,9 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
|
||||||
#define kvm_arm_vcpu_sve_finalized(vcpu) \
|
#define kvm_arm_vcpu_sve_finalized(vcpu) \
|
||||||
((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
|
((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
|
||||||
|
|
||||||
#define kvm_has_mte(kvm) (system_supports_mte() && (kvm)->arch.mte_enabled)
|
#define kvm_has_mte(kvm) \
|
||||||
|
(system_supports_mte() && \
|
||||||
|
test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &(kvm)->arch.flags))
|
||||||
#define kvm_vcpu_has_pmu(vcpu) \
|
#define kvm_vcpu_has_pmu(vcpu) \
|
||||||
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
|
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
|
||||||
|
|
||||||
|
|
|
@ -115,6 +115,7 @@ alternative_cb_end
|
||||||
#include <asm/cache.h>
|
#include <asm/cache.h>
|
||||||
#include <asm/cacheflush.h>
|
#include <asm/cacheflush.h>
|
||||||
#include <asm/mmu_context.h>
|
#include <asm/mmu_context.h>
|
||||||
|
#include <asm/kvm_host.h>
|
||||||
|
|
||||||
void kvm_update_va_mask(struct alt_instr *alt,
|
void kvm_update_va_mask(struct alt_instr *alt,
|
||||||
__le32 *origptr, __le32 *updptr, int nr_inst);
|
__le32 *origptr, __le32 *updptr, int nr_inst);
|
||||||
|
@ -266,7 +267,8 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu)
|
||||||
u64 cnp = system_supports_cnp() ? VTTBR_CNP_BIT : 0;
|
u64 cnp = system_supports_cnp() ? VTTBR_CNP_BIT : 0;
|
||||||
|
|
||||||
baddr = mmu->pgd_phys;
|
baddr = mmu->pgd_phys;
|
||||||
vmid_field = (u64)READ_ONCE(vmid->vmid) << VTTBR_VMID_SHIFT;
|
vmid_field = atomic64_read(&vmid->id) << VTTBR_VMID_SHIFT;
|
||||||
|
vmid_field &= VTTBR_VMID_MASK(kvm_arm_vmid_bits);
|
||||||
return kvm_phys_to_vttbr(baddr) | vmid_field | cnp;
|
return kvm_phys_to_vttbr(baddr) | vmid_field | cnp;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -128,8 +128,16 @@
|
||||||
#define SYS_DBGWVRn_EL1(n) sys_reg(2, 0, 0, n, 6)
|
#define SYS_DBGWVRn_EL1(n) sys_reg(2, 0, 0, n, 6)
|
||||||
#define SYS_DBGWCRn_EL1(n) sys_reg(2, 0, 0, n, 7)
|
#define SYS_DBGWCRn_EL1(n) sys_reg(2, 0, 0, n, 7)
|
||||||
#define SYS_MDRAR_EL1 sys_reg(2, 0, 1, 0, 0)
|
#define SYS_MDRAR_EL1 sys_reg(2, 0, 1, 0, 0)
|
||||||
|
|
||||||
#define SYS_OSLAR_EL1 sys_reg(2, 0, 1, 0, 4)
|
#define SYS_OSLAR_EL1 sys_reg(2, 0, 1, 0, 4)
|
||||||
|
#define SYS_OSLAR_OSLK BIT(0)
|
||||||
|
|
||||||
#define SYS_OSLSR_EL1 sys_reg(2, 0, 1, 1, 4)
|
#define SYS_OSLSR_EL1 sys_reg(2, 0, 1, 1, 4)
|
||||||
|
#define SYS_OSLSR_OSLM_MASK (BIT(3) | BIT(0))
|
||||||
|
#define SYS_OSLSR_OSLM_NI 0
|
||||||
|
#define SYS_OSLSR_OSLM_IMPLEMENTED BIT(3)
|
||||||
|
#define SYS_OSLSR_OSLK BIT(1)
|
||||||
|
|
||||||
#define SYS_OSDLR_EL1 sys_reg(2, 0, 1, 3, 4)
|
#define SYS_OSDLR_EL1 sys_reg(2, 0, 1, 3, 4)
|
||||||
#define SYS_DBGPRCR_EL1 sys_reg(2, 0, 1, 4, 4)
|
#define SYS_DBGPRCR_EL1 sys_reg(2, 0, 1, 4, 4)
|
||||||
#define SYS_DBGCLAIMSET_EL1 sys_reg(2, 0, 7, 8, 6)
|
#define SYS_DBGCLAIMSET_EL1 sys_reg(2, 0, 7, 8, 6)
|
||||||
|
|
|
@ -367,6 +367,7 @@ struct kvm_arm_copy_mte_tags {
|
||||||
#define KVM_ARM_VCPU_PMU_V3_IRQ 0
|
#define KVM_ARM_VCPU_PMU_V3_IRQ 0
|
||||||
#define KVM_ARM_VCPU_PMU_V3_INIT 1
|
#define KVM_ARM_VCPU_PMU_V3_INIT 1
|
||||||
#define KVM_ARM_VCPU_PMU_V3_FILTER 2
|
#define KVM_ARM_VCPU_PMU_V3_FILTER 2
|
||||||
|
#define KVM_ARM_VCPU_PMU_V3_SET_PMU 3
|
||||||
#define KVM_ARM_VCPU_TIMER_CTRL 1
|
#define KVM_ARM_VCPU_TIMER_CTRL 1
|
||||||
#define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0
|
#define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0
|
||||||
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
|
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
|
||||||
|
@ -418,6 +419,16 @@ struct kvm_arm_copy_mte_tags {
|
||||||
#define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS
|
#define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS
|
||||||
#define KVM_PSCI_RET_DENIED PSCI_RET_DENIED
|
#define KVM_PSCI_RET_DENIED PSCI_RET_DENIED
|
||||||
|
|
||||||
|
/* arm64-specific kvm_run::system_event flags */
|
||||||
|
/*
|
||||||
|
* Reset caused by a PSCI v1.1 SYSTEM_RESET2 call.
|
||||||
|
* Valid only when the system event has a type of KVM_SYSTEM_EVENT_RESET.
|
||||||
|
*/
|
||||||
|
#define KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (1ULL << 0)
|
||||||
|
|
||||||
|
/* run->fail_entry.hardware_entry_failure_reason codes. */
|
||||||
|
#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED (1ULL << 0)
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#endif /* __ARM_KVM_H__ */
|
#endif /* __ARM_KVM_H__ */
|
||||||
|
|
|
@ -348,7 +348,13 @@ static void task_fpsimd_load(void)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Ensure FPSIMD/SVE storage in memory for the loaded context is up to
|
* Ensure FPSIMD/SVE storage in memory for the loaded context is up to
|
||||||
* date with respect to the CPU registers.
|
* date with respect to the CPU registers. Note carefully that the
|
||||||
|
* current context is the context last bound to the CPU stored in
|
||||||
|
* last, if KVM is involved this may be the guest VM context rather
|
||||||
|
* than the host thread for the VM pointed to by current. This means
|
||||||
|
* that we must always reference the state storage via last rather
|
||||||
|
* than via current, other than the TIF_ flags which KVM will
|
||||||
|
* carefully maintain for us.
|
||||||
*/
|
*/
|
||||||
static void fpsimd_save(void)
|
static void fpsimd_save(void)
|
||||||
{
|
{
|
||||||
|
|
|
@ -83,6 +83,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
|
||||||
/* Kernel symbol used by icache_is_vpipt(). */
|
/* Kernel symbol used by icache_is_vpipt(). */
|
||||||
KVM_NVHE_ALIAS(__icache_flags);
|
KVM_NVHE_ALIAS(__icache_flags);
|
||||||
|
|
||||||
|
/* VMID bits set by the KVM VMID allocator */
|
||||||
|
KVM_NVHE_ALIAS(kvm_arm_vmid_bits);
|
||||||
|
|
||||||
/* Kernel symbols needed for cpus_have_final/const_caps checks. */
|
/* Kernel symbols needed for cpus_have_final/const_caps checks. */
|
||||||
KVM_NVHE_ALIAS(arm64_const_caps_ready);
|
KVM_NVHE_ALIAS(arm64_const_caps_ready);
|
||||||
KVM_NVHE_ALIAS(cpu_hwcap_keys);
|
KVM_NVHE_ALIAS(cpu_hwcap_keys);
|
||||||
|
|
|
@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
|
||||||
inject_fault.o va_layout.o handle_exit.o \
|
inject_fault.o va_layout.o handle_exit.o \
|
||||||
guest.o debug.o reset.o sys_regs.o \
|
guest.o debug.o reset.o sys_regs.o \
|
||||||
vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
|
vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
|
||||||
arch_timer.o trng.o\
|
arch_timer.o trng.o vmid.o \
|
||||||
vgic/vgic.o vgic/vgic-init.o \
|
vgic/vgic.o vgic/vgic-init.o \
|
||||||
vgic/vgic-irqfd.o vgic/vgic-v2.o \
|
vgic/vgic-irqfd.o vgic/vgic-v2.o \
|
||||||
vgic/vgic-v3.o vgic/vgic-v4.o \
|
vgic/vgic-v3.o vgic/vgic-v4.o \
|
||||||
|
|
|
@ -53,11 +53,6 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
|
||||||
unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
|
unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
|
||||||
DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
|
DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
|
||||||
|
|
||||||
/* The VMID used in the VTTBR */
|
|
||||||
static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
|
|
||||||
static u32 kvm_next_vmid;
|
|
||||||
static DEFINE_SPINLOCK(kvm_vmid_lock);
|
|
||||||
|
|
||||||
static bool vgic_present;
|
static bool vgic_present;
|
||||||
|
|
||||||
static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
|
static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
|
||||||
|
@ -89,7 +84,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
|
||||||
switch (cap->cap) {
|
switch (cap->cap) {
|
||||||
case KVM_CAP_ARM_NISV_TO_USER:
|
case KVM_CAP_ARM_NISV_TO_USER:
|
||||||
r = 0;
|
r = 0;
|
||||||
kvm->arch.return_nisv_io_abort_to_user = true;
|
set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
|
||||||
|
&kvm->arch.flags);
|
||||||
break;
|
break;
|
||||||
case KVM_CAP_ARM_MTE:
|
case KVM_CAP_ARM_MTE:
|
||||||
mutex_lock(&kvm->lock);
|
mutex_lock(&kvm->lock);
|
||||||
|
@ -97,7 +93,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
} else {
|
} else {
|
||||||
r = 0;
|
r = 0;
|
||||||
kvm->arch.mte_enabled = true;
|
set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags);
|
||||||
}
|
}
|
||||||
mutex_unlock(&kvm->lock);
|
mutex_unlock(&kvm->lock);
|
||||||
break;
|
break;
|
||||||
|
@ -150,6 +146,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
|
||||||
if (ret)
|
if (ret)
|
||||||
goto out_free_stage2_pgd;
|
goto out_free_stage2_pgd;
|
||||||
|
|
||||||
|
if (!zalloc_cpumask_var(&kvm->arch.supported_cpus, GFP_KERNEL))
|
||||||
|
goto out_free_stage2_pgd;
|
||||||
|
cpumask_copy(kvm->arch.supported_cpus, cpu_possible_mask);
|
||||||
|
|
||||||
kvm_vgic_early_init(kvm);
|
kvm_vgic_early_init(kvm);
|
||||||
|
|
||||||
/* The maximum number of VCPUs is limited by the host's GIC model */
|
/* The maximum number of VCPUs is limited by the host's GIC model */
|
||||||
|
@ -176,6 +176,7 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
|
||||||
void kvm_arch_destroy_vm(struct kvm *kvm)
|
void kvm_arch_destroy_vm(struct kvm *kvm)
|
||||||
{
|
{
|
||||||
bitmap_free(kvm->arch.pmu_filter);
|
bitmap_free(kvm->arch.pmu_filter);
|
||||||
|
free_cpumask_var(kvm->arch.supported_cpus);
|
||||||
|
|
||||||
kvm_vgic_destroy(kvm);
|
kvm_vgic_destroy(kvm);
|
||||||
|
|
||||||
|
@ -411,6 +412,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||||
if (vcpu_has_ptrauth(vcpu))
|
if (vcpu_has_ptrauth(vcpu))
|
||||||
vcpu_ptrauth_disable(vcpu);
|
vcpu_ptrauth_disable(vcpu);
|
||||||
kvm_arch_vcpu_load_debug_state_flags(vcpu);
|
kvm_arch_vcpu_load_debug_state_flags(vcpu);
|
||||||
|
|
||||||
|
if (!cpumask_test_cpu(smp_processor_id(), vcpu->kvm->arch.supported_cpus))
|
||||||
|
vcpu_set_on_unsupported_cpu(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
|
@ -422,7 +426,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
kvm_timer_vcpu_put(vcpu);
|
kvm_timer_vcpu_put(vcpu);
|
||||||
kvm_vgic_put(vcpu);
|
kvm_vgic_put(vcpu);
|
||||||
kvm_vcpu_pmu_restore_host(vcpu);
|
kvm_vcpu_pmu_restore_host(vcpu);
|
||||||
|
kvm_arm_vmid_clear_active();
|
||||||
|
|
||||||
|
vcpu_clear_on_unsupported_cpu(vcpu);
|
||||||
vcpu->cpu = -1;
|
vcpu->cpu = -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -489,87 +495,6 @@ unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* Just ensure a guest exit from a particular CPU */
|
|
||||||
static void exit_vm_noop(void *info)
|
|
||||||
{
|
|
||||||
}
|
|
||||||
|
|
||||||
void force_vm_exit(const cpumask_t *mask)
|
|
||||||
{
|
|
||||||
preempt_disable();
|
|
||||||
smp_call_function_many(mask, exit_vm_noop, NULL, true);
|
|
||||||
preempt_enable();
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* need_new_vmid_gen - check that the VMID is still valid
|
|
||||||
* @vmid: The VMID to check
|
|
||||||
*
|
|
||||||
* return true if there is a new generation of VMIDs being used
|
|
||||||
*
|
|
||||||
* The hardware supports a limited set of values with the value zero reserved
|
|
||||||
* for the host, so we check if an assigned value belongs to a previous
|
|
||||||
* generation, which requires us to assign a new value. If we're the first to
|
|
||||||
* use a VMID for the new generation, we must flush necessary caches and TLBs
|
|
||||||
* on all CPUs.
|
|
||||||
*/
|
|
||||||
static bool need_new_vmid_gen(struct kvm_vmid *vmid)
|
|
||||||
{
|
|
||||||
u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
|
|
||||||
smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
|
|
||||||
return unlikely(READ_ONCE(vmid->vmid_gen) != current_vmid_gen);
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* update_vmid - Update the vmid with a valid VMID for the current generation
|
|
||||||
* @vmid: The stage-2 VMID information struct
|
|
||||||
*/
|
|
||||||
static void update_vmid(struct kvm_vmid *vmid)
|
|
||||||
{
|
|
||||||
if (!need_new_vmid_gen(vmid))
|
|
||||||
return;
|
|
||||||
|
|
||||||
spin_lock(&kvm_vmid_lock);
|
|
||||||
|
|
||||||
/*
|
|
||||||
* We need to re-check the vmid_gen here to ensure that if another vcpu
|
|
||||||
* already allocated a valid vmid for this vm, then this vcpu should
|
|
||||||
* use the same vmid.
|
|
||||||
*/
|
|
||||||
if (!need_new_vmid_gen(vmid)) {
|
|
||||||
spin_unlock(&kvm_vmid_lock);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* First user of a new VMID generation? */
|
|
||||||
if (unlikely(kvm_next_vmid == 0)) {
|
|
||||||
atomic64_inc(&kvm_vmid_gen);
|
|
||||||
kvm_next_vmid = 1;
|
|
||||||
|
|
||||||
/*
|
|
||||||
* On SMP we know no other CPUs can use this CPU's or each
|
|
||||||
* other's VMID after force_vm_exit returns since the
|
|
||||||
* kvm_vmid_lock blocks them from reentry to the guest.
|
|
||||||
*/
|
|
||||||
force_vm_exit(cpu_all_mask);
|
|
||||||
/*
|
|
||||||
* Now broadcast TLB + ICACHE invalidation over the inner
|
|
||||||
* shareable domain to make sure all data structures are
|
|
||||||
* clean.
|
|
||||||
*/
|
|
||||||
kvm_call_hyp(__kvm_flush_vm_context);
|
|
||||||
}
|
|
||||||
|
|
||||||
WRITE_ONCE(vmid->vmid, kvm_next_vmid);
|
|
||||||
kvm_next_vmid++;
|
|
||||||
kvm_next_vmid &= (1 << kvm_get_vmid_bits()) - 1;
|
|
||||||
|
|
||||||
smp_wmb();
|
|
||||||
WRITE_ONCE(vmid->vmid_gen, atomic64_read(&kvm_vmid_gen));
|
|
||||||
|
|
||||||
spin_unlock(&kvm_vmid_lock);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
|
static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
return vcpu->arch.target >= 0;
|
return vcpu->arch.target >= 0;
|
||||||
|
@ -634,6 +559,10 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
|
||||||
if (kvm_vm_is_protected(kvm))
|
if (kvm_vm_is_protected(kvm))
|
||||||
kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
|
kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
|
||||||
|
|
||||||
|
mutex_lock(&kvm->lock);
|
||||||
|
set_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags);
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -792,8 +721,15 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
|
||||||
|
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
|
||||||
|
run->fail_entry.hardware_entry_failure_reason = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
|
||||||
|
run->fail_entry.cpu = smp_processor_id();
|
||||||
|
*ret = 0;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
return kvm_request_pending(vcpu) ||
|
return kvm_request_pending(vcpu) ||
|
||||||
need_new_vmid_gen(&vcpu->arch.hw_mmu->vmid) ||
|
|
||||||
xfer_to_guest_mode_work_pending();
|
xfer_to_guest_mode_work_pending();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -855,8 +791,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
|
||||||
if (!ret)
|
if (!ret)
|
||||||
ret = 1;
|
ret = 1;
|
||||||
|
|
||||||
update_vmid(&vcpu->arch.hw_mmu->vmid);
|
|
||||||
|
|
||||||
check_vcpu_requests(vcpu);
|
check_vcpu_requests(vcpu);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -866,6 +800,15 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
|
||||||
*/
|
*/
|
||||||
preempt_disable();
|
preempt_disable();
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The VMID allocator only tracks active VMIDs per
|
||||||
|
* physical CPU, and therefore the VMID allocated may not be
|
||||||
|
* preserved on VMID roll-over if the task was preempted,
|
||||||
|
* making a thread's VMID inactive. So we need to call
|
||||||
|
* kvm_arm_vmid_update() in non-premptible context.
|
||||||
|
*/
|
||||||
|
kvm_arm_vmid_update(&vcpu->arch.hw_mmu->vmid);
|
||||||
|
|
||||||
kvm_pmu_flush_hwstate(vcpu);
|
kvm_pmu_flush_hwstate(vcpu);
|
||||||
|
|
||||||
local_irq_disable();
|
local_irq_disable();
|
||||||
|
@ -945,9 +888,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
|
||||||
* context synchronization event) is necessary to ensure that
|
* context synchronization event) is necessary to ensure that
|
||||||
* pending interrupts are taken.
|
* pending interrupts are taken.
|
||||||
*/
|
*/
|
||||||
local_irq_enable();
|
if (ARM_EXCEPTION_CODE(ret) == ARM_EXCEPTION_IRQ) {
|
||||||
isb();
|
local_irq_enable();
|
||||||
local_irq_disable();
|
isb();
|
||||||
|
local_irq_disable();
|
||||||
|
}
|
||||||
|
|
||||||
guest_timing_exit_irqoff();
|
guest_timing_exit_irqoff();
|
||||||
|
|
||||||
|
@ -1742,7 +1687,7 @@ static void init_cpu_logical_map(void)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Copy the MPIDR <-> logical CPU ID mapping to hyp.
|
* Copy the MPIDR <-> logical CPU ID mapping to hyp.
|
||||||
* Only copy the set of online CPUs whose features have been chacked
|
* Only copy the set of online CPUs whose features have been checked
|
||||||
* against the finalized system capabilities. The hypervisor will not
|
* against the finalized system capabilities. The hypervisor will not
|
||||||
* allow any other CPUs from the `possible` set to boot.
|
* allow any other CPUs from the `possible` set to boot.
|
||||||
*/
|
*/
|
||||||
|
@ -2159,6 +2104,12 @@ int kvm_arch_init(void *opaque)
|
||||||
if (err)
|
if (err)
|
||||||
return err;
|
return err;
|
||||||
|
|
||||||
|
err = kvm_arm_vmid_alloc_init();
|
||||||
|
if (err) {
|
||||||
|
kvm_err("Failed to initialize VMID allocator.\n");
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
if (!in_hyp_mode) {
|
if (!in_hyp_mode) {
|
||||||
err = init_hyp_mode();
|
err = init_hyp_mode();
|
||||||
if (err)
|
if (err)
|
||||||
|
@ -2198,6 +2149,7 @@ int kvm_arch_init(void *opaque)
|
||||||
if (!in_hyp_mode)
|
if (!in_hyp_mode)
|
||||||
teardown_hyp_mode();
|
teardown_hyp_mode();
|
||||||
out_err:
|
out_err:
|
||||||
|
kvm_arm_vmid_alloc_free();
|
||||||
return err;
|
return err;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -105,9 +105,11 @@ static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
|
||||||
* - Userspace is using the hardware to debug the guest
|
* - Userspace is using the hardware to debug the guest
|
||||||
* (KVM_GUESTDBG_USE_HW is set).
|
* (KVM_GUESTDBG_USE_HW is set).
|
||||||
* - The guest is not using debug (KVM_ARM64_DEBUG_DIRTY is clear).
|
* - The guest is not using debug (KVM_ARM64_DEBUG_DIRTY is clear).
|
||||||
|
* - The guest has enabled the OS Lock (debug exceptions are blocked).
|
||||||
*/
|
*/
|
||||||
if ((vcpu->guest_debug & KVM_GUESTDBG_USE_HW) ||
|
if ((vcpu->guest_debug & KVM_GUESTDBG_USE_HW) ||
|
||||||
!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
|
!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY) ||
|
||||||
|
kvm_vcpu_os_lock_enabled(vcpu))
|
||||||
vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
|
vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
|
||||||
|
|
||||||
trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
|
trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
|
||||||
|
@ -160,8 +162,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
kvm_arm_setup_mdcr_el2(vcpu);
|
kvm_arm_setup_mdcr_el2(vcpu);
|
||||||
|
|
||||||
/* Is Guest debugging in effect? */
|
/* Check if we need to use the debug registers. */
|
||||||
if (vcpu->guest_debug) {
|
if (vcpu->guest_debug || kvm_vcpu_os_lock_enabled(vcpu)) {
|
||||||
/* Save guest debug state */
|
/* Save guest debug state */
|
||||||
save_guest_debug_regs(vcpu);
|
save_guest_debug_regs(vcpu);
|
||||||
|
|
||||||
|
@ -223,6 +225,19 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
|
||||||
trace_kvm_arm_set_regset("WAPTS", get_num_wrps(),
|
trace_kvm_arm_set_regset("WAPTS", get_num_wrps(),
|
||||||
&vcpu->arch.debug_ptr->dbg_wcr[0],
|
&vcpu->arch.debug_ptr->dbg_wcr[0],
|
||||||
&vcpu->arch.debug_ptr->dbg_wvr[0]);
|
&vcpu->arch.debug_ptr->dbg_wvr[0]);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The OS Lock blocks debug exceptions in all ELs when it is
|
||||||
|
* enabled. If the guest has enabled the OS Lock, constrain its
|
||||||
|
* effects to the guest. Emulate the behavior by clearing
|
||||||
|
* MDSCR_EL1.MDE. In so doing, we ensure that host debug
|
||||||
|
* exceptions are unaffected by guest configuration of the OS
|
||||||
|
* Lock.
|
||||||
|
*/
|
||||||
|
} else if (kvm_vcpu_os_lock_enabled(vcpu)) {
|
||||||
|
mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
|
||||||
|
mdscr &= ~DBG_MDSCR_MDE;
|
||||||
|
vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -244,7 +259,10 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
trace_kvm_arm_clear_debug(vcpu->guest_debug);
|
trace_kvm_arm_clear_debug(vcpu->guest_debug);
|
||||||
|
|
||||||
if (vcpu->guest_debug) {
|
/*
|
||||||
|
* Restore the guest's debug registers if we were using them.
|
||||||
|
*/
|
||||||
|
if (vcpu->guest_debug || kvm_vcpu_os_lock_enabled(vcpu)) {
|
||||||
restore_guest_debug_regs(vcpu);
|
restore_guest_debug_regs(vcpu);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
|
@ -84,6 +84,11 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
|
||||||
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
|
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Called just before entering the guest once we are no longer
|
||||||
|
* preemptable. Syncs the host's TIF_FOREIGN_FPSTATE with the KVM
|
||||||
|
* mirror of the flag used by the hypervisor.
|
||||||
|
*/
|
||||||
void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
|
void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
|
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
|
||||||
|
@ -93,10 +98,11 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If the guest FPSIMD state was loaded, update the host's context
|
* Called just after exiting the guest. If the guest FPSIMD state
|
||||||
* tracking data mark the CPU FPSIMD regs as dirty and belonging to vcpu
|
* was loaded, update the host's context tracking data mark the CPU
|
||||||
* so that they will be written back if the kernel clobbers them due to
|
* FPSIMD regs as dirty and belonging to vcpu so that they will be
|
||||||
* kernel-mode NEON before re-entry into the guest.
|
* written back if the kernel clobbers them due to kernel-mode NEON
|
||||||
|
* before re-entry into the guest.
|
||||||
*/
|
*/
|
||||||
void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
|
void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
|
|
|
@ -282,7 +282,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Otherwide, this is a priviledged mode, and *all* the
|
* Otherwise, this is a privileged mode, and *all* the
|
||||||
* registers must be narrowed to 32bit.
|
* registers must be narrowed to 32bit.
|
||||||
*/
|
*/
|
||||||
default:
|
default:
|
||||||
|
|
|
@ -248,7 +248,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
|
||||||
case ARM_EXCEPTION_HYP_GONE:
|
case ARM_EXCEPTION_HYP_GONE:
|
||||||
/*
|
/*
|
||||||
* EL2 has been reset to the hyp-stub. This happens when a guest
|
* EL2 has been reset to the hyp-stub. This happens when a guest
|
||||||
* is pre-empted by kvm_reboot()'s shutdown call.
|
* is pre-emptied by kvm_reboot()'s shutdown call.
|
||||||
*/
|
*/
|
||||||
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
|
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
|
||||||
return 0;
|
return 0;
|
||||||
|
|
|
@ -173,6 +173,8 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
/* Valid trap. Switch the context: */
|
/* Valid trap. Switch the context: */
|
||||||
|
|
||||||
|
/* First disable enough traps to allow us to update the registers */
|
||||||
if (has_vhe()) {
|
if (has_vhe()) {
|
||||||
reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN;
|
reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN;
|
||||||
if (sve_guest)
|
if (sve_guest)
|
||||||
|
@ -188,11 +190,13 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||||
}
|
}
|
||||||
isb();
|
isb();
|
||||||
|
|
||||||
|
/* Write out the host state if it's in the registers */
|
||||||
if (vcpu->arch.flags & KVM_ARM64_FP_HOST) {
|
if (vcpu->arch.flags & KVM_ARM64_FP_HOST) {
|
||||||
__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
|
__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
|
||||||
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
|
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Restore the guest state */
|
||||||
if (sve_guest)
|
if (sve_guest)
|
||||||
__hyp_sve_restore_guest(vcpu);
|
__hyp_sve_restore_guest(vcpu);
|
||||||
else
|
else
|
||||||
|
|
|
@ -13,10 +13,11 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
|
||||||
lib-objs := $(addprefix ../../../lib/, $(lib-objs))
|
lib-objs := $(addprefix ../../../lib/, $(lib-objs))
|
||||||
|
|
||||||
obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
|
obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
|
||||||
hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
|
hyp-main.o hyp-smp.o psci-relay.o early_alloc.o page_alloc.o \
|
||||||
cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
|
cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
|
||||||
obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
|
obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
|
||||||
../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
|
../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
|
||||||
|
obj-$(CONFIG_DEBUG_LIST) += list_debug.o
|
||||||
obj-y += $(lib-objs)
|
obj-y += $(lib-objs)
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
|
@ -0,0 +1,54 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0-only
|
||||||
|
/*
|
||||||
|
* Copyright (C) 2022 - Google LLC
|
||||||
|
* Author: Keir Fraser <keirf@google.com>
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/list.h>
|
||||||
|
#include <linux/bug.h>
|
||||||
|
|
||||||
|
static inline __must_check bool nvhe_check_data_corruption(bool v)
|
||||||
|
{
|
||||||
|
return v;
|
||||||
|
}
|
||||||
|
|
||||||
|
#define NVHE_CHECK_DATA_CORRUPTION(condition) \
|
||||||
|
nvhe_check_data_corruption(({ \
|
||||||
|
bool corruption = unlikely(condition); \
|
||||||
|
if (corruption) { \
|
||||||
|
if (IS_ENABLED(CONFIG_BUG_ON_DATA_CORRUPTION)) { \
|
||||||
|
BUG_ON(1); \
|
||||||
|
} else \
|
||||||
|
WARN_ON(1); \
|
||||||
|
} \
|
||||||
|
corruption; \
|
||||||
|
}))
|
||||||
|
|
||||||
|
/* The predicates checked here are taken from lib/list_debug.c. */
|
||||||
|
|
||||||
|
bool __list_add_valid(struct list_head *new, struct list_head *prev,
|
||||||
|
struct list_head *next)
|
||||||
|
{
|
||||||
|
if (NVHE_CHECK_DATA_CORRUPTION(next->prev != prev) ||
|
||||||
|
NVHE_CHECK_DATA_CORRUPTION(prev->next != next) ||
|
||||||
|
NVHE_CHECK_DATA_CORRUPTION(new == prev || new == next))
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool __list_del_entry_valid(struct list_head *entry)
|
||||||
|
{
|
||||||
|
struct list_head *prev, *next;
|
||||||
|
|
||||||
|
prev = entry->prev;
|
||||||
|
next = entry->next;
|
||||||
|
|
||||||
|
if (NVHE_CHECK_DATA_CORRUPTION(next == LIST_POISON1) ||
|
||||||
|
NVHE_CHECK_DATA_CORRUPTION(prev == LIST_POISON2) ||
|
||||||
|
NVHE_CHECK_DATA_CORRUPTION(prev->next != entry) ||
|
||||||
|
NVHE_CHECK_DATA_CORRUPTION(next->prev != entry))
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
|
@ -138,8 +138,7 @@ int kvm_host_prepare_stage2(void *pgt_pool_base)
|
||||||
|
|
||||||
mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd);
|
mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd);
|
||||||
mmu->pgt = &host_kvm.pgt;
|
mmu->pgt = &host_kvm.pgt;
|
||||||
WRITE_ONCE(mmu->vmid.vmid_gen, 0);
|
atomic64_set(&mmu->vmid.id, 0);
|
||||||
WRITE_ONCE(mmu->vmid.vmid, 0);
|
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
|
@ -102,7 +102,7 @@ static void __hyp_attach_page(struct hyp_pool *pool,
|
||||||
* Only the first struct hyp_page of a high-order page (otherwise known
|
* Only the first struct hyp_page of a high-order page (otherwise known
|
||||||
* as the 'head') should have p->order set. The non-head pages should
|
* as the 'head') should have p->order set. The non-head pages should
|
||||||
* have p->order = HYP_NO_ORDER. Here @p may no longer be the head
|
* have p->order = HYP_NO_ORDER. Here @p may no longer be the head
|
||||||
* after coallescing, so make sure to mark it HYP_NO_ORDER proactively.
|
* after coalescing, so make sure to mark it HYP_NO_ORDER proactively.
|
||||||
*/
|
*/
|
||||||
p->order = HYP_NO_ORDER;
|
p->order = HYP_NO_ORDER;
|
||||||
for (; (order + 1) < pool->max_order; order++) {
|
for (; (order + 1) < pool->max_order; order++) {
|
||||||
|
@ -110,7 +110,7 @@ static void __hyp_attach_page(struct hyp_pool *pool,
|
||||||
if (!buddy)
|
if (!buddy)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
/* Take the buddy out of its list, and coallesce with @p */
|
/* Take the buddy out of its list, and coalesce with @p */
|
||||||
page_remove_from_list(buddy);
|
page_remove_from_list(buddy);
|
||||||
buddy->order = HYP_NO_ORDER;
|
buddy->order = HYP_NO_ORDER;
|
||||||
p = min(p, buddy);
|
p = min(p, buddy);
|
||||||
|
|
|
@ -1,22 +0,0 @@
|
||||||
// SPDX-License-Identifier: GPL-2.0-only
|
|
||||||
/*
|
|
||||||
* Stubs for out-of-line function calls caused by re-using kernel
|
|
||||||
* infrastructure at EL2.
|
|
||||||
*
|
|
||||||
* Copyright (C) 2020 - Google LLC
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <linux/list.h>
|
|
||||||
|
|
||||||
#ifdef CONFIG_DEBUG_LIST
|
|
||||||
bool __list_add_valid(struct list_head *new, struct list_head *prev,
|
|
||||||
struct list_head *next)
|
|
||||||
{
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
bool __list_del_entry_valid(struct list_head *entry)
|
|
||||||
{
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
#endif
|
|
|
@ -135,7 +135,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
|
||||||
* volunteered to do so, and bail out otherwise.
|
* volunteered to do so, and bail out otherwise.
|
||||||
*/
|
*/
|
||||||
if (!kvm_vcpu_dabt_isvalid(vcpu)) {
|
if (!kvm_vcpu_dabt_isvalid(vcpu)) {
|
||||||
if (vcpu->kvm->arch.return_nisv_io_abort_to_user) {
|
if (test_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
|
||||||
|
&vcpu->kvm->arch.flags)) {
|
||||||
run->exit_reason = KVM_EXIT_ARM_NISV;
|
run->exit_reason = KVM_EXIT_ARM_NISV;
|
||||||
run->arm_nisv.esr_iss = kvm_vcpu_dabt_iss_nisv_sanitized(vcpu);
|
run->arm_nisv.esr_iss = kvm_vcpu_dabt_iss_nisv_sanitized(vcpu);
|
||||||
run->arm_nisv.fault_ipa = fault_ipa;
|
run->arm_nisv.fault_ipa = fault_ipa;
|
||||||
|
|
|
@ -58,7 +58,7 @@ static int stage2_apply_range(struct kvm *kvm, phys_addr_t addr,
|
||||||
break;
|
break;
|
||||||
|
|
||||||
if (resched && next != end)
|
if (resched && next != end)
|
||||||
cond_resched_lock(&kvm->mmu_lock);
|
cond_resched_rwlock_write(&kvm->mmu_lock);
|
||||||
} while (addr = next, addr != end);
|
} while (addr = next, addr != end);
|
||||||
|
|
||||||
return ret;
|
return ret;
|
||||||
|
@ -179,7 +179,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
|
||||||
struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
|
struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
|
||||||
phys_addr_t end = start + size;
|
phys_addr_t end = start + size;
|
||||||
|
|
||||||
assert_spin_locked(&kvm->mmu_lock);
|
lockdep_assert_held_write(&kvm->mmu_lock);
|
||||||
WARN_ON(size & ~PAGE_MASK);
|
WARN_ON(size & ~PAGE_MASK);
|
||||||
WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
|
WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
|
||||||
may_block));
|
may_block));
|
||||||
|
@ -213,13 +213,13 @@ static void stage2_flush_vm(struct kvm *kvm)
|
||||||
int idx, bkt;
|
int idx, bkt;
|
||||||
|
|
||||||
idx = srcu_read_lock(&kvm->srcu);
|
idx = srcu_read_lock(&kvm->srcu);
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
|
|
||||||
slots = kvm_memslots(kvm);
|
slots = kvm_memslots(kvm);
|
||||||
kvm_for_each_memslot(memslot, bkt, slots)
|
kvm_for_each_memslot(memslot, bkt, slots)
|
||||||
stage2_flush_memslot(kvm, memslot);
|
stage2_flush_memslot(kvm, memslot);
|
||||||
|
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
srcu_read_unlock(&kvm->srcu, idx);
|
srcu_read_unlock(&kvm->srcu, idx);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -615,7 +615,7 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = {
|
||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* kvm_init_stage2_mmu - Initialise a S2 MMU strucrure
|
* kvm_init_stage2_mmu - Initialise a S2 MMU structure
|
||||||
* @kvm: The pointer to the KVM structure
|
* @kvm: The pointer to the KVM structure
|
||||||
* @mmu: The pointer to the s2 MMU structure
|
* @mmu: The pointer to the s2 MMU structure
|
||||||
*
|
*
|
||||||
|
@ -653,7 +653,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu)
|
||||||
|
|
||||||
mmu->pgt = pgt;
|
mmu->pgt = pgt;
|
||||||
mmu->pgd_phys = __pa(pgt->pgd);
|
mmu->pgd_phys = __pa(pgt->pgd);
|
||||||
WRITE_ONCE(mmu->vmid.vmid_gen, 0);
|
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
out_destroy_pgtable:
|
out_destroy_pgtable:
|
||||||
|
@ -720,13 +719,13 @@ void stage2_unmap_vm(struct kvm *kvm)
|
||||||
|
|
||||||
idx = srcu_read_lock(&kvm->srcu);
|
idx = srcu_read_lock(&kvm->srcu);
|
||||||
mmap_read_lock(current->mm);
|
mmap_read_lock(current->mm);
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
|
|
||||||
slots = kvm_memslots(kvm);
|
slots = kvm_memslots(kvm);
|
||||||
kvm_for_each_memslot(memslot, bkt, slots)
|
kvm_for_each_memslot(memslot, bkt, slots)
|
||||||
stage2_unmap_memslot(kvm, memslot);
|
stage2_unmap_memslot(kvm, memslot);
|
||||||
|
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
mmap_read_unlock(current->mm);
|
mmap_read_unlock(current->mm);
|
||||||
srcu_read_unlock(&kvm->srcu, idx);
|
srcu_read_unlock(&kvm->srcu, idx);
|
||||||
}
|
}
|
||||||
|
@ -736,14 +735,14 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
|
||||||
struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
|
struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
|
||||||
struct kvm_pgtable *pgt = NULL;
|
struct kvm_pgtable *pgt = NULL;
|
||||||
|
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
pgt = mmu->pgt;
|
pgt = mmu->pgt;
|
||||||
if (pgt) {
|
if (pgt) {
|
||||||
mmu->pgd_phys = 0;
|
mmu->pgd_phys = 0;
|
||||||
mmu->pgt = NULL;
|
mmu->pgt = NULL;
|
||||||
free_percpu(mmu->last_vcpu_ran);
|
free_percpu(mmu->last_vcpu_ran);
|
||||||
}
|
}
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
|
|
||||||
if (pgt) {
|
if (pgt) {
|
||||||
kvm_pgtable_stage2_destroy(pgt);
|
kvm_pgtable_stage2_destroy(pgt);
|
||||||
|
@ -783,10 +782,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
|
||||||
if (ret)
|
if (ret)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot,
|
ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot,
|
||||||
&cache);
|
&cache);
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
if (ret)
|
if (ret)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
@ -834,9 +833,9 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
|
||||||
start = memslot->base_gfn << PAGE_SHIFT;
|
start = memslot->base_gfn << PAGE_SHIFT;
|
||||||
end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
|
end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
|
||||||
|
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
stage2_wp_range(&kvm->arch.mmu, start, end);
|
stage2_wp_range(&kvm->arch.mmu, start, end);
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
kvm_flush_remote_tlbs(kvm);
|
kvm_flush_remote_tlbs(kvm);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -1080,6 +1079,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||||
gfn_t gfn;
|
gfn_t gfn;
|
||||||
kvm_pfn_t pfn;
|
kvm_pfn_t pfn;
|
||||||
bool logging_active = memslot_is_logging(memslot);
|
bool logging_active = memslot_is_logging(memslot);
|
||||||
|
bool logging_perm_fault = false;
|
||||||
unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
|
unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
|
||||||
unsigned long vma_pagesize, fault_granule;
|
unsigned long vma_pagesize, fault_granule;
|
||||||
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
|
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
|
||||||
|
@ -1114,6 +1114,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||||
if (logging_active) {
|
if (logging_active) {
|
||||||
force_pte = true;
|
force_pte = true;
|
||||||
vma_shift = PAGE_SHIFT;
|
vma_shift = PAGE_SHIFT;
|
||||||
|
logging_perm_fault = (fault_status == FSC_PERM && write_fault);
|
||||||
} else {
|
} else {
|
||||||
vma_shift = get_vma_page_shift(vma, hva);
|
vma_shift = get_vma_page_shift(vma, hva);
|
||||||
}
|
}
|
||||||
|
@ -1212,7 +1213,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||||
if (exec_fault && device)
|
if (exec_fault && device)
|
||||||
return -ENOEXEC;
|
return -ENOEXEC;
|
||||||
|
|
||||||
spin_lock(&kvm->mmu_lock);
|
/*
|
||||||
|
* To reduce MMU contentions and enhance concurrency during dirty
|
||||||
|
* logging dirty logging, only acquire read lock for permission
|
||||||
|
* relaxation.
|
||||||
|
*/
|
||||||
|
if (logging_perm_fault)
|
||||||
|
read_lock(&kvm->mmu_lock);
|
||||||
|
else
|
||||||
|
write_lock(&kvm->mmu_lock);
|
||||||
pgt = vcpu->arch.hw_mmu->pgt;
|
pgt = vcpu->arch.hw_mmu->pgt;
|
||||||
if (mmu_notifier_retry(kvm, mmu_seq))
|
if (mmu_notifier_retry(kvm, mmu_seq))
|
||||||
goto out_unlock;
|
goto out_unlock;
|
||||||
|
@ -1271,7 +1280,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||||
}
|
}
|
||||||
|
|
||||||
out_unlock:
|
out_unlock:
|
||||||
spin_unlock(&kvm->mmu_lock);
|
if (logging_perm_fault)
|
||||||
|
read_unlock(&kvm->mmu_lock);
|
||||||
|
else
|
||||||
|
write_unlock(&kvm->mmu_lock);
|
||||||
kvm_set_pfn_accessed(pfn);
|
kvm_set_pfn_accessed(pfn);
|
||||||
kvm_release_pfn_clean(pfn);
|
kvm_release_pfn_clean(pfn);
|
||||||
return ret != -EAGAIN ? ret : 0;
|
return ret != -EAGAIN ? ret : 0;
|
||||||
|
@ -1286,10 +1298,10 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
|
||||||
|
|
||||||
trace_kvm_access_fault(fault_ipa);
|
trace_kvm_access_fault(fault_ipa);
|
||||||
|
|
||||||
spin_lock(&vcpu->kvm->mmu_lock);
|
write_lock(&vcpu->kvm->mmu_lock);
|
||||||
mmu = vcpu->arch.hw_mmu;
|
mmu = vcpu->arch.hw_mmu;
|
||||||
kpte = kvm_pgtable_stage2_mkyoung(mmu->pgt, fault_ipa);
|
kpte = kvm_pgtable_stage2_mkyoung(mmu->pgt, fault_ipa);
|
||||||
spin_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
|
|
||||||
pte = __pte(kpte);
|
pte = __pte(kpte);
|
||||||
if (pte_valid(pte))
|
if (pte_valid(pte))
|
||||||
|
@ -1692,9 +1704,9 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
|
||||||
gpa_t gpa = slot->base_gfn << PAGE_SHIFT;
|
gpa_t gpa = slot->base_gfn << PAGE_SHIFT;
|
||||||
phys_addr_t size = slot->npages << PAGE_SHIFT;
|
phys_addr_t size = slot->npages << PAGE_SHIFT;
|
||||||
|
|
||||||
spin_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
unmap_stage2_range(&kvm->arch.mmu, gpa, size);
|
unmap_stage2_range(&kvm->arch.mmu, gpa, size);
|
||||||
spin_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
|
@ -7,6 +7,7 @@
|
||||||
#include <linux/cpu.h>
|
#include <linux/cpu.h>
|
||||||
#include <linux/kvm.h>
|
#include <linux/kvm.h>
|
||||||
#include <linux/kvm_host.h>
|
#include <linux/kvm_host.h>
|
||||||
|
#include <linux/list.h>
|
||||||
#include <linux/perf_event.h>
|
#include <linux/perf_event.h>
|
||||||
#include <linux/perf/arm_pmu.h>
|
#include <linux/perf/arm_pmu.h>
|
||||||
#include <linux/uaccess.h>
|
#include <linux/uaccess.h>
|
||||||
|
@ -16,6 +17,9 @@
|
||||||
|
|
||||||
DEFINE_STATIC_KEY_FALSE(kvm_arm_pmu_available);
|
DEFINE_STATIC_KEY_FALSE(kvm_arm_pmu_available);
|
||||||
|
|
||||||
|
static LIST_HEAD(arm_pmus);
|
||||||
|
static DEFINE_MUTEX(arm_pmus_lock);
|
||||||
|
|
||||||
static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
|
static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
|
||||||
static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
|
static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
|
||||||
static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
|
static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
|
||||||
|
@ -24,7 +28,11 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
|
||||||
|
|
||||||
static u32 kvm_pmu_event_mask(struct kvm *kvm)
|
static u32 kvm_pmu_event_mask(struct kvm *kvm)
|
||||||
{
|
{
|
||||||
switch (kvm->arch.pmuver) {
|
unsigned int pmuver;
|
||||||
|
|
||||||
|
pmuver = kvm->arch.arm_pmu->pmuver;
|
||||||
|
|
||||||
|
switch (pmuver) {
|
||||||
case ID_AA64DFR0_PMUVER_8_0:
|
case ID_AA64DFR0_PMUVER_8_0:
|
||||||
return GENMASK(9, 0);
|
return GENMASK(9, 0);
|
||||||
case ID_AA64DFR0_PMUVER_8_1:
|
case ID_AA64DFR0_PMUVER_8_1:
|
||||||
|
@ -33,7 +41,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
|
||||||
case ID_AA64DFR0_PMUVER_8_7:
|
case ID_AA64DFR0_PMUVER_8_7:
|
||||||
return GENMASK(15, 0);
|
return GENMASK(15, 0);
|
||||||
default: /* Shouldn't be here, just for sanity */
|
default: /* Shouldn't be here, just for sanity */
|
||||||
WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
|
WARN_ONCE(1, "Unknown PMU version %d\n", pmuver);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -600,6 +608,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
|
||||||
*/
|
*/
|
||||||
static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
|
static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
|
||||||
{
|
{
|
||||||
|
struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu;
|
||||||
struct kvm_pmu *pmu = &vcpu->arch.pmu;
|
struct kvm_pmu *pmu = &vcpu->arch.pmu;
|
||||||
struct kvm_pmc *pmc;
|
struct kvm_pmc *pmc;
|
||||||
struct perf_event *event;
|
struct perf_event *event;
|
||||||
|
@ -636,7 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
memset(&attr, 0, sizeof(struct perf_event_attr));
|
memset(&attr, 0, sizeof(struct perf_event_attr));
|
||||||
attr.type = PERF_TYPE_RAW;
|
attr.type = arm_pmu->pmu.type;
|
||||||
attr.size = sizeof(attr);
|
attr.size = sizeof(attr);
|
||||||
attr.pinned = 1;
|
attr.pinned = 1;
|
||||||
attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
|
attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
|
||||||
|
@ -745,17 +754,33 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
|
||||||
|
|
||||||
void kvm_host_pmu_init(struct arm_pmu *pmu)
|
void kvm_host_pmu_init(struct arm_pmu *pmu)
|
||||||
{
|
{
|
||||||
if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
|
struct arm_pmu_entry *entry;
|
||||||
!kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
|
|
||||||
|
if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
|
||||||
|
is_protected_kvm_enabled())
|
||||||
|
return;
|
||||||
|
|
||||||
|
mutex_lock(&arm_pmus_lock);
|
||||||
|
|
||||||
|
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
|
||||||
|
if (!entry)
|
||||||
|
goto out_unlock;
|
||||||
|
|
||||||
|
entry->arm_pmu = pmu;
|
||||||
|
list_add_tail(&entry->entry, &arm_pmus);
|
||||||
|
|
||||||
|
if (list_is_singular(&arm_pmus))
|
||||||
static_branch_enable(&kvm_arm_pmu_available);
|
static_branch_enable(&kvm_arm_pmu_available);
|
||||||
|
|
||||||
|
out_unlock:
|
||||||
|
mutex_unlock(&arm_pmus_lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int kvm_pmu_probe_pmuver(void)
|
static struct arm_pmu *kvm_pmu_probe_armpmu(void)
|
||||||
{
|
{
|
||||||
struct perf_event_attr attr = { };
|
struct perf_event_attr attr = { };
|
||||||
struct perf_event *event;
|
struct perf_event *event;
|
||||||
struct arm_pmu *pmu;
|
struct arm_pmu *pmu = NULL;
|
||||||
int pmuver = ID_AA64DFR0_PMUVER_IMP_DEF;
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Create a dummy event that only counts user cycles. As we'll never
|
* Create a dummy event that only counts user cycles. As we'll never
|
||||||
|
@ -780,19 +805,20 @@ static int kvm_pmu_probe_pmuver(void)
|
||||||
if (IS_ERR(event)) {
|
if (IS_ERR(event)) {
|
||||||
pr_err_once("kvm: pmu event creation failed %ld\n",
|
pr_err_once("kvm: pmu event creation failed %ld\n",
|
||||||
PTR_ERR(event));
|
PTR_ERR(event));
|
||||||
return ID_AA64DFR0_PMUVER_IMP_DEF;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (event->pmu) {
|
if (event->pmu) {
|
||||||
pmu = to_arm_pmu(event->pmu);
|
pmu = to_arm_pmu(event->pmu);
|
||||||
if (pmu->pmuver)
|
if (pmu->pmuver == 0 ||
|
||||||
pmuver = pmu->pmuver;
|
pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF)
|
||||||
|
pmu = NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
perf_event_disable(event);
|
perf_event_disable(event);
|
||||||
perf_event_release_kernel(event);
|
perf_event_release_kernel(event);
|
||||||
|
|
||||||
return pmuver;
|
return pmu;
|
||||||
}
|
}
|
||||||
|
|
||||||
u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
|
u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
|
||||||
|
@ -810,7 +836,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
|
||||||
* Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
|
* Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
|
||||||
* as RAZ
|
* as RAZ
|
||||||
*/
|
*/
|
||||||
if (vcpu->kvm->arch.pmuver >= ID_AA64DFR0_PMUVER_8_4)
|
if (vcpu->kvm->arch.arm_pmu->pmuver >= ID_AA64DFR0_PMUVER_8_4)
|
||||||
val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
|
val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
|
||||||
base = 32;
|
base = 32;
|
||||||
}
|
}
|
||||||
|
@ -922,26 +948,64 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
|
||||||
|
{
|
||||||
|
struct kvm *kvm = vcpu->kvm;
|
||||||
|
struct arm_pmu_entry *entry;
|
||||||
|
struct arm_pmu *arm_pmu;
|
||||||
|
int ret = -ENXIO;
|
||||||
|
|
||||||
|
mutex_lock(&kvm->lock);
|
||||||
|
mutex_lock(&arm_pmus_lock);
|
||||||
|
|
||||||
|
list_for_each_entry(entry, &arm_pmus, entry) {
|
||||||
|
arm_pmu = entry->arm_pmu;
|
||||||
|
if (arm_pmu->pmu.type == pmu_id) {
|
||||||
|
if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags) ||
|
||||||
|
(kvm->arch.pmu_filter && kvm->arch.arm_pmu != arm_pmu)) {
|
||||||
|
ret = -EBUSY;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
kvm->arch.arm_pmu = arm_pmu;
|
||||||
|
cpumask_copy(kvm->arch.supported_cpus, &arm_pmu->supported_cpus);
|
||||||
|
ret = 0;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
mutex_unlock(&arm_pmus_lock);
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
{
|
{
|
||||||
|
struct kvm *kvm = vcpu->kvm;
|
||||||
|
|
||||||
if (!kvm_vcpu_has_pmu(vcpu))
|
if (!kvm_vcpu_has_pmu(vcpu))
|
||||||
return -ENODEV;
|
return -ENODEV;
|
||||||
|
|
||||||
if (vcpu->arch.pmu.created)
|
if (vcpu->arch.pmu.created)
|
||||||
return -EBUSY;
|
return -EBUSY;
|
||||||
|
|
||||||
if (!vcpu->kvm->arch.pmuver)
|
mutex_lock(&kvm->lock);
|
||||||
vcpu->kvm->arch.pmuver = kvm_pmu_probe_pmuver();
|
if (!kvm->arch.arm_pmu) {
|
||||||
|
/* No PMU set, get the default one */
|
||||||
if (vcpu->kvm->arch.pmuver == ID_AA64DFR0_PMUVER_IMP_DEF)
|
kvm->arch.arm_pmu = kvm_pmu_probe_armpmu();
|
||||||
return -ENODEV;
|
if (!kvm->arch.arm_pmu) {
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
return -ENODEV;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
|
||||||
switch (attr->attr) {
|
switch (attr->attr) {
|
||||||
case KVM_ARM_VCPU_PMU_V3_IRQ: {
|
case KVM_ARM_VCPU_PMU_V3_IRQ: {
|
||||||
int __user *uaddr = (int __user *)(long)attr->addr;
|
int __user *uaddr = (int __user *)(long)attr->addr;
|
||||||
int irq;
|
int irq;
|
||||||
|
|
||||||
if (!irqchip_in_kernel(vcpu->kvm))
|
if (!irqchip_in_kernel(kvm))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (get_user(irq, uaddr))
|
if (get_user(irq, uaddr))
|
||||||
|
@ -951,7 +1015,7 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
if (!(irq_is_ppi(irq) || irq_is_spi(irq)))
|
if (!(irq_is_ppi(irq) || irq_is_spi(irq)))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (!pmu_irq_is_valid(vcpu->kvm, irq))
|
if (!pmu_irq_is_valid(kvm, irq))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (kvm_arm_pmu_irq_initialized(vcpu))
|
if (kvm_arm_pmu_irq_initialized(vcpu))
|
||||||
|
@ -966,7 +1030,7 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
struct kvm_pmu_event_filter filter;
|
struct kvm_pmu_event_filter filter;
|
||||||
int nr_events;
|
int nr_events;
|
||||||
|
|
||||||
nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
|
nr_events = kvm_pmu_event_mask(kvm) + 1;
|
||||||
|
|
||||||
uaddr = (struct kvm_pmu_event_filter __user *)(long)attr->addr;
|
uaddr = (struct kvm_pmu_event_filter __user *)(long)attr->addr;
|
||||||
|
|
||||||
|
@ -978,12 +1042,17 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
filter.action != KVM_PMU_EVENT_DENY))
|
filter.action != KVM_PMU_EVENT_DENY))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
mutex_lock(&vcpu->kvm->lock);
|
mutex_lock(&kvm->lock);
|
||||||
|
|
||||||
if (!vcpu->kvm->arch.pmu_filter) {
|
if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags)) {
|
||||||
vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
|
mutex_unlock(&kvm->lock);
|
||||||
if (!vcpu->kvm->arch.pmu_filter) {
|
return -EBUSY;
|
||||||
mutex_unlock(&vcpu->kvm->lock);
|
}
|
||||||
|
|
||||||
|
if (!kvm->arch.pmu_filter) {
|
||||||
|
kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
|
||||||
|
if (!kvm->arch.pmu_filter) {
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -994,20 +1063,29 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
* events, the default is to allow.
|
* events, the default is to allow.
|
||||||
*/
|
*/
|
||||||
if (filter.action == KVM_PMU_EVENT_ALLOW)
|
if (filter.action == KVM_PMU_EVENT_ALLOW)
|
||||||
bitmap_zero(vcpu->kvm->arch.pmu_filter, nr_events);
|
bitmap_zero(kvm->arch.pmu_filter, nr_events);
|
||||||
else
|
else
|
||||||
bitmap_fill(vcpu->kvm->arch.pmu_filter, nr_events);
|
bitmap_fill(kvm->arch.pmu_filter, nr_events);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (filter.action == KVM_PMU_EVENT_ALLOW)
|
if (filter.action == KVM_PMU_EVENT_ALLOW)
|
||||||
bitmap_set(vcpu->kvm->arch.pmu_filter, filter.base_event, filter.nevents);
|
bitmap_set(kvm->arch.pmu_filter, filter.base_event, filter.nevents);
|
||||||
else
|
else
|
||||||
bitmap_clear(vcpu->kvm->arch.pmu_filter, filter.base_event, filter.nevents);
|
bitmap_clear(kvm->arch.pmu_filter, filter.base_event, filter.nevents);
|
||||||
|
|
||||||
mutex_unlock(&vcpu->kvm->lock);
|
mutex_unlock(&kvm->lock);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
|
||||||
|
int __user *uaddr = (int __user *)(long)attr->addr;
|
||||||
|
int pmu_id;
|
||||||
|
|
||||||
|
if (get_user(pmu_id, uaddr))
|
||||||
|
return -EFAULT;
|
||||||
|
|
||||||
|
return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
|
||||||
|
}
|
||||||
case KVM_ARM_VCPU_PMU_V3_INIT:
|
case KVM_ARM_VCPU_PMU_V3_INIT:
|
||||||
return kvm_arm_pmu_v3_init(vcpu);
|
return kvm_arm_pmu_v3_init(vcpu);
|
||||||
}
|
}
|
||||||
|
@ -1045,6 +1123,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
|
||||||
case KVM_ARM_VCPU_PMU_V3_IRQ:
|
case KVM_ARM_VCPU_PMU_V3_IRQ:
|
||||||
case KVM_ARM_VCPU_PMU_V3_INIT:
|
case KVM_ARM_VCPU_PMU_V3_INIT:
|
||||||
case KVM_ARM_VCPU_PMU_V3_FILTER:
|
case KVM_ARM_VCPU_PMU_V3_FILTER:
|
||||||
|
case KVM_ARM_VCPU_PMU_V3_SET_PMU:
|
||||||
if (kvm_vcpu_has_pmu(vcpu))
|
if (kvm_vcpu_has_pmu(vcpu))
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
|
@ -84,7 +84,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
|
||||||
if (!vcpu)
|
if (!vcpu)
|
||||||
return PSCI_RET_INVALID_PARAMS;
|
return PSCI_RET_INVALID_PARAMS;
|
||||||
if (!vcpu->arch.power_off) {
|
if (!vcpu->arch.power_off) {
|
||||||
if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
|
if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
|
||||||
return PSCI_RET_ALREADY_ON;
|
return PSCI_RET_ALREADY_ON;
|
||||||
else
|
else
|
||||||
return PSCI_RET_INVALID_PARAMS;
|
return PSCI_RET_INVALID_PARAMS;
|
||||||
|
@ -161,7 +161,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
|
||||||
return PSCI_0_2_AFFINITY_LEVEL_OFF;
|
return PSCI_0_2_AFFINITY_LEVEL_OFF;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
|
static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type, u64 flags)
|
||||||
{
|
{
|
||||||
unsigned long i;
|
unsigned long i;
|
||||||
struct kvm_vcpu *tmp;
|
struct kvm_vcpu *tmp;
|
||||||
|
@ -181,17 +181,24 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
|
||||||
|
|
||||||
memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
|
memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
|
||||||
vcpu->run->system_event.type = type;
|
vcpu->run->system_event.type = type;
|
||||||
|
vcpu->run->system_event.flags = flags;
|
||||||
vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
|
vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_psci_system_off(struct kvm_vcpu *vcpu)
|
static void kvm_psci_system_off(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_SHUTDOWN);
|
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_SHUTDOWN, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
|
static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET);
|
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET, 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void kvm_psci_system_reset2(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET,
|
||||||
|
KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
|
static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
|
||||||
|
@ -304,24 +311,27 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
|
static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
|
||||||
{
|
{
|
||||||
u32 psci_fn = smccc_get_function(vcpu);
|
u32 psci_fn = smccc_get_function(vcpu);
|
||||||
u32 feature;
|
u32 arg;
|
||||||
unsigned long val;
|
unsigned long val;
|
||||||
int ret = 1;
|
int ret = 1;
|
||||||
|
|
||||||
|
if (minor > 1)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
switch(psci_fn) {
|
switch(psci_fn) {
|
||||||
case PSCI_0_2_FN_PSCI_VERSION:
|
case PSCI_0_2_FN_PSCI_VERSION:
|
||||||
val = KVM_ARM_PSCI_1_0;
|
val = minor == 0 ? KVM_ARM_PSCI_1_0 : KVM_ARM_PSCI_1_1;
|
||||||
break;
|
break;
|
||||||
case PSCI_1_0_FN_PSCI_FEATURES:
|
case PSCI_1_0_FN_PSCI_FEATURES:
|
||||||
feature = smccc_get_arg1(vcpu);
|
arg = smccc_get_arg1(vcpu);
|
||||||
val = kvm_psci_check_allowed_function(vcpu, feature);
|
val = kvm_psci_check_allowed_function(vcpu, arg);
|
||||||
if (val)
|
if (val)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
switch(feature) {
|
switch(arg) {
|
||||||
case PSCI_0_2_FN_PSCI_VERSION:
|
case PSCI_0_2_FN_PSCI_VERSION:
|
||||||
case PSCI_0_2_FN_CPU_SUSPEND:
|
case PSCI_0_2_FN_CPU_SUSPEND:
|
||||||
case PSCI_0_2_FN64_CPU_SUSPEND:
|
case PSCI_0_2_FN64_CPU_SUSPEND:
|
||||||
|
@ -337,11 +347,36 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
|
||||||
case ARM_SMCCC_VERSION_FUNC_ID:
|
case ARM_SMCCC_VERSION_FUNC_ID:
|
||||||
val = 0;
|
val = 0;
|
||||||
break;
|
break;
|
||||||
|
case PSCI_1_1_FN_SYSTEM_RESET2:
|
||||||
|
case PSCI_1_1_FN64_SYSTEM_RESET2:
|
||||||
|
if (minor >= 1) {
|
||||||
|
val = 0;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
fallthrough;
|
||||||
default:
|
default:
|
||||||
val = PSCI_RET_NOT_SUPPORTED;
|
val = PSCI_RET_NOT_SUPPORTED;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
|
case PSCI_1_1_FN_SYSTEM_RESET2:
|
||||||
|
kvm_psci_narrow_to_32bit(vcpu);
|
||||||
|
fallthrough;
|
||||||
|
case PSCI_1_1_FN64_SYSTEM_RESET2:
|
||||||
|
if (minor >= 1) {
|
||||||
|
arg = smccc_get_arg1(vcpu);
|
||||||
|
|
||||||
|
if (arg <= PSCI_1_1_RESET_TYPE_SYSTEM_WARM_RESET ||
|
||||||
|
arg >= PSCI_1_1_RESET_TYPE_VENDOR_START) {
|
||||||
|
kvm_psci_system_reset2(vcpu);
|
||||||
|
vcpu_set_reg(vcpu, 0, PSCI_RET_INTERNAL_FAILURE);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
val = PSCI_RET_INVALID_PARAMS;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
fallthrough;
|
||||||
default:
|
default:
|
||||||
return kvm_psci_0_2_call(vcpu);
|
return kvm_psci_0_2_call(vcpu);
|
||||||
}
|
}
|
||||||
|
@ -391,16 +426,18 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
|
||||||
*/
|
*/
|
||||||
int kvm_psci_call(struct kvm_vcpu *vcpu)
|
int kvm_psci_call(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
switch (kvm_psci_version(vcpu, vcpu->kvm)) {
|
switch (kvm_psci_version(vcpu)) {
|
||||||
|
case KVM_ARM_PSCI_1_1:
|
||||||
|
return kvm_psci_1_x_call(vcpu, 1);
|
||||||
case KVM_ARM_PSCI_1_0:
|
case KVM_ARM_PSCI_1_0:
|
||||||
return kvm_psci_1_0_call(vcpu);
|
return kvm_psci_1_x_call(vcpu, 0);
|
||||||
case KVM_ARM_PSCI_0_2:
|
case KVM_ARM_PSCI_0_2:
|
||||||
return kvm_psci_0_2_call(vcpu);
|
return kvm_psci_0_2_call(vcpu);
|
||||||
case KVM_ARM_PSCI_0_1:
|
case KVM_ARM_PSCI_0_1:
|
||||||
return kvm_psci_0_1_call(vcpu);
|
return kvm_psci_0_1_call(vcpu);
|
||||||
default:
|
default:
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
};
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
|
int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
|
||||||
|
@ -484,7 +521,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||||
|
|
||||||
switch (reg->id) {
|
switch (reg->id) {
|
||||||
case KVM_REG_ARM_PSCI_VERSION:
|
case KVM_REG_ARM_PSCI_VERSION:
|
||||||
val = kvm_psci_version(vcpu, vcpu->kvm);
|
val = kvm_psci_version(vcpu);
|
||||||
break;
|
break;
|
||||||
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
|
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
|
||||||
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
|
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
|
||||||
|
@ -525,6 +562,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||||
return 0;
|
return 0;
|
||||||
case KVM_ARM_PSCI_0_2:
|
case KVM_ARM_PSCI_0_2:
|
||||||
case KVM_ARM_PSCI_1_0:
|
case KVM_ARM_PSCI_1_0:
|
||||||
|
case KVM_ARM_PSCI_1_1:
|
||||||
if (!wants_02)
|
if (!wants_02)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
vcpu->kvm->arch.psci_version = val;
|
vcpu->kvm->arch.psci_version = val;
|
||||||
|
|
|
@ -44,6 +44,10 @@
|
||||||
* 64bit interface.
|
* 64bit interface.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
|
||||||
|
static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
|
||||||
|
static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
|
||||||
|
|
||||||
static bool read_from_write_only(struct kvm_vcpu *vcpu,
|
static bool read_from_write_only(struct kvm_vcpu *vcpu,
|
||||||
struct sys_reg_params *params,
|
struct sys_reg_params *params,
|
||||||
const struct sys_reg_desc *r)
|
const struct sys_reg_desc *r)
|
||||||
|
@ -287,16 +291,55 @@ static bool trap_loregion(struct kvm_vcpu *vcpu,
|
||||||
return trap_raz_wi(vcpu, p, r);
|
return trap_raz_wi(vcpu, p, r);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static bool trap_oslar_el1(struct kvm_vcpu *vcpu,
|
||||||
|
struct sys_reg_params *p,
|
||||||
|
const struct sys_reg_desc *r)
|
||||||
|
{
|
||||||
|
u64 oslsr;
|
||||||
|
|
||||||
|
if (!p->is_write)
|
||||||
|
return read_from_write_only(vcpu, p, r);
|
||||||
|
|
||||||
|
/* Forward the OSLK bit to OSLSR */
|
||||||
|
oslsr = __vcpu_sys_reg(vcpu, OSLSR_EL1) & ~SYS_OSLSR_OSLK;
|
||||||
|
if (p->regval & SYS_OSLAR_OSLK)
|
||||||
|
oslsr |= SYS_OSLSR_OSLK;
|
||||||
|
|
||||||
|
__vcpu_sys_reg(vcpu, OSLSR_EL1) = oslsr;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
|
static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
|
||||||
struct sys_reg_params *p,
|
struct sys_reg_params *p,
|
||||||
const struct sys_reg_desc *r)
|
const struct sys_reg_desc *r)
|
||||||
{
|
{
|
||||||
if (p->is_write) {
|
if (p->is_write)
|
||||||
return ignore_write(vcpu, p);
|
return write_to_read_only(vcpu, p, r);
|
||||||
} else {
|
|
||||||
p->regval = (1 << 3);
|
p->regval = __vcpu_sys_reg(vcpu, r->reg);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int set_oslsr_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||||
|
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||||
|
{
|
||||||
|
u64 id = sys_reg_to_index(rd);
|
||||||
|
u64 val;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
err = reg_from_user(&val, uaddr, id);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The only modifiable bit is the OSLK bit. Refuse the write if
|
||||||
|
* userspace attempts to change any other bit in the register.
|
||||||
|
*/
|
||||||
|
if ((val ^ rd->val) & ~SYS_OSLSR_OSLK)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
__vcpu_sys_reg(vcpu, rd->reg) = val;
|
||||||
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
|
static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
|
||||||
|
@ -1169,10 +1212,6 @@ static bool access_raz_id_reg(struct kvm_vcpu *vcpu,
|
||||||
return __access_id_reg(vcpu, p, r, true);
|
return __access_id_reg(vcpu, p, r, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
|
|
||||||
static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
|
|
||||||
static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
|
|
||||||
|
|
||||||
/* Visibility overrides for SVE-specific control registers */
|
/* Visibility overrides for SVE-specific control registers */
|
||||||
static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
|
static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
|
||||||
const struct sys_reg_desc *rd)
|
const struct sys_reg_desc *rd)
|
||||||
|
@ -1423,9 +1462,9 @@ static unsigned int mte_visibility(const struct kvm_vcpu *vcpu,
|
||||||
* Debug handling: We do trap most, if not all debug related system
|
* Debug handling: We do trap most, if not all debug related system
|
||||||
* registers. The implementation is good enough to ensure that a guest
|
* registers. The implementation is good enough to ensure that a guest
|
||||||
* can use these with minimal performance degradation. The drawback is
|
* can use these with minimal performance degradation. The drawback is
|
||||||
* that we don't implement any of the external debug, none of the
|
* that we don't implement any of the external debug architecture.
|
||||||
* OSlock protocol. This should be revisited if we ever encounter a
|
* This should be revisited if we ever encounter a more demanding
|
||||||
* more demanding guest...
|
* guest...
|
||||||
*/
|
*/
|
||||||
static const struct sys_reg_desc sys_reg_descs[] = {
|
static const struct sys_reg_desc sys_reg_descs[] = {
|
||||||
{ SYS_DESC(SYS_DC_ISW), access_dcsw },
|
{ SYS_DESC(SYS_DC_ISW), access_dcsw },
|
||||||
|
@ -1452,8 +1491,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||||
DBG_BCR_BVR_WCR_WVR_EL1(15),
|
DBG_BCR_BVR_WCR_WVR_EL1(15),
|
||||||
|
|
||||||
{ SYS_DESC(SYS_MDRAR_EL1), trap_raz_wi },
|
{ SYS_DESC(SYS_MDRAR_EL1), trap_raz_wi },
|
||||||
{ SYS_DESC(SYS_OSLAR_EL1), trap_raz_wi },
|
{ SYS_DESC(SYS_OSLAR_EL1), trap_oslar_el1 },
|
||||||
{ SYS_DESC(SYS_OSLSR_EL1), trap_oslsr_el1 },
|
{ SYS_DESC(SYS_OSLSR_EL1), trap_oslsr_el1, reset_val, OSLSR_EL1,
|
||||||
|
SYS_OSLSR_OSLM_IMPLEMENTED, .set_user = set_oslsr_el1, },
|
||||||
{ SYS_DESC(SYS_OSDLR_EL1), trap_raz_wi },
|
{ SYS_DESC(SYS_OSDLR_EL1), trap_raz_wi },
|
||||||
{ SYS_DESC(SYS_DBGPRCR_EL1), trap_raz_wi },
|
{ SYS_DESC(SYS_DBGPRCR_EL1), trap_raz_wi },
|
||||||
{ SYS_DESC(SYS_DBGCLAIMSET_EL1), trap_raz_wi },
|
{ SYS_DESC(SYS_DBGCLAIMSET_EL1), trap_raz_wi },
|
||||||
|
@ -1925,10 +1965,10 @@ static const struct sys_reg_desc cp14_regs[] = {
|
||||||
|
|
||||||
DBGBXVR(0),
|
DBGBXVR(0),
|
||||||
/* DBGOSLAR */
|
/* DBGOSLAR */
|
||||||
{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_raz_wi },
|
{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_oslar_el1 },
|
||||||
DBGBXVR(1),
|
DBGBXVR(1),
|
||||||
/* DBGOSLSR */
|
/* DBGOSLSR */
|
||||||
{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1 },
|
{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1, NULL, OSLSR_EL1 },
|
||||||
DBGBXVR(2),
|
DBGBXVR(2),
|
||||||
DBGBXVR(3),
|
DBGBXVR(3),
|
||||||
/* DBGOSDLR */
|
/* DBGOSDLR */
|
||||||
|
|
|
@ -37,7 +37,7 @@ struct vgic_global kvm_vgic_global_state __ro_after_init = {
|
||||||
* If you need to take multiple locks, always take the upper lock first,
|
* If you need to take multiple locks, always take the upper lock first,
|
||||||
* then the lower ones, e.g. first take the its_lock, then the irq_lock.
|
* then the lower ones, e.g. first take the its_lock, then the irq_lock.
|
||||||
* If you are already holding a lock and need to take a higher one, you
|
* If you are already holding a lock and need to take a higher one, you
|
||||||
* have to drop the lower ranking lock first and re-aquire it after having
|
* have to drop the lower ranking lock first and re-acquire it after having
|
||||||
* taken the upper one.
|
* taken the upper one.
|
||||||
*
|
*
|
||||||
* When taking more than one ap_list_lock at the same time, always take the
|
* When taking more than one ap_list_lock at the same time, always take the
|
||||||
|
|
|
@ -0,0 +1,196 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* VMID allocator.
|
||||||
|
*
|
||||||
|
* Based on Arm64 ASID allocator algorithm.
|
||||||
|
* Please refer arch/arm64/mm/context.c for detailed
|
||||||
|
* comments on algorithm.
|
||||||
|
*
|
||||||
|
* Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
|
||||||
|
* Copyright (C) 2012 ARM Ltd.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/bitfield.h>
|
||||||
|
#include <linux/bitops.h>
|
||||||
|
|
||||||
|
#include <asm/kvm_asm.h>
|
||||||
|
#include <asm/kvm_mmu.h>
|
||||||
|
|
||||||
|
unsigned int kvm_arm_vmid_bits;
|
||||||
|
static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
|
||||||
|
|
||||||
|
static atomic64_t vmid_generation;
|
||||||
|
static unsigned long *vmid_map;
|
||||||
|
|
||||||
|
static DEFINE_PER_CPU(atomic64_t, active_vmids);
|
||||||
|
static DEFINE_PER_CPU(u64, reserved_vmids);
|
||||||
|
|
||||||
|
#define VMID_MASK (~GENMASK(kvm_arm_vmid_bits - 1, 0))
|
||||||
|
#define VMID_FIRST_VERSION (1UL << kvm_arm_vmid_bits)
|
||||||
|
|
||||||
|
#define NUM_USER_VMIDS VMID_FIRST_VERSION
|
||||||
|
#define vmid2idx(vmid) ((vmid) & ~VMID_MASK)
|
||||||
|
#define idx2vmid(idx) vmid2idx(idx)
|
||||||
|
|
||||||
|
/*
|
||||||
|
* As vmid #0 is always reserved, we will never allocate one
|
||||||
|
* as below and can be treated as invalid. This is used to
|
||||||
|
* set the active_vmids on vCPU schedule out.
|
||||||
|
*/
|
||||||
|
#define VMID_ACTIVE_INVALID VMID_FIRST_VERSION
|
||||||
|
|
||||||
|
#define vmid_gen_match(vmid) \
|
||||||
|
(!(((vmid) ^ atomic64_read(&vmid_generation)) >> kvm_arm_vmid_bits))
|
||||||
|
|
||||||
|
static void flush_context(void)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
u64 vmid;
|
||||||
|
|
||||||
|
bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
|
||||||
|
|
||||||
|
for_each_possible_cpu(cpu) {
|
||||||
|
vmid = atomic64_xchg_relaxed(&per_cpu(active_vmids, cpu), 0);
|
||||||
|
|
||||||
|
/* Preserve reserved VMID */
|
||||||
|
if (vmid == 0)
|
||||||
|
vmid = per_cpu(reserved_vmids, cpu);
|
||||||
|
__set_bit(vmid2idx(vmid), vmid_map);
|
||||||
|
per_cpu(reserved_vmids, cpu) = vmid;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Unlike ASID allocator, we expect less frequent rollover in
|
||||||
|
* case of VMIDs. Hence, instead of marking the CPU as
|
||||||
|
* flush_pending and issuing a local context invalidation on
|
||||||
|
* the next context-switch, we broadcast TLB flush + I-cache
|
||||||
|
* invalidation over the inner shareable domain on rollover.
|
||||||
|
*/
|
||||||
|
kvm_call_hyp(__kvm_flush_vm_context);
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
bool hit = false;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Iterate over the set of reserved VMIDs looking for a match
|
||||||
|
* and update to use newvmid (i.e. the same VMID in the current
|
||||||
|
* generation).
|
||||||
|
*/
|
||||||
|
for_each_possible_cpu(cpu) {
|
||||||
|
if (per_cpu(reserved_vmids, cpu) == vmid) {
|
||||||
|
hit = true;
|
||||||
|
per_cpu(reserved_vmids, cpu) = newvmid;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return hit;
|
||||||
|
}
|
||||||
|
|
||||||
|
static u64 new_vmid(struct kvm_vmid *kvm_vmid)
|
||||||
|
{
|
||||||
|
static u32 cur_idx = 1;
|
||||||
|
u64 vmid = atomic64_read(&kvm_vmid->id);
|
||||||
|
u64 generation = atomic64_read(&vmid_generation);
|
||||||
|
|
||||||
|
if (vmid != 0) {
|
||||||
|
u64 newvmid = generation | (vmid & ~VMID_MASK);
|
||||||
|
|
||||||
|
if (check_update_reserved_vmid(vmid, newvmid)) {
|
||||||
|
atomic64_set(&kvm_vmid->id, newvmid);
|
||||||
|
return newvmid;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!__test_and_set_bit(vmid2idx(vmid), vmid_map)) {
|
||||||
|
atomic64_set(&kvm_vmid->id, newvmid);
|
||||||
|
return newvmid;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, cur_idx);
|
||||||
|
if (vmid != NUM_USER_VMIDS)
|
||||||
|
goto set_vmid;
|
||||||
|
|
||||||
|
/* We're out of VMIDs, so increment the global generation count */
|
||||||
|
generation = atomic64_add_return_relaxed(VMID_FIRST_VERSION,
|
||||||
|
&vmid_generation);
|
||||||
|
flush_context();
|
||||||
|
|
||||||
|
/* We have more VMIDs than CPUs, so this will always succeed */
|
||||||
|
vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, 1);
|
||||||
|
|
||||||
|
set_vmid:
|
||||||
|
__set_bit(vmid, vmid_map);
|
||||||
|
cur_idx = vmid;
|
||||||
|
vmid = idx2vmid(vmid) | generation;
|
||||||
|
atomic64_set(&kvm_vmid->id, vmid);
|
||||||
|
return vmid;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Called from vCPU sched out with preemption disabled */
|
||||||
|
void kvm_arm_vmid_clear_active(void)
|
||||||
|
{
|
||||||
|
atomic64_set(this_cpu_ptr(&active_vmids), VMID_ACTIVE_INVALID);
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
|
||||||
|
{
|
||||||
|
unsigned long flags;
|
||||||
|
u64 vmid, old_active_vmid;
|
||||||
|
|
||||||
|
vmid = atomic64_read(&kvm_vmid->id);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Please refer comments in check_and_switch_context() in
|
||||||
|
* arch/arm64/mm/context.c.
|
||||||
|
*
|
||||||
|
* Unlike ASID allocator, we set the active_vmids to
|
||||||
|
* VMID_ACTIVE_INVALID on vCPU schedule out to avoid
|
||||||
|
* reserving the VMID space needlessly on rollover.
|
||||||
|
* Hence explicitly check here for a "!= 0" to
|
||||||
|
* handle the sync with a concurrent rollover.
|
||||||
|
*/
|
||||||
|
old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
|
||||||
|
if (old_active_vmid != 0 && vmid_gen_match(vmid) &&
|
||||||
|
0 != atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
|
||||||
|
old_active_vmid, vmid))
|
||||||
|
return;
|
||||||
|
|
||||||
|
raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
|
||||||
|
|
||||||
|
/* Check that our VMID belongs to the current generation. */
|
||||||
|
vmid = atomic64_read(&kvm_vmid->id);
|
||||||
|
if (!vmid_gen_match(vmid))
|
||||||
|
vmid = new_vmid(kvm_vmid);
|
||||||
|
|
||||||
|
atomic64_set(this_cpu_ptr(&active_vmids), vmid);
|
||||||
|
raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Initialize the VMID allocator
|
||||||
|
*/
|
||||||
|
int kvm_arm_vmid_alloc_init(void)
|
||||||
|
{
|
||||||
|
kvm_arm_vmid_bits = kvm_get_vmid_bits();
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Expect allocation after rollover to fail if we don't have
|
||||||
|
* at least one more VMID than CPUs. VMID #0 is always reserved.
|
||||||
|
*/
|
||||||
|
WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());
|
||||||
|
atomic64_set(&vmid_generation, VMID_FIRST_VERSION);
|
||||||
|
vmid_map = kcalloc(BITS_TO_LONGS(NUM_USER_VMIDS),
|
||||||
|
sizeof(*vmid_map), GFP_KERNEL);
|
||||||
|
if (!vmid_map)
|
||||||
|
return -ENOMEM;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_arm_vmid_alloc_free(void)
|
||||||
|
{
|
||||||
|
kfree(vmid_map);
|
||||||
|
}
|
|
@ -252,7 +252,7 @@ int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot)
|
||||||
p = kzalloc(sizeof(*p), GFP_KERNEL);
|
p = kzalloc(sizeof(*p), GFP_KERNEL);
|
||||||
if (!p)
|
if (!p)
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
p->pfns = vzalloc(array_size(slot->npages, sizeof(*p->pfns)));
|
p->pfns = vcalloc(slot->npages, sizeof(*p->pfns));
|
||||||
if (!p->pfns) {
|
if (!p->pfns) {
|
||||||
kfree(p);
|
kfree(p);
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
|
|
@ -228,6 +228,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
void __kvm_riscv_unpriv_trap(void);
|
void __kvm_riscv_unpriv_trap(void);
|
||||||
|
|
||||||
|
void kvm_riscv_vcpu_wfi(struct kvm_vcpu *vcpu);
|
||||||
unsigned long kvm_riscv_vcpu_unpriv_read(struct kvm_vcpu *vcpu,
|
unsigned long kvm_riscv_vcpu_unpriv_read(struct kvm_vcpu *vcpu,
|
||||||
bool read_insn,
|
bool read_insn,
|
||||||
unsigned long guest_addr,
|
unsigned long guest_addr,
|
||||||
|
|
|
@ -12,7 +12,7 @@
|
||||||
#define KVM_SBI_IMPID 3
|
#define KVM_SBI_IMPID 3
|
||||||
|
|
||||||
#define KVM_SBI_VERSION_MAJOR 0
|
#define KVM_SBI_VERSION_MAJOR 0
|
||||||
#define KVM_SBI_VERSION_MINOR 2
|
#define KVM_SBI_VERSION_MINOR 3
|
||||||
|
|
||||||
struct kvm_vcpu_sbi_extension {
|
struct kvm_vcpu_sbi_extension {
|
||||||
unsigned long extid_start;
|
unsigned long extid_start;
|
||||||
|
@ -28,6 +28,9 @@ struct kvm_vcpu_sbi_extension {
|
||||||
};
|
};
|
||||||
|
|
||||||
void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
|
void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
|
||||||
|
void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
|
||||||
|
struct kvm_run *run,
|
||||||
|
u32 type, u64 flags);
|
||||||
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(unsigned long extid);
|
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(unsigned long extid);
|
||||||
|
|
||||||
#endif /* __RISCV_KVM_VCPU_SBI_H__ */
|
#endif /* __RISCV_KVM_VCPU_SBI_H__ */
|
||||||
|
|
|
@ -71,15 +71,32 @@ enum sbi_ext_hsm_fid {
|
||||||
SBI_EXT_HSM_HART_START = 0,
|
SBI_EXT_HSM_HART_START = 0,
|
||||||
SBI_EXT_HSM_HART_STOP,
|
SBI_EXT_HSM_HART_STOP,
|
||||||
SBI_EXT_HSM_HART_STATUS,
|
SBI_EXT_HSM_HART_STATUS,
|
||||||
|
SBI_EXT_HSM_HART_SUSPEND,
|
||||||
};
|
};
|
||||||
|
|
||||||
enum sbi_hsm_hart_status {
|
enum sbi_hsm_hart_state {
|
||||||
SBI_HSM_HART_STATUS_STARTED = 0,
|
SBI_HSM_STATE_STARTED = 0,
|
||||||
SBI_HSM_HART_STATUS_STOPPED,
|
SBI_HSM_STATE_STOPPED,
|
||||||
SBI_HSM_HART_STATUS_START_PENDING,
|
SBI_HSM_STATE_START_PENDING,
|
||||||
SBI_HSM_HART_STATUS_STOP_PENDING,
|
SBI_HSM_STATE_STOP_PENDING,
|
||||||
|
SBI_HSM_STATE_SUSPENDED,
|
||||||
|
SBI_HSM_STATE_SUSPEND_PENDING,
|
||||||
|
SBI_HSM_STATE_RESUME_PENDING,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#define SBI_HSM_SUSP_BASE_MASK 0x7fffffff
|
||||||
|
#define SBI_HSM_SUSP_NON_RET_BIT 0x80000000
|
||||||
|
#define SBI_HSM_SUSP_PLAT_BASE 0x10000000
|
||||||
|
|
||||||
|
#define SBI_HSM_SUSPEND_RET_DEFAULT 0x00000000
|
||||||
|
#define SBI_HSM_SUSPEND_RET_PLATFORM SBI_HSM_SUSP_PLAT_BASE
|
||||||
|
#define SBI_HSM_SUSPEND_RET_LAST SBI_HSM_SUSP_BASE_MASK
|
||||||
|
#define SBI_HSM_SUSPEND_NON_RET_DEFAULT SBI_HSM_SUSP_NON_RET_BIT
|
||||||
|
#define SBI_HSM_SUSPEND_NON_RET_PLATFORM (SBI_HSM_SUSP_NON_RET_BIT | \
|
||||||
|
SBI_HSM_SUSP_PLAT_BASE)
|
||||||
|
#define SBI_HSM_SUSPEND_NON_RET_LAST (SBI_HSM_SUSP_NON_RET_BIT | \
|
||||||
|
SBI_HSM_SUSP_BASE_MASK)
|
||||||
|
|
||||||
enum sbi_ext_srst_fid {
|
enum sbi_ext_srst_fid {
|
||||||
SBI_EXT_SRST_RESET = 0,
|
SBI_EXT_SRST_RESET = 0,
|
||||||
};
|
};
|
||||||
|
|
|
@ -111,7 +111,7 @@ static int sbi_cpu_is_stopped(unsigned int cpuid)
|
||||||
|
|
||||||
rc = sbi_hsm_hart_get_status(hartid);
|
rc = sbi_hsm_hart_get_status(hartid);
|
||||||
|
|
||||||
if (rc == SBI_HSM_HART_STATUS_STOPPED)
|
if (rc == SBI_HSM_STATE_STOPPED)
|
||||||
return 0;
|
return 0;
|
||||||
return rc;
|
return rc;
|
||||||
}
|
}
|
||||||
|
|
|
@ -144,12 +144,7 @@ static int system_opcode_insn(struct kvm_vcpu *vcpu,
|
||||||
{
|
{
|
||||||
if ((insn & INSN_MASK_WFI) == INSN_MATCH_WFI) {
|
if ((insn & INSN_MASK_WFI) == INSN_MATCH_WFI) {
|
||||||
vcpu->stat.wfi_exit_stat++;
|
vcpu->stat.wfi_exit_stat++;
|
||||||
if (!kvm_arch_vcpu_runnable(vcpu)) {
|
kvm_riscv_vcpu_wfi(vcpu);
|
||||||
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->arch.srcu_idx);
|
|
||||||
kvm_vcpu_halt(vcpu);
|
|
||||||
vcpu->arch.srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
|
|
||||||
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
|
|
||||||
}
|
|
||||||
vcpu->arch.guest_context.sepc += INSN_LEN(insn);
|
vcpu->arch.guest_context.sepc += INSN_LEN(insn);
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
@ -453,6 +448,21 @@ static int stage2_page_fault(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* kvm_riscv_vcpu_wfi -- Emulate wait for interrupt (WFI) behaviour
|
||||||
|
*
|
||||||
|
* @vcpu: The VCPU pointer
|
||||||
|
*/
|
||||||
|
void kvm_riscv_vcpu_wfi(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
if (!kvm_arch_vcpu_runnable(vcpu)) {
|
||||||
|
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->arch.srcu_idx);
|
||||||
|
kvm_vcpu_halt(vcpu);
|
||||||
|
vcpu->arch.srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
|
||||||
|
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* kvm_riscv_vcpu_unpriv_read -- Read machine word from Guest memory
|
* kvm_riscv_vcpu_unpriv_read -- Read machine word from Guest memory
|
||||||
*
|
*
|
||||||
|
|
|
@ -45,6 +45,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence;
|
||||||
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
|
||||||
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
|
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
|
||||||
|
@ -55,6 +56,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
|
||||||
&vcpu_sbi_ext_time,
|
&vcpu_sbi_ext_time,
|
||||||
&vcpu_sbi_ext_ipi,
|
&vcpu_sbi_ext_ipi,
|
||||||
&vcpu_sbi_ext_rfence,
|
&vcpu_sbi_ext_rfence,
|
||||||
|
&vcpu_sbi_ext_srst,
|
||||||
&vcpu_sbi_ext_hsm,
|
&vcpu_sbi_ext_hsm,
|
||||||
&vcpu_sbi_ext_experimental,
|
&vcpu_sbi_ext_experimental,
|
||||||
&vcpu_sbi_ext_vendor,
|
&vcpu_sbi_ext_vendor,
|
||||||
|
@ -79,6 +81,23 @@ void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||||
run->riscv_sbi.ret[1] = cp->a1;
|
run->riscv_sbi.ret[1] = cp->a1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
|
||||||
|
struct kvm_run *run,
|
||||||
|
u32 type, u64 flags)
|
||||||
|
{
|
||||||
|
unsigned long i;
|
||||||
|
struct kvm_vcpu *tmp;
|
||||||
|
|
||||||
|
kvm_for_each_vcpu(i, tmp, vcpu->kvm)
|
||||||
|
tmp->arch.power_off = true;
|
||||||
|
kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
|
||||||
|
|
||||||
|
memset(&run->system_event, 0, sizeof(run->system_event));
|
||||||
|
run->system_event.type = type;
|
||||||
|
run->system_event.flags = flags;
|
||||||
|
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
|
||||||
|
}
|
||||||
|
|
||||||
int kvm_riscv_vcpu_sbi_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
int kvm_riscv_vcpu_sbi_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||||
{
|
{
|
||||||
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
|
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
|
||||||
|
|
|
@ -60,9 +60,11 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
|
||||||
if (!target_vcpu)
|
if (!target_vcpu)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
if (!target_vcpu->arch.power_off)
|
if (!target_vcpu->arch.power_off)
|
||||||
return SBI_HSM_HART_STATUS_STARTED;
|
return SBI_HSM_STATE_STARTED;
|
||||||
|
else if (vcpu->stat.generic.blocking)
|
||||||
|
return SBI_HSM_STATE_SUSPENDED;
|
||||||
else
|
else
|
||||||
return SBI_HSM_HART_STATUS_STOPPED;
|
return SBI_HSM_STATE_STOPPED;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
|
@ -91,6 +93,18 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
ret = 0;
|
ret = 0;
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
|
case SBI_EXT_HSM_HART_SUSPEND:
|
||||||
|
switch (cp->a0) {
|
||||||
|
case SBI_HSM_SUSPEND_RET_DEFAULT:
|
||||||
|
kvm_riscv_vcpu_wfi(vcpu);
|
||||||
|
break;
|
||||||
|
case SBI_HSM_SUSPEND_NON_RET_DEFAULT:
|
||||||
|
ret = -EOPNOTSUPP;
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
ret = -EINVAL;
|
||||||
|
}
|
||||||
|
break;
|
||||||
default:
|
default:
|
||||||
ret = -EOPNOTSUPP;
|
ret = -EOPNOTSUPP;
|
||||||
}
|
}
|
||||||
|
|
|
@ -130,3 +130,47 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
|
||||||
.extid_end = SBI_EXT_RFENCE,
|
.extid_end = SBI_EXT_RFENCE,
|
||||||
.handler = kvm_sbi_ext_rfence_handler,
|
.handler = kvm_sbi_ext_rfence_handler,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
|
||||||
|
struct kvm_run *run,
|
||||||
|
unsigned long *out_val,
|
||||||
|
struct kvm_cpu_trap *utrap, bool *exit)
|
||||||
|
{
|
||||||
|
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
|
||||||
|
unsigned long funcid = cp->a6;
|
||||||
|
u32 reason = cp->a1;
|
||||||
|
u32 type = cp->a0;
|
||||||
|
int ret = 0;
|
||||||
|
|
||||||
|
switch (funcid) {
|
||||||
|
case SBI_EXT_SRST_RESET:
|
||||||
|
switch (type) {
|
||||||
|
case SBI_SRST_RESET_TYPE_SHUTDOWN:
|
||||||
|
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
|
||||||
|
KVM_SYSTEM_EVENT_SHUTDOWN,
|
||||||
|
reason);
|
||||||
|
*exit = true;
|
||||||
|
break;
|
||||||
|
case SBI_SRST_RESET_TYPE_COLD_REBOOT:
|
||||||
|
case SBI_SRST_RESET_TYPE_WARM_REBOOT:
|
||||||
|
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
|
||||||
|
KVM_SYSTEM_EVENT_RESET,
|
||||||
|
reason);
|
||||||
|
*exit = true;
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
ret = -EOPNOTSUPP;
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
ret = -EOPNOTSUPP;
|
||||||
|
}
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
|
||||||
|
.extid_start = SBI_EXT_SRST,
|
||||||
|
.extid_end = SBI_EXT_SRST,
|
||||||
|
.handler = kvm_sbi_ext_srst_handler,
|
||||||
|
};
|
||||||
|
|
|
@ -14,21 +14,6 @@
|
||||||
#include <asm/kvm_vcpu_timer.h>
|
#include <asm/kvm_vcpu_timer.h>
|
||||||
#include <asm/kvm_vcpu_sbi.h>
|
#include <asm/kvm_vcpu_sbi.h>
|
||||||
|
|
||||||
static void kvm_sbi_system_shutdown(struct kvm_vcpu *vcpu,
|
|
||||||
struct kvm_run *run, u32 type)
|
|
||||||
{
|
|
||||||
unsigned long i;
|
|
||||||
struct kvm_vcpu *tmp;
|
|
||||||
|
|
||||||
kvm_for_each_vcpu(i, tmp, vcpu->kvm)
|
|
||||||
tmp->arch.power_off = true;
|
|
||||||
kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
|
|
||||||
|
|
||||||
memset(&run->system_event, 0, sizeof(run->system_event));
|
|
||||||
run->system_event.type = type;
|
|
||||||
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
unsigned long *out_val,
|
unsigned long *out_val,
|
||||||
struct kvm_cpu_trap *utrap,
|
struct kvm_cpu_trap *utrap,
|
||||||
|
@ -80,7 +65,8 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
case SBI_EXT_0_1_SHUTDOWN:
|
case SBI_EXT_0_1_SHUTDOWN:
|
||||||
kvm_sbi_system_shutdown(vcpu, run, KVM_SYSTEM_EVENT_SHUTDOWN);
|
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
|
||||||
|
KVM_SYSTEM_EVENT_SHUTDOWN, 0);
|
||||||
*exit = true;
|
*exit = true;
|
||||||
break;
|
break;
|
||||||
case SBI_EXT_0_1_REMOTE_FENCE_I:
|
case SBI_EXT_0_1_REMOTE_FENCE_I:
|
||||||
|
@ -111,7 +97,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
||||||
default:
|
default:
|
||||||
ret = -EINVAL;
|
ret = -EINVAL;
|
||||||
break;
|
break;
|
||||||
};
|
}
|
||||||
|
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
|
@ -41,33 +41,37 @@ ENTRY(__kvm_riscv_switch_to)
|
||||||
REG_S s10, (KVM_ARCH_HOST_S10)(a0)
|
REG_S s10, (KVM_ARCH_HOST_S10)(a0)
|
||||||
REG_S s11, (KVM_ARCH_HOST_S11)(a0)
|
REG_S s11, (KVM_ARCH_HOST_S11)(a0)
|
||||||
|
|
||||||
/* Save Host and Restore Guest SSTATUS */
|
/* Load Guest CSR values */
|
||||||
REG_L t0, (KVM_ARCH_GUEST_SSTATUS)(a0)
|
REG_L t0, (KVM_ARCH_GUEST_SSTATUS)(a0)
|
||||||
|
REG_L t1, (KVM_ARCH_GUEST_HSTATUS)(a0)
|
||||||
|
REG_L t2, (KVM_ARCH_GUEST_SCOUNTEREN)(a0)
|
||||||
|
la t4, __kvm_switch_return
|
||||||
|
REG_L t5, (KVM_ARCH_GUEST_SEPC)(a0)
|
||||||
|
|
||||||
|
/* Save Host and Restore Guest SSTATUS */
|
||||||
csrrw t0, CSR_SSTATUS, t0
|
csrrw t0, CSR_SSTATUS, t0
|
||||||
REG_S t0, (KVM_ARCH_HOST_SSTATUS)(a0)
|
|
||||||
|
|
||||||
/* Save Host and Restore Guest HSTATUS */
|
/* Save Host and Restore Guest HSTATUS */
|
||||||
REG_L t1, (KVM_ARCH_GUEST_HSTATUS)(a0)
|
|
||||||
csrrw t1, CSR_HSTATUS, t1
|
csrrw t1, CSR_HSTATUS, t1
|
||||||
REG_S t1, (KVM_ARCH_HOST_HSTATUS)(a0)
|
|
||||||
|
|
||||||
/* Save Host and Restore Guest SCOUNTEREN */
|
/* Save Host and Restore Guest SCOUNTEREN */
|
||||||
REG_L t2, (KVM_ARCH_GUEST_SCOUNTEREN)(a0)
|
|
||||||
csrrw t2, CSR_SCOUNTEREN, t2
|
csrrw t2, CSR_SCOUNTEREN, t2
|
||||||
REG_S t2, (KVM_ARCH_HOST_SCOUNTEREN)(a0)
|
|
||||||
|
/* Save Host STVEC and change it to return path */
|
||||||
|
csrrw t4, CSR_STVEC, t4
|
||||||
|
|
||||||
/* Save Host SSCRATCH and change it to struct kvm_vcpu_arch pointer */
|
/* Save Host SSCRATCH and change it to struct kvm_vcpu_arch pointer */
|
||||||
csrrw t3, CSR_SSCRATCH, a0
|
csrrw t3, CSR_SSCRATCH, a0
|
||||||
REG_S t3, (KVM_ARCH_HOST_SSCRATCH)(a0)
|
|
||||||
|
|
||||||
/* Save Host STVEC and change it to return path */
|
|
||||||
la t4, __kvm_switch_return
|
|
||||||
csrrw t4, CSR_STVEC, t4
|
|
||||||
REG_S t4, (KVM_ARCH_HOST_STVEC)(a0)
|
|
||||||
|
|
||||||
/* Restore Guest SEPC */
|
/* Restore Guest SEPC */
|
||||||
REG_L t0, (KVM_ARCH_GUEST_SEPC)(a0)
|
csrw CSR_SEPC, t5
|
||||||
csrw CSR_SEPC, t0
|
|
||||||
|
/* Store Host CSR values */
|
||||||
|
REG_S t0, (KVM_ARCH_HOST_SSTATUS)(a0)
|
||||||
|
REG_S t1, (KVM_ARCH_HOST_HSTATUS)(a0)
|
||||||
|
REG_S t2, (KVM_ARCH_HOST_SCOUNTEREN)(a0)
|
||||||
|
REG_S t3, (KVM_ARCH_HOST_SSCRATCH)(a0)
|
||||||
|
REG_S t4, (KVM_ARCH_HOST_STVEC)(a0)
|
||||||
|
|
||||||
/* Restore Guest GPRs (except A0) */
|
/* Restore Guest GPRs (except A0) */
|
||||||
REG_L ra, (KVM_ARCH_GUEST_RA)(a0)
|
REG_L ra, (KVM_ARCH_GUEST_RA)(a0)
|
||||||
|
@ -145,32 +149,36 @@ __kvm_switch_return:
|
||||||
REG_S t5, (KVM_ARCH_GUEST_T5)(a0)
|
REG_S t5, (KVM_ARCH_GUEST_T5)(a0)
|
||||||
REG_S t6, (KVM_ARCH_GUEST_T6)(a0)
|
REG_S t6, (KVM_ARCH_GUEST_T6)(a0)
|
||||||
|
|
||||||
|
/* Load Host CSR values */
|
||||||
|
REG_L t1, (KVM_ARCH_HOST_STVEC)(a0)
|
||||||
|
REG_L t2, (KVM_ARCH_HOST_SSCRATCH)(a0)
|
||||||
|
REG_L t3, (KVM_ARCH_HOST_SCOUNTEREN)(a0)
|
||||||
|
REG_L t4, (KVM_ARCH_HOST_HSTATUS)(a0)
|
||||||
|
REG_L t5, (KVM_ARCH_HOST_SSTATUS)(a0)
|
||||||
|
|
||||||
/* Save Guest SEPC */
|
/* Save Guest SEPC */
|
||||||
csrr t0, CSR_SEPC
|
csrr t0, CSR_SEPC
|
||||||
REG_S t0, (KVM_ARCH_GUEST_SEPC)(a0)
|
|
||||||
|
|
||||||
/* Restore Host STVEC */
|
|
||||||
REG_L t1, (KVM_ARCH_HOST_STVEC)(a0)
|
|
||||||
csrw CSR_STVEC, t1
|
|
||||||
|
|
||||||
/* Save Guest A0 and Restore Host SSCRATCH */
|
/* Save Guest A0 and Restore Host SSCRATCH */
|
||||||
REG_L t2, (KVM_ARCH_HOST_SSCRATCH)(a0)
|
|
||||||
csrrw t2, CSR_SSCRATCH, t2
|
csrrw t2, CSR_SSCRATCH, t2
|
||||||
REG_S t2, (KVM_ARCH_GUEST_A0)(a0)
|
|
||||||
|
/* Restore Host STVEC */
|
||||||
|
csrw CSR_STVEC, t1
|
||||||
|
|
||||||
/* Save Guest and Restore Host SCOUNTEREN */
|
/* Save Guest and Restore Host SCOUNTEREN */
|
||||||
REG_L t3, (KVM_ARCH_HOST_SCOUNTEREN)(a0)
|
|
||||||
csrrw t3, CSR_SCOUNTEREN, t3
|
csrrw t3, CSR_SCOUNTEREN, t3
|
||||||
REG_S t3, (KVM_ARCH_GUEST_SCOUNTEREN)(a0)
|
|
||||||
|
|
||||||
/* Save Guest and Restore Host HSTATUS */
|
/* Save Guest and Restore Host HSTATUS */
|
||||||
REG_L t4, (KVM_ARCH_HOST_HSTATUS)(a0)
|
|
||||||
csrrw t4, CSR_HSTATUS, t4
|
csrrw t4, CSR_HSTATUS, t4
|
||||||
REG_S t4, (KVM_ARCH_GUEST_HSTATUS)(a0)
|
|
||||||
|
|
||||||
/* Save Guest and Restore Host SSTATUS */
|
/* Save Guest and Restore Host SSTATUS */
|
||||||
REG_L t5, (KVM_ARCH_HOST_SSTATUS)(a0)
|
|
||||||
csrrw t5, CSR_SSTATUS, t5
|
csrrw t5, CSR_SSTATUS, t5
|
||||||
|
|
||||||
|
/* Store Guest CSR values */
|
||||||
|
REG_S t0, (KVM_ARCH_GUEST_SEPC)(a0)
|
||||||
|
REG_S t2, (KVM_ARCH_GUEST_A0)(a0)
|
||||||
|
REG_S t3, (KVM_ARCH_GUEST_SCOUNTEREN)(a0)
|
||||||
|
REG_S t4, (KVM_ARCH_GUEST_HSTATUS)(a0)
|
||||||
REG_S t5, (KVM_ARCH_GUEST_SSTATUS)(a0)
|
REG_S t5, (KVM_ARCH_GUEST_SSTATUS)(a0)
|
||||||
|
|
||||||
/* Restore Host GPRs (except A0 and T0-T6) */
|
/* Restore Host GPRs (except A0 and T0-T6) */
|
||||||
|
|
|
@ -12,6 +12,8 @@
|
||||||
|
|
||||||
#define CR0_CLOCK_COMPARATOR_SIGN BIT(63 - 10)
|
#define CR0_CLOCK_COMPARATOR_SIGN BIT(63 - 10)
|
||||||
#define CR0_LOW_ADDRESS_PROTECTION BIT(63 - 35)
|
#define CR0_LOW_ADDRESS_PROTECTION BIT(63 - 35)
|
||||||
|
#define CR0_FETCH_PROTECTION_OVERRIDE BIT(63 - 38)
|
||||||
|
#define CR0_STORAGE_PROTECTION_OVERRIDE BIT(63 - 39)
|
||||||
#define CR0_EMERGENCY_SIGNAL_SUBMASK BIT(63 - 49)
|
#define CR0_EMERGENCY_SIGNAL_SUBMASK BIT(63 - 49)
|
||||||
#define CR0_EXTERNAL_CALL_SUBMASK BIT(63 - 50)
|
#define CR0_EXTERNAL_CALL_SUBMASK BIT(63 - 50)
|
||||||
#define CR0_CLOCK_COMPARATOR_SUBMASK BIT(63 - 52)
|
#define CR0_CLOCK_COMPARATOR_SUBMASK BIT(63 - 52)
|
||||||
|
|
|
@ -45,6 +45,8 @@
|
||||||
#define KVM_REQ_START_MIGRATION KVM_ARCH_REQ(3)
|
#define KVM_REQ_START_MIGRATION KVM_ARCH_REQ(3)
|
||||||
#define KVM_REQ_STOP_MIGRATION KVM_ARCH_REQ(4)
|
#define KVM_REQ_STOP_MIGRATION KVM_ARCH_REQ(4)
|
||||||
#define KVM_REQ_VSIE_RESTART KVM_ARCH_REQ(5)
|
#define KVM_REQ_VSIE_RESTART KVM_ARCH_REQ(5)
|
||||||
|
#define KVM_REQ_REFRESH_GUEST_PREFIX \
|
||||||
|
KVM_ARCH_REQ_FLAGS(6, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||||
|
|
||||||
#define SIGP_CTRL_C 0x80
|
#define SIGP_CTRL_C 0x80
|
||||||
#define SIGP_CTRL_SCN_MASK 0x3f
|
#define SIGP_CTRL_SCN_MASK 0x3f
|
||||||
|
|
|
@ -20,6 +20,8 @@
|
||||||
#define PAGE_SIZE _PAGE_SIZE
|
#define PAGE_SIZE _PAGE_SIZE
|
||||||
#define PAGE_MASK _PAGE_MASK
|
#define PAGE_MASK _PAGE_MASK
|
||||||
#define PAGE_DEFAULT_ACC 0
|
#define PAGE_DEFAULT_ACC 0
|
||||||
|
/* storage-protection override */
|
||||||
|
#define PAGE_SPO_ACC 9
|
||||||
#define PAGE_DEFAULT_KEY (PAGE_DEFAULT_ACC << 4)
|
#define PAGE_DEFAULT_KEY (PAGE_DEFAULT_ACC << 4)
|
||||||
|
|
||||||
#define HPAGE_SHIFT 20
|
#define HPAGE_SHIFT 20
|
||||||
|
|
|
@ -32,6 +32,28 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n);
|
||||||
#define INLINE_COPY_TO_USER
|
#define INLINE_COPY_TO_USER
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
unsigned long __must_check
|
||||||
|
_copy_from_user_key(void *to, const void __user *from, unsigned long n, unsigned long key);
|
||||||
|
|
||||||
|
static __always_inline unsigned long __must_check
|
||||||
|
copy_from_user_key(void *to, const void __user *from, unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
if (likely(check_copy_size(to, n, false)))
|
||||||
|
n = _copy_from_user_key(to, from, n, key);
|
||||||
|
return n;
|
||||||
|
}
|
||||||
|
|
||||||
|
unsigned long __must_check
|
||||||
|
_copy_to_user_key(void __user *to, const void *from, unsigned long n, unsigned long key);
|
||||||
|
|
||||||
|
static __always_inline unsigned long __must_check
|
||||||
|
copy_to_user_key(void __user *to, const void *from, unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
if (likely(check_copy_size(from, n, true)))
|
||||||
|
n = _copy_to_user_key(to, from, n, key);
|
||||||
|
return n;
|
||||||
|
}
|
||||||
|
|
||||||
int __put_user_bad(void) __attribute__((noreturn));
|
int __put_user_bad(void) __attribute__((noreturn));
|
||||||
int __get_user_bad(void) __attribute__((noreturn));
|
int __get_user_bad(void) __attribute__((noreturn));
|
||||||
|
|
||||||
|
|
|
@ -80,6 +80,7 @@ enum uv_cmds_inst {
|
||||||
|
|
||||||
enum uv_feat_ind {
|
enum uv_feat_ind {
|
||||||
BIT_UV_FEAT_MISC = 0,
|
BIT_UV_FEAT_MISC = 0,
|
||||||
|
BIT_UV_FEAT_AIV = 1,
|
||||||
};
|
};
|
||||||
|
|
||||||
struct uv_cb_header {
|
struct uv_cb_header {
|
||||||
|
|
|
@ -10,6 +10,7 @@
|
||||||
#include <linux/mm_types.h>
|
#include <linux/mm_types.h>
|
||||||
#include <linux/err.h>
|
#include <linux/err.h>
|
||||||
#include <linux/pgtable.h>
|
#include <linux/pgtable.h>
|
||||||
|
#include <linux/bitfield.h>
|
||||||
|
|
||||||
#include <asm/gmap.h>
|
#include <asm/gmap.h>
|
||||||
#include "kvm-s390.h"
|
#include "kvm-s390.h"
|
||||||
|
@ -794,6 +795,108 @@ static int low_address_protection_enabled(struct kvm_vcpu *vcpu,
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int vm_check_access_key(struct kvm *kvm, u8 access_key,
|
||||||
|
enum gacc_mode mode, gpa_t gpa)
|
||||||
|
{
|
||||||
|
u8 storage_key, access_control;
|
||||||
|
bool fetch_protected;
|
||||||
|
unsigned long hva;
|
||||||
|
int r;
|
||||||
|
|
||||||
|
if (access_key == 0)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
hva = gfn_to_hva(kvm, gpa_to_gfn(gpa));
|
||||||
|
if (kvm_is_error_hva(hva))
|
||||||
|
return PGM_ADDRESSING;
|
||||||
|
|
||||||
|
mmap_read_lock(current->mm);
|
||||||
|
r = get_guest_storage_key(current->mm, hva, &storage_key);
|
||||||
|
mmap_read_unlock(current->mm);
|
||||||
|
if (r)
|
||||||
|
return r;
|
||||||
|
access_control = FIELD_GET(_PAGE_ACC_BITS, storage_key);
|
||||||
|
if (access_control == access_key)
|
||||||
|
return 0;
|
||||||
|
fetch_protected = storage_key & _PAGE_FP_BIT;
|
||||||
|
if ((mode == GACC_FETCH || mode == GACC_IFETCH) && !fetch_protected)
|
||||||
|
return 0;
|
||||||
|
return PGM_PROTECTION;
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool fetch_prot_override_applicable(struct kvm_vcpu *vcpu, enum gacc_mode mode,
|
||||||
|
union asce asce)
|
||||||
|
{
|
||||||
|
psw_t *psw = &vcpu->arch.sie_block->gpsw;
|
||||||
|
unsigned long override;
|
||||||
|
|
||||||
|
if (mode == GACC_FETCH || mode == GACC_IFETCH) {
|
||||||
|
/* check if fetch protection override enabled */
|
||||||
|
override = vcpu->arch.sie_block->gcr[0];
|
||||||
|
override &= CR0_FETCH_PROTECTION_OVERRIDE;
|
||||||
|
/* not applicable if subject to DAT && private space */
|
||||||
|
override = override && !(psw_bits(*psw).dat && asce.p);
|
||||||
|
return override;
|
||||||
|
}
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool fetch_prot_override_applies(unsigned long ga, unsigned int len)
|
||||||
|
{
|
||||||
|
return ga < 2048 && ga + len <= 2048;
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool storage_prot_override_applicable(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
/* check if storage protection override enabled */
|
||||||
|
return vcpu->arch.sie_block->gcr[0] & CR0_STORAGE_PROTECTION_OVERRIDE;
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool storage_prot_override_applies(u8 access_control)
|
||||||
|
{
|
||||||
|
/* matches special storage protection override key (9) -> allow */
|
||||||
|
return access_control == PAGE_SPO_ACC;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int vcpu_check_access_key(struct kvm_vcpu *vcpu, u8 access_key,
|
||||||
|
enum gacc_mode mode, union asce asce, gpa_t gpa,
|
||||||
|
unsigned long ga, unsigned int len)
|
||||||
|
{
|
||||||
|
u8 storage_key, access_control;
|
||||||
|
unsigned long hva;
|
||||||
|
int r;
|
||||||
|
|
||||||
|
/* access key 0 matches any storage key -> allow */
|
||||||
|
if (access_key == 0)
|
||||||
|
return 0;
|
||||||
|
/*
|
||||||
|
* caller needs to ensure that gfn is accessible, so we can
|
||||||
|
* assume that this cannot fail
|
||||||
|
*/
|
||||||
|
hva = gfn_to_hva(vcpu->kvm, gpa_to_gfn(gpa));
|
||||||
|
mmap_read_lock(current->mm);
|
||||||
|
r = get_guest_storage_key(current->mm, hva, &storage_key);
|
||||||
|
mmap_read_unlock(current->mm);
|
||||||
|
if (r)
|
||||||
|
return r;
|
||||||
|
access_control = FIELD_GET(_PAGE_ACC_BITS, storage_key);
|
||||||
|
/* access key matches storage key -> allow */
|
||||||
|
if (access_control == access_key)
|
||||||
|
return 0;
|
||||||
|
if (mode == GACC_FETCH || mode == GACC_IFETCH) {
|
||||||
|
/* it is a fetch and fetch protection is off -> allow */
|
||||||
|
if (!(storage_key & _PAGE_FP_BIT))
|
||||||
|
return 0;
|
||||||
|
if (fetch_prot_override_applicable(vcpu, mode, asce) &&
|
||||||
|
fetch_prot_override_applies(ga, len))
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
if (storage_prot_override_applicable(vcpu) &&
|
||||||
|
storage_prot_override_applies(access_control))
|
||||||
|
return 0;
|
||||||
|
return PGM_PROTECTION;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* guest_range_to_gpas() - Calculate guest physical addresses of page fragments
|
* guest_range_to_gpas() - Calculate guest physical addresses of page fragments
|
||||||
* covering a logical range
|
* covering a logical range
|
||||||
|
@ -804,6 +907,7 @@ static int low_address_protection_enabled(struct kvm_vcpu *vcpu,
|
||||||
* @len: length of range in bytes
|
* @len: length of range in bytes
|
||||||
* @asce: address-space-control element to use for translation
|
* @asce: address-space-control element to use for translation
|
||||||
* @mode: access mode
|
* @mode: access mode
|
||||||
|
* @access_key: access key to mach the range's storage keys against
|
||||||
*
|
*
|
||||||
* Translate a logical range to a series of guest absolute addresses,
|
* Translate a logical range to a series of guest absolute addresses,
|
||||||
* such that the concatenation of page fragments starting at each gpa make up
|
* such that the concatenation of page fragments starting at each gpa make up
|
||||||
|
@ -830,7 +934,8 @@ static int low_address_protection_enabled(struct kvm_vcpu *vcpu,
|
||||||
*/
|
*/
|
||||||
static int guest_range_to_gpas(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
static int guest_range_to_gpas(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
unsigned long *gpas, unsigned long len,
|
unsigned long *gpas, unsigned long len,
|
||||||
const union asce asce, enum gacc_mode mode)
|
const union asce asce, enum gacc_mode mode,
|
||||||
|
u8 access_key)
|
||||||
{
|
{
|
||||||
psw_t *psw = &vcpu->arch.sie_block->gpsw;
|
psw_t *psw = &vcpu->arch.sie_block->gpsw;
|
||||||
unsigned int offset = offset_in_page(ga);
|
unsigned int offset = offset_in_page(ga);
|
||||||
|
@ -857,6 +962,10 @@ static int guest_range_to_gpas(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
}
|
}
|
||||||
if (rc)
|
if (rc)
|
||||||
return trans_exc(vcpu, rc, ga, ar, mode, prot);
|
return trans_exc(vcpu, rc, ga, ar, mode, prot);
|
||||||
|
rc = vcpu_check_access_key(vcpu, access_key, mode, asce, gpa, ga,
|
||||||
|
fragment_len);
|
||||||
|
if (rc)
|
||||||
|
return trans_exc(vcpu, rc, ga, ar, mode, PROT_TYPE_KEYC);
|
||||||
if (gpas)
|
if (gpas)
|
||||||
*gpas++ = gpa;
|
*gpas++ = gpa;
|
||||||
offset = 0;
|
offset = 0;
|
||||||
|
@ -880,16 +989,74 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
|
||||||
return rc;
|
return rc;
|
||||||
}
|
}
|
||||||
|
|
||||||
int access_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
static int
|
||||||
unsigned long len, enum gacc_mode mode)
|
access_guest_page_with_key(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
|
||||||
|
void *data, unsigned int len, u8 access_key)
|
||||||
|
{
|
||||||
|
struct kvm_memory_slot *slot;
|
||||||
|
bool writable;
|
||||||
|
gfn_t gfn;
|
||||||
|
hva_t hva;
|
||||||
|
int rc;
|
||||||
|
|
||||||
|
gfn = gpa >> PAGE_SHIFT;
|
||||||
|
slot = gfn_to_memslot(kvm, gfn);
|
||||||
|
hva = gfn_to_hva_memslot_prot(slot, gfn, &writable);
|
||||||
|
|
||||||
|
if (kvm_is_error_hva(hva))
|
||||||
|
return PGM_ADDRESSING;
|
||||||
|
/*
|
||||||
|
* Check if it's a ro memslot, even tho that can't occur (they're unsupported).
|
||||||
|
* Don't try to actually handle that case.
|
||||||
|
*/
|
||||||
|
if (!writable && mode == GACC_STORE)
|
||||||
|
return -EOPNOTSUPP;
|
||||||
|
hva += offset_in_page(gpa);
|
||||||
|
if (mode == GACC_STORE)
|
||||||
|
rc = copy_to_user_key((void __user *)hva, data, len, access_key);
|
||||||
|
else
|
||||||
|
rc = copy_from_user_key(data, (void __user *)hva, len, access_key);
|
||||||
|
if (rc)
|
||||||
|
return PGM_PROTECTION;
|
||||||
|
if (mode == GACC_STORE)
|
||||||
|
mark_page_dirty_in_slot(kvm, slot, gfn);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int access_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, void *data,
|
||||||
|
unsigned long len, enum gacc_mode mode, u8 access_key)
|
||||||
|
{
|
||||||
|
int offset = offset_in_page(gpa);
|
||||||
|
int fragment_len;
|
||||||
|
int rc;
|
||||||
|
|
||||||
|
while (min(PAGE_SIZE - offset, len) > 0) {
|
||||||
|
fragment_len = min(PAGE_SIZE - offset, len);
|
||||||
|
rc = access_guest_page_with_key(kvm, mode, gpa, data, fragment_len, access_key);
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
|
offset = 0;
|
||||||
|
len -= fragment_len;
|
||||||
|
data += fragment_len;
|
||||||
|
gpa += fragment_len;
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
|
void *data, unsigned long len, enum gacc_mode mode,
|
||||||
|
u8 access_key)
|
||||||
{
|
{
|
||||||
psw_t *psw = &vcpu->arch.sie_block->gpsw;
|
psw_t *psw = &vcpu->arch.sie_block->gpsw;
|
||||||
unsigned long nr_pages, idx;
|
unsigned long nr_pages, idx;
|
||||||
unsigned long gpa_array[2];
|
unsigned long gpa_array[2];
|
||||||
unsigned int fragment_len;
|
unsigned int fragment_len;
|
||||||
unsigned long *gpas;
|
unsigned long *gpas;
|
||||||
|
enum prot_type prot;
|
||||||
int need_ipte_lock;
|
int need_ipte_lock;
|
||||||
union asce asce;
|
union asce asce;
|
||||||
|
bool try_storage_prot_override;
|
||||||
|
bool try_fetch_prot_override;
|
||||||
int rc;
|
int rc;
|
||||||
|
|
||||||
if (!len)
|
if (!len)
|
||||||
|
@ -904,16 +1071,47 @@ int access_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
||||||
gpas = vmalloc(array_size(nr_pages, sizeof(unsigned long)));
|
gpas = vmalloc(array_size(nr_pages, sizeof(unsigned long)));
|
||||||
if (!gpas)
|
if (!gpas)
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
try_fetch_prot_override = fetch_prot_override_applicable(vcpu, mode, asce);
|
||||||
|
try_storage_prot_override = storage_prot_override_applicable(vcpu);
|
||||||
need_ipte_lock = psw_bits(*psw).dat && !asce.r;
|
need_ipte_lock = psw_bits(*psw).dat && !asce.r;
|
||||||
if (need_ipte_lock)
|
if (need_ipte_lock)
|
||||||
ipte_lock(vcpu);
|
ipte_lock(vcpu);
|
||||||
rc = guest_range_to_gpas(vcpu, ga, ar, gpas, len, asce, mode);
|
/*
|
||||||
for (idx = 0; idx < nr_pages && !rc; idx++) {
|
* Since we do the access further down ultimately via a move instruction
|
||||||
|
* that does key checking and returns an error in case of a protection
|
||||||
|
* violation, we don't need to do the check during address translation.
|
||||||
|
* Skip it by passing access key 0, which matches any storage key,
|
||||||
|
* obviating the need for any further checks. As a result the check is
|
||||||
|
* handled entirely in hardware on access, we only need to take care to
|
||||||
|
* forego key protection checking if fetch protection override applies or
|
||||||
|
* retry with the special key 9 in case of storage protection override.
|
||||||
|
*/
|
||||||
|
rc = guest_range_to_gpas(vcpu, ga, ar, gpas, len, asce, mode, 0);
|
||||||
|
if (rc)
|
||||||
|
goto out_unlock;
|
||||||
|
for (idx = 0; idx < nr_pages; idx++) {
|
||||||
fragment_len = min(PAGE_SIZE - offset_in_page(gpas[idx]), len);
|
fragment_len = min(PAGE_SIZE - offset_in_page(gpas[idx]), len);
|
||||||
rc = access_guest_page(vcpu->kvm, mode, gpas[idx], data, fragment_len);
|
if (try_fetch_prot_override && fetch_prot_override_applies(ga, fragment_len)) {
|
||||||
|
rc = access_guest_page(vcpu->kvm, mode, gpas[idx],
|
||||||
|
data, fragment_len);
|
||||||
|
} else {
|
||||||
|
rc = access_guest_page_with_key(vcpu->kvm, mode, gpas[idx],
|
||||||
|
data, fragment_len, access_key);
|
||||||
|
}
|
||||||
|
if (rc == PGM_PROTECTION && try_storage_prot_override)
|
||||||
|
rc = access_guest_page_with_key(vcpu->kvm, mode, gpas[idx],
|
||||||
|
data, fragment_len, PAGE_SPO_ACC);
|
||||||
|
if (rc == PGM_PROTECTION)
|
||||||
|
prot = PROT_TYPE_KEYC;
|
||||||
|
if (rc)
|
||||||
|
break;
|
||||||
len -= fragment_len;
|
len -= fragment_len;
|
||||||
data += fragment_len;
|
data += fragment_len;
|
||||||
|
ga = kvm_s390_logical_to_effective(vcpu, ga + fragment_len);
|
||||||
}
|
}
|
||||||
|
if (rc > 0)
|
||||||
|
rc = trans_exc(vcpu, rc, ga, ar, mode, prot);
|
||||||
|
out_unlock:
|
||||||
if (need_ipte_lock)
|
if (need_ipte_lock)
|
||||||
ipte_unlock(vcpu);
|
ipte_unlock(vcpu);
|
||||||
if (nr_pages > ARRAY_SIZE(gpa_array))
|
if (nr_pages > ARRAY_SIZE(gpa_array))
|
||||||
|
@ -940,12 +1138,13 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* guest_translate_address - translate guest logical into guest absolute address
|
* guest_translate_address_with_key - translate guest logical into guest absolute address
|
||||||
* @vcpu: virtual cpu
|
* @vcpu: virtual cpu
|
||||||
* @gva: Guest virtual address
|
* @gva: Guest virtual address
|
||||||
* @ar: Access register
|
* @ar: Access register
|
||||||
* @gpa: Guest physical address
|
* @gpa: Guest physical address
|
||||||
* @mode: Translation access mode
|
* @mode: Translation access mode
|
||||||
|
* @access_key: access key to mach the storage key with
|
||||||
*
|
*
|
||||||
* Parameter semantics are the same as the ones from guest_translate.
|
* Parameter semantics are the same as the ones from guest_translate.
|
||||||
* The memory contents at the guest address are not changed.
|
* The memory contents at the guest address are not changed.
|
||||||
|
@ -953,8 +1152,9 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
||||||
* Note: The IPTE lock is not taken during this function, so the caller
|
* Note: The IPTE lock is not taken during this function, so the caller
|
||||||
* has to take care of this.
|
* has to take care of this.
|
||||||
*/
|
*/
|
||||||
int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
int guest_translate_address_with_key(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
unsigned long *gpa, enum gacc_mode mode)
|
unsigned long *gpa, enum gacc_mode mode,
|
||||||
|
u8 access_key)
|
||||||
{
|
{
|
||||||
union asce asce;
|
union asce asce;
|
||||||
int rc;
|
int rc;
|
||||||
|
@ -963,7 +1163,8 @@ int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
|
rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
|
||||||
if (rc)
|
if (rc)
|
||||||
return rc;
|
return rc;
|
||||||
return guest_range_to_gpas(vcpu, gva, ar, gpa, 1, asce, mode);
|
return guest_range_to_gpas(vcpu, gva, ar, gpa, 1, asce, mode,
|
||||||
|
access_key);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -973,9 +1174,10 @@ int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
* @ar: Access register
|
* @ar: Access register
|
||||||
* @length: Length of test range
|
* @length: Length of test range
|
||||||
* @mode: Translation access mode
|
* @mode: Translation access mode
|
||||||
|
* @access_key: access key to mach the storage keys with
|
||||||
*/
|
*/
|
||||||
int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
unsigned long length, enum gacc_mode mode)
|
unsigned long length, enum gacc_mode mode, u8 access_key)
|
||||||
{
|
{
|
||||||
union asce asce;
|
union asce asce;
|
||||||
int rc = 0;
|
int rc = 0;
|
||||||
|
@ -984,12 +1186,36 @@ int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
if (rc)
|
if (rc)
|
||||||
return rc;
|
return rc;
|
||||||
ipte_lock(vcpu);
|
ipte_lock(vcpu);
|
||||||
rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode);
|
rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode,
|
||||||
|
access_key);
|
||||||
ipte_unlock(vcpu);
|
ipte_unlock(vcpu);
|
||||||
|
|
||||||
return rc;
|
return rc;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* check_gpa_range - test a range of guest physical addresses for accessibility
|
||||||
|
* @kvm: virtual machine instance
|
||||||
|
* @gpa: guest physical address
|
||||||
|
* @length: length of test range
|
||||||
|
* @mode: access mode to test, relevant for storage keys
|
||||||
|
* @access_key: access key to mach the storage keys with
|
||||||
|
*/
|
||||||
|
int check_gpa_range(struct kvm *kvm, unsigned long gpa, unsigned long length,
|
||||||
|
enum gacc_mode mode, u8 access_key)
|
||||||
|
{
|
||||||
|
unsigned int fragment_len;
|
||||||
|
int rc = 0;
|
||||||
|
|
||||||
|
while (length && !rc) {
|
||||||
|
fragment_len = min(PAGE_SIZE - offset_in_page(gpa), length);
|
||||||
|
rc = vm_check_access_key(kvm, access_key, mode, gpa);
|
||||||
|
length -= fragment_len;
|
||||||
|
gpa += fragment_len;
|
||||||
|
}
|
||||||
|
return rc;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* kvm_s390_check_low_addr_prot_real - check for low-address protection
|
* kvm_s390_check_low_addr_prot_real - check for low-address protection
|
||||||
* @vcpu: virtual cpu
|
* @vcpu: virtual cpu
|
||||||
|
|
|
@ -186,24 +186,34 @@ enum gacc_mode {
|
||||||
GACC_IFETCH,
|
GACC_IFETCH,
|
||||||
};
|
};
|
||||||
|
|
||||||
int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva,
|
int guest_translate_address_with_key(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
u8 ar, unsigned long *gpa, enum gacc_mode mode);
|
unsigned long *gpa, enum gacc_mode mode,
|
||||||
int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
u8 access_key);
|
||||||
unsigned long length, enum gacc_mode mode);
|
|
||||||
|
|
||||||
int access_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
|
||||||
unsigned long len, enum gacc_mode mode);
|
unsigned long length, enum gacc_mode mode, u8 access_key);
|
||||||
|
|
||||||
|
int check_gpa_range(struct kvm *kvm, unsigned long gpa, unsigned long length,
|
||||||
|
enum gacc_mode mode, u8 access_key);
|
||||||
|
|
||||||
|
int access_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, void *data,
|
||||||
|
unsigned long len, enum gacc_mode mode, u8 access_key);
|
||||||
|
|
||||||
|
int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
|
void *data, unsigned long len, enum gacc_mode mode,
|
||||||
|
u8 access_key);
|
||||||
|
|
||||||
int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
||||||
void *data, unsigned long len, enum gacc_mode mode);
|
void *data, unsigned long len, enum gacc_mode mode);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* write_guest - copy data from kernel space to guest space
|
* write_guest_with_key - copy data from kernel space to guest space
|
||||||
* @vcpu: virtual cpu
|
* @vcpu: virtual cpu
|
||||||
* @ga: guest address
|
* @ga: guest address
|
||||||
* @ar: access register
|
* @ar: access register
|
||||||
* @data: source address in kernel space
|
* @data: source address in kernel space
|
||||||
* @len: number of bytes to copy
|
* @len: number of bytes to copy
|
||||||
|
* @access_key: access key the storage key needs to match
|
||||||
*
|
*
|
||||||
* Copy @len bytes from @data (kernel space) to @ga (guest address).
|
* Copy @len bytes from @data (kernel space) to @ga (guest address).
|
||||||
* In order to copy data to guest space the PSW of the vcpu is inspected:
|
* In order to copy data to guest space the PSW of the vcpu is inspected:
|
||||||
|
@ -214,8 +224,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
||||||
* The addressing mode of the PSW is also inspected, so that address wrap
|
* The addressing mode of the PSW is also inspected, so that address wrap
|
||||||
* around is taken into account for 24-, 31- and 64-bit addressing mode,
|
* around is taken into account for 24-, 31- and 64-bit addressing mode,
|
||||||
* if the to be copied data crosses page boundaries in guest address space.
|
* if the to be copied data crosses page boundaries in guest address space.
|
||||||
* In addition also low address and DAT protection are inspected before
|
* In addition low address, DAT and key protection checks are performed before
|
||||||
* copying any data (key protection is currently not implemented).
|
* copying any data.
|
||||||
*
|
*
|
||||||
* This function modifies the 'struct kvm_s390_pgm_info pgm' member of @vcpu.
|
* This function modifies the 'struct kvm_s390_pgm_info pgm' member of @vcpu.
|
||||||
* In case of an access exception (e.g. protection exception) pgm will contain
|
* In case of an access exception (e.g. protection exception) pgm will contain
|
||||||
|
@ -243,10 +253,53 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
|
||||||
* if data has been changed in guest space in case of an exception.
|
* if data has been changed in guest space in case of an exception.
|
||||||
*/
|
*/
|
||||||
static inline __must_check
|
static inline __must_check
|
||||||
|
int write_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
|
void *data, unsigned long len, u8 access_key)
|
||||||
|
{
|
||||||
|
return access_guest_with_key(vcpu, ga, ar, data, len, GACC_STORE,
|
||||||
|
access_key);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* write_guest - copy data from kernel space to guest space
|
||||||
|
* @vcpu: virtual cpu
|
||||||
|
* @ga: guest address
|
||||||
|
* @ar: access register
|
||||||
|
* @data: source address in kernel space
|
||||||
|
* @len: number of bytes to copy
|
||||||
|
*
|
||||||
|
* The behaviour of write_guest is identical to write_guest_with_key, except
|
||||||
|
* that the PSW access key is used instead of an explicit argument.
|
||||||
|
*/
|
||||||
|
static inline __must_check
|
||||||
int write_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
int write_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
||||||
unsigned long len)
|
unsigned long len)
|
||||||
{
|
{
|
||||||
return access_guest(vcpu, ga, ar, data, len, GACC_STORE);
|
u8 access_key = psw_bits(vcpu->arch.sie_block->gpsw).key;
|
||||||
|
|
||||||
|
return write_guest_with_key(vcpu, ga, ar, data, len, access_key);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* read_guest_with_key - copy data from guest space to kernel space
|
||||||
|
* @vcpu: virtual cpu
|
||||||
|
* @ga: guest address
|
||||||
|
* @ar: access register
|
||||||
|
* @data: destination address in kernel space
|
||||||
|
* @len: number of bytes to copy
|
||||||
|
* @access_key: access key the storage key needs to match
|
||||||
|
*
|
||||||
|
* Copy @len bytes from @ga (guest address) to @data (kernel space).
|
||||||
|
*
|
||||||
|
* The behaviour of read_guest_with_key is identical to write_guest_with_key,
|
||||||
|
* except that data will be copied from guest space to kernel space.
|
||||||
|
*/
|
||||||
|
static inline __must_check
|
||||||
|
int read_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
|
||||||
|
void *data, unsigned long len, u8 access_key)
|
||||||
|
{
|
||||||
|
return access_guest_with_key(vcpu, ga, ar, data, len, GACC_FETCH,
|
||||||
|
access_key);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -259,14 +312,16 @@ int write_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
||||||
*
|
*
|
||||||
* Copy @len bytes from @ga (guest address) to @data (kernel space).
|
* Copy @len bytes from @ga (guest address) to @data (kernel space).
|
||||||
*
|
*
|
||||||
* The behaviour of read_guest is identical to write_guest, except that
|
* The behaviour of read_guest is identical to read_guest_with_key, except
|
||||||
* data will be copied from guest space to kernel space.
|
* that the PSW access key is used instead of an explicit argument.
|
||||||
*/
|
*/
|
||||||
static inline __must_check
|
static inline __must_check
|
||||||
int read_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
int read_guest(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, void *data,
|
||||||
unsigned long len)
|
unsigned long len)
|
||||||
{
|
{
|
||||||
return access_guest(vcpu, ga, ar, data, len, GACC_FETCH);
|
u8 access_key = psw_bits(vcpu->arch.sie_block->gpsw).key;
|
||||||
|
|
||||||
|
return read_guest_with_key(vcpu, ga, ar, data, len, access_key);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -287,7 +342,10 @@ static inline __must_check
|
||||||
int read_guest_instr(struct kvm_vcpu *vcpu, unsigned long ga, void *data,
|
int read_guest_instr(struct kvm_vcpu *vcpu, unsigned long ga, void *data,
|
||||||
unsigned long len)
|
unsigned long len)
|
||||||
{
|
{
|
||||||
return access_guest(vcpu, ga, 0, data, len, GACC_IFETCH);
|
u8 access_key = psw_bits(vcpu->arch.sie_block->gpsw).key;
|
||||||
|
|
||||||
|
return access_guest_with_key(vcpu, ga, 0, data, len, GACC_IFETCH,
|
||||||
|
access_key);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -331,18 +331,18 @@ static int handle_mvpg_pei(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
kvm_s390_get_regs_rre(vcpu, ®1, ®2);
|
kvm_s390_get_regs_rre(vcpu, ®1, ®2);
|
||||||
|
|
||||||
/* Make sure that the source is paged-in */
|
/* Ensure that the source is paged-in, no actual access -> no key checking */
|
||||||
rc = guest_translate_address(vcpu, vcpu->run->s.regs.gprs[reg2],
|
rc = guest_translate_address_with_key(vcpu, vcpu->run->s.regs.gprs[reg2],
|
||||||
reg2, &srcaddr, GACC_FETCH);
|
reg2, &srcaddr, GACC_FETCH, 0);
|
||||||
if (rc)
|
if (rc)
|
||||||
return kvm_s390_inject_prog_cond(vcpu, rc);
|
return kvm_s390_inject_prog_cond(vcpu, rc);
|
||||||
rc = kvm_arch_fault_in_page(vcpu, srcaddr, 0);
|
rc = kvm_arch_fault_in_page(vcpu, srcaddr, 0);
|
||||||
if (rc != 0)
|
if (rc != 0)
|
||||||
return rc;
|
return rc;
|
||||||
|
|
||||||
/* Make sure that the destination is paged-in */
|
/* Ensure that the source is paged-in, no actual access -> no key checking */
|
||||||
rc = guest_translate_address(vcpu, vcpu->run->s.regs.gprs[reg1],
|
rc = guest_translate_address_with_key(vcpu, vcpu->run->s.regs.gprs[reg1],
|
||||||
reg1, &dstaddr, GACC_STORE);
|
reg1, &dstaddr, GACC_STORE, 0);
|
||||||
if (rc)
|
if (rc)
|
||||||
return kvm_s390_inject_prog_cond(vcpu, rc);
|
return kvm_s390_inject_prog_cond(vcpu, rc);
|
||||||
rc = kvm_arch_fault_in_page(vcpu, dstaddr, 1);
|
rc = kvm_arch_fault_in_page(vcpu, dstaddr, 1);
|
||||||
|
|
|
@ -1901,13 +1901,12 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
|
||||||
isc = int_word_to_isc(inti->io.io_int_word);
|
isc = int_word_to_isc(inti->io.io_int_word);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Do not make use of gisa in protected mode. We do not use the lock
|
* We do not use the lock checking variant as this is just a
|
||||||
* checking variant as this is just a performance optimization and we
|
* performance optimization and we do not hold the lock here.
|
||||||
* do not hold the lock here. This is ok as the code will pick
|
* This is ok as the code will pick interrupts from both "lists"
|
||||||
* interrupts from both "lists" for delivery.
|
* for delivery.
|
||||||
*/
|
*/
|
||||||
if (!kvm_s390_pv_get_handle(kvm) &&
|
if (gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) {
|
||||||
gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) {
|
|
||||||
VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc);
|
VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc);
|
||||||
gisa_set_ipm_gisc(gi->origin, isc);
|
gisa_set_ipm_gisc(gi->origin, isc);
|
||||||
kfree(inti);
|
kfree(inti);
|
||||||
|
@ -3171,9 +3170,33 @@ void kvm_s390_gisa_init(struct kvm *kvm)
|
||||||
VM_EVENT(kvm, 3, "gisa 0x%pK initialized", gi->origin);
|
VM_EVENT(kvm, 3, "gisa 0x%pK initialized", gi->origin);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void kvm_s390_gisa_enable(struct kvm *kvm)
|
||||||
|
{
|
||||||
|
struct kvm_s390_gisa_interrupt *gi = &kvm->arch.gisa_int;
|
||||||
|
struct kvm_vcpu *vcpu;
|
||||||
|
unsigned long i;
|
||||||
|
u32 gisa_desc;
|
||||||
|
|
||||||
|
if (gi->origin)
|
||||||
|
return;
|
||||||
|
kvm_s390_gisa_init(kvm);
|
||||||
|
gisa_desc = kvm_s390_get_gisa_desc(kvm);
|
||||||
|
if (!gisa_desc)
|
||||||
|
return;
|
||||||
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||||
|
mutex_lock(&vcpu->mutex);
|
||||||
|
vcpu->arch.sie_block->gd = gisa_desc;
|
||||||
|
vcpu->arch.sie_block->eca |= ECA_AIV;
|
||||||
|
VCPU_EVENT(vcpu, 3, "AIV gisa format-%u enabled for cpu %03u",
|
||||||
|
vcpu->arch.sie_block->gd & 0x3, vcpu->vcpu_id);
|
||||||
|
mutex_unlock(&vcpu->mutex);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
void kvm_s390_gisa_destroy(struct kvm *kvm)
|
void kvm_s390_gisa_destroy(struct kvm *kvm)
|
||||||
{
|
{
|
||||||
struct kvm_s390_gisa_interrupt *gi = &kvm->arch.gisa_int;
|
struct kvm_s390_gisa_interrupt *gi = &kvm->arch.gisa_int;
|
||||||
|
struct kvm_s390_gisa *gisa = gi->origin;
|
||||||
|
|
||||||
if (!gi->origin)
|
if (!gi->origin)
|
||||||
return;
|
return;
|
||||||
|
@ -3184,6 +3207,25 @@ void kvm_s390_gisa_destroy(struct kvm *kvm)
|
||||||
cpu_relax();
|
cpu_relax();
|
||||||
hrtimer_cancel(&gi->timer);
|
hrtimer_cancel(&gi->timer);
|
||||||
gi->origin = NULL;
|
gi->origin = NULL;
|
||||||
|
VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_s390_gisa_disable(struct kvm *kvm)
|
||||||
|
{
|
||||||
|
struct kvm_s390_gisa_interrupt *gi = &kvm->arch.gisa_int;
|
||||||
|
struct kvm_vcpu *vcpu;
|
||||||
|
unsigned long i;
|
||||||
|
|
||||||
|
if (!gi->origin)
|
||||||
|
return;
|
||||||
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||||
|
mutex_lock(&vcpu->mutex);
|
||||||
|
vcpu->arch.sie_block->eca &= ~ECA_AIV;
|
||||||
|
vcpu->arch.sie_block->gd = 0U;
|
||||||
|
mutex_unlock(&vcpu->mutex);
|
||||||
|
VCPU_EVENT(vcpu, 3, "AIV disabled for cpu %03u", vcpu->vcpu_id);
|
||||||
|
}
|
||||||
|
kvm_s390_gisa_destroy(kvm);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -564,6 +564,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||||
case KVM_CAP_S390_VCPU_RESETS:
|
case KVM_CAP_S390_VCPU_RESETS:
|
||||||
case KVM_CAP_SET_GUEST_DEBUG:
|
case KVM_CAP_SET_GUEST_DEBUG:
|
||||||
case KVM_CAP_S390_DIAG318:
|
case KVM_CAP_S390_DIAG318:
|
||||||
|
case KVM_CAP_S390_MEM_OP_EXTENSION:
|
||||||
r = 1;
|
r = 1;
|
||||||
break;
|
break;
|
||||||
case KVM_CAP_SET_GUEST_DEBUG2:
|
case KVM_CAP_SET_GUEST_DEBUG2:
|
||||||
|
@ -2194,6 +2195,9 @@ static int kvm_s390_cpus_from_pv(struct kvm *kvm, u16 *rcp, u16 *rrcp)
|
||||||
}
|
}
|
||||||
mutex_unlock(&vcpu->mutex);
|
mutex_unlock(&vcpu->mutex);
|
||||||
}
|
}
|
||||||
|
/* Ensure that we re-enable gisa if the non-PV guest used it but the PV guest did not. */
|
||||||
|
if (use_gisa)
|
||||||
|
kvm_s390_gisa_enable(kvm);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2205,6 +2209,10 @@ static int kvm_s390_cpus_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
|
||||||
|
|
||||||
struct kvm_vcpu *vcpu;
|
struct kvm_vcpu *vcpu;
|
||||||
|
|
||||||
|
/* Disable the GISA if the ultravisor does not support AIV. */
|
||||||
|
if (!test_bit_inv(BIT_UV_FEAT_AIV, &uv_info.uv_feature_indications))
|
||||||
|
kvm_s390_gisa_disable(kvm);
|
||||||
|
|
||||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||||
mutex_lock(&vcpu->mutex);
|
mutex_lock(&vcpu->mutex);
|
||||||
r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
|
r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
|
||||||
|
@ -2359,6 +2367,83 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static bool access_key_invalid(u8 access_key)
|
||||||
|
{
|
||||||
|
return access_key > 0xf;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop)
|
||||||
|
{
|
||||||
|
void __user *uaddr = (void __user *)mop->buf;
|
||||||
|
u64 supported_flags;
|
||||||
|
void *tmpbuf = NULL;
|
||||||
|
int r, srcu_idx;
|
||||||
|
|
||||||
|
supported_flags = KVM_S390_MEMOP_F_SKEY_PROTECTION
|
||||||
|
| KVM_S390_MEMOP_F_CHECK_ONLY;
|
||||||
|
if (mop->flags & ~supported_flags || !mop->size)
|
||||||
|
return -EINVAL;
|
||||||
|
if (mop->size > MEM_OP_MAX_SIZE)
|
||||||
|
return -E2BIG;
|
||||||
|
if (kvm_s390_pv_is_protected(kvm))
|
||||||
|
return -EINVAL;
|
||||||
|
if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) {
|
||||||
|
if (access_key_invalid(mop->key))
|
||||||
|
return -EINVAL;
|
||||||
|
} else {
|
||||||
|
mop->key = 0;
|
||||||
|
}
|
||||||
|
if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) {
|
||||||
|
tmpbuf = vmalloc(mop->size);
|
||||||
|
if (!tmpbuf)
|
||||||
|
return -ENOMEM;
|
||||||
|
}
|
||||||
|
|
||||||
|
srcu_idx = srcu_read_lock(&kvm->srcu);
|
||||||
|
|
||||||
|
if (kvm_is_error_gpa(kvm, mop->gaddr)) {
|
||||||
|
r = PGM_ADDRESSING;
|
||||||
|
goto out_unlock;
|
||||||
|
}
|
||||||
|
|
||||||
|
switch (mop->op) {
|
||||||
|
case KVM_S390_MEMOP_ABSOLUTE_READ: {
|
||||||
|
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
||||||
|
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_FETCH, mop->key);
|
||||||
|
} else {
|
||||||
|
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
|
||||||
|
mop->size, GACC_FETCH, mop->key);
|
||||||
|
if (r == 0) {
|
||||||
|
if (copy_to_user(uaddr, tmpbuf, mop->size))
|
||||||
|
r = -EFAULT;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
case KVM_S390_MEMOP_ABSOLUTE_WRITE: {
|
||||||
|
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
||||||
|
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_STORE, mop->key);
|
||||||
|
} else {
|
||||||
|
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
|
||||||
|
r = -EFAULT;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
|
||||||
|
mop->size, GACC_STORE, mop->key);
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
default:
|
||||||
|
r = -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
out_unlock:
|
||||||
|
srcu_read_unlock(&kvm->srcu, srcu_idx);
|
||||||
|
|
||||||
|
vfree(tmpbuf);
|
||||||
|
return r;
|
||||||
|
}
|
||||||
|
|
||||||
long kvm_arch_vm_ioctl(struct file *filp,
|
long kvm_arch_vm_ioctl(struct file *filp,
|
||||||
unsigned int ioctl, unsigned long arg)
|
unsigned int ioctl, unsigned long arg)
|
||||||
{
|
{
|
||||||
|
@ -2483,6 +2568,15 @@ long kvm_arch_vm_ioctl(struct file *filp,
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
case KVM_S390_MEM_OP: {
|
||||||
|
struct kvm_s390_mem_op mem_op;
|
||||||
|
|
||||||
|
if (copy_from_user(&mem_op, argp, sizeof(mem_op)) == 0)
|
||||||
|
r = kvm_s390_vm_mem_op(kvm, &mem_op);
|
||||||
|
else
|
||||||
|
r = -EFAULT;
|
||||||
|
break;
|
||||||
|
}
|
||||||
default:
|
default:
|
||||||
r = -ENOTTY;
|
r = -ENOTTY;
|
||||||
}
|
}
|
||||||
|
@ -3263,9 +3357,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
vcpu->arch.sie_block->icpua = vcpu->vcpu_id;
|
vcpu->arch.sie_block->icpua = vcpu->vcpu_id;
|
||||||
spin_lock_init(&vcpu->arch.local_int.lock);
|
spin_lock_init(&vcpu->arch.local_int.lock);
|
||||||
vcpu->arch.sie_block->gd = (u32)(u64)vcpu->kvm->arch.gisa_int.origin;
|
vcpu->arch.sie_block->gd = kvm_s390_get_gisa_desc(vcpu->kvm);
|
||||||
if (vcpu->arch.sie_block->gd && sclp.has_gisaf)
|
|
||||||
vcpu->arch.sie_block->gd |= GISA_FORMAT1;
|
|
||||||
seqcount_init(&vcpu->arch.cputm_seqcount);
|
seqcount_init(&vcpu->arch.cputm_seqcount);
|
||||||
|
|
||||||
vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
|
vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
|
||||||
|
@ -3394,7 +3486,7 @@ static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
|
||||||
if (prefix <= end && start <= prefix + 2*PAGE_SIZE - 1) {
|
if (prefix <= end && start <= prefix + 2*PAGE_SIZE - 1) {
|
||||||
VCPU_EVENT(vcpu, 2, "gmap notifier for %lx-%lx",
|
VCPU_EVENT(vcpu, 2, "gmap notifier for %lx-%lx",
|
||||||
start, end);
|
start, end);
|
||||||
kvm_s390_sync_request(KVM_REQ_MMU_RELOAD, vcpu);
|
kvm_s390_sync_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -3796,19 +3888,19 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
|
||||||
if (!kvm_request_pending(vcpu))
|
if (!kvm_request_pending(vcpu))
|
||||||
return 0;
|
return 0;
|
||||||
/*
|
/*
|
||||||
* We use MMU_RELOAD just to re-arm the ipte notifier for the
|
* If the guest prefix changed, re-arm the ipte notifier for the
|
||||||
* guest prefix page. gmap_mprotect_notify will wait on the ptl lock.
|
* guest prefix page. gmap_mprotect_notify will wait on the ptl lock.
|
||||||
* This ensures that the ipte instruction for this request has
|
* This ensures that the ipte instruction for this request has
|
||||||
* already finished. We might race against a second unmapper that
|
* already finished. We might race against a second unmapper that
|
||||||
* wants to set the blocking bit. Lets just retry the request loop.
|
* wants to set the blocking bit. Lets just retry the request loop.
|
||||||
*/
|
*/
|
||||||
if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu)) {
|
if (kvm_check_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu)) {
|
||||||
int rc;
|
int rc;
|
||||||
rc = gmap_mprotect_notify(vcpu->arch.gmap,
|
rc = gmap_mprotect_notify(vcpu->arch.gmap,
|
||||||
kvm_s390_get_prefix(vcpu),
|
kvm_s390_get_prefix(vcpu),
|
||||||
PAGE_SIZE * 2, PROT_WRITE);
|
PAGE_SIZE * 2, PROT_WRITE);
|
||||||
if (rc) {
|
if (rc) {
|
||||||
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
|
kvm_make_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu);
|
||||||
return rc;
|
return rc;
|
||||||
}
|
}
|
||||||
goto retry;
|
goto retry;
|
||||||
|
@ -3869,14 +3961,12 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
void kvm_s390_set_tod_clock(struct kvm *kvm,
|
static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
|
||||||
const struct kvm_s390_vm_tod_clock *gtod)
|
|
||||||
{
|
{
|
||||||
struct kvm_vcpu *vcpu;
|
struct kvm_vcpu *vcpu;
|
||||||
union tod_clock clk;
|
union tod_clock clk;
|
||||||
unsigned long i;
|
unsigned long i;
|
||||||
|
|
||||||
mutex_lock(&kvm->lock);
|
|
||||||
preempt_disable();
|
preempt_disable();
|
||||||
|
|
||||||
store_tod_clock_ext(&clk);
|
store_tod_clock_ext(&clk);
|
||||||
|
@ -3897,9 +3987,24 @@ void kvm_s390_set_tod_clock(struct kvm *kvm,
|
||||||
|
|
||||||
kvm_s390_vcpu_unblock_all(kvm);
|
kvm_s390_vcpu_unblock_all(kvm);
|
||||||
preempt_enable();
|
preempt_enable();
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
|
||||||
|
{
|
||||||
|
mutex_lock(&kvm->lock);
|
||||||
|
__kvm_s390_set_tod_clock(kvm, gtod);
|
||||||
mutex_unlock(&kvm->lock);
|
mutex_unlock(&kvm->lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod)
|
||||||
|
{
|
||||||
|
if (!mutex_trylock(&kvm->lock))
|
||||||
|
return 0;
|
||||||
|
__kvm_s390_set_tod_clock(kvm, gtod);
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* kvm_arch_fault_in_page - fault-in guest page if necessary
|
* kvm_arch_fault_in_page - fault-in guest page if necessary
|
||||||
* @vcpu: The corresponding virtual cpu
|
* @vcpu: The corresponding virtual cpu
|
||||||
|
@ -4655,8 +4760,8 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
static long kvm_s390_guest_sida_op(struct kvm_vcpu *vcpu,
|
static long kvm_s390_vcpu_sida_op(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_s390_mem_op *mop)
|
struct kvm_s390_mem_op *mop)
|
||||||
{
|
{
|
||||||
void __user *uaddr = (void __user *)mop->buf;
|
void __user *uaddr = (void __user *)mop->buf;
|
||||||
int r = 0;
|
int r = 0;
|
||||||
|
@ -4685,24 +4790,29 @@ static long kvm_s390_guest_sida_op(struct kvm_vcpu *vcpu,
|
||||||
}
|
}
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
|
|
||||||
struct kvm_s390_mem_op *mop)
|
static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu,
|
||||||
|
struct kvm_s390_mem_op *mop)
|
||||||
{
|
{
|
||||||
void __user *uaddr = (void __user *)mop->buf;
|
void __user *uaddr = (void __user *)mop->buf;
|
||||||
void *tmpbuf = NULL;
|
void *tmpbuf = NULL;
|
||||||
int r = 0;
|
int r = 0;
|
||||||
const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION
|
const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION
|
||||||
| KVM_S390_MEMOP_F_CHECK_ONLY;
|
| KVM_S390_MEMOP_F_CHECK_ONLY
|
||||||
|
| KVM_S390_MEMOP_F_SKEY_PROTECTION;
|
||||||
|
|
||||||
if (mop->flags & ~supported_flags || mop->ar >= NUM_ACRS || !mop->size)
|
if (mop->flags & ~supported_flags || mop->ar >= NUM_ACRS || !mop->size)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (mop->size > MEM_OP_MAX_SIZE)
|
if (mop->size > MEM_OP_MAX_SIZE)
|
||||||
return -E2BIG;
|
return -E2BIG;
|
||||||
|
|
||||||
if (kvm_s390_pv_cpu_is_protected(vcpu))
|
if (kvm_s390_pv_cpu_is_protected(vcpu))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) {
|
||||||
|
if (access_key_invalid(mop->key))
|
||||||
|
return -EINVAL;
|
||||||
|
} else {
|
||||||
|
mop->key = 0;
|
||||||
|
}
|
||||||
if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) {
|
if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) {
|
||||||
tmpbuf = vmalloc(mop->size);
|
tmpbuf = vmalloc(mop->size);
|
||||||
if (!tmpbuf)
|
if (!tmpbuf)
|
||||||
|
@ -4712,11 +4822,12 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
|
||||||
switch (mop->op) {
|
switch (mop->op) {
|
||||||
case KVM_S390_MEMOP_LOGICAL_READ:
|
case KVM_S390_MEMOP_LOGICAL_READ:
|
||||||
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
||||||
r = check_gva_range(vcpu, mop->gaddr, mop->ar,
|
r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size,
|
||||||
mop->size, GACC_FETCH);
|
GACC_FETCH, mop->key);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
r = read_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size);
|
r = read_guest_with_key(vcpu, mop->gaddr, mop->ar, tmpbuf,
|
||||||
|
mop->size, mop->key);
|
||||||
if (r == 0) {
|
if (r == 0) {
|
||||||
if (copy_to_user(uaddr, tmpbuf, mop->size))
|
if (copy_to_user(uaddr, tmpbuf, mop->size))
|
||||||
r = -EFAULT;
|
r = -EFAULT;
|
||||||
|
@ -4724,15 +4835,16 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
|
||||||
break;
|
break;
|
||||||
case KVM_S390_MEMOP_LOGICAL_WRITE:
|
case KVM_S390_MEMOP_LOGICAL_WRITE:
|
||||||
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
|
||||||
r = check_gva_range(vcpu, mop->gaddr, mop->ar,
|
r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size,
|
||||||
mop->size, GACC_STORE);
|
GACC_STORE, mop->key);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
|
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
|
||||||
r = -EFAULT;
|
r = -EFAULT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
r = write_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size);
|
r = write_guest_with_key(vcpu, mop->gaddr, mop->ar, tmpbuf,
|
||||||
|
mop->size, mop->key);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -4743,8 +4855,8 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
static long kvm_s390_guest_memsida_op(struct kvm_vcpu *vcpu,
|
static long kvm_s390_vcpu_memsida_op(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_s390_mem_op *mop)
|
struct kvm_s390_mem_op *mop)
|
||||||
{
|
{
|
||||||
int r, srcu_idx;
|
int r, srcu_idx;
|
||||||
|
|
||||||
|
@ -4753,12 +4865,12 @@ static long kvm_s390_guest_memsida_op(struct kvm_vcpu *vcpu,
|
||||||
switch (mop->op) {
|
switch (mop->op) {
|
||||||
case KVM_S390_MEMOP_LOGICAL_READ:
|
case KVM_S390_MEMOP_LOGICAL_READ:
|
||||||
case KVM_S390_MEMOP_LOGICAL_WRITE:
|
case KVM_S390_MEMOP_LOGICAL_WRITE:
|
||||||
r = kvm_s390_guest_mem_op(vcpu, mop);
|
r = kvm_s390_vcpu_mem_op(vcpu, mop);
|
||||||
break;
|
break;
|
||||||
case KVM_S390_MEMOP_SIDA_READ:
|
case KVM_S390_MEMOP_SIDA_READ:
|
||||||
case KVM_S390_MEMOP_SIDA_WRITE:
|
case KVM_S390_MEMOP_SIDA_WRITE:
|
||||||
/* we are locked against sida going away by the vcpu->mutex */
|
/* we are locked against sida going away by the vcpu->mutex */
|
||||||
r = kvm_s390_guest_sida_op(vcpu, mop);
|
r = kvm_s390_vcpu_sida_op(vcpu, mop);
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
|
@ -4921,7 +5033,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
|
||||||
struct kvm_s390_mem_op mem_op;
|
struct kvm_s390_mem_op mem_op;
|
||||||
|
|
||||||
if (copy_from_user(&mem_op, argp, sizeof(mem_op)) == 0)
|
if (copy_from_user(&mem_op, argp, sizeof(mem_op)) == 0)
|
||||||
r = kvm_s390_guest_memsida_op(vcpu, &mem_op);
|
r = kvm_s390_vcpu_memsida_op(vcpu, &mem_op);
|
||||||
else
|
else
|
||||||
r = -EFAULT;
|
r = -EFAULT;
|
||||||
break;
|
break;
|
||||||
|
|
|
@ -105,7 +105,7 @@ static inline void kvm_s390_set_prefix(struct kvm_vcpu *vcpu, u32 prefix)
|
||||||
prefix);
|
prefix);
|
||||||
vcpu->arch.sie_block->prefix = prefix >> GUEST_PREFIX_SHIFT;
|
vcpu->arch.sie_block->prefix = prefix >> GUEST_PREFIX_SHIFT;
|
||||||
kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
|
kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
|
||||||
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
|
kvm_make_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline u64 kvm_s390_get_base_disp_s(struct kvm_vcpu *vcpu, u8 *ar)
|
static inline u64 kvm_s390_get_base_disp_s(struct kvm_vcpu *vcpu, u8 *ar)
|
||||||
|
@ -231,6 +231,15 @@ static inline unsigned long kvm_s390_get_gfn_end(struct kvm_memslots *slots)
|
||||||
return ms->base_gfn + ms->npages;
|
return ms->base_gfn + ms->npages;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline u32 kvm_s390_get_gisa_desc(struct kvm *kvm)
|
||||||
|
{
|
||||||
|
u32 gd = (u32)(u64)kvm->arch.gisa_int.origin;
|
||||||
|
|
||||||
|
if (gd && sclp.has_gisaf)
|
||||||
|
gd |= GISA_FORMAT1;
|
||||||
|
return gd;
|
||||||
|
}
|
||||||
|
|
||||||
/* implemented in pv.c */
|
/* implemented in pv.c */
|
||||||
int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
|
int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
|
||||||
int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
|
int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
|
||||||
|
@ -349,8 +358,8 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu);
|
||||||
int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
|
int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
/* implemented in kvm-s390.c */
|
/* implemented in kvm-s390.c */
|
||||||
void kvm_s390_set_tod_clock(struct kvm *kvm,
|
void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
|
||||||
const struct kvm_s390_vm_tod_clock *gtod);
|
int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
|
||||||
long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
|
long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
|
||||||
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
|
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||||
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
|
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||||
|
@ -450,6 +459,8 @@ int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu,
|
||||||
void kvm_s390_gisa_init(struct kvm *kvm);
|
void kvm_s390_gisa_init(struct kvm *kvm);
|
||||||
void kvm_s390_gisa_clear(struct kvm *kvm);
|
void kvm_s390_gisa_clear(struct kvm *kvm);
|
||||||
void kvm_s390_gisa_destroy(struct kvm *kvm);
|
void kvm_s390_gisa_destroy(struct kvm *kvm);
|
||||||
|
void kvm_s390_gisa_disable(struct kvm *kvm);
|
||||||
|
void kvm_s390_gisa_enable(struct kvm *kvm);
|
||||||
int kvm_s390_gib_init(u8 nisc);
|
int kvm_s390_gib_init(u8 nisc);
|
||||||
void kvm_s390_gib_destroy(void);
|
void kvm_s390_gib_destroy(void);
|
||||||
|
|
||||||
|
|
|
@ -102,7 +102,20 @@ static int handle_set_clock(struct kvm_vcpu *vcpu)
|
||||||
return kvm_s390_inject_prog_cond(vcpu, rc);
|
return kvm_s390_inject_prog_cond(vcpu, rc);
|
||||||
|
|
||||||
VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", gtod.tod);
|
VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", gtod.tod);
|
||||||
kvm_s390_set_tod_clock(vcpu->kvm, >od);
|
/*
|
||||||
|
* To set the TOD clock the kvm lock must be taken, but the vcpu lock
|
||||||
|
* is already held in handle_set_clock. The usual lock order is the
|
||||||
|
* opposite. As SCK is deprecated and should not be used in several
|
||||||
|
* cases, for example when the multiple epoch facility or TOD clock
|
||||||
|
* steering facility is installed (see Principles of Operation), a
|
||||||
|
* slow path can be used. If the lock can not be taken via try_lock,
|
||||||
|
* the instruction will be retried via -EAGAIN at a later point in
|
||||||
|
* time.
|
||||||
|
*/
|
||||||
|
if (!kvm_s390_try_set_tod_clock(vcpu->kvm, >od)) {
|
||||||
|
kvm_s390_retry_instr(vcpu);
|
||||||
|
return -EAGAIN;
|
||||||
|
}
|
||||||
|
|
||||||
kvm_s390_set_psw_cc(vcpu, 0);
|
kvm_s390_set_psw_cc(vcpu, 0);
|
||||||
return 0;
|
return 0;
|
||||||
|
@ -1443,10 +1456,11 @@ int kvm_s390_handle_eb(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
static int handle_tprot(struct kvm_vcpu *vcpu)
|
static int handle_tprot(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
u64 address1, address2;
|
u64 address, operand2;
|
||||||
unsigned long hva, gpa;
|
unsigned long gpa;
|
||||||
int ret = 0, cc = 0;
|
u8 access_key;
|
||||||
bool writable;
|
bool writable;
|
||||||
|
int ret, cc;
|
||||||
u8 ar;
|
u8 ar;
|
||||||
|
|
||||||
vcpu->stat.instruction_tprot++;
|
vcpu->stat.instruction_tprot++;
|
||||||
|
@ -1454,43 +1468,46 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
|
||||||
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
|
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
|
||||||
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
|
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
|
||||||
|
|
||||||
kvm_s390_get_base_disp_sse(vcpu, &address1, &address2, &ar, NULL);
|
kvm_s390_get_base_disp_sse(vcpu, &address, &operand2, &ar, NULL);
|
||||||
|
access_key = (operand2 & 0xf0) >> 4;
|
||||||
|
|
||||||
/* we only handle the Linux memory detection case:
|
|
||||||
* access key == 0
|
|
||||||
* everything else goes to userspace. */
|
|
||||||
if (address2 & 0xf0)
|
|
||||||
return -EOPNOTSUPP;
|
|
||||||
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
|
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
|
||||||
ipte_lock(vcpu);
|
ipte_lock(vcpu);
|
||||||
ret = guest_translate_address(vcpu, address1, ar, &gpa, GACC_STORE);
|
|
||||||
if (ret == PGM_PROTECTION) {
|
ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
|
||||||
|
GACC_STORE, access_key);
|
||||||
|
if (ret == 0) {
|
||||||
|
gfn_to_hva_prot(vcpu->kvm, gpa_to_gfn(gpa), &writable);
|
||||||
|
} else if (ret == PGM_PROTECTION) {
|
||||||
|
writable = false;
|
||||||
/* Write protected? Try again with read-only... */
|
/* Write protected? Try again with read-only... */
|
||||||
cc = 1;
|
ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
|
||||||
ret = guest_translate_address(vcpu, address1, ar, &gpa,
|
GACC_FETCH, access_key);
|
||||||
GACC_FETCH);
|
|
||||||
}
|
}
|
||||||
if (ret) {
|
if (ret >= 0) {
|
||||||
if (ret == PGM_ADDRESSING || ret == PGM_TRANSLATION_SPEC) {
|
cc = -1;
|
||||||
ret = kvm_s390_inject_program_int(vcpu, ret);
|
|
||||||
} else if (ret > 0) {
|
/* Fetching permitted; storing permitted */
|
||||||
/* Translation not available */
|
if (ret == 0 && writable)
|
||||||
kvm_s390_set_psw_cc(vcpu, 3);
|
cc = 0;
|
||||||
|
/* Fetching permitted; storing not permitted */
|
||||||
|
else if (ret == 0 && !writable)
|
||||||
|
cc = 1;
|
||||||
|
/* Fetching not permitted; storing not permitted */
|
||||||
|
else if (ret == PGM_PROTECTION)
|
||||||
|
cc = 2;
|
||||||
|
/* Translation not available */
|
||||||
|
else if (ret != PGM_ADDRESSING && ret != PGM_TRANSLATION_SPEC)
|
||||||
|
cc = 3;
|
||||||
|
|
||||||
|
if (cc != -1) {
|
||||||
|
kvm_s390_set_psw_cc(vcpu, cc);
|
||||||
ret = 0;
|
ret = 0;
|
||||||
|
} else {
|
||||||
|
ret = kvm_s390_inject_program_int(vcpu, ret);
|
||||||
}
|
}
|
||||||
goto out_unlock;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
hva = gfn_to_hva_prot(vcpu->kvm, gpa_to_gfn(gpa), &writable);
|
|
||||||
if (kvm_is_error_hva(hva)) {
|
|
||||||
ret = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
|
|
||||||
} else {
|
|
||||||
if (!writable)
|
|
||||||
cc = 1; /* Write not permitted ==> read-only */
|
|
||||||
kvm_s390_set_psw_cc(vcpu, cc);
|
|
||||||
/* Note: CC2 only occurs for storage keys (not supported yet) */
|
|
||||||
}
|
|
||||||
out_unlock:
|
|
||||||
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
|
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
|
||||||
ipte_unlock(vcpu);
|
ipte_unlock(vcpu);
|
||||||
return ret;
|
return ret;
|
||||||
|
|
|
@ -59,11 +59,13 @@ static inline int copy_with_mvcos(void)
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
static inline unsigned long copy_from_user_mvcos(void *x, const void __user *ptr,
|
static inline unsigned long copy_from_user_mvcos(void *x, const void __user *ptr,
|
||||||
unsigned long size)
|
unsigned long size, unsigned long key)
|
||||||
{
|
{
|
||||||
unsigned long tmp1, tmp2;
|
unsigned long tmp1, tmp2;
|
||||||
union oac spec = {
|
union oac spec = {
|
||||||
|
.oac2.key = key,
|
||||||
.oac2.as = PSW_BITS_AS_SECONDARY,
|
.oac2.as = PSW_BITS_AS_SECONDARY,
|
||||||
|
.oac2.k = 1,
|
||||||
.oac2.a = 1,
|
.oac2.a = 1,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@ -94,19 +96,19 @@ static inline unsigned long copy_from_user_mvcos(void *x, const void __user *ptr
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline unsigned long copy_from_user_mvcp(void *x, const void __user *ptr,
|
static inline unsigned long copy_from_user_mvcp(void *x, const void __user *ptr,
|
||||||
unsigned long size)
|
unsigned long size, unsigned long key)
|
||||||
{
|
{
|
||||||
unsigned long tmp1, tmp2;
|
unsigned long tmp1, tmp2;
|
||||||
|
|
||||||
tmp1 = -256UL;
|
tmp1 = -256UL;
|
||||||
asm volatile(
|
asm volatile(
|
||||||
" sacf 0\n"
|
" sacf 0\n"
|
||||||
"0: mvcp 0(%0,%2),0(%1),%3\n"
|
"0: mvcp 0(%0,%2),0(%1),%[key]\n"
|
||||||
"7: jz 5f\n"
|
"7: jz 5f\n"
|
||||||
"1: algr %0,%3\n"
|
"1: algr %0,%3\n"
|
||||||
" la %1,256(%1)\n"
|
" la %1,256(%1)\n"
|
||||||
" la %2,256(%2)\n"
|
" la %2,256(%2)\n"
|
||||||
"2: mvcp 0(%0,%2),0(%1),%3\n"
|
"2: mvcp 0(%0,%2),0(%1),%[key]\n"
|
||||||
"8: jnz 1b\n"
|
"8: jnz 1b\n"
|
||||||
" j 5f\n"
|
" j 5f\n"
|
||||||
"3: la %4,255(%1)\n" /* %4 = ptr + 255 */
|
"3: la %4,255(%1)\n" /* %4 = ptr + 255 */
|
||||||
|
@ -115,7 +117,7 @@ static inline unsigned long copy_from_user_mvcp(void *x, const void __user *ptr,
|
||||||
" slgr %4,%1\n"
|
" slgr %4,%1\n"
|
||||||
" clgr %0,%4\n" /* copy crosses next page boundary? */
|
" clgr %0,%4\n" /* copy crosses next page boundary? */
|
||||||
" jnh 6f\n"
|
" jnh 6f\n"
|
||||||
"4: mvcp 0(%4,%2),0(%1),%3\n"
|
"4: mvcp 0(%4,%2),0(%1),%[key]\n"
|
||||||
"9: slgr %0,%4\n"
|
"9: slgr %0,%4\n"
|
||||||
" j 6f\n"
|
" j 6f\n"
|
||||||
"5: slgr %0,%0\n"
|
"5: slgr %0,%0\n"
|
||||||
|
@ -123,24 +125,49 @@ static inline unsigned long copy_from_user_mvcp(void *x, const void __user *ptr,
|
||||||
EX_TABLE(0b,3b) EX_TABLE(2b,3b) EX_TABLE(4b,6b)
|
EX_TABLE(0b,3b) EX_TABLE(2b,3b) EX_TABLE(4b,6b)
|
||||||
EX_TABLE(7b,3b) EX_TABLE(8b,3b) EX_TABLE(9b,6b)
|
EX_TABLE(7b,3b) EX_TABLE(8b,3b) EX_TABLE(9b,6b)
|
||||||
: "+a" (size), "+a" (ptr), "+a" (x), "+a" (tmp1), "=a" (tmp2)
|
: "+a" (size), "+a" (ptr), "+a" (x), "+a" (tmp1), "=a" (tmp2)
|
||||||
: : "cc", "memory");
|
: [key] "d" (key << 4)
|
||||||
|
: "cc", "memory");
|
||||||
return size;
|
return size;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static unsigned long raw_copy_from_user_key(void *to, const void __user *from,
|
||||||
|
unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
if (copy_with_mvcos())
|
||||||
|
return copy_from_user_mvcos(to, from, n, key);
|
||||||
|
return copy_from_user_mvcp(to, from, n, key);
|
||||||
|
}
|
||||||
|
|
||||||
unsigned long raw_copy_from_user(void *to, const void __user *from, unsigned long n)
|
unsigned long raw_copy_from_user(void *to, const void __user *from, unsigned long n)
|
||||||
{
|
{
|
||||||
if (copy_with_mvcos())
|
return raw_copy_from_user_key(to, from, n, 0);
|
||||||
return copy_from_user_mvcos(to, from, n);
|
|
||||||
return copy_from_user_mvcp(to, from, n);
|
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(raw_copy_from_user);
|
EXPORT_SYMBOL(raw_copy_from_user);
|
||||||
|
|
||||||
|
unsigned long _copy_from_user_key(void *to, const void __user *from,
|
||||||
|
unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
unsigned long res = n;
|
||||||
|
|
||||||
|
might_fault();
|
||||||
|
if (!should_fail_usercopy()) {
|
||||||
|
instrument_copy_from_user(to, from, n);
|
||||||
|
res = raw_copy_from_user_key(to, from, n, key);
|
||||||
|
}
|
||||||
|
if (unlikely(res))
|
||||||
|
memset(to + (n - res), 0, res);
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(_copy_from_user_key);
|
||||||
|
|
||||||
static inline unsigned long copy_to_user_mvcos(void __user *ptr, const void *x,
|
static inline unsigned long copy_to_user_mvcos(void __user *ptr, const void *x,
|
||||||
unsigned long size)
|
unsigned long size, unsigned long key)
|
||||||
{
|
{
|
||||||
unsigned long tmp1, tmp2;
|
unsigned long tmp1, tmp2;
|
||||||
union oac spec = {
|
union oac spec = {
|
||||||
|
.oac1.key = key,
|
||||||
.oac1.as = PSW_BITS_AS_SECONDARY,
|
.oac1.as = PSW_BITS_AS_SECONDARY,
|
||||||
|
.oac1.k = 1,
|
||||||
.oac1.a = 1,
|
.oac1.a = 1,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@ -171,19 +198,19 @@ static inline unsigned long copy_to_user_mvcos(void __user *ptr, const void *x,
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline unsigned long copy_to_user_mvcs(void __user *ptr, const void *x,
|
static inline unsigned long copy_to_user_mvcs(void __user *ptr, const void *x,
|
||||||
unsigned long size)
|
unsigned long size, unsigned long key)
|
||||||
{
|
{
|
||||||
unsigned long tmp1, tmp2;
|
unsigned long tmp1, tmp2;
|
||||||
|
|
||||||
tmp1 = -256UL;
|
tmp1 = -256UL;
|
||||||
asm volatile(
|
asm volatile(
|
||||||
" sacf 0\n"
|
" sacf 0\n"
|
||||||
"0: mvcs 0(%0,%1),0(%2),%3\n"
|
"0: mvcs 0(%0,%1),0(%2),%[key]\n"
|
||||||
"7: jz 5f\n"
|
"7: jz 5f\n"
|
||||||
"1: algr %0,%3\n"
|
"1: algr %0,%3\n"
|
||||||
" la %1,256(%1)\n"
|
" la %1,256(%1)\n"
|
||||||
" la %2,256(%2)\n"
|
" la %2,256(%2)\n"
|
||||||
"2: mvcs 0(%0,%1),0(%2),%3\n"
|
"2: mvcs 0(%0,%1),0(%2),%[key]\n"
|
||||||
"8: jnz 1b\n"
|
"8: jnz 1b\n"
|
||||||
" j 5f\n"
|
" j 5f\n"
|
||||||
"3: la %4,255(%1)\n" /* %4 = ptr + 255 */
|
"3: la %4,255(%1)\n" /* %4 = ptr + 255 */
|
||||||
|
@ -192,7 +219,7 @@ static inline unsigned long copy_to_user_mvcs(void __user *ptr, const void *x,
|
||||||
" slgr %4,%1\n"
|
" slgr %4,%1\n"
|
||||||
" clgr %0,%4\n" /* copy crosses next page boundary? */
|
" clgr %0,%4\n" /* copy crosses next page boundary? */
|
||||||
" jnh 6f\n"
|
" jnh 6f\n"
|
||||||
"4: mvcs 0(%4,%1),0(%2),%3\n"
|
"4: mvcs 0(%4,%1),0(%2),%[key]\n"
|
||||||
"9: slgr %0,%4\n"
|
"9: slgr %0,%4\n"
|
||||||
" j 6f\n"
|
" j 6f\n"
|
||||||
"5: slgr %0,%0\n"
|
"5: slgr %0,%0\n"
|
||||||
|
@ -200,18 +227,36 @@ static inline unsigned long copy_to_user_mvcs(void __user *ptr, const void *x,
|
||||||
EX_TABLE(0b,3b) EX_TABLE(2b,3b) EX_TABLE(4b,6b)
|
EX_TABLE(0b,3b) EX_TABLE(2b,3b) EX_TABLE(4b,6b)
|
||||||
EX_TABLE(7b,3b) EX_TABLE(8b,3b) EX_TABLE(9b,6b)
|
EX_TABLE(7b,3b) EX_TABLE(8b,3b) EX_TABLE(9b,6b)
|
||||||
: "+a" (size), "+a" (ptr), "+a" (x), "+a" (tmp1), "=a" (tmp2)
|
: "+a" (size), "+a" (ptr), "+a" (x), "+a" (tmp1), "=a" (tmp2)
|
||||||
: : "cc", "memory");
|
: [key] "d" (key << 4)
|
||||||
|
: "cc", "memory");
|
||||||
return size;
|
return size;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static unsigned long raw_copy_to_user_key(void __user *to, const void *from,
|
||||||
|
unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
if (copy_with_mvcos())
|
||||||
|
return copy_to_user_mvcos(to, from, n, key);
|
||||||
|
return copy_to_user_mvcs(to, from, n, key);
|
||||||
|
}
|
||||||
|
|
||||||
unsigned long raw_copy_to_user(void __user *to, const void *from, unsigned long n)
|
unsigned long raw_copy_to_user(void __user *to, const void *from, unsigned long n)
|
||||||
{
|
{
|
||||||
if (copy_with_mvcos())
|
return raw_copy_to_user_key(to, from, n, 0);
|
||||||
return copy_to_user_mvcos(to, from, n);
|
|
||||||
return copy_to_user_mvcs(to, from, n);
|
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(raw_copy_to_user);
|
EXPORT_SYMBOL(raw_copy_to_user);
|
||||||
|
|
||||||
|
unsigned long _copy_to_user_key(void __user *to, const void *from,
|
||||||
|
unsigned long n, unsigned long key)
|
||||||
|
{
|
||||||
|
might_fault();
|
||||||
|
if (should_fail_usercopy())
|
||||||
|
return n;
|
||||||
|
instrument_copy_to_user(to, from, n);
|
||||||
|
return raw_copy_to_user_key(to, from, n, key);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(_copy_to_user_key);
|
||||||
|
|
||||||
static inline unsigned long clear_user_mvcos(void __user *to, unsigned long size)
|
static inline unsigned long clear_user_mvcos(void __user *to, unsigned long size)
|
||||||
{
|
{
|
||||||
unsigned long tmp1, tmp2;
|
unsigned long tmp1, tmp2;
|
||||||
|
|
|
@ -1,29 +1,30 @@
|
||||||
/* SPDX-License-Identifier: GPL-2.0 */
|
/* SPDX-License-Identifier: GPL-2.0 */
|
||||||
#if !defined(KVM_X86_OP) || !defined(KVM_X86_OP_NULL)
|
#if !defined(KVM_X86_OP) || !defined(KVM_X86_OP_OPTIONAL)
|
||||||
BUILD_BUG_ON(1)
|
BUILD_BUG_ON(1)
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* KVM_X86_OP() and KVM_X86_OP_NULL() are used to help generate
|
* KVM_X86_OP() and KVM_X86_OP_OPTIONAL() are used to help generate
|
||||||
* "static_call()"s. They are also intended for use when defining
|
* both DECLARE/DEFINE_STATIC_CALL() invocations and
|
||||||
* the vmx/svm kvm_x86_ops. KVM_X86_OP() can be used for those
|
* "static_call_update()" calls.
|
||||||
* functions that follow the [svm|vmx]_func_name convention.
|
*
|
||||||
* KVM_X86_OP_NULL() can leave a NULL definition for the
|
* KVM_X86_OP_OPTIONAL() can be used for those functions that can have
|
||||||
* case where there is no definition or a function name that
|
* a NULL definition, for example if "static_call_cond()" will be used
|
||||||
* doesn't match the typical naming convention is supplied.
|
* at the call sites. KVM_X86_OP_OPTIONAL_RET0() can be used likewise
|
||||||
|
* to make a definition optional, but in this case the default will
|
||||||
|
* be __static_call_return0.
|
||||||
*/
|
*/
|
||||||
KVM_X86_OP_NULL(hardware_enable)
|
KVM_X86_OP(hardware_enable)
|
||||||
KVM_X86_OP_NULL(hardware_disable)
|
KVM_X86_OP(hardware_disable)
|
||||||
KVM_X86_OP_NULL(hardware_unsetup)
|
KVM_X86_OP(hardware_unsetup)
|
||||||
KVM_X86_OP_NULL(cpu_has_accelerated_tpr)
|
|
||||||
KVM_X86_OP(has_emulated_msr)
|
KVM_X86_OP(has_emulated_msr)
|
||||||
KVM_X86_OP(vcpu_after_set_cpuid)
|
KVM_X86_OP(vcpu_after_set_cpuid)
|
||||||
KVM_X86_OP(vm_init)
|
KVM_X86_OP(vm_init)
|
||||||
KVM_X86_OP_NULL(vm_destroy)
|
KVM_X86_OP_OPTIONAL(vm_destroy)
|
||||||
KVM_X86_OP(vcpu_create)
|
KVM_X86_OP(vcpu_create)
|
||||||
KVM_X86_OP(vcpu_free)
|
KVM_X86_OP(vcpu_free)
|
||||||
KVM_X86_OP(vcpu_reset)
|
KVM_X86_OP(vcpu_reset)
|
||||||
KVM_X86_OP(prepare_guest_switch)
|
KVM_X86_OP(prepare_switch_to_guest)
|
||||||
KVM_X86_OP(vcpu_load)
|
KVM_X86_OP(vcpu_load)
|
||||||
KVM_X86_OP(vcpu_put)
|
KVM_X86_OP(vcpu_put)
|
||||||
KVM_X86_OP(update_exception_bitmap)
|
KVM_X86_OP(update_exception_bitmap)
|
||||||
|
@ -33,9 +34,9 @@ KVM_X86_OP(get_segment_base)
|
||||||
KVM_X86_OP(get_segment)
|
KVM_X86_OP(get_segment)
|
||||||
KVM_X86_OP(get_cpl)
|
KVM_X86_OP(get_cpl)
|
||||||
KVM_X86_OP(set_segment)
|
KVM_X86_OP(set_segment)
|
||||||
KVM_X86_OP_NULL(get_cs_db_l_bits)
|
KVM_X86_OP(get_cs_db_l_bits)
|
||||||
KVM_X86_OP(set_cr0)
|
KVM_X86_OP(set_cr0)
|
||||||
KVM_X86_OP_NULL(post_set_cr3)
|
KVM_X86_OP_OPTIONAL(post_set_cr3)
|
||||||
KVM_X86_OP(is_valid_cr4)
|
KVM_X86_OP(is_valid_cr4)
|
||||||
KVM_X86_OP(set_cr4)
|
KVM_X86_OP(set_cr4)
|
||||||
KVM_X86_OP(set_efer)
|
KVM_X86_OP(set_efer)
|
||||||
|
@ -49,22 +50,22 @@ KVM_X86_OP(cache_reg)
|
||||||
KVM_X86_OP(get_rflags)
|
KVM_X86_OP(get_rflags)
|
||||||
KVM_X86_OP(set_rflags)
|
KVM_X86_OP(set_rflags)
|
||||||
KVM_X86_OP(get_if_flag)
|
KVM_X86_OP(get_if_flag)
|
||||||
KVM_X86_OP(tlb_flush_all)
|
KVM_X86_OP(flush_tlb_all)
|
||||||
KVM_X86_OP(tlb_flush_current)
|
KVM_X86_OP(flush_tlb_current)
|
||||||
KVM_X86_OP_NULL(tlb_remote_flush)
|
KVM_X86_OP_OPTIONAL(tlb_remote_flush)
|
||||||
KVM_X86_OP_NULL(tlb_remote_flush_with_range)
|
KVM_X86_OP_OPTIONAL(tlb_remote_flush_with_range)
|
||||||
KVM_X86_OP(tlb_flush_gva)
|
KVM_X86_OP(flush_tlb_gva)
|
||||||
KVM_X86_OP(tlb_flush_guest)
|
KVM_X86_OP(flush_tlb_guest)
|
||||||
KVM_X86_OP(vcpu_pre_run)
|
KVM_X86_OP(vcpu_pre_run)
|
||||||
KVM_X86_OP(run)
|
KVM_X86_OP(vcpu_run)
|
||||||
KVM_X86_OP_NULL(handle_exit)
|
KVM_X86_OP(handle_exit)
|
||||||
KVM_X86_OP_NULL(skip_emulated_instruction)
|
KVM_X86_OP(skip_emulated_instruction)
|
||||||
KVM_X86_OP_NULL(update_emulated_instruction)
|
KVM_X86_OP_OPTIONAL(update_emulated_instruction)
|
||||||
KVM_X86_OP(set_interrupt_shadow)
|
KVM_X86_OP(set_interrupt_shadow)
|
||||||
KVM_X86_OP(get_interrupt_shadow)
|
KVM_X86_OP(get_interrupt_shadow)
|
||||||
KVM_X86_OP(patch_hypercall)
|
KVM_X86_OP(patch_hypercall)
|
||||||
KVM_X86_OP(set_irq)
|
KVM_X86_OP(inject_irq)
|
||||||
KVM_X86_OP(set_nmi)
|
KVM_X86_OP(inject_nmi)
|
||||||
KVM_X86_OP(queue_exception)
|
KVM_X86_OP(queue_exception)
|
||||||
KVM_X86_OP(cancel_injection)
|
KVM_X86_OP(cancel_injection)
|
||||||
KVM_X86_OP(interrupt_allowed)
|
KVM_X86_OP(interrupt_allowed)
|
||||||
|
@ -73,22 +74,22 @@ KVM_X86_OP(get_nmi_mask)
|
||||||
KVM_X86_OP(set_nmi_mask)
|
KVM_X86_OP(set_nmi_mask)
|
||||||
KVM_X86_OP(enable_nmi_window)
|
KVM_X86_OP(enable_nmi_window)
|
||||||
KVM_X86_OP(enable_irq_window)
|
KVM_X86_OP(enable_irq_window)
|
||||||
KVM_X86_OP(update_cr8_intercept)
|
KVM_X86_OP_OPTIONAL(update_cr8_intercept)
|
||||||
KVM_X86_OP(check_apicv_inhibit_reasons)
|
KVM_X86_OP(check_apicv_inhibit_reasons)
|
||||||
KVM_X86_OP(refresh_apicv_exec_ctrl)
|
KVM_X86_OP(refresh_apicv_exec_ctrl)
|
||||||
KVM_X86_OP(hwapic_irr_update)
|
KVM_X86_OP_OPTIONAL(hwapic_irr_update)
|
||||||
KVM_X86_OP(hwapic_isr_update)
|
KVM_X86_OP_OPTIONAL(hwapic_isr_update)
|
||||||
KVM_X86_OP_NULL(guest_apic_has_interrupt)
|
KVM_X86_OP_OPTIONAL_RET0(guest_apic_has_interrupt)
|
||||||
KVM_X86_OP(load_eoi_exitmap)
|
KVM_X86_OP_OPTIONAL(load_eoi_exitmap)
|
||||||
KVM_X86_OP(set_virtual_apic_mode)
|
KVM_X86_OP_OPTIONAL(set_virtual_apic_mode)
|
||||||
KVM_X86_OP_NULL(set_apic_access_page_addr)
|
KVM_X86_OP_OPTIONAL(set_apic_access_page_addr)
|
||||||
KVM_X86_OP(deliver_interrupt)
|
KVM_X86_OP(deliver_interrupt)
|
||||||
KVM_X86_OP_NULL(sync_pir_to_irr)
|
KVM_X86_OP_OPTIONAL(sync_pir_to_irr)
|
||||||
KVM_X86_OP(set_tss_addr)
|
KVM_X86_OP_OPTIONAL_RET0(set_tss_addr)
|
||||||
KVM_X86_OP(set_identity_map_addr)
|
KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
|
||||||
KVM_X86_OP(get_mt_mask)
|
KVM_X86_OP(get_mt_mask)
|
||||||
KVM_X86_OP(load_mmu_pgd)
|
KVM_X86_OP(load_mmu_pgd)
|
||||||
KVM_X86_OP_NULL(has_wbinvd_exit)
|
KVM_X86_OP(has_wbinvd_exit)
|
||||||
KVM_X86_OP(get_l2_tsc_offset)
|
KVM_X86_OP(get_l2_tsc_offset)
|
||||||
KVM_X86_OP(get_l2_tsc_multiplier)
|
KVM_X86_OP(get_l2_tsc_multiplier)
|
||||||
KVM_X86_OP(write_tsc_offset)
|
KVM_X86_OP(write_tsc_offset)
|
||||||
|
@ -96,32 +97,36 @@ KVM_X86_OP(write_tsc_multiplier)
|
||||||
KVM_X86_OP(get_exit_info)
|
KVM_X86_OP(get_exit_info)
|
||||||
KVM_X86_OP(check_intercept)
|
KVM_X86_OP(check_intercept)
|
||||||
KVM_X86_OP(handle_exit_irqoff)
|
KVM_X86_OP(handle_exit_irqoff)
|
||||||
KVM_X86_OP_NULL(request_immediate_exit)
|
KVM_X86_OP(request_immediate_exit)
|
||||||
KVM_X86_OP(sched_in)
|
KVM_X86_OP(sched_in)
|
||||||
KVM_X86_OP_NULL(update_cpu_dirty_logging)
|
KVM_X86_OP_OPTIONAL(update_cpu_dirty_logging)
|
||||||
KVM_X86_OP_NULL(vcpu_blocking)
|
KVM_X86_OP_OPTIONAL(vcpu_blocking)
|
||||||
KVM_X86_OP_NULL(vcpu_unblocking)
|
KVM_X86_OP_OPTIONAL(vcpu_unblocking)
|
||||||
KVM_X86_OP_NULL(update_pi_irte)
|
KVM_X86_OP_OPTIONAL(pi_update_irte)
|
||||||
KVM_X86_OP_NULL(start_assignment)
|
KVM_X86_OP_OPTIONAL(pi_start_assignment)
|
||||||
KVM_X86_OP_NULL(apicv_post_state_restore)
|
KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
|
||||||
KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt)
|
KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
|
||||||
KVM_X86_OP_NULL(set_hv_timer)
|
KVM_X86_OP_OPTIONAL(set_hv_timer)
|
||||||
KVM_X86_OP_NULL(cancel_hv_timer)
|
KVM_X86_OP_OPTIONAL(cancel_hv_timer)
|
||||||
KVM_X86_OP(setup_mce)
|
KVM_X86_OP(setup_mce)
|
||||||
KVM_X86_OP(smi_allowed)
|
KVM_X86_OP(smi_allowed)
|
||||||
KVM_X86_OP(enter_smm)
|
KVM_X86_OP(enter_smm)
|
||||||
KVM_X86_OP(leave_smm)
|
KVM_X86_OP(leave_smm)
|
||||||
KVM_X86_OP(enable_smi_window)
|
KVM_X86_OP(enable_smi_window)
|
||||||
KVM_X86_OP_NULL(mem_enc_op)
|
KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
|
||||||
KVM_X86_OP_NULL(mem_enc_reg_region)
|
KVM_X86_OP_OPTIONAL(mem_enc_register_region)
|
||||||
KVM_X86_OP_NULL(mem_enc_unreg_region)
|
KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
|
||||||
|
KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
|
||||||
|
KVM_X86_OP_OPTIONAL(vm_move_enc_context_from)
|
||||||
KVM_X86_OP(get_msr_feature)
|
KVM_X86_OP(get_msr_feature)
|
||||||
KVM_X86_OP(can_emulate_instruction)
|
KVM_X86_OP(can_emulate_instruction)
|
||||||
KVM_X86_OP(apic_init_signal_blocked)
|
KVM_X86_OP(apic_init_signal_blocked)
|
||||||
KVM_X86_OP_NULL(enable_direct_tlbflush)
|
KVM_X86_OP_OPTIONAL(enable_direct_tlbflush)
|
||||||
KVM_X86_OP_NULL(migrate_timers)
|
KVM_X86_OP_OPTIONAL(migrate_timers)
|
||||||
KVM_X86_OP(msr_filter_changed)
|
KVM_X86_OP(msr_filter_changed)
|
||||||
KVM_X86_OP_NULL(complete_emulated_msr)
|
KVM_X86_OP(complete_emulated_msr)
|
||||||
|
KVM_X86_OP(vcpu_deliver_sipi_vector)
|
||||||
|
|
||||||
#undef KVM_X86_OP
|
#undef KVM_X86_OP
|
||||||
#undef KVM_X86_OP_NULL
|
#undef KVM_X86_OP_OPTIONAL
|
||||||
|
#undef KVM_X86_OP_OPTIONAL_RET0
|
||||||
|
|
|
@ -15,6 +15,7 @@
|
||||||
#include <linux/cpumask.h>
|
#include <linux/cpumask.h>
|
||||||
#include <linux/irq_work.h>
|
#include <linux/irq_work.h>
|
||||||
#include <linux/irq.h>
|
#include <linux/irq.h>
|
||||||
|
#include <linux/workqueue.h>
|
||||||
|
|
||||||
#include <linux/kvm.h>
|
#include <linux/kvm.h>
|
||||||
#include <linux/kvm_para.h>
|
#include <linux/kvm_para.h>
|
||||||
|
@ -102,6 +103,8 @@
|
||||||
#define KVM_REQ_MSR_FILTER_CHANGED KVM_ARCH_REQ(29)
|
#define KVM_REQ_MSR_FILTER_CHANGED KVM_ARCH_REQ(29)
|
||||||
#define KVM_REQ_UPDATE_CPU_DIRTY_LOGGING \
|
#define KVM_REQ_UPDATE_CPU_DIRTY_LOGGING \
|
||||||
KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||||
|
#define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \
|
||||||
|
KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||||
|
|
||||||
#define CR0_RESERVED_BITS \
|
#define CR0_RESERVED_BITS \
|
||||||
(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
|
(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
|
||||||
|
@ -432,8 +435,7 @@ struct kvm_mmu {
|
||||||
int (*sync_page)(struct kvm_vcpu *vcpu,
|
int (*sync_page)(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_mmu_page *sp);
|
struct kvm_mmu_page *sp);
|
||||||
void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
|
void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
|
||||||
hpa_t root_hpa;
|
struct kvm_mmu_root_info root;
|
||||||
gpa_t root_pgd;
|
|
||||||
union kvm_mmu_role mmu_role;
|
union kvm_mmu_role mmu_role;
|
||||||
u8 root_level;
|
u8 root_level;
|
||||||
u8 shadow_root_level;
|
u8 shadow_root_level;
|
||||||
|
@ -1128,10 +1130,6 @@ struct kvm_arch {
|
||||||
struct kvm_hv hyperv;
|
struct kvm_hv hyperv;
|
||||||
struct kvm_xen xen;
|
struct kvm_xen xen;
|
||||||
|
|
||||||
#ifdef CONFIG_KVM_MMU_AUDIT
|
|
||||||
int audit_point;
|
|
||||||
#endif
|
|
||||||
|
|
||||||
bool backwards_tsc_observed;
|
bool backwards_tsc_observed;
|
||||||
bool boot_vcpu_runs_old_kvmclock;
|
bool boot_vcpu_runs_old_kvmclock;
|
||||||
u32 bsp_vcpu_id;
|
u32 bsp_vcpu_id;
|
||||||
|
@ -1151,6 +1149,7 @@ struct kvm_arch {
|
||||||
bool exception_payload_enabled;
|
bool exception_payload_enabled;
|
||||||
|
|
||||||
bool bus_lock_detection_enabled;
|
bool bus_lock_detection_enabled;
|
||||||
|
bool enable_pmu;
|
||||||
/*
|
/*
|
||||||
* If exit_on_emulation_error is set, and the in-kernel instruction
|
* If exit_on_emulation_error is set, and the in-kernel instruction
|
||||||
* emulator fails to emulate an instruction, allow userspace
|
* emulator fails to emulate an instruction, allow userspace
|
||||||
|
@ -1220,6 +1219,7 @@ struct kvm_arch {
|
||||||
* the thread holds the MMU lock in write mode.
|
* the thread holds the MMU lock in write mode.
|
||||||
*/
|
*/
|
||||||
spinlock_t tdp_mmu_pages_lock;
|
spinlock_t tdp_mmu_pages_lock;
|
||||||
|
struct workqueue_struct *tdp_mmu_zap_wq;
|
||||||
#endif /* CONFIG_X86_64 */
|
#endif /* CONFIG_X86_64 */
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -1318,7 +1318,6 @@ struct kvm_x86_ops {
|
||||||
int (*hardware_enable)(void);
|
int (*hardware_enable)(void);
|
||||||
void (*hardware_disable)(void);
|
void (*hardware_disable)(void);
|
||||||
void (*hardware_unsetup)(void);
|
void (*hardware_unsetup)(void);
|
||||||
bool (*cpu_has_accelerated_tpr)(void);
|
|
||||||
bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
|
bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
|
||||||
void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);
|
void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
|
@ -1331,7 +1330,7 @@ struct kvm_x86_ops {
|
||||||
void (*vcpu_free)(struct kvm_vcpu *vcpu);
|
void (*vcpu_free)(struct kvm_vcpu *vcpu);
|
||||||
void (*vcpu_reset)(struct kvm_vcpu *vcpu, bool init_event);
|
void (*vcpu_reset)(struct kvm_vcpu *vcpu, bool init_event);
|
||||||
|
|
||||||
void (*prepare_guest_switch)(struct kvm_vcpu *vcpu);
|
void (*prepare_switch_to_guest)(struct kvm_vcpu *vcpu);
|
||||||
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
|
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
|
||||||
void (*vcpu_put)(struct kvm_vcpu *vcpu);
|
void (*vcpu_put)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
|
@ -1361,8 +1360,8 @@ struct kvm_x86_ops {
|
||||||
void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
|
void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
|
||||||
bool (*get_if_flag)(struct kvm_vcpu *vcpu);
|
bool (*get_if_flag)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
void (*tlb_flush_all)(struct kvm_vcpu *vcpu);
|
void (*flush_tlb_all)(struct kvm_vcpu *vcpu);
|
||||||
void (*tlb_flush_current)(struct kvm_vcpu *vcpu);
|
void (*flush_tlb_current)(struct kvm_vcpu *vcpu);
|
||||||
int (*tlb_remote_flush)(struct kvm *kvm);
|
int (*tlb_remote_flush)(struct kvm *kvm);
|
||||||
int (*tlb_remote_flush_with_range)(struct kvm *kvm,
|
int (*tlb_remote_flush_with_range)(struct kvm *kvm,
|
||||||
struct kvm_tlb_range *range);
|
struct kvm_tlb_range *range);
|
||||||
|
@ -1373,16 +1372,16 @@ struct kvm_x86_ops {
|
||||||
* Can potentially get non-canonical addresses through INVLPGs, which
|
* Can potentially get non-canonical addresses through INVLPGs, which
|
||||||
* the implementation may choose to ignore if appropriate.
|
* the implementation may choose to ignore if appropriate.
|
||||||
*/
|
*/
|
||||||
void (*tlb_flush_gva)(struct kvm_vcpu *vcpu, gva_t addr);
|
void (*flush_tlb_gva)(struct kvm_vcpu *vcpu, gva_t addr);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Flush any TLB entries created by the guest. Like tlb_flush_gva(),
|
* Flush any TLB entries created by the guest. Like tlb_flush_gva(),
|
||||||
* does not need to flush GPA->HPA mappings.
|
* does not need to flush GPA->HPA mappings.
|
||||||
*/
|
*/
|
||||||
void (*tlb_flush_guest)(struct kvm_vcpu *vcpu);
|
void (*flush_tlb_guest)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
int (*vcpu_pre_run)(struct kvm_vcpu *vcpu);
|
int (*vcpu_pre_run)(struct kvm_vcpu *vcpu);
|
||||||
enum exit_fastpath_completion (*run)(struct kvm_vcpu *vcpu);
|
enum exit_fastpath_completion (*vcpu_run)(struct kvm_vcpu *vcpu);
|
||||||
int (*handle_exit)(struct kvm_vcpu *vcpu,
|
int (*handle_exit)(struct kvm_vcpu *vcpu,
|
||||||
enum exit_fastpath_completion exit_fastpath);
|
enum exit_fastpath_completion exit_fastpath);
|
||||||
int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
|
int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
|
||||||
|
@ -1391,8 +1390,8 @@ struct kvm_x86_ops {
|
||||||
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
|
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
|
||||||
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
|
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
|
||||||
unsigned char *hypercall_addr);
|
unsigned char *hypercall_addr);
|
||||||
void (*set_irq)(struct kvm_vcpu *vcpu);
|
void (*inject_irq)(struct kvm_vcpu *vcpu);
|
||||||
void (*set_nmi)(struct kvm_vcpu *vcpu);
|
void (*inject_nmi)(struct kvm_vcpu *vcpu);
|
||||||
void (*queue_exception)(struct kvm_vcpu *vcpu);
|
void (*queue_exception)(struct kvm_vcpu *vcpu);
|
||||||
void (*cancel_injection)(struct kvm_vcpu *vcpu);
|
void (*cancel_injection)(struct kvm_vcpu *vcpu);
|
||||||
int (*interrupt_allowed)(struct kvm_vcpu *vcpu, bool for_injection);
|
int (*interrupt_allowed)(struct kvm_vcpu *vcpu, bool for_injection);
|
||||||
|
@ -1459,9 +1458,9 @@ struct kvm_x86_ops {
|
||||||
void (*vcpu_blocking)(struct kvm_vcpu *vcpu);
|
void (*vcpu_blocking)(struct kvm_vcpu *vcpu);
|
||||||
void (*vcpu_unblocking)(struct kvm_vcpu *vcpu);
|
void (*vcpu_unblocking)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
|
int (*pi_update_irte)(struct kvm *kvm, unsigned int host_irq,
|
||||||
uint32_t guest_irq, bool set);
|
uint32_t guest_irq, bool set);
|
||||||
void (*start_assignment)(struct kvm *kvm);
|
void (*pi_start_assignment)(struct kvm *kvm);
|
||||||
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
|
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
|
||||||
bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
|
bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
|
@ -1476,9 +1475,9 @@ struct kvm_x86_ops {
|
||||||
int (*leave_smm)(struct kvm_vcpu *vcpu, const char *smstate);
|
int (*leave_smm)(struct kvm_vcpu *vcpu, const char *smstate);
|
||||||
void (*enable_smi_window)(struct kvm_vcpu *vcpu);
|
void (*enable_smi_window)(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
int (*mem_enc_op)(struct kvm *kvm, void __user *argp);
|
int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp);
|
||||||
int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
|
int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp);
|
||||||
int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
|
int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp);
|
||||||
int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
|
int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
|
||||||
int (*vm_move_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
|
int (*vm_move_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
|
||||||
|
|
||||||
|
@ -1541,15 +1540,22 @@ extern struct kvm_x86_ops kvm_x86_ops;
|
||||||
|
|
||||||
#define KVM_X86_OP(func) \
|
#define KVM_X86_OP(func) \
|
||||||
DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
|
DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
|
||||||
#define KVM_X86_OP_NULL KVM_X86_OP
|
#define KVM_X86_OP_OPTIONAL KVM_X86_OP
|
||||||
|
#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
|
||||||
#include <asm/kvm-x86-ops.h>
|
#include <asm/kvm-x86-ops.h>
|
||||||
|
|
||||||
static inline void kvm_ops_static_call_update(void)
|
static inline void kvm_ops_static_call_update(void)
|
||||||
{
|
{
|
||||||
#define KVM_X86_OP(func) \
|
#define __KVM_X86_OP(func) \
|
||||||
static_call_update(kvm_x86_##func, kvm_x86_ops.func);
|
static_call_update(kvm_x86_##func, kvm_x86_ops.func);
|
||||||
#define KVM_X86_OP_NULL KVM_X86_OP
|
#define KVM_X86_OP(func) \
|
||||||
|
WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
|
||||||
|
#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
|
||||||
|
#define KVM_X86_OP_OPTIONAL_RET0(func) \
|
||||||
|
static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
|
||||||
|
(void *)__static_call_return0);
|
||||||
#include <asm/kvm-x86-ops.h>
|
#include <asm/kvm-x86-ops.h>
|
||||||
|
#undef __KVM_X86_OP
|
||||||
}
|
}
|
||||||
|
|
||||||
#define __KVM_HAVE_ARCH_VM_ALLOC
|
#define __KVM_HAVE_ARCH_VM_ALLOC
|
||||||
|
@ -1587,6 +1593,13 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
|
||||||
void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
|
void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
|
||||||
const struct kvm_memory_slot *memslot,
|
const struct kvm_memory_slot *memslot,
|
||||||
int start_level);
|
int start_level);
|
||||||
|
void kvm_mmu_slot_try_split_huge_pages(struct kvm *kvm,
|
||||||
|
const struct kvm_memory_slot *memslot,
|
||||||
|
int target_level);
|
||||||
|
void kvm_mmu_try_split_huge_pages(struct kvm *kvm,
|
||||||
|
const struct kvm_memory_slot *memslot,
|
||||||
|
u64 start, u64 end,
|
||||||
|
int target_level);
|
||||||
void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
|
void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
|
||||||
const struct kvm_memory_slot *memslot);
|
const struct kvm_memory_slot *memslot);
|
||||||
void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
|
void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
|
||||||
|
@ -1724,7 +1737,6 @@ int kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val);
|
||||||
void kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val);
|
void kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val);
|
||||||
unsigned long kvm_get_cr8(struct kvm_vcpu *vcpu);
|
unsigned long kvm_get_cr8(struct kvm_vcpu *vcpu);
|
||||||
void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw);
|
void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw);
|
||||||
void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l);
|
|
||||||
int kvm_emulate_xsetbv(struct kvm_vcpu *vcpu);
|
int kvm_emulate_xsetbv(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr);
|
int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr);
|
||||||
|
@ -1769,9 +1781,9 @@ void kvm_inject_nmi(struct kvm_vcpu *vcpu);
|
||||||
void kvm_update_dr7(struct kvm_vcpu *vcpu);
|
void kvm_update_dr7(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
|
int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
|
||||||
void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
|
||||||
ulong roots_to_free);
|
ulong roots_to_free);
|
||||||
void kvm_mmu_free_guest_mode_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu);
|
void kvm_mmu_free_guest_mode_roots(struct kvm *kvm, struct kvm_mmu *mmu);
|
||||||
gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva,
|
gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva,
|
||||||
struct x86_exception *exception);
|
struct x86_exception *exception);
|
||||||
gpa_t kvm_mmu_gva_to_gpa_fetch(struct kvm_vcpu *vcpu, gva_t gva,
|
gpa_t kvm_mmu_gva_to_gpa_fetch(struct kvm_vcpu *vcpu, gva_t gva,
|
||||||
|
@ -1878,7 +1890,7 @@ static inline bool kvm_is_supported_user_return_msr(u32 msr)
|
||||||
return kvm_find_user_return_msr(msr) >= 0;
|
return kvm_find_user_return_msr(msr) >= 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc, u64 ratio);
|
u64 kvm_scale_tsc(u64 tsc, u64 ratio);
|
||||||
u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
|
u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
|
||||||
u64 kvm_calc_nested_tsc_offset(u64 l1_offset, u64 l2_offset, u64 l2_multiplier);
|
u64 kvm_calc_nested_tsc_offset(u64 l1_offset, u64 l2_offset, u64 l2_multiplier);
|
||||||
u64 kvm_calc_nested_tsc_multiplier(u64 l1_multiplier, u64 l2_multiplier);
|
u64 kvm_calc_nested_tsc_multiplier(u64 l1_multiplier, u64 l2_multiplier);
|
||||||
|
@ -1955,4 +1967,11 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
|
||||||
#define KVM_CLOCK_VALID_FLAGS \
|
#define KVM_CLOCK_VALID_FLAGS \
|
||||||
(KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
|
(KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
|
||||||
|
|
||||||
|
#define KVM_X86_VALID_QUIRKS \
|
||||||
|
(KVM_X86_QUIRK_LINT0_REENABLED | \
|
||||||
|
KVM_X86_QUIRK_CD_NW_CLEARED | \
|
||||||
|
KVM_X86_QUIRK_LAPIC_MMIO_HOLE | \
|
||||||
|
KVM_X86_QUIRK_OUT_7E_INC_RIP | \
|
||||||
|
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)
|
||||||
|
|
||||||
#endif /* _ASM_X86_KVM_HOST_H */
|
#endif /* _ASM_X86_KVM_HOST_H */
|
||||||
|
|
|
@ -226,7 +226,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
|
||||||
#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31
|
#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31
|
||||||
#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31)
|
#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31)
|
||||||
|
|
||||||
#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK (0xFFULL)
|
#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK GENMASK_ULL(11, 0)
|
||||||
#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12)
|
#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12)
|
||||||
#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62)
|
#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62)
|
||||||
#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63)
|
#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63)
|
||||||
|
|
|
@ -126,13 +126,6 @@ config KVM_XEN
|
||||||
|
|
||||||
If in doubt, say "N".
|
If in doubt, say "N".
|
||||||
|
|
||||||
config KVM_MMU_AUDIT
|
|
||||||
bool "Audit KVM MMU"
|
|
||||||
depends on KVM && TRACEPOINTS
|
|
||||||
help
|
|
||||||
This option adds a R/W kVM module parameter 'mmu_audit', which allows
|
|
||||||
auditing of KVM MMU events at runtime.
|
|
||||||
|
|
||||||
config KVM_EXTERNAL_WRITE_TRACKING
|
config KVM_EXTERNAL_WRITE_TRACKING
|
||||||
bool
|
bool
|
||||||
|
|
||||||
|
|
|
@ -715,9 +715,30 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
|
||||||
|
|
||||||
entry = &array->entries[array->nent++];
|
entry = &array->entries[array->nent++];
|
||||||
|
|
||||||
|
memset(entry, 0, sizeof(*entry));
|
||||||
entry->function = function;
|
entry->function = function;
|
||||||
entry->index = index;
|
entry->index = index;
|
||||||
entry->flags = 0;
|
switch (function & 0xC0000000) {
|
||||||
|
case 0x40000000:
|
||||||
|
/* Hypervisor leaves are always synthesized by __do_cpuid_func. */
|
||||||
|
return entry;
|
||||||
|
|
||||||
|
case 0x80000000:
|
||||||
|
/*
|
||||||
|
* 0x80000021 is sometimes synthesized by __do_cpuid_func, which
|
||||||
|
* would result in out-of-bounds calls to do_host_cpuid.
|
||||||
|
*/
|
||||||
|
{
|
||||||
|
static int max_cpuid_80000000;
|
||||||
|
if (!READ_ONCE(max_cpuid_80000000))
|
||||||
|
WRITE_ONCE(max_cpuid_80000000, cpuid_eax(0x80000000));
|
||||||
|
if (function > READ_ONCE(max_cpuid_80000000))
|
||||||
|
return entry;
|
||||||
|
}
|
||||||
|
|
||||||
|
default:
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
cpuid_count(entry->function, entry->index,
|
cpuid_count(entry->function, entry->index,
|
||||||
&entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
|
&entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
|
||||||
|
@ -1061,7 +1082,15 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
|
||||||
entry->edx = 0;
|
entry->edx = 0;
|
||||||
break;
|
break;
|
||||||
case 0x80000000:
|
case 0x80000000:
|
||||||
entry->eax = min(entry->eax, 0x8000001f);
|
entry->eax = min(entry->eax, 0x80000021);
|
||||||
|
/*
|
||||||
|
* Serializing LFENCE is reported in a multitude of ways,
|
||||||
|
* and NullSegClearsBase is not reported in CPUID on Zen2;
|
||||||
|
* help userspace by providing the CPUID leaf ourselves.
|
||||||
|
*/
|
||||||
|
if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
|
||||||
|
|| !static_cpu_has_bug(X86_BUG_NULL_SEG))
|
||||||
|
entry->eax = max(entry->eax, 0x80000021);
|
||||||
break;
|
break;
|
||||||
case 0x80000001:
|
case 0x80000001:
|
||||||
cpuid_entry_override(entry, CPUID_8000_0001_EDX);
|
cpuid_entry_override(entry, CPUID_8000_0001_EDX);
|
||||||
|
@ -1132,6 +1161,27 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
|
||||||
entry->ebx &= ~GENMASK(11, 6);
|
entry->ebx &= ~GENMASK(11, 6);
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
|
case 0x80000020:
|
||||||
|
entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
|
||||||
|
break;
|
||||||
|
case 0x80000021:
|
||||||
|
entry->ebx = entry->ecx = entry->edx = 0;
|
||||||
|
/*
|
||||||
|
* Pass down these bits:
|
||||||
|
* EAX 0 NNDBP, Processor ignores nested data breakpoints
|
||||||
|
* EAX 2 LAS, LFENCE always serializing
|
||||||
|
* EAX 6 NSCB, Null selector clear base
|
||||||
|
*
|
||||||
|
* Other defined bits are for MSRs that KVM does not expose:
|
||||||
|
* EAX 3 SPCL, SMM page configuration lock
|
||||||
|
* EAX 13 PCMSR, Prefetch control MSR
|
||||||
|
*/
|
||||||
|
entry->eax &= BIT(0) | BIT(2) | BIT(6);
|
||||||
|
if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC))
|
||||||
|
entry->eax |= BIT(2);
|
||||||
|
if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
|
||||||
|
entry->eax |= BIT(6);
|
||||||
|
break;
|
||||||
/*Add support for Centaur's CPUID instruction*/
|
/*Add support for Centaur's CPUID instruction*/
|
||||||
case 0xC0000000:
|
case 0xC0000000:
|
||||||
/*Just support up to 0xC0000004 now*/
|
/*Just support up to 0xC0000004 now*/
|
||||||
|
@ -1241,8 +1291,7 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
|
||||||
if (sanity_check_entries(entries, cpuid->nent, type))
|
if (sanity_check_entries(entries, cpuid->nent, type))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
array.entries = vzalloc(array_size(sizeof(struct kvm_cpuid_entry2),
|
array.entries = kvcalloc(sizeof(struct kvm_cpuid_entry2), cpuid->nent, GFP_KERNEL);
|
||||||
cpuid->nent));
|
|
||||||
if (!array.entries)
|
if (!array.entries)
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
|
||||||
|
@ -1260,7 +1309,7 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
|
||||||
r = -EFAULT;
|
r = -EFAULT;
|
||||||
|
|
||||||
out_free:
|
out_free:
|
||||||
vfree(array.entries);
|
kvfree(array.entries);
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -1623,11 +1623,6 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
|
||||||
goto exception;
|
goto exception;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!seg_desc.p) {
|
|
||||||
err_vec = (seg == VCPU_SREG_SS) ? SS_VECTOR : NP_VECTOR;
|
|
||||||
goto exception;
|
|
||||||
}
|
|
||||||
|
|
||||||
dpl = seg_desc.dpl;
|
dpl = seg_desc.dpl;
|
||||||
|
|
||||||
switch (seg) {
|
switch (seg) {
|
||||||
|
@ -1643,14 +1638,34 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
|
||||||
if (!(seg_desc.type & 8))
|
if (!(seg_desc.type & 8))
|
||||||
goto exception;
|
goto exception;
|
||||||
|
|
||||||
if (seg_desc.type & 4) {
|
if (transfer == X86_TRANSFER_RET) {
|
||||||
/* conforming */
|
/* RET can never return to an inner privilege level. */
|
||||||
if (dpl > cpl)
|
if (rpl < cpl)
|
||||||
goto exception;
|
|
||||||
} else {
|
|
||||||
/* nonconforming */
|
|
||||||
if (rpl > cpl || dpl != cpl)
|
|
||||||
goto exception;
|
goto exception;
|
||||||
|
/* Outer-privilege level return is not implemented */
|
||||||
|
if (rpl > cpl)
|
||||||
|
return X86EMUL_UNHANDLEABLE;
|
||||||
|
}
|
||||||
|
if (transfer == X86_TRANSFER_RET || transfer == X86_TRANSFER_TASK_SWITCH) {
|
||||||
|
if (seg_desc.type & 4) {
|
||||||
|
/* conforming */
|
||||||
|
if (dpl > rpl)
|
||||||
|
goto exception;
|
||||||
|
} else {
|
||||||
|
/* nonconforming */
|
||||||
|
if (dpl != rpl)
|
||||||
|
goto exception;
|
||||||
|
}
|
||||||
|
} else { /* X86_TRANSFER_CALL_JMP */
|
||||||
|
if (seg_desc.type & 4) {
|
||||||
|
/* conforming */
|
||||||
|
if (dpl > cpl)
|
||||||
|
goto exception;
|
||||||
|
} else {
|
||||||
|
/* nonconforming */
|
||||||
|
if (rpl > cpl || dpl != cpl)
|
||||||
|
goto exception;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
/* in long-mode d/b must be clear if l is set */
|
/* in long-mode d/b must be clear if l is set */
|
||||||
if (seg_desc.d && seg_desc.l) {
|
if (seg_desc.d && seg_desc.l) {
|
||||||
|
@ -1667,6 +1682,10 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
|
||||||
case VCPU_SREG_TR:
|
case VCPU_SREG_TR:
|
||||||
if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9))
|
if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9))
|
||||||
goto exception;
|
goto exception;
|
||||||
|
if (!seg_desc.p) {
|
||||||
|
err_vec = NP_VECTOR;
|
||||||
|
goto exception;
|
||||||
|
}
|
||||||
old_desc = seg_desc;
|
old_desc = seg_desc;
|
||||||
seg_desc.type |= 2; /* busy */
|
seg_desc.type |= 2; /* busy */
|
||||||
ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
|
ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
|
||||||
|
@ -1691,6 +1710,11 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (!seg_desc.p) {
|
||||||
|
err_vec = (seg == VCPU_SREG_SS) ? SS_VECTOR : NP_VECTOR;
|
||||||
|
goto exception;
|
||||||
|
}
|
||||||
|
|
||||||
if (seg_desc.s) {
|
if (seg_desc.s) {
|
||||||
/* mark segment as accessed */
|
/* mark segment as accessed */
|
||||||
if (!(seg_desc.type & 1)) {
|
if (!(seg_desc.type & 1)) {
|
||||||
|
@ -2216,9 +2240,6 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
|
||||||
rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
|
rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
|
||||||
if (rc != X86EMUL_CONTINUE)
|
if (rc != X86EMUL_CONTINUE)
|
||||||
return rc;
|
return rc;
|
||||||
/* Outer-privilege level return is not implemented */
|
|
||||||
if (ctxt->mode >= X86EMUL_MODE_PROT16 && (cs & 3) > cpl)
|
|
||||||
return X86EMUL_UNHANDLEABLE;
|
|
||||||
rc = __load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS, cpl,
|
rc = __load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS, cpl,
|
||||||
X86_TRANSFER_RET,
|
X86_TRANSFER_RET,
|
||||||
&new_desc);
|
&new_desc);
|
||||||
|
@ -2615,8 +2636,7 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
|
||||||
}
|
}
|
||||||
|
|
||||||
static void
|
static void
|
||||||
setup_syscalls_segments(struct x86_emulate_ctxt *ctxt,
|
setup_syscalls_segments(struct desc_struct *cs, struct desc_struct *ss)
|
||||||
struct desc_struct *cs, struct desc_struct *ss)
|
|
||||||
{
|
{
|
||||||
cs->l = 0; /* will be adjusted later */
|
cs->l = 0; /* will be adjusted later */
|
||||||
set_desc_base(cs, 0); /* flat segment */
|
set_desc_base(cs, 0); /* flat segment */
|
||||||
|
@ -2705,7 +2725,7 @@ static int em_syscall(struct x86_emulate_ctxt *ctxt)
|
||||||
if (!(efer & EFER_SCE))
|
if (!(efer & EFER_SCE))
|
||||||
return emulate_ud(ctxt);
|
return emulate_ud(ctxt);
|
||||||
|
|
||||||
setup_syscalls_segments(ctxt, &cs, &ss);
|
setup_syscalls_segments(&cs, &ss);
|
||||||
ops->get_msr(ctxt, MSR_STAR, &msr_data);
|
ops->get_msr(ctxt, MSR_STAR, &msr_data);
|
||||||
msr_data >>= 32;
|
msr_data >>= 32;
|
||||||
cs_sel = (u16)(msr_data & 0xfffc);
|
cs_sel = (u16)(msr_data & 0xfffc);
|
||||||
|
@ -2773,7 +2793,7 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
|
||||||
if ((msr_data & 0xfffc) == 0x0)
|
if ((msr_data & 0xfffc) == 0x0)
|
||||||
return emulate_gp(ctxt, 0);
|
return emulate_gp(ctxt, 0);
|
||||||
|
|
||||||
setup_syscalls_segments(ctxt, &cs, &ss);
|
setup_syscalls_segments(&cs, &ss);
|
||||||
ctxt->eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IF);
|
ctxt->eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IF);
|
||||||
cs_sel = (u16)msr_data & ~SEGMENT_RPL_MASK;
|
cs_sel = (u16)msr_data & ~SEGMENT_RPL_MASK;
|
||||||
ss_sel = cs_sel + 8;
|
ss_sel = cs_sel + 8;
|
||||||
|
@ -2810,7 +2830,7 @@ static int em_sysexit(struct x86_emulate_ctxt *ctxt)
|
||||||
ctxt->mode == X86EMUL_MODE_VM86)
|
ctxt->mode == X86EMUL_MODE_VM86)
|
||||||
return emulate_gp(ctxt, 0);
|
return emulate_gp(ctxt, 0);
|
||||||
|
|
||||||
setup_syscalls_segments(ctxt, &cs, &ss);
|
setup_syscalls_segments(&cs, &ss);
|
||||||
|
|
||||||
if ((ctxt->rex_prefix & 0x8) != 0x0)
|
if ((ctxt->rex_prefix & 0x8) != 0x0)
|
||||||
usermode = X86EMUL_MODE_PROT64;
|
usermode = X86EMUL_MODE_PROT64;
|
||||||
|
@ -3028,8 +3048,7 @@ static int load_state_from_tss16(struct x86_emulate_ctxt *ctxt,
|
||||||
return X86EMUL_CONTINUE;
|
return X86EMUL_CONTINUE;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int task_switch_16(struct x86_emulate_ctxt *ctxt,
|
static int task_switch_16(struct x86_emulate_ctxt *ctxt, u16 old_tss_sel,
|
||||||
u16 tss_selector, u16 old_tss_sel,
|
|
||||||
ulong old_tss_base, struct desc_struct *new_desc)
|
ulong old_tss_base, struct desc_struct *new_desc)
|
||||||
{
|
{
|
||||||
struct tss_segment_16 tss_seg;
|
struct tss_segment_16 tss_seg;
|
||||||
|
@ -3167,8 +3186,7 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt,
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int task_switch_32(struct x86_emulate_ctxt *ctxt,
|
static int task_switch_32(struct x86_emulate_ctxt *ctxt, u16 old_tss_sel,
|
||||||
u16 tss_selector, u16 old_tss_sel,
|
|
||||||
ulong old_tss_base, struct desc_struct *new_desc)
|
ulong old_tss_base, struct desc_struct *new_desc)
|
||||||
{
|
{
|
||||||
struct tss_segment_32 tss_seg;
|
struct tss_segment_32 tss_seg;
|
||||||
|
@ -3276,10 +3294,9 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt,
|
||||||
old_tss_sel = 0xffff;
|
old_tss_sel = 0xffff;
|
||||||
|
|
||||||
if (next_tss_desc.type & 8)
|
if (next_tss_desc.type & 8)
|
||||||
ret = task_switch_32(ctxt, tss_selector, old_tss_sel,
|
ret = task_switch_32(ctxt, old_tss_sel, old_tss_base, &next_tss_desc);
|
||||||
old_tss_base, &next_tss_desc);
|
|
||||||
else
|
else
|
||||||
ret = task_switch_16(ctxt, tss_selector, old_tss_sel,
|
ret = task_switch_16(ctxt, old_tss_sel,
|
||||||
old_tss_base, &next_tss_desc);
|
old_tss_base, &next_tss_desc);
|
||||||
if (ret != X86EMUL_CONTINUE)
|
if (ret != X86EMUL_CONTINUE)
|
||||||
return ret;
|
return ret;
|
||||||
|
|
|
@ -112,6 +112,9 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
|
||||||
if (!!auto_eoi_old == !!auto_eoi_new)
|
if (!!auto_eoi_old == !!auto_eoi_new)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
|
if (!enable_apicv)
|
||||||
|
return;
|
||||||
|
|
||||||
down_write(&vcpu->kvm->arch.apicv_update_lock);
|
down_write(&vcpu->kvm->arch.apicv_update_lock);
|
||||||
|
|
||||||
if (auto_eoi_new)
|
if (auto_eoi_new)
|
||||||
|
@ -1710,32 +1713,47 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
|
||||||
return kvm_hv_get_msr(vcpu, msr, pdata, host);
|
return kvm_hv_get_msr(vcpu, msr, pdata, host);
|
||||||
}
|
}
|
||||||
|
|
||||||
static __always_inline unsigned long *sparse_set_to_vcpu_mask(
|
static void sparse_set_to_vcpu_mask(struct kvm *kvm, u64 *sparse_banks,
|
||||||
struct kvm *kvm, u64 *sparse_banks, u64 valid_bank_mask,
|
u64 valid_bank_mask, unsigned long *vcpu_mask)
|
||||||
u64 *vp_bitmap, unsigned long *vcpu_bitmap)
|
|
||||||
{
|
{
|
||||||
struct kvm_hv *hv = to_kvm_hv(kvm);
|
struct kvm_hv *hv = to_kvm_hv(kvm);
|
||||||
|
bool has_mismatch = atomic_read(&hv->num_mismatched_vp_indexes);
|
||||||
|
u64 vp_bitmap[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
|
||||||
struct kvm_vcpu *vcpu;
|
struct kvm_vcpu *vcpu;
|
||||||
int bank, sbank = 0;
|
int bank, sbank = 0;
|
||||||
unsigned long i;
|
unsigned long i;
|
||||||
|
u64 *bitmap;
|
||||||
|
|
||||||
memset(vp_bitmap, 0,
|
BUILD_BUG_ON(sizeof(vp_bitmap) >
|
||||||
KVM_HV_MAX_SPARSE_VCPU_SET_BITS * sizeof(*vp_bitmap));
|
sizeof(*vcpu_mask) * BITS_TO_LONGS(KVM_MAX_VCPUS));
|
||||||
|
|
||||||
|
/*
|
||||||
|
* If vp_index == vcpu_idx for all vCPUs, fill vcpu_mask directly, else
|
||||||
|
* fill a temporary buffer and manually test each vCPU's VP index.
|
||||||
|
*/
|
||||||
|
if (likely(!has_mismatch))
|
||||||
|
bitmap = (u64 *)vcpu_mask;
|
||||||
|
else
|
||||||
|
bitmap = vp_bitmap;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Each set of 64 VPs is packed into sparse_banks, with valid_bank_mask
|
||||||
|
* having a '1' for each bank that exists in sparse_banks. Sets must
|
||||||
|
* be in ascending order, i.e. bank0..bankN.
|
||||||
|
*/
|
||||||
|
memset(bitmap, 0, sizeof(vp_bitmap));
|
||||||
for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
|
for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
|
||||||
KVM_HV_MAX_SPARSE_VCPU_SET_BITS)
|
KVM_HV_MAX_SPARSE_VCPU_SET_BITS)
|
||||||
vp_bitmap[bank] = sparse_banks[sbank++];
|
bitmap[bank] = sparse_banks[sbank++];
|
||||||
|
|
||||||
if (likely(!atomic_read(&hv->num_mismatched_vp_indexes))) {
|
if (likely(!has_mismatch))
|
||||||
/* for all vcpus vp_index == vcpu_idx */
|
return;
|
||||||
return (unsigned long *)vp_bitmap;
|
|
||||||
}
|
|
||||||
|
|
||||||
bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS);
|
bitmap_zero(vcpu_mask, KVM_MAX_VCPUS);
|
||||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||||
if (test_bit(kvm_hv_get_vpindex(vcpu), (unsigned long *)vp_bitmap))
|
if (test_bit(kvm_hv_get_vpindex(vcpu), (unsigned long *)vp_bitmap))
|
||||||
__set_bit(i, vcpu_bitmap);
|
__set_bit(i, vcpu_mask);
|
||||||
}
|
}
|
||||||
return vcpu_bitmap;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
struct kvm_hv_hcall {
|
struct kvm_hv_hcall {
|
||||||
|
@ -1743,6 +1761,7 @@ struct kvm_hv_hcall {
|
||||||
u64 ingpa;
|
u64 ingpa;
|
||||||
u64 outgpa;
|
u64 outgpa;
|
||||||
u16 code;
|
u16 code;
|
||||||
|
u16 var_cnt;
|
||||||
u16 rep_cnt;
|
u16 rep_cnt;
|
||||||
u16 rep_idx;
|
u16 rep_idx;
|
||||||
bool fast;
|
bool fast;
|
||||||
|
@ -1750,22 +1769,60 @@ struct kvm_hv_hcall {
|
||||||
sse128_t xmm[HV_HYPERCALL_MAX_XMM_REGISTERS];
|
sse128_t xmm[HV_HYPERCALL_MAX_XMM_REGISTERS];
|
||||||
};
|
};
|
||||||
|
|
||||||
static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool ex)
|
static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
|
||||||
|
int consumed_xmm_halves,
|
||||||
|
u64 *sparse_banks, gpa_t offset)
|
||||||
{
|
{
|
||||||
|
u16 var_cnt;
|
||||||
int i;
|
int i;
|
||||||
gpa_t gpa;
|
|
||||||
|
if (hc->var_cnt > 64)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
/* Ignore banks that cannot possibly contain a legal VP index. */
|
||||||
|
var_cnt = min_t(u16, hc->var_cnt, KVM_HV_MAX_SPARSE_VCPU_SET_BITS);
|
||||||
|
|
||||||
|
if (hc->fast) {
|
||||||
|
/*
|
||||||
|
* Each XMM holds two sparse banks, but do not count halves that
|
||||||
|
* have already been consumed for hypercall parameters.
|
||||||
|
*/
|
||||||
|
if (hc->var_cnt > 2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves)
|
||||||
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
for (i = 0; i < var_cnt; i++) {
|
||||||
|
int j = i + consumed_xmm_halves;
|
||||||
|
if (j % 2)
|
||||||
|
sparse_banks[i] = sse128_hi(hc->xmm[j / 2]);
|
||||||
|
else
|
||||||
|
sparse_banks[i] = sse128_lo(hc->xmm[j / 2]);
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return kvm_read_guest(kvm, hc->ingpa + offset, sparse_banks,
|
||||||
|
var_cnt * sizeof(*sparse_banks));
|
||||||
|
}
|
||||||
|
|
||||||
|
static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
|
||||||
|
{
|
||||||
struct kvm *kvm = vcpu->kvm;
|
struct kvm *kvm = vcpu->kvm;
|
||||||
struct hv_tlb_flush_ex flush_ex;
|
struct hv_tlb_flush_ex flush_ex;
|
||||||
struct hv_tlb_flush flush;
|
struct hv_tlb_flush flush;
|
||||||
u64 vp_bitmap[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
|
DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
|
||||||
DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
|
|
||||||
unsigned long *vcpu_mask;
|
|
||||||
u64 valid_bank_mask;
|
u64 valid_bank_mask;
|
||||||
u64 sparse_banks[64];
|
u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
|
||||||
int sparse_banks_len;
|
|
||||||
bool all_cpus;
|
bool all_cpus;
|
||||||
|
|
||||||
if (!ex) {
|
/*
|
||||||
|
* The Hyper-V TLFS doesn't allow more than 64 sparse banks, e.g. the
|
||||||
|
* valid mask is a u64. Fail the build if KVM's max allowed number of
|
||||||
|
* vCPUs (>4096) would exceed this limit, KVM will additional changes
|
||||||
|
* for Hyper-V support to avoid setting the guest up to fail.
|
||||||
|
*/
|
||||||
|
BUILD_BUG_ON(KVM_HV_MAX_SPARSE_VCPU_SET_BITS > 64);
|
||||||
|
|
||||||
|
if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST ||
|
||||||
|
hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE) {
|
||||||
if (hc->fast) {
|
if (hc->fast) {
|
||||||
flush.address_space = hc->ingpa;
|
flush.address_space = hc->ingpa;
|
||||||
flush.flags = hc->outgpa;
|
flush.flags = hc->outgpa;
|
||||||
|
@ -1812,30 +1869,22 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
|
||||||
all_cpus = flush_ex.hv_vp_set.format !=
|
all_cpus = flush_ex.hv_vp_set.format !=
|
||||||
HV_GENERIC_SET_SPARSE_4K;
|
HV_GENERIC_SET_SPARSE_4K;
|
||||||
|
|
||||||
sparse_banks_len = bitmap_weight((unsigned long *)&valid_bank_mask, 64);
|
if (hc->var_cnt != bitmap_weight((unsigned long *)&valid_bank_mask, 64))
|
||||||
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
|
||||||
if (!sparse_banks_len && !all_cpus)
|
if (all_cpus)
|
||||||
|
goto do_flush;
|
||||||
|
|
||||||
|
if (!hc->var_cnt)
|
||||||
goto ret_success;
|
goto ret_success;
|
||||||
|
|
||||||
if (!all_cpus) {
|
if (kvm_get_sparse_vp_set(kvm, hc, 2, sparse_banks,
|
||||||
if (hc->fast) {
|
offsetof(struct hv_tlb_flush_ex,
|
||||||
if (sparse_banks_len > HV_HYPERCALL_MAX_XMM_REGISTERS - 1)
|
hv_vp_set.bank_contents)))
|
||||||
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
for (i = 0; i < sparse_banks_len; i += 2) {
|
|
||||||
sparse_banks[i] = sse128_lo(hc->xmm[i / 2 + 1]);
|
|
||||||
sparse_banks[i + 1] = sse128_hi(hc->xmm[i / 2 + 1]);
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
gpa = hc->ingpa + offsetof(struct hv_tlb_flush_ex,
|
|
||||||
hv_vp_set.bank_contents);
|
|
||||||
if (unlikely(kvm_read_guest(kvm, gpa, sparse_banks,
|
|
||||||
sparse_banks_len *
|
|
||||||
sizeof(sparse_banks[0]))))
|
|
||||||
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
do_flush:
|
||||||
/*
|
/*
|
||||||
* vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
|
* vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
|
||||||
* analyze it here, flush TLB regardless of the specified address space.
|
* analyze it here, flush TLB regardless of the specified address space.
|
||||||
|
@ -1843,11 +1892,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
|
||||||
if (all_cpus) {
|
if (all_cpus) {
|
||||||
kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST);
|
kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST);
|
||||||
} else {
|
} else {
|
||||||
vcpu_mask = sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
|
sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
|
||||||
vp_bitmap, vcpu_bitmap);
|
|
||||||
|
|
||||||
kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
|
kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST, vcpu_mask);
|
||||||
vcpu_mask);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
ret_success:
|
ret_success:
|
||||||
|
@ -1875,21 +1922,18 @@ static void kvm_send_ipi_to_many(struct kvm *kvm, u32 vector,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool ex)
|
static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
|
||||||
{
|
{
|
||||||
struct kvm *kvm = vcpu->kvm;
|
struct kvm *kvm = vcpu->kvm;
|
||||||
struct hv_send_ipi_ex send_ipi_ex;
|
struct hv_send_ipi_ex send_ipi_ex;
|
||||||
struct hv_send_ipi send_ipi;
|
struct hv_send_ipi send_ipi;
|
||||||
u64 vp_bitmap[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
|
DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
|
||||||
DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
|
|
||||||
unsigned long *vcpu_mask;
|
|
||||||
unsigned long valid_bank_mask;
|
unsigned long valid_bank_mask;
|
||||||
u64 sparse_banks[64];
|
u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
|
||||||
int sparse_banks_len;
|
|
||||||
u32 vector;
|
u32 vector;
|
||||||
bool all_cpus;
|
bool all_cpus;
|
||||||
|
|
||||||
if (!ex) {
|
if (hc->code == HVCALL_SEND_IPI) {
|
||||||
if (!hc->fast) {
|
if (!hc->fast) {
|
||||||
if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi,
|
if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi,
|
||||||
sizeof(send_ipi))))
|
sizeof(send_ipi))))
|
||||||
|
@ -1908,9 +1952,15 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
|
||||||
|
|
||||||
trace_kvm_hv_send_ipi(vector, sparse_banks[0]);
|
trace_kvm_hv_send_ipi(vector, sparse_banks[0]);
|
||||||
} else {
|
} else {
|
||||||
if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi_ex,
|
if (!hc->fast) {
|
||||||
sizeof(send_ipi_ex))))
|
if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi_ex,
|
||||||
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
sizeof(send_ipi_ex))))
|
||||||
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
} else {
|
||||||
|
send_ipi_ex.vector = (u32)hc->ingpa;
|
||||||
|
send_ipi_ex.vp_set.format = hc->outgpa;
|
||||||
|
send_ipi_ex.vp_set.valid_bank_mask = sse128_lo(hc->xmm[0]);
|
||||||
|
}
|
||||||
|
|
||||||
trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
|
trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
|
||||||
send_ipi_ex.vp_set.format,
|
send_ipi_ex.vp_set.format,
|
||||||
|
@ -1918,22 +1968,20 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
|
||||||
|
|
||||||
vector = send_ipi_ex.vector;
|
vector = send_ipi_ex.vector;
|
||||||
valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
|
valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
|
||||||
sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
|
|
||||||
sizeof(sparse_banks[0]);
|
|
||||||
|
|
||||||
all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
|
all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
|
||||||
|
|
||||||
|
if (hc->var_cnt != bitmap_weight(&valid_bank_mask, 64))
|
||||||
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
|
||||||
if (all_cpus)
|
if (all_cpus)
|
||||||
goto check_and_send_ipi;
|
goto check_and_send_ipi;
|
||||||
|
|
||||||
if (!sparse_banks_len)
|
if (!hc->var_cnt)
|
||||||
goto ret_success;
|
goto ret_success;
|
||||||
|
|
||||||
if (kvm_read_guest(kvm,
|
if (kvm_get_sparse_vp_set(kvm, hc, 1, sparse_banks,
|
||||||
hc->ingpa + offsetof(struct hv_send_ipi_ex,
|
offsetof(struct hv_send_ipi_ex,
|
||||||
vp_set.bank_contents),
|
vp_set.bank_contents)))
|
||||||
sparse_banks,
|
|
||||||
sparse_banks_len))
|
|
||||||
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -1941,11 +1989,13 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
|
||||||
if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
|
if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
|
||||||
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
return HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
|
||||||
vcpu_mask = all_cpus ? NULL :
|
if (all_cpus) {
|
||||||
sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
|
kvm_send_ipi_to_many(kvm, vector, NULL);
|
||||||
vp_bitmap, vcpu_bitmap);
|
} else {
|
||||||
|
sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
|
||||||
|
|
||||||
kvm_send_ipi_to_many(kvm, vector, vcpu_mask);
|
kvm_send_ipi_to_many(kvm, vector, vcpu_mask);
|
||||||
|
}
|
||||||
|
|
||||||
ret_success:
|
ret_success:
|
||||||
return HV_STATUS_SUCCESS;
|
return HV_STATUS_SUCCESS;
|
||||||
|
@ -2017,11 +2067,6 @@ int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
|
||||||
return vcpu->arch.hyperv_enabled && to_kvm_hv(vcpu->kvm)->hv_guest_os_id;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void kvm_hv_hypercall_set_result(struct kvm_vcpu *vcpu, u64 result)
|
static void kvm_hv_hypercall_set_result(struct kvm_vcpu *vcpu, u64 result)
|
||||||
{
|
{
|
||||||
bool longmode;
|
bool longmode;
|
||||||
|
@ -2096,6 +2141,7 @@ static bool is_xmm_fast_hypercall(struct kvm_hv_hcall *hc)
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
|
||||||
|
case HVCALL_SEND_IPI_EX:
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2191,19 +2237,25 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
hc.code = hc.param & 0xffff;
|
hc.code = hc.param & 0xffff;
|
||||||
|
hc.var_cnt = (hc.param & HV_HYPERCALL_VARHEAD_MASK) >> HV_HYPERCALL_VARHEAD_OFFSET;
|
||||||
hc.fast = !!(hc.param & HV_HYPERCALL_FAST_BIT);
|
hc.fast = !!(hc.param & HV_HYPERCALL_FAST_BIT);
|
||||||
hc.rep_cnt = (hc.param >> HV_HYPERCALL_REP_COMP_OFFSET) & 0xfff;
|
hc.rep_cnt = (hc.param >> HV_HYPERCALL_REP_COMP_OFFSET) & 0xfff;
|
||||||
hc.rep_idx = (hc.param >> HV_HYPERCALL_REP_START_OFFSET) & 0xfff;
|
hc.rep_idx = (hc.param >> HV_HYPERCALL_REP_START_OFFSET) & 0xfff;
|
||||||
hc.rep = !!(hc.rep_cnt || hc.rep_idx);
|
hc.rep = !!(hc.rep_cnt || hc.rep_idx);
|
||||||
|
|
||||||
trace_kvm_hv_hypercall(hc.code, hc.fast, hc.rep_cnt, hc.rep_idx,
|
trace_kvm_hv_hypercall(hc.code, hc.fast, hc.var_cnt, hc.rep_cnt,
|
||||||
hc.ingpa, hc.outgpa);
|
hc.rep_idx, hc.ingpa, hc.outgpa);
|
||||||
|
|
||||||
if (unlikely(!hv_check_hypercall_access(hv_vcpu, hc.code))) {
|
if (unlikely(!hv_check_hypercall_access(hv_vcpu, hc.code))) {
|
||||||
ret = HV_STATUS_ACCESS_DENIED;
|
ret = HV_STATUS_ACCESS_DENIED;
|
||||||
goto hypercall_complete;
|
goto hypercall_complete;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (unlikely(hc.param & HV_HYPERCALL_RSVD_MASK)) {
|
||||||
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
goto hypercall_complete;
|
||||||
|
}
|
||||||
|
|
||||||
if (hc.fast && is_xmm_fast_hypercall(&hc)) {
|
if (hc.fast && is_xmm_fast_hypercall(&hc)) {
|
||||||
if (unlikely(hv_vcpu->enforce_cpuid &&
|
if (unlikely(hv_vcpu->enforce_cpuid &&
|
||||||
!(hv_vcpu->cpuid_cache.features_edx &
|
!(hv_vcpu->cpuid_cache.features_edx &
|
||||||
|
@ -2217,14 +2269,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
switch (hc.code) {
|
switch (hc.code) {
|
||||||
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
|
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
|
||||||
if (unlikely(hc.rep)) {
|
if (unlikely(hc.rep || hc.var_cnt)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
kvm_vcpu_on_spin(vcpu, true);
|
kvm_vcpu_on_spin(vcpu, true);
|
||||||
break;
|
break;
|
||||||
case HVCALL_SIGNAL_EVENT:
|
case HVCALL_SIGNAL_EVENT:
|
||||||
if (unlikely(hc.rep)) {
|
if (unlikely(hc.rep || hc.var_cnt)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -2234,7 +2286,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
|
||||||
fallthrough; /* maybe userspace knows this conn_id */
|
fallthrough; /* maybe userspace knows this conn_id */
|
||||||
case HVCALL_POST_MESSAGE:
|
case HVCALL_POST_MESSAGE:
|
||||||
/* don't bother userspace if it has no way to handle it */
|
/* don't bother userspace if it has no way to handle it */
|
||||||
if (unlikely(hc.rep || !to_hv_synic(vcpu)->active)) {
|
if (unlikely(hc.rep || hc.var_cnt || !to_hv_synic(vcpu)->active)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -2247,46 +2299,43 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
|
||||||
kvm_hv_hypercall_complete_userspace;
|
kvm_hv_hypercall_complete_userspace;
|
||||||
return 0;
|
return 0;
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
|
||||||
if (unlikely(!hc.rep_cnt || hc.rep_idx)) {
|
if (unlikely(hc.var_cnt)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
ret = kvm_hv_flush_tlb(vcpu, &hc, false);
|
fallthrough;
|
||||||
break;
|
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
|
|
||||||
if (unlikely(hc.rep)) {
|
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
ret = kvm_hv_flush_tlb(vcpu, &hc, false);
|
|
||||||
break;
|
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
|
||||||
if (unlikely(!hc.rep_cnt || hc.rep_idx)) {
|
if (unlikely(!hc.rep_cnt || hc.rep_idx)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
ret = kvm_hv_flush_tlb(vcpu, &hc, true);
|
ret = kvm_hv_flush_tlb(vcpu, &hc);
|
||||||
break;
|
break;
|
||||||
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
|
||||||
|
if (unlikely(hc.var_cnt)) {
|
||||||
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
fallthrough;
|
||||||
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
|
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
|
||||||
if (unlikely(hc.rep)) {
|
if (unlikely(hc.rep)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
ret = kvm_hv_flush_tlb(vcpu, &hc, true);
|
ret = kvm_hv_flush_tlb(vcpu, &hc);
|
||||||
break;
|
break;
|
||||||
case HVCALL_SEND_IPI:
|
case HVCALL_SEND_IPI:
|
||||||
|
if (unlikely(hc.var_cnt)) {
|
||||||
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
fallthrough;
|
||||||
|
case HVCALL_SEND_IPI_EX:
|
||||||
if (unlikely(hc.rep)) {
|
if (unlikely(hc.rep)) {
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
ret = kvm_hv_send_ipi(vcpu, &hc, false);
|
ret = kvm_hv_send_ipi(vcpu, &hc);
|
||||||
break;
|
|
||||||
case HVCALL_SEND_IPI_EX:
|
|
||||||
if (unlikely(hc.fast || hc.rep)) {
|
|
||||||
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
ret = kvm_hv_send_ipi(vcpu, &hc, true);
|
|
||||||
break;
|
break;
|
||||||
case HVCALL_POST_DEBUG_DATA:
|
case HVCALL_POST_DEBUG_DATA:
|
||||||
case HVCALL_RETRIEVE_DEBUG_DATA:
|
case HVCALL_RETRIEVE_DEBUG_DATA:
|
||||||
|
@ -2417,10 +2466,6 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
|
||||||
if (kvm_x86_ops.nested_ops->get_evmcs_version)
|
if (kvm_x86_ops.nested_ops->get_evmcs_version)
|
||||||
evmcs_ver = kvm_x86_ops.nested_ops->get_evmcs_version(vcpu);
|
evmcs_ver = kvm_x86_ops.nested_ops->get_evmcs_version(vcpu);
|
||||||
|
|
||||||
/* Skip NESTED_FEATURES if eVMCS is not supported */
|
|
||||||
if (!evmcs_ver)
|
|
||||||
--nent;
|
|
||||||
|
|
||||||
if (cpuid->nent < nent)
|
if (cpuid->nent < nent)
|
||||||
return -E2BIG;
|
return -E2BIG;
|
||||||
|
|
||||||
|
@ -2520,8 +2565,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
|
||||||
|
|
||||||
case HYPERV_CPUID_NESTED_FEATURES:
|
case HYPERV_CPUID_NESTED_FEATURES:
|
||||||
ent->eax = evmcs_ver;
|
ent->eax = evmcs_ver;
|
||||||
if (evmcs_ver)
|
ent->eax |= HV_X64_NESTED_MSR_BITMAP;
|
||||||
ent->eax |= HV_X64_NESTED_MSR_BITMAP;
|
|
||||||
|
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
|
|
@ -89,7 +89,11 @@ static inline u32 kvm_hv_get_vpindex(struct kvm_vcpu *vcpu)
|
||||||
int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host);
|
int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host);
|
||||||
int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host);
|
int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host);
|
||||||
|
|
||||||
bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu);
|
static inline bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
return vcpu->arch.hyperv_enabled && to_kvm_hv(vcpu->kvm)->hv_guest_os_id;
|
||||||
|
}
|
||||||
|
|
||||||
int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
|
int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
void kvm_hv_irq_routing_update(struct kvm *kvm);
|
void kvm_hv_irq_routing_update(struct kvm *kvm);
|
||||||
|
|
|
@ -437,13 +437,13 @@ static u32 pic_ioport_read(void *opaque, u32 addr)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void elcr_ioport_write(void *opaque, u32 addr, u32 val)
|
static void elcr_ioport_write(void *opaque, u32 val)
|
||||||
{
|
{
|
||||||
struct kvm_kpic_state *s = opaque;
|
struct kvm_kpic_state *s = opaque;
|
||||||
s->elcr = val & s->elcr_mask;
|
s->elcr = val & s->elcr_mask;
|
||||||
}
|
}
|
||||||
|
|
||||||
static u32 elcr_ioport_read(void *opaque, u32 addr1)
|
static u32 elcr_ioport_read(void *opaque)
|
||||||
{
|
{
|
||||||
struct kvm_kpic_state *s = opaque;
|
struct kvm_kpic_state *s = opaque;
|
||||||
return s->elcr;
|
return s->elcr;
|
||||||
|
@ -474,7 +474,7 @@ static int picdev_write(struct kvm_pic *s,
|
||||||
case 0x4d0:
|
case 0x4d0:
|
||||||
case 0x4d1:
|
case 0x4d1:
|
||||||
pic_lock(s);
|
pic_lock(s);
|
||||||
elcr_ioport_write(&s->pics[addr & 1], addr, data);
|
elcr_ioport_write(&s->pics[addr & 1], data);
|
||||||
pic_unlock(s);
|
pic_unlock(s);
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
|
@ -505,7 +505,7 @@ static int picdev_read(struct kvm_pic *s,
|
||||||
case 0x4d0:
|
case 0x4d0:
|
||||||
case 0x4d1:
|
case 0x4d1:
|
||||||
pic_lock(s);
|
pic_lock(s);
|
||||||
*data = elcr_ioport_read(&s->pics[addr & 1], addr);
|
*data = elcr_ioport_read(&s->pics[addr & 1]);
|
||||||
pic_unlock(s);
|
pic_unlock(s);
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
|
|
|
@ -54,9 +54,7 @@ static void kvm_ioapic_update_eoi_one(struct kvm_vcpu *vcpu,
|
||||||
int trigger_mode,
|
int trigger_mode,
|
||||||
int pin);
|
int pin);
|
||||||
|
|
||||||
static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
|
static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic)
|
||||||
unsigned long addr,
|
|
||||||
unsigned long length)
|
|
||||||
{
|
{
|
||||||
unsigned long result = 0;
|
unsigned long result = 0;
|
||||||
|
|
||||||
|
@ -593,7 +591,7 @@ static int ioapic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case IOAPIC_REG_WINDOW:
|
case IOAPIC_REG_WINDOW:
|
||||||
result = ioapic_read_indirect(ioapic, addr, len);
|
result = ioapic_read_indirect(ioapic);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
|
|
|
@ -92,3 +92,17 @@ int hv_remote_flush_tlb(struct kvm *kvm)
|
||||||
return hv_remote_flush_tlb_with_range(kvm, NULL);
|
return hv_remote_flush_tlb_with_range(kvm, NULL);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(hv_remote_flush_tlb);
|
EXPORT_SYMBOL_GPL(hv_remote_flush_tlb);
|
||||||
|
|
||||||
|
void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp)
|
||||||
|
{
|
||||||
|
struct kvm_arch *kvm_arch = &vcpu->kvm->arch;
|
||||||
|
|
||||||
|
if (kvm_x86_ops.tlb_remote_flush == hv_remote_flush_tlb) {
|
||||||
|
spin_lock(&kvm_arch->hv_root_tdp_lock);
|
||||||
|
vcpu->arch.hv_root_tdp = root_tdp;
|
||||||
|
if (root_tdp != kvm_arch->hv_root_tdp)
|
||||||
|
kvm_arch->hv_root_tdp = INVALID_PAGE;
|
||||||
|
spin_unlock(&kvm_arch->hv_root_tdp_lock);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(hv_track_root_tdp);
|
||||||
|
|
|
@ -10,19 +10,7 @@
|
||||||
int hv_remote_flush_tlb_with_range(struct kvm *kvm,
|
int hv_remote_flush_tlb_with_range(struct kvm *kvm,
|
||||||
struct kvm_tlb_range *range);
|
struct kvm_tlb_range *range);
|
||||||
int hv_remote_flush_tlb(struct kvm *kvm);
|
int hv_remote_flush_tlb(struct kvm *kvm);
|
||||||
|
void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp);
|
||||||
static inline void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp)
|
|
||||||
{
|
|
||||||
struct kvm_arch *kvm_arch = &vcpu->kvm->arch;
|
|
||||||
|
|
||||||
if (kvm_x86_ops.tlb_remote_flush == hv_remote_flush_tlb) {
|
|
||||||
spin_lock(&kvm_arch->hv_root_tdp_lock);
|
|
||||||
vcpu->arch.hv_root_tdp = root_tdp;
|
|
||||||
if (root_tdp != kvm_arch->hv_root_tdp)
|
|
||||||
kvm_arch->hv_root_tdp = INVALID_PAGE;
|
|
||||||
spin_unlock(&kvm_arch->hv_root_tdp_lock);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
#else /* !CONFIG_HYPERV */
|
#else /* !CONFIG_HYPERV */
|
||||||
static inline void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp)
|
static inline void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp)
|
||||||
{
|
{
|
||||||
|
|
|
@ -68,6 +68,39 @@ static bool lapic_timer_advance_dynamic __read_mostly;
|
||||||
/* step-by-step approximation to mitigate fluctuation */
|
/* step-by-step approximation to mitigate fluctuation */
|
||||||
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
|
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
|
||||||
|
|
||||||
|
static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val)
|
||||||
|
{
|
||||||
|
*((u32 *) (regs + reg_off)) = val;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void kvm_lapic_set_reg(struct kvm_lapic *apic, int reg_off, u32 val)
|
||||||
|
{
|
||||||
|
__kvm_lapic_set_reg(apic->regs, reg_off, val);
|
||||||
|
}
|
||||||
|
|
||||||
|
static __always_inline u64 __kvm_lapic_get_reg64(char *regs, int reg)
|
||||||
|
{
|
||||||
|
BUILD_BUG_ON(reg != APIC_ICR);
|
||||||
|
return *((u64 *) (regs + reg));
|
||||||
|
}
|
||||||
|
|
||||||
|
static __always_inline u64 kvm_lapic_get_reg64(struct kvm_lapic *apic, int reg)
|
||||||
|
{
|
||||||
|
return __kvm_lapic_get_reg64(apic->regs, reg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static __always_inline void __kvm_lapic_set_reg64(char *regs, int reg, u64 val)
|
||||||
|
{
|
||||||
|
BUILD_BUG_ON(reg != APIC_ICR);
|
||||||
|
*((u64 *) (regs + reg)) = val;
|
||||||
|
}
|
||||||
|
|
||||||
|
static __always_inline void kvm_lapic_set_reg64(struct kvm_lapic *apic,
|
||||||
|
int reg, u64 val)
|
||||||
|
{
|
||||||
|
__kvm_lapic_set_reg64(apic->regs, reg, val);
|
||||||
|
}
|
||||||
|
|
||||||
static inline int apic_test_vector(int vec, void *bitmap)
|
static inline int apic_test_vector(int vec, void *bitmap)
|
||||||
{
|
{
|
||||||
return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
|
return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
|
||||||
|
@ -113,7 +146,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
|
||||||
|
|
||||||
static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
|
static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
|
return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
|
||||||
|
(kvm_mwait_in_guest(vcpu->kvm) || kvm_hlt_in_guest(vcpu->kvm));
|
||||||
}
|
}
|
||||||
|
|
||||||
bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
|
bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
|
||||||
|
@ -491,8 +525,7 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
|
||||||
if (unlikely(vcpu->arch.apicv_active)) {
|
if (unlikely(vcpu->arch.apicv_active)) {
|
||||||
/* need to update RVI */
|
/* need to update RVI */
|
||||||
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
|
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
|
||||||
static_call(kvm_x86_hwapic_irr_update)(vcpu,
|
static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, apic_find_highest_irr(apic));
|
||||||
apic_find_highest_irr(apic));
|
|
||||||
} else {
|
} else {
|
||||||
apic->irr_pending = false;
|
apic->irr_pending = false;
|
||||||
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
|
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
|
||||||
|
@ -522,7 +555,7 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
|
||||||
* just set SVI.
|
* just set SVI.
|
||||||
*/
|
*/
|
||||||
if (unlikely(vcpu->arch.apicv_active))
|
if (unlikely(vcpu->arch.apicv_active))
|
||||||
static_call(kvm_x86_hwapic_isr_update)(vcpu, vec);
|
static_call_cond(kvm_x86_hwapic_isr_update)(vcpu, vec);
|
||||||
else {
|
else {
|
||||||
++apic->isr_count;
|
++apic->isr_count;
|
||||||
BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
|
BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
|
||||||
|
@ -570,8 +603,7 @@ static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
|
||||||
* and must be left alone.
|
* and must be left alone.
|
||||||
*/
|
*/
|
||||||
if (unlikely(vcpu->arch.apicv_active))
|
if (unlikely(vcpu->arch.apicv_active))
|
||||||
static_call(kvm_x86_hwapic_isr_update)(vcpu,
|
static_call_cond(kvm_x86_hwapic_isr_update)(vcpu, apic_find_highest_isr(apic));
|
||||||
apic_find_highest_isr(apic));
|
|
||||||
else {
|
else {
|
||||||
--apic->isr_count;
|
--apic->isr_count;
|
||||||
BUG_ON(apic->isr_count < 0);
|
BUG_ON(apic->isr_count < 0);
|
||||||
|
@ -1276,6 +1308,9 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
|
||||||
{
|
{
|
||||||
struct kvm_lapic_irq irq;
|
struct kvm_lapic_irq irq;
|
||||||
|
|
||||||
|
/* KVM has no delay and should always clear the BUSY/PENDING flag. */
|
||||||
|
WARN_ON_ONCE(icr_low & APIC_ICR_BUSY);
|
||||||
|
|
||||||
irq.vector = icr_low & APIC_VECTOR_MASK;
|
irq.vector = icr_low & APIC_VECTOR_MASK;
|
||||||
irq.delivery_mode = icr_low & APIC_MODE_MASK;
|
irq.delivery_mode = icr_low & APIC_MODE_MASK;
|
||||||
irq.dest_mode = icr_low & APIC_DEST_MASK;
|
irq.dest_mode = icr_low & APIC_DEST_MASK;
|
||||||
|
@ -1292,6 +1327,7 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
|
||||||
|
|
||||||
kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
|
kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
|
||||||
}
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(kvm_apic_send_ipi);
|
||||||
|
|
||||||
static u32 apic_get_tmcct(struct kvm_lapic *apic)
|
static u32 apic_get_tmcct(struct kvm_lapic *apic)
|
||||||
{
|
{
|
||||||
|
@ -1375,8 +1411,8 @@ static inline struct kvm_lapic *to_lapic(struct kvm_io_device *dev)
|
||||||
#define APIC_REGS_MASK(first, count) \
|
#define APIC_REGS_MASK(first, count) \
|
||||||
(APIC_REG_MASK(first) * ((1ull << (count)) - 1))
|
(APIC_REG_MASK(first) * ((1ull << (count)) - 1))
|
||||||
|
|
||||||
int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
static int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
||||||
void *data)
|
void *data)
|
||||||
{
|
{
|
||||||
unsigned char alignment = offset & 0xf;
|
unsigned char alignment = offset & 0xf;
|
||||||
u32 result;
|
u32 result;
|
||||||
|
@ -1394,7 +1430,6 @@ int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
||||||
APIC_REGS_MASK(APIC_IRR, APIC_ISR_NR) |
|
APIC_REGS_MASK(APIC_IRR, APIC_ISR_NR) |
|
||||||
APIC_REG_MASK(APIC_ESR) |
|
APIC_REG_MASK(APIC_ESR) |
|
||||||
APIC_REG_MASK(APIC_ICR) |
|
APIC_REG_MASK(APIC_ICR) |
|
||||||
APIC_REG_MASK(APIC_ICR2) |
|
|
||||||
APIC_REG_MASK(APIC_LVTT) |
|
APIC_REG_MASK(APIC_LVTT) |
|
||||||
APIC_REG_MASK(APIC_LVTTHMR) |
|
APIC_REG_MASK(APIC_LVTTHMR) |
|
||||||
APIC_REG_MASK(APIC_LVTPC) |
|
APIC_REG_MASK(APIC_LVTPC) |
|
||||||
|
@ -1405,9 +1440,16 @@ int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
||||||
APIC_REG_MASK(APIC_TMCCT) |
|
APIC_REG_MASK(APIC_TMCCT) |
|
||||||
APIC_REG_MASK(APIC_TDCR);
|
APIC_REG_MASK(APIC_TDCR);
|
||||||
|
|
||||||
/* ARBPRI is not valid on x2APIC */
|
/*
|
||||||
|
* ARBPRI and ICR2 are not valid in x2APIC mode. WARN if KVM reads ICR
|
||||||
|
* in x2APIC mode as it's an 8-byte register in x2APIC and needs to be
|
||||||
|
* manually handled by the caller.
|
||||||
|
*/
|
||||||
if (!apic_x2apic_mode(apic))
|
if (!apic_x2apic_mode(apic))
|
||||||
valid_reg_mask |= APIC_REG_MASK(APIC_ARBPRI);
|
valid_reg_mask |= APIC_REG_MASK(APIC_ARBPRI) |
|
||||||
|
APIC_REG_MASK(APIC_ICR2);
|
||||||
|
else
|
||||||
|
WARN_ON_ONCE(offset == APIC_ICR);
|
||||||
|
|
||||||
if (alignment + len > 4)
|
if (alignment + len > 4)
|
||||||
return 1;
|
return 1;
|
||||||
|
@ -1432,7 +1474,6 @@ int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
||||||
}
|
}
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_lapic_reg_read);
|
|
||||||
|
|
||||||
static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
|
static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
|
||||||
{
|
{
|
||||||
|
@ -1993,7 +2034,7 @@ static void apic_manage_nmi_watchdog(struct kvm_lapic *apic, u32 lvt0_val)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
|
static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
|
||||||
{
|
{
|
||||||
int ret = 0;
|
int ret = 0;
|
||||||
|
|
||||||
|
@ -2052,16 +2093,18 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case APIC_ICR:
|
case APIC_ICR:
|
||||||
|
WARN_ON_ONCE(apic_x2apic_mode(apic));
|
||||||
|
|
||||||
/* No delay here, so we always clear the pending bit */
|
/* No delay here, so we always clear the pending bit */
|
||||||
val &= ~(1 << 12);
|
val &= ~APIC_ICR_BUSY;
|
||||||
kvm_apic_send_ipi(apic, val, kvm_lapic_get_reg(apic, APIC_ICR2));
|
kvm_apic_send_ipi(apic, val, kvm_lapic_get_reg(apic, APIC_ICR2));
|
||||||
kvm_lapic_set_reg(apic, APIC_ICR, val);
|
kvm_lapic_set_reg(apic, APIC_ICR, val);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case APIC_ICR2:
|
case APIC_ICR2:
|
||||||
if (!apic_x2apic_mode(apic))
|
if (apic_x2apic_mode(apic))
|
||||||
val &= 0xff000000;
|
ret = 1;
|
||||||
kvm_lapic_set_reg(apic, APIC_ICR2, val);
|
else
|
||||||
|
kvm_lapic_set_reg(apic, APIC_ICR2, val & 0xff000000);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case APIC_LVT0:
|
case APIC_LVT0:
|
||||||
|
@ -2121,10 +2164,9 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case APIC_SELF_IPI:
|
case APIC_SELF_IPI:
|
||||||
if (apic_x2apic_mode(apic)) {
|
if (apic_x2apic_mode(apic))
|
||||||
kvm_lapic_reg_write(apic, APIC_ICR,
|
kvm_apic_send_ipi(apic, APIC_DEST_SELF | (val & APIC_VECTOR_MASK), 0);
|
||||||
APIC_DEST_SELF | (val & APIC_VECTOR_MASK));
|
else
|
||||||
} else
|
|
||||||
ret = 1;
|
ret = 1;
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
|
@ -2132,11 +2174,15 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Recalculate APIC maps if necessary, e.g. if the software enable bit
|
||||||
|
* was toggled, the APIC ID changed, etc... The maps are marked dirty
|
||||||
|
* on relevant changes, i.e. this is a nop for most writes.
|
||||||
|
*/
|
||||||
kvm_recalculate_apic_map(apic->vcpu->kvm);
|
kvm_recalculate_apic_map(apic->vcpu->kvm);
|
||||||
|
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_lapic_reg_write);
|
|
||||||
|
|
||||||
static int apic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
|
static int apic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
|
||||||
gpa_t address, int len, const void *data)
|
gpa_t address, int len, const void *data)
|
||||||
|
@ -2180,12 +2226,7 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
|
||||||
/* emulate APIC access in a trap manner */
|
/* emulate APIC access in a trap manner */
|
||||||
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
|
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
|
||||||
{
|
{
|
||||||
u32 val = 0;
|
u32 val = kvm_lapic_get_reg(vcpu->arch.apic, offset);
|
||||||
|
|
||||||
/* hw has done the conditional check and inst decode */
|
|
||||||
offset &= 0xff0;
|
|
||||||
|
|
||||||
kvm_lapic_reg_read(vcpu->arch.apic, offset, 4, &val);
|
|
||||||
|
|
||||||
/* TODO: optimize to just emulate side effect w/o one more write */
|
/* TODO: optimize to just emulate side effect w/o one more write */
|
||||||
kvm_lapic_reg_write(vcpu->arch.apic, offset, val);
|
kvm_lapic_reg_write(vcpu->arch.apic, offset, val);
|
||||||
|
@ -2242,10 +2283,7 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
|
||||||
|
|
||||||
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
|
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = vcpu->arch.apic;
|
apic_set_tpr(vcpu->arch.apic, (cr8 & 0x0f) << 4);
|
||||||
|
|
||||||
apic_set_tpr(apic, ((cr8 & 0x0f) << 4)
|
|
||||||
| (kvm_lapic_get_reg(apic, APIC_TASKPRI) & 4));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
|
u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
|
||||||
|
@ -2287,7 +2325,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
|
||||||
kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
|
kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
|
||||||
|
|
||||||
if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
|
if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
|
||||||
static_call(kvm_x86_set_virtual_apic_mode)(vcpu);
|
static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
|
||||||
|
|
||||||
apic->base_address = apic->vcpu->arch.apic_base &
|
apic->base_address = apic->vcpu->arch.apic_base &
|
||||||
MSR_IA32_APICBASE_BASE;
|
MSR_IA32_APICBASE_BASE;
|
||||||
|
@ -2356,8 +2394,12 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
|
||||||
if (!apic_x2apic_mode(apic))
|
if (!apic_x2apic_mode(apic))
|
||||||
kvm_apic_set_ldr(apic, 0);
|
kvm_apic_set_ldr(apic, 0);
|
||||||
kvm_lapic_set_reg(apic, APIC_ESR, 0);
|
kvm_lapic_set_reg(apic, APIC_ESR, 0);
|
||||||
kvm_lapic_set_reg(apic, APIC_ICR, 0);
|
if (!apic_x2apic_mode(apic)) {
|
||||||
kvm_lapic_set_reg(apic, APIC_ICR2, 0);
|
kvm_lapic_set_reg(apic, APIC_ICR, 0);
|
||||||
|
kvm_lapic_set_reg(apic, APIC_ICR2, 0);
|
||||||
|
} else {
|
||||||
|
kvm_lapic_set_reg64(apic, APIC_ICR, 0);
|
||||||
|
}
|
||||||
kvm_lapic_set_reg(apic, APIC_TDCR, 0);
|
kvm_lapic_set_reg(apic, APIC_TDCR, 0);
|
||||||
kvm_lapic_set_reg(apic, APIC_TMICT, 0);
|
kvm_lapic_set_reg(apic, APIC_TMICT, 0);
|
||||||
for (i = 0; i < 8; i++) {
|
for (i = 0; i < 8; i++) {
|
||||||
|
@ -2373,9 +2415,9 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
|
||||||
vcpu->arch.pv_eoi.msr_val = 0;
|
vcpu->arch.pv_eoi.msr_val = 0;
|
||||||
apic_update_ppr(apic);
|
apic_update_ppr(apic);
|
||||||
if (vcpu->arch.apicv_active) {
|
if (vcpu->arch.apicv_active) {
|
||||||
static_call(kvm_x86_apicv_post_state_restore)(vcpu);
|
static_call_cond(kvm_x86_apicv_post_state_restore)(vcpu);
|
||||||
static_call(kvm_x86_hwapic_irr_update)(vcpu, -1);
|
static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, -1);
|
||||||
static_call(kvm_x86_hwapic_isr_update)(vcpu, -1);
|
static_call_cond(kvm_x86_hwapic_isr_update)(vcpu, -1);
|
||||||
}
|
}
|
||||||
|
|
||||||
vcpu->arch.apic_arb_prio = 0;
|
vcpu->arch.apic_arb_prio = 0;
|
||||||
|
@ -2574,6 +2616,7 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
|
||||||
if (apic_x2apic_mode(vcpu->arch.apic)) {
|
if (apic_x2apic_mode(vcpu->arch.apic)) {
|
||||||
u32 *id = (u32 *)(s->regs + APIC_ID);
|
u32 *id = (u32 *)(s->regs + APIC_ID);
|
||||||
u32 *ldr = (u32 *)(s->regs + APIC_LDR);
|
u32 *ldr = (u32 *)(s->regs + APIC_LDR);
|
||||||
|
u64 icr;
|
||||||
|
|
||||||
if (vcpu->kvm->arch.x2apic_format) {
|
if (vcpu->kvm->arch.x2apic_format) {
|
||||||
if (*id != vcpu->vcpu_id)
|
if (*id != vcpu->vcpu_id)
|
||||||
|
@ -2585,9 +2628,21 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
|
||||||
*id <<= 24;
|
*id <<= 24;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* In x2APIC mode, the LDR is fixed and based on the id */
|
/*
|
||||||
if (set)
|
* In x2APIC mode, the LDR is fixed and based on the id. And
|
||||||
|
* ICR is internally a single 64-bit register, but needs to be
|
||||||
|
* split to ICR+ICR2 in userspace for backwards compatibility.
|
||||||
|
*/
|
||||||
|
if (set) {
|
||||||
*ldr = kvm_apic_calc_x2apic_ldr(*id);
|
*ldr = kvm_apic_calc_x2apic_ldr(*id);
|
||||||
|
|
||||||
|
icr = __kvm_lapic_get_reg(s->regs, APIC_ICR) |
|
||||||
|
(u64)__kvm_lapic_get_reg(s->regs, APIC_ICR2) << 32;
|
||||||
|
__kvm_lapic_set_reg64(s->regs, APIC_ICR, icr);
|
||||||
|
} else {
|
||||||
|
icr = __kvm_lapic_get_reg64(s->regs, APIC_ICR);
|
||||||
|
__kvm_lapic_set_reg(s->regs, APIC_ICR2, icr >> 32);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
|
@ -2638,11 +2693,9 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
|
||||||
kvm_apic_update_apicv(vcpu);
|
kvm_apic_update_apicv(vcpu);
|
||||||
apic->highest_isr_cache = -1;
|
apic->highest_isr_cache = -1;
|
||||||
if (vcpu->arch.apicv_active) {
|
if (vcpu->arch.apicv_active) {
|
||||||
static_call(kvm_x86_apicv_post_state_restore)(vcpu);
|
static_call_cond(kvm_x86_apicv_post_state_restore)(vcpu);
|
||||||
static_call(kvm_x86_hwapic_irr_update)(vcpu,
|
static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, apic_find_highest_irr(apic));
|
||||||
apic_find_highest_irr(apic));
|
static_call_cond(kvm_x86_hwapic_isr_update)(vcpu, apic_find_highest_isr(apic));
|
||||||
static_call(kvm_x86_hwapic_isr_update)(vcpu,
|
|
||||||
apic_find_highest_isr(apic));
|
|
||||||
}
|
}
|
||||||
kvm_make_request(KVM_REQ_EVENT, vcpu);
|
kvm_make_request(KVM_REQ_EVENT, vcpu);
|
||||||
if (ioapic_in_kernel(vcpu->kvm))
|
if (ioapic_in_kernel(vcpu->kvm))
|
||||||
|
@ -2779,6 +2832,46 @@ int kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int kvm_x2apic_icr_write(struct kvm_lapic *apic, u64 data)
|
||||||
|
{
|
||||||
|
data &= ~APIC_ICR_BUSY;
|
||||||
|
|
||||||
|
kvm_apic_send_ipi(apic, (u32)data, (u32)(data >> 32));
|
||||||
|
kvm_lapic_set_reg64(apic, APIC_ICR, data);
|
||||||
|
trace_kvm_apic_write(APIC_ICR, data);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data)
|
||||||
|
{
|
||||||
|
u32 low;
|
||||||
|
|
||||||
|
if (reg == APIC_ICR) {
|
||||||
|
*data = kvm_lapic_get_reg64(apic, APIC_ICR);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (kvm_lapic_reg_read(apic, reg, 4, &low))
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
*data = low;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int kvm_lapic_msr_write(struct kvm_lapic *apic, u32 reg, u64 data)
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* ICR is a 64-bit register in x2APIC mode (and Hyper'v PV vAPIC) and
|
||||||
|
* can be written as such, all other registers remain accessible only
|
||||||
|
* through 32-bit reads/writes.
|
||||||
|
*/
|
||||||
|
if (reg == APIC_ICR)
|
||||||
|
return kvm_x2apic_icr_write(apic, data);
|
||||||
|
|
||||||
|
return kvm_lapic_reg_write(apic, reg, (u32)data);
|
||||||
|
}
|
||||||
|
|
||||||
int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data)
|
int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = vcpu->arch.apic;
|
struct kvm_lapic *apic = vcpu->arch.apic;
|
||||||
|
@ -2787,65 +2880,37 @@ int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data)
|
||||||
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic))
|
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic))
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (reg == APIC_ICR2)
|
return kvm_lapic_msr_write(apic, reg, data);
|
||||||
return 1;
|
|
||||||
|
|
||||||
/* if this is ICR write vector before command */
|
|
||||||
if (reg == APIC_ICR)
|
|
||||||
kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(data >> 32));
|
|
||||||
return kvm_lapic_reg_write(apic, reg, (u32)data);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
|
int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = vcpu->arch.apic;
|
struct kvm_lapic *apic = vcpu->arch.apic;
|
||||||
u32 reg = (msr - APIC_BASE_MSR) << 4, low, high = 0;
|
u32 reg = (msr - APIC_BASE_MSR) << 4;
|
||||||
|
|
||||||
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic))
|
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic))
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (reg == APIC_DFR || reg == APIC_ICR2)
|
if (reg == APIC_DFR)
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (kvm_lapic_reg_read(apic, reg, 4, &low))
|
return kvm_lapic_msr_read(apic, reg, data);
|
||||||
return 1;
|
|
||||||
if (reg == APIC_ICR)
|
|
||||||
kvm_lapic_reg_read(apic, APIC_ICR2, 4, &high);
|
|
||||||
|
|
||||||
*data = (((u64)high) << 32) | low;
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_hv_vapic_msr_write(struct kvm_vcpu *vcpu, u32 reg, u64 data)
|
int kvm_hv_vapic_msr_write(struct kvm_vcpu *vcpu, u32 reg, u64 data)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = vcpu->arch.apic;
|
|
||||||
|
|
||||||
if (!lapic_in_kernel(vcpu))
|
if (!lapic_in_kernel(vcpu))
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
/* if this is ICR write vector before command */
|
return kvm_lapic_msr_write(vcpu->arch.apic, reg, data);
|
||||||
if (reg == APIC_ICR)
|
|
||||||
kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(data >> 32));
|
|
||||||
return kvm_lapic_reg_write(apic, reg, (u32)data);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 reg, u64 *data)
|
int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 reg, u64 *data)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = vcpu->arch.apic;
|
|
||||||
u32 low, high = 0;
|
|
||||||
|
|
||||||
if (!lapic_in_kernel(vcpu))
|
if (!lapic_in_kernel(vcpu))
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (kvm_lapic_reg_read(apic, reg, 4, &low))
|
return kvm_lapic_msr_read(vcpu->arch.apic, reg, data);
|
||||||
return 1;
|
|
||||||
if (reg == APIC_ICR)
|
|
||||||
kvm_lapic_reg_read(apic, APIC_ICR2, 4, &high);
|
|
||||||
|
|
||||||
*data = (((u64)high) << 32) | low;
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len)
|
int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len)
|
||||||
|
@ -2933,7 +2998,7 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
|
||||||
/* evaluate pending_events before reading the vector */
|
/* evaluate pending_events before reading the vector */
|
||||||
smp_rmb();
|
smp_rmb();
|
||||||
sipi_vector = apic->sipi_vector;
|
sipi_vector = apic->sipi_vector;
|
||||||
kvm_x86_ops.vcpu_deliver_sipi_vector(vcpu, sipi_vector);
|
static_call(kvm_x86_vcpu_deliver_sipi_vector)(vcpu, sipi_vector);
|
||||||
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
|
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -85,9 +85,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value);
|
||||||
u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
|
u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
|
||||||
void kvm_recalculate_apic_map(struct kvm *kvm);
|
void kvm_recalculate_apic_map(struct kvm *kvm);
|
||||||
void kvm_apic_set_version(struct kvm_vcpu *vcpu);
|
void kvm_apic_set_version(struct kvm_vcpu *vcpu);
|
||||||
int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val);
|
|
||||||
int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
|
|
||||||
void *data);
|
|
||||||
bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
|
bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
|
||||||
int shorthand, unsigned int dest, int dest_mode);
|
int shorthand, unsigned int dest, int dest_mode);
|
||||||
int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2);
|
int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2);
|
||||||
|
@ -121,6 +118,7 @@ int kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr);
|
||||||
void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu);
|
void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu);
|
||||||
void kvm_lapic_sync_to_vapic(struct kvm_vcpu *vcpu);
|
void kvm_lapic_sync_to_vapic(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
|
int kvm_x2apic_icr_write(struct kvm_lapic *apic, u64 data);
|
||||||
int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data);
|
int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data);
|
||||||
int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
|
int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
|
||||||
|
|
||||||
|
@ -153,19 +151,14 @@ static inline void kvm_lapic_set_irr(int vec, struct kvm_lapic *apic)
|
||||||
apic->irr_pending = true;
|
apic->irr_pending = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline u32 __kvm_lapic_get_reg(char *regs, int reg_off)
|
||||||
|
{
|
||||||
|
return *((u32 *) (regs + reg_off));
|
||||||
|
}
|
||||||
|
|
||||||
static inline u32 kvm_lapic_get_reg(struct kvm_lapic *apic, int reg_off)
|
static inline u32 kvm_lapic_get_reg(struct kvm_lapic *apic, int reg_off)
|
||||||
{
|
{
|
||||||
return *((u32 *) (apic->regs + reg_off));
|
return __kvm_lapic_get_reg(apic->regs, reg_off);
|
||||||
}
|
|
||||||
|
|
||||||
static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val)
|
|
||||||
{
|
|
||||||
*((u32 *) (regs + reg_off)) = val;
|
|
||||||
}
|
|
||||||
|
|
||||||
static inline void kvm_lapic_set_reg(struct kvm_lapic *apic, int reg_off, u32 val)
|
|
||||||
{
|
|
||||||
__kvm_lapic_set_reg(apic->regs, reg_off, val);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
DECLARE_STATIC_KEY_FALSE(kvm_has_noapic_vcpu);
|
DECLARE_STATIC_KEY_FALSE(kvm_has_noapic_vcpu);
|
||||||
|
|
|
@ -48,6 +48,7 @@
|
||||||
X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE)
|
X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE)
|
||||||
|
|
||||||
#define KVM_MMU_CR0_ROLE_BITS (X86_CR0_PG | X86_CR0_WP)
|
#define KVM_MMU_CR0_ROLE_BITS (X86_CR0_PG | X86_CR0_WP)
|
||||||
|
#define KVM_MMU_EFER_ROLE_BITS (EFER_LME | EFER_NX)
|
||||||
|
|
||||||
static __always_inline u64 rsvd_bits(int s, int e)
|
static __always_inline u64 rsvd_bits(int s, int e)
|
||||||
{
|
{
|
||||||
|
@ -79,12 +80,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
|
||||||
|
|
||||||
int kvm_mmu_load(struct kvm_vcpu *vcpu);
|
int kvm_mmu_load(struct kvm_vcpu *vcpu);
|
||||||
void kvm_mmu_unload(struct kvm_vcpu *vcpu);
|
void kvm_mmu_unload(struct kvm_vcpu *vcpu);
|
||||||
|
void kvm_mmu_free_obsolete_roots(struct kvm_vcpu *vcpu);
|
||||||
void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu);
|
void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu);
|
||||||
void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu);
|
void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
|
static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (likely(vcpu->arch.mmu->root_hpa != INVALID_PAGE))
|
if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
return kvm_mmu_load(vcpu);
|
return kvm_mmu_load(vcpu);
|
||||||
|
@ -106,7 +108,7 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
|
static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
u64 root_hpa = vcpu->arch.mmu->root_hpa;
|
u64 root_hpa = vcpu->arch.mmu->root.hpa;
|
||||||
|
|
||||||
if (!VALID_PAGE(root_hpa))
|
if (!VALID_PAGE(root_hpa))
|
||||||
return;
|
return;
|
||||||
|
@ -202,44 +204,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
|
||||||
return vcpu->arch.mmu->page_fault(vcpu, &fault);
|
return vcpu->arch.mmu->page_fault(vcpu, &fault);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
|
||||||
* Currently, we have two sorts of write-protection, a) the first one
|
|
||||||
* write-protects guest page to sync the guest modification, b) another one is
|
|
||||||
* used to sync dirty bitmap when we do KVM_GET_DIRTY_LOG. The differences
|
|
||||||
* between these two sorts are:
|
|
||||||
* 1) the first case clears MMU-writable bit.
|
|
||||||
* 2) the first case requires flushing tlb immediately avoiding corrupting
|
|
||||||
* shadow page table between all vcpus so it should be in the protection of
|
|
||||||
* mmu-lock. And the another case does not need to flush tlb until returning
|
|
||||||
* the dirty bitmap to userspace since it only write-protects the page
|
|
||||||
* logged in the bitmap, that means the page in the dirty bitmap is not
|
|
||||||
* missed, so it can flush tlb out of mmu-lock.
|
|
||||||
*
|
|
||||||
* So, there is the problem: the first case can meet the corrupted tlb caused
|
|
||||||
* by another case which write-protects pages but without flush tlb
|
|
||||||
* immediately. In order to making the first case be aware this problem we let
|
|
||||||
* it flush tlb if we try to write-protect a spte whose MMU-writable bit
|
|
||||||
* is set, it works since another case never touches MMU-writable bit.
|
|
||||||
*
|
|
||||||
* Anyway, whenever a spte is updated (only permission and status bits are
|
|
||||||
* changed) we need to check whether the spte with MMU-writable becomes
|
|
||||||
* readonly, if that happens, we need to flush tlb. Fortunately,
|
|
||||||
* mmu_spte_update() has already handled it perfectly.
|
|
||||||
*
|
|
||||||
* The rules to use MMU-writable and PT_WRITABLE_MASK:
|
|
||||||
* - if we want to see if it has writable tlb entry or if the spte can be
|
|
||||||
* writable on the mmu mapping, check MMU-writable, this is the most
|
|
||||||
* case, otherwise
|
|
||||||
* - if we fix page fault on the spte or do write-protection by dirty logging,
|
|
||||||
* check PT_WRITABLE_MASK.
|
|
||||||
*
|
|
||||||
* TODO: introduce APIs to split these two cases.
|
|
||||||
*/
|
|
||||||
static inline bool is_writable_pte(unsigned long pte)
|
|
||||||
{
|
|
||||||
return pte & PT_WRITABLE_MASK;
|
|
||||||
}
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Check if a given access (described through the I/D, W/R and U/S bits of a
|
* Check if a given access (described through the I/D, W/R and U/S bits of a
|
||||||
* page fault error code pfec) causes a permission fault with the given PTE
|
* page fault error code pfec) causes a permission fault with the given PTE
|
||||||
|
|
|
@ -104,15 +104,6 @@ static int max_huge_page_level __read_mostly;
|
||||||
static int tdp_root_level __read_mostly;
|
static int tdp_root_level __read_mostly;
|
||||||
static int max_tdp_level __read_mostly;
|
static int max_tdp_level __read_mostly;
|
||||||
|
|
||||||
enum {
|
|
||||||
AUDIT_PRE_PAGE_FAULT,
|
|
||||||
AUDIT_POST_PAGE_FAULT,
|
|
||||||
AUDIT_PRE_PTE_WRITE,
|
|
||||||
AUDIT_POST_PTE_WRITE,
|
|
||||||
AUDIT_PRE_SYNC,
|
|
||||||
AUDIT_POST_SYNC
|
|
||||||
};
|
|
||||||
|
|
||||||
#ifdef MMU_DEBUG
|
#ifdef MMU_DEBUG
|
||||||
bool dbg = 0;
|
bool dbg = 0;
|
||||||
module_param(dbg, bool, 0644);
|
module_param(dbg, bool, 0644);
|
||||||
|
@ -190,8 +181,6 @@ struct kmem_cache *mmu_page_header_cache;
|
||||||
static struct percpu_counter kvm_total_used_mmu_pages;
|
static struct percpu_counter kvm_total_used_mmu_pages;
|
||||||
|
|
||||||
static void mmu_spte_set(u64 *sptep, u64 spte);
|
static void mmu_spte_set(u64 *sptep, u64 spte);
|
||||||
static union kvm_mmu_page_role
|
|
||||||
kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
|
|
||||||
|
|
||||||
struct kvm_mmu_role_regs {
|
struct kvm_mmu_role_regs {
|
||||||
const unsigned long cr0;
|
const unsigned long cr0;
|
||||||
|
@ -529,6 +518,7 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
|
||||||
u64 old_spte = *sptep;
|
u64 old_spte = *sptep;
|
||||||
|
|
||||||
WARN_ON(!is_shadow_present_pte(new_spte));
|
WARN_ON(!is_shadow_present_pte(new_spte));
|
||||||
|
check_spte_writable_invariants(new_spte);
|
||||||
|
|
||||||
if (!is_shadow_present_pte(old_spte)) {
|
if (!is_shadow_present_pte(old_spte)) {
|
||||||
mmu_spte_set(sptep, new_spte);
|
mmu_spte_set(sptep, new_spte);
|
||||||
|
@ -548,11 +538,9 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
|
||||||
/* Rules for using mmu_spte_update:
|
/* Rules for using mmu_spte_update:
|
||||||
* Update the state bits, it means the mapped pfn is not changed.
|
* Update the state bits, it means the mapped pfn is not changed.
|
||||||
*
|
*
|
||||||
* Whenever we overwrite a writable spte with a read-only one we
|
* Whenever an MMU-writable SPTE is overwritten with a read-only SPTE, remote
|
||||||
* should flush remote TLBs. Otherwise rmap_write_protect
|
* TLBs must be flushed. Otherwise rmap_write_protect will find a read-only
|
||||||
* will find a read-only spte, even though the writable spte
|
* spte, even though the writable spte might be cached on a CPU's TLB.
|
||||||
* might be cached on a CPU's TLB, the return value indicates this
|
|
||||||
* case.
|
|
||||||
*
|
*
|
||||||
* Returns true if the TLB needs to be flushed
|
* Returns true if the TLB needs to be flushed
|
||||||
*/
|
*/
|
||||||
|
@ -646,24 +634,6 @@ static u64 mmu_spte_get_lockless(u64 *sptep)
|
||||||
return __get_spte_lockless(sptep);
|
return __get_spte_lockless(sptep);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Restore an acc-track PTE back to a regular PTE */
|
|
||||||
static u64 restore_acc_track_spte(u64 spte)
|
|
||||||
{
|
|
||||||
u64 new_spte = spte;
|
|
||||||
u64 saved_bits = (spte >> SHADOW_ACC_TRACK_SAVED_BITS_SHIFT)
|
|
||||||
& SHADOW_ACC_TRACK_SAVED_BITS_MASK;
|
|
||||||
|
|
||||||
WARN_ON_ONCE(spte_ad_enabled(spte));
|
|
||||||
WARN_ON_ONCE(!is_access_track_spte(spte));
|
|
||||||
|
|
||||||
new_spte &= ~shadow_acc_track_mask;
|
|
||||||
new_spte &= ~(SHADOW_ACC_TRACK_SAVED_BITS_MASK <<
|
|
||||||
SHADOW_ACC_TRACK_SAVED_BITS_SHIFT);
|
|
||||||
new_spte |= saved_bits;
|
|
||||||
|
|
||||||
return new_spte;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Returns the Accessed status of the PTE and resets it at the same time. */
|
/* Returns the Accessed status of the PTE and resets it at the same time. */
|
||||||
static bool mmu_spte_age(u64 *sptep)
|
static bool mmu_spte_age(u64 *sptep)
|
||||||
{
|
{
|
||||||
|
@ -1229,9 +1199,8 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect)
|
||||||
return mmu_spte_update(sptep, spte);
|
return mmu_spte_update(sptep, spte);
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool __rmap_write_protect(struct kvm *kvm,
|
static bool rmap_write_protect(struct kvm_rmap_head *rmap_head,
|
||||||
struct kvm_rmap_head *rmap_head,
|
bool pt_protect)
|
||||||
bool pt_protect)
|
|
||||||
{
|
{
|
||||||
u64 *sptep;
|
u64 *sptep;
|
||||||
struct rmap_iterator iter;
|
struct rmap_iterator iter;
|
||||||
|
@ -1311,7 +1280,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
|
||||||
while (mask) {
|
while (mask) {
|
||||||
rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
|
rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
|
||||||
PG_LEVEL_4K, slot);
|
PG_LEVEL_4K, slot);
|
||||||
__rmap_write_protect(kvm, rmap_head, false);
|
rmap_write_protect(rmap_head, false);
|
||||||
|
|
||||||
/* clear the first set bit */
|
/* clear the first set bit */
|
||||||
mask &= mask - 1;
|
mask &= mask - 1;
|
||||||
|
@ -1378,6 +1347,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
|
||||||
gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
|
gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
|
||||||
gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
|
gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
|
||||||
|
|
||||||
|
if (READ_ONCE(eager_page_split))
|
||||||
|
kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K);
|
||||||
|
|
||||||
kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
|
kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
|
||||||
|
|
||||||
/* Cross two large pages? */
|
/* Cross two large pages? */
|
||||||
|
@ -1410,7 +1382,7 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
|
||||||
if (kvm_memslots_have_rmaps(kvm)) {
|
if (kvm_memslots_have_rmaps(kvm)) {
|
||||||
for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
|
for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
|
||||||
rmap_head = gfn_to_rmap(gfn, i, slot);
|
rmap_head = gfn_to_rmap(gfn, i, slot);
|
||||||
write_protected |= __rmap_write_protect(kvm, rmap_head, true);
|
write_protected |= rmap_write_protect(rmap_head, true);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -1421,7 +1393,7 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
|
||||||
return write_protected;
|
return write_protected;
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
|
static bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn)
|
||||||
{
|
{
|
||||||
struct kvm_memory_slot *slot;
|
struct kvm_memory_slot *slot;
|
||||||
|
|
||||||
|
@ -1921,13 +1893,6 @@ static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifdef CONFIG_KVM_MMU_AUDIT
|
|
||||||
#include "mmu_audit.c"
|
|
||||||
#else
|
|
||||||
static void kvm_mmu_audit(struct kvm_vcpu *vcpu, int point) { }
|
|
||||||
static void mmu_audit_disable(void) { }
|
|
||||||
#endif
|
|
||||||
|
|
||||||
static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
|
static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
|
||||||
{
|
{
|
||||||
if (sp->role.invalid)
|
if (sp->role.invalid)
|
||||||
|
@ -2024,7 +1989,7 @@ static int mmu_sync_children(struct kvm_vcpu *vcpu,
|
||||||
bool protected = false;
|
bool protected = false;
|
||||||
|
|
||||||
for_each_sp(pages, sp, parents, i)
|
for_each_sp(pages, sp, parents, i)
|
||||||
protected |= rmap_write_protect(vcpu, sp->gfn);
|
protected |= kvm_vcpu_write_protect_gfn(vcpu, sp->gfn);
|
||||||
|
|
||||||
if (protected) {
|
if (protected) {
|
||||||
kvm_mmu_remote_flush_or_zap(vcpu->kvm, &invalid_list, true);
|
kvm_mmu_remote_flush_or_zap(vcpu->kvm, &invalid_list, true);
|
||||||
|
@ -2149,7 +2114,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
|
||||||
hlist_add_head(&sp->hash_link, sp_list);
|
hlist_add_head(&sp->hash_link, sp_list);
|
||||||
if (!direct) {
|
if (!direct) {
|
||||||
account_shadowed(vcpu->kvm, sp);
|
account_shadowed(vcpu->kvm, sp);
|
||||||
if (level == PG_LEVEL_4K && rmap_write_protect(vcpu, gfn))
|
if (level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
|
||||||
kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
|
kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
|
||||||
}
|
}
|
||||||
trace_kvm_mmu_get_page(sp, true);
|
trace_kvm_mmu_get_page(sp, true);
|
||||||
|
@ -2179,7 +2144,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
|
||||||
* prev_root is currently only used for 64-bit hosts. So only
|
* prev_root is currently only used for 64-bit hosts. So only
|
||||||
* the active root_hpa is valid here.
|
* the active root_hpa is valid here.
|
||||||
*/
|
*/
|
||||||
BUG_ON(root != vcpu->arch.mmu->root_hpa);
|
BUG_ON(root != vcpu->arch.mmu->root.hpa);
|
||||||
|
|
||||||
iterator->shadow_addr
|
iterator->shadow_addr
|
||||||
= vcpu->arch.mmu->pae_root[(addr >> 30) & 3];
|
= vcpu->arch.mmu->pae_root[(addr >> 30) & 3];
|
||||||
|
@ -2193,7 +2158,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
|
||||||
static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
|
static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
|
||||||
struct kvm_vcpu *vcpu, u64 addr)
|
struct kvm_vcpu *vcpu, u64 addr)
|
||||||
{
|
{
|
||||||
shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root_hpa,
|
shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root.hpa,
|
||||||
addr);
|
addr);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2307,7 +2272,7 @@ static int kvm_mmu_page_unlink_children(struct kvm *kvm,
|
||||||
return zapped;
|
return zapped;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
|
static void kvm_mmu_unlink_parents(struct kvm_mmu_page *sp)
|
||||||
{
|
{
|
||||||
u64 *sptep;
|
u64 *sptep;
|
||||||
struct rmap_iterator iter;
|
struct rmap_iterator iter;
|
||||||
|
@ -2345,13 +2310,13 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm,
|
||||||
struct list_head *invalid_list,
|
struct list_head *invalid_list,
|
||||||
int *nr_zapped)
|
int *nr_zapped)
|
||||||
{
|
{
|
||||||
bool list_unstable;
|
bool list_unstable, zapped_root = false;
|
||||||
|
|
||||||
trace_kvm_mmu_prepare_zap_page(sp);
|
trace_kvm_mmu_prepare_zap_page(sp);
|
||||||
++kvm->stat.mmu_shadow_zapped;
|
++kvm->stat.mmu_shadow_zapped;
|
||||||
*nr_zapped = mmu_zap_unsync_children(kvm, sp, invalid_list);
|
*nr_zapped = mmu_zap_unsync_children(kvm, sp, invalid_list);
|
||||||
*nr_zapped += kvm_mmu_page_unlink_children(kvm, sp, invalid_list);
|
*nr_zapped += kvm_mmu_page_unlink_children(kvm, sp, invalid_list);
|
||||||
kvm_mmu_unlink_parents(kvm, sp);
|
kvm_mmu_unlink_parents(sp);
|
||||||
|
|
||||||
/* Zapping children means active_mmu_pages has become unstable. */
|
/* Zapping children means active_mmu_pages has become unstable. */
|
||||||
list_unstable = *nr_zapped;
|
list_unstable = *nr_zapped;
|
||||||
|
@ -2387,14 +2352,20 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm,
|
||||||
* in kvm_mmu_zap_all_fast(). Note, is_obsolete_sp() also
|
* in kvm_mmu_zap_all_fast(). Note, is_obsolete_sp() also
|
||||||
* treats invalid shadow pages as being obsolete.
|
* treats invalid shadow pages as being obsolete.
|
||||||
*/
|
*/
|
||||||
if (!is_obsolete_sp(kvm, sp))
|
zapped_root = !is_obsolete_sp(kvm, sp);
|
||||||
kvm_reload_remote_mmus(kvm);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (sp->lpage_disallowed)
|
if (sp->lpage_disallowed)
|
||||||
unaccount_huge_nx_page(kvm, sp);
|
unaccount_huge_nx_page(kvm, sp);
|
||||||
|
|
||||||
sp->role.invalid = 1;
|
sp->role.invalid = 1;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Make the request to free obsolete roots after marking the root
|
||||||
|
* invalid, otherwise other vCPUs may not see it as invalid.
|
||||||
|
*/
|
||||||
|
if (zapped_root)
|
||||||
|
kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
|
||||||
return list_unstable;
|
return list_unstable;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -3239,6 +3210,8 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
|
||||||
return;
|
return;
|
||||||
|
|
||||||
sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);
|
sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);
|
||||||
|
if (WARN_ON(!sp))
|
||||||
|
return;
|
||||||
|
|
||||||
if (is_tdp_mmu_page(sp))
|
if (is_tdp_mmu_page(sp))
|
||||||
kvm_tdp_mmu_put_root(kvm, sp, false);
|
kvm_tdp_mmu_put_root(kvm, sp, false);
|
||||||
|
@ -3249,18 +3222,20 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
|
||||||
}
|
}
|
||||||
|
|
||||||
/* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */
|
/* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */
|
||||||
void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
|
||||||
ulong roots_to_free)
|
ulong roots_to_free)
|
||||||
{
|
{
|
||||||
struct kvm *kvm = vcpu->kvm;
|
|
||||||
int i;
|
int i;
|
||||||
LIST_HEAD(invalid_list);
|
LIST_HEAD(invalid_list);
|
||||||
bool free_active_root = roots_to_free & KVM_MMU_ROOT_CURRENT;
|
bool free_active_root;
|
||||||
|
|
||||||
BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG);
|
BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG);
|
||||||
|
|
||||||
/* Before acquiring the MMU lock, see if we need to do any real work. */
|
/* Before acquiring the MMU lock, see if we need to do any real work. */
|
||||||
if (!(free_active_root && VALID_PAGE(mmu->root_hpa))) {
|
free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT)
|
||||||
|
&& VALID_PAGE(mmu->root.hpa);
|
||||||
|
|
||||||
|
if (!free_active_root) {
|
||||||
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
|
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
|
||||||
if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) &&
|
if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) &&
|
||||||
VALID_PAGE(mmu->prev_roots[i].hpa))
|
VALID_PAGE(mmu->prev_roots[i].hpa))
|
||||||
|
@ -3278,9 +3253,8 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||||
&invalid_list);
|
&invalid_list);
|
||||||
|
|
||||||
if (free_active_root) {
|
if (free_active_root) {
|
||||||
if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
|
if (to_shadow_page(mmu->root.hpa)) {
|
||||||
(mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
|
mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
|
||||||
mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
|
|
||||||
} else if (mmu->pae_root) {
|
} else if (mmu->pae_root) {
|
||||||
for (i = 0; i < 4; ++i) {
|
for (i = 0; i < 4; ++i) {
|
||||||
if (!IS_VALID_PAE_ROOT(mmu->pae_root[i]))
|
if (!IS_VALID_PAE_ROOT(mmu->pae_root[i]))
|
||||||
|
@ -3291,8 +3265,8 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||||
mmu->pae_root[i] = INVALID_PAE_ROOT;
|
mmu->pae_root[i] = INVALID_PAE_ROOT;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
mmu->root_hpa = INVALID_PAGE;
|
mmu->root.hpa = INVALID_PAGE;
|
||||||
mmu->root_pgd = 0;
|
mmu->root.pgd = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
kvm_mmu_commit_zap_page(kvm, &invalid_list);
|
kvm_mmu_commit_zap_page(kvm, &invalid_list);
|
||||||
|
@ -3300,7 +3274,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_mmu_free_roots);
|
EXPORT_SYMBOL_GPL(kvm_mmu_free_roots);
|
||||||
|
|
||||||
void kvm_mmu_free_guest_mode_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
|
void kvm_mmu_free_guest_mode_roots(struct kvm *kvm, struct kvm_mmu *mmu)
|
||||||
{
|
{
|
||||||
unsigned long roots_to_free = 0;
|
unsigned long roots_to_free = 0;
|
||||||
hpa_t root_hpa;
|
hpa_t root_hpa;
|
||||||
|
@ -3322,7 +3296,7 @@ void kvm_mmu_free_guest_mode_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
|
||||||
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
||||||
}
|
}
|
||||||
|
|
||||||
kvm_mmu_free_roots(vcpu, mmu, roots_to_free);
|
kvm_mmu_free_roots(kvm, mmu, roots_to_free);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_mmu_free_guest_mode_roots);
|
EXPORT_SYMBOL_GPL(kvm_mmu_free_guest_mode_roots);
|
||||||
|
|
||||||
|
@ -3365,10 +3339,10 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
if (is_tdp_mmu_enabled(vcpu->kvm)) {
|
if (is_tdp_mmu_enabled(vcpu->kvm)) {
|
||||||
root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
|
root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
|
||||||
mmu->root_hpa = root;
|
mmu->root.hpa = root;
|
||||||
} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
|
} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
|
||||||
root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true);
|
root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true);
|
||||||
mmu->root_hpa = root;
|
mmu->root.hpa = root;
|
||||||
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
|
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
|
||||||
if (WARN_ON_ONCE(!mmu->pae_root)) {
|
if (WARN_ON_ONCE(!mmu->pae_root)) {
|
||||||
r = -EIO;
|
r = -EIO;
|
||||||
|
@ -3383,15 +3357,15 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
|
||||||
mmu->pae_root[i] = root | PT_PRESENT_MASK |
|
mmu->pae_root[i] = root | PT_PRESENT_MASK |
|
||||||
shadow_me_mask;
|
shadow_me_mask;
|
||||||
}
|
}
|
||||||
mmu->root_hpa = __pa(mmu->pae_root);
|
mmu->root.hpa = __pa(mmu->pae_root);
|
||||||
} else {
|
} else {
|
||||||
WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level);
|
WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level);
|
||||||
r = -EIO;
|
r = -EIO;
|
||||||
goto out_unlock;
|
goto out_unlock;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* root_pgd is ignored for direct MMUs. */
|
/* root.pgd is ignored for direct MMUs. */
|
||||||
mmu->root_pgd = 0;
|
mmu->root.pgd = 0;
|
||||||
out_unlock:
|
out_unlock:
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
return r;
|
return r;
|
||||||
|
@ -3504,7 +3478,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
|
||||||
if (mmu->root_level >= PT64_ROOT_4LEVEL) {
|
if (mmu->root_level >= PT64_ROOT_4LEVEL) {
|
||||||
root = mmu_alloc_root(vcpu, root_gfn, 0,
|
root = mmu_alloc_root(vcpu, root_gfn, 0,
|
||||||
mmu->shadow_root_level, false);
|
mmu->shadow_root_level, false);
|
||||||
mmu->root_hpa = root;
|
mmu->root.hpa = root;
|
||||||
goto set_root_pgd;
|
goto set_root_pgd;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -3554,14 +3528,14 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
if (mmu->shadow_root_level == PT64_ROOT_5LEVEL)
|
if (mmu->shadow_root_level == PT64_ROOT_5LEVEL)
|
||||||
mmu->root_hpa = __pa(mmu->pml5_root);
|
mmu->root.hpa = __pa(mmu->pml5_root);
|
||||||
else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
|
else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
|
||||||
mmu->root_hpa = __pa(mmu->pml4_root);
|
mmu->root.hpa = __pa(mmu->pml4_root);
|
||||||
else
|
else
|
||||||
mmu->root_hpa = __pa(mmu->pae_root);
|
mmu->root.hpa = __pa(mmu->pae_root);
|
||||||
|
|
||||||
set_root_pgd:
|
set_root_pgd:
|
||||||
mmu->root_pgd = root_pgd;
|
mmu->root.pgd = root_pgd;
|
||||||
out_unlock:
|
out_unlock:
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
|
|
||||||
|
@ -3660,6 +3634,14 @@ static bool is_unsync_root(hpa_t root)
|
||||||
*/
|
*/
|
||||||
smp_rmb();
|
smp_rmb();
|
||||||
sp = to_shadow_page(root);
|
sp = to_shadow_page(root);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* PAE roots (somewhat arbitrarily) aren't backed by shadow pages, the
|
||||||
|
* PDPTEs for a given PAE root need to be synchronized individually.
|
||||||
|
*/
|
||||||
|
if (WARN_ON_ONCE(!sp))
|
||||||
|
return false;
|
||||||
|
|
||||||
if (sp->unsync || sp->unsync_children)
|
if (sp->unsync || sp->unsync_children)
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
|
@ -3674,30 +3656,25 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
|
||||||
if (vcpu->arch.mmu->direct_map)
|
if (vcpu->arch.mmu->direct_map)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
|
if (!VALID_PAGE(vcpu->arch.mmu->root.hpa))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
|
vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
|
||||||
|
|
||||||
if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
|
if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
|
||||||
hpa_t root = vcpu->arch.mmu->root_hpa;
|
hpa_t root = vcpu->arch.mmu->root.hpa;
|
||||||
sp = to_shadow_page(root);
|
sp = to_shadow_page(root);
|
||||||
|
|
||||||
if (!is_unsync_root(root))
|
if (!is_unsync_root(root))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
write_lock(&vcpu->kvm->mmu_lock);
|
write_lock(&vcpu->kvm->mmu_lock);
|
||||||
kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC);
|
|
||||||
|
|
||||||
mmu_sync_children(vcpu, sp, true);
|
mmu_sync_children(vcpu, sp, true);
|
||||||
|
|
||||||
kvm_mmu_audit(vcpu, AUDIT_POST_SYNC);
|
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
write_lock(&vcpu->kvm->mmu_lock);
|
write_lock(&vcpu->kvm->mmu_lock);
|
||||||
kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC);
|
|
||||||
|
|
||||||
for (i = 0; i < 4; ++i) {
|
for (i = 0; i < 4; ++i) {
|
||||||
hpa_t root = vcpu->arch.mmu->pae_root[i];
|
hpa_t root = vcpu->arch.mmu->pae_root[i];
|
||||||
|
@ -3709,7 +3686,6 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
kvm_mmu_audit(vcpu, AUDIT_POST_SYNC);
|
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -3723,7 +3699,7 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu)
|
||||||
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
||||||
|
|
||||||
/* sync prev_roots by simply freeing them */
|
/* sync prev_roots by simply freeing them */
|
||||||
kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, roots_to_free);
|
kvm_mmu_free_roots(vcpu->kvm, vcpu->arch.mmu, roots_to_free);
|
||||||
}
|
}
|
||||||
|
|
||||||
static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||||
|
@ -3982,7 +3958,7 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
|
||||||
static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
|
static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
|
||||||
struct kvm_page_fault *fault, int mmu_seq)
|
struct kvm_page_fault *fault, int mmu_seq)
|
||||||
{
|
{
|
||||||
struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root_hpa);
|
struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa);
|
||||||
|
|
||||||
/* Special roots, e.g. pae_root, are not backed by shadow pages. */
|
/* Special roots, e.g. pae_root, are not backed by shadow pages. */
|
||||||
if (sp && is_obsolete_sp(vcpu->kvm, sp))
|
if (sp && is_obsolete_sp(vcpu->kvm, sp))
|
||||||
|
@ -3996,7 +3972,7 @@ static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
|
||||||
* previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs
|
* previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs
|
||||||
* to reload even if no vCPU is actively using the root.
|
* to reload even if no vCPU is actively using the root.
|
||||||
*/
|
*/
|
||||||
if (!sp && kvm_test_request(KVM_REQ_MMU_RELOAD, vcpu))
|
if (!sp && kvm_test_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu))
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
return fault->slot &&
|
return fault->slot &&
|
||||||
|
@ -4132,74 +4108,105 @@ static inline bool is_root_usable(struct kvm_mmu_root_info *root, gpa_t pgd,
|
||||||
union kvm_mmu_page_role role)
|
union kvm_mmu_page_role role)
|
||||||
{
|
{
|
||||||
return (role.direct || pgd == root->pgd) &&
|
return (role.direct || pgd == root->pgd) &&
|
||||||
VALID_PAGE(root->hpa) && to_shadow_page(root->hpa) &&
|
VALID_PAGE(root->hpa) &&
|
||||||
role.word == to_shadow_page(root->hpa)->role.word;
|
role.word == to_shadow_page(root->hpa)->role.word;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Find out if a previously cached root matching the new pgd/role is available.
|
* Find out if a previously cached root matching the new pgd/role is available,
|
||||||
* The current root is also inserted into the cache.
|
* and insert the current root as the MRU in the cache.
|
||||||
* If a matching root was found, it is assigned to kvm_mmu->root_hpa and true is
|
* If a matching root is found, it is assigned to kvm_mmu->root and
|
||||||
* returned.
|
* true is returned.
|
||||||
* Otherwise, the LRU root from the cache is assigned to kvm_mmu->root_hpa and
|
* If no match is found, kvm_mmu->root is left invalid, the LRU root is
|
||||||
* false is returned. This root should now be freed by the caller.
|
* evicted to make room for the current root, and false is returned.
|
||||||
*/
|
*/
|
||||||
static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd,
|
static bool cached_root_find_and_keep_current(struct kvm *kvm, struct kvm_mmu *mmu,
|
||||||
union kvm_mmu_page_role new_role)
|
gpa_t new_pgd,
|
||||||
|
union kvm_mmu_page_role new_role)
|
||||||
{
|
{
|
||||||
uint i;
|
uint i;
|
||||||
struct kvm_mmu_root_info root;
|
|
||||||
struct kvm_mmu *mmu = vcpu->arch.mmu;
|
|
||||||
|
|
||||||
root.pgd = mmu->root_pgd;
|
if (is_root_usable(&mmu->root, new_pgd, new_role))
|
||||||
root.hpa = mmu->root_hpa;
|
|
||||||
|
|
||||||
if (is_root_usable(&root, new_pgd, new_role))
|
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
|
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
|
||||||
swap(root, mmu->prev_roots[i]);
|
/*
|
||||||
|
* The swaps end up rotating the cache like this:
|
||||||
if (is_root_usable(&root, new_pgd, new_role))
|
* C 0 1 2 3 (on entry to the function)
|
||||||
break;
|
* 0 C 1 2 3
|
||||||
|
* 1 C 0 2 3
|
||||||
|
* 2 C 0 1 3
|
||||||
|
* 3 C 0 1 2 (on exit from the loop)
|
||||||
|
*/
|
||||||
|
swap(mmu->root, mmu->prev_roots[i]);
|
||||||
|
if (is_root_usable(&mmu->root, new_pgd, new_role))
|
||||||
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
mmu->root_hpa = root.hpa;
|
kvm_mmu_free_roots(kvm, mmu, KVM_MMU_ROOT_CURRENT);
|
||||||
mmu->root_pgd = root.pgd;
|
|
||||||
|
|
||||||
return i < KVM_MMU_NUM_PREV_ROOTS;
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
|
|
||||||
union kvm_mmu_page_role new_role)
|
|
||||||
{
|
|
||||||
struct kvm_mmu *mmu = vcpu->arch.mmu;
|
|
||||||
|
|
||||||
/*
|
|
||||||
* For now, limit the fast switch to 64-bit hosts+VMs in order to avoid
|
|
||||||
* having to deal with PDPTEs. We may add support for 32-bit hosts/VMs
|
|
||||||
* later if necessary.
|
|
||||||
*/
|
|
||||||
if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
|
|
||||||
mmu->root_level >= PT64_ROOT_4LEVEL)
|
|
||||||
return cached_root_available(vcpu, new_pgd, new_role);
|
|
||||||
|
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
|
/*
|
||||||
union kvm_mmu_page_role new_role)
|
* Find out if a previously cached root matching the new pgd/role is available.
|
||||||
|
* On entry, mmu->root is invalid.
|
||||||
|
* If a matching root is found, it is assigned to kvm_mmu->root, the LRU entry
|
||||||
|
* of the cache becomes invalid, and true is returned.
|
||||||
|
* If no match is found, kvm_mmu->root is left invalid and false is returned.
|
||||||
|
*/
|
||||||
|
static bool cached_root_find_without_current(struct kvm *kvm, struct kvm_mmu *mmu,
|
||||||
|
gpa_t new_pgd,
|
||||||
|
union kvm_mmu_page_role new_role)
|
||||||
{
|
{
|
||||||
if (!fast_pgd_switch(vcpu, new_pgd, new_role)) {
|
uint i;
|
||||||
kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT);
|
|
||||||
|
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
|
||||||
|
if (is_root_usable(&mmu->prev_roots[i], new_pgd, new_role))
|
||||||
|
goto hit;
|
||||||
|
|
||||||
|
return false;
|
||||||
|
|
||||||
|
hit:
|
||||||
|
swap(mmu->root, mmu->prev_roots[i]);
|
||||||
|
/* Bubble up the remaining roots. */
|
||||||
|
for (; i < KVM_MMU_NUM_PREV_ROOTS - 1; i++)
|
||||||
|
mmu->prev_roots[i] = mmu->prev_roots[i + 1];
|
||||||
|
mmu->prev_roots[i].hpa = INVALID_PAGE;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool fast_pgd_switch(struct kvm *kvm, struct kvm_mmu *mmu,
|
||||||
|
gpa_t new_pgd, union kvm_mmu_page_role new_role)
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* For now, limit the caching to 64-bit hosts+VMs in order to avoid
|
||||||
|
* having to deal with PDPTEs. We may add support for 32-bit hosts/VMs
|
||||||
|
* later if necessary.
|
||||||
|
*/
|
||||||
|
if (VALID_PAGE(mmu->root.hpa) && !to_shadow_page(mmu->root.hpa))
|
||||||
|
kvm_mmu_free_roots(kvm, mmu, KVM_MMU_ROOT_CURRENT);
|
||||||
|
|
||||||
|
if (VALID_PAGE(mmu->root.hpa))
|
||||||
|
return cached_root_find_and_keep_current(kvm, mmu, new_pgd, new_role);
|
||||||
|
else
|
||||||
|
return cached_root_find_without_current(kvm, mmu, new_pgd, new_role);
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
|
||||||
|
{
|
||||||
|
struct kvm_mmu *mmu = vcpu->arch.mmu;
|
||||||
|
union kvm_mmu_page_role new_role = mmu->mmu_role.base;
|
||||||
|
|
||||||
|
if (!fast_pgd_switch(vcpu->kvm, mmu, new_pgd, new_role)) {
|
||||||
|
/* kvm_mmu_ensure_valid_pgd will set up a new root. */
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* It's possible that the cached previous root page is obsolete because
|
* It's possible that the cached previous root page is obsolete because
|
||||||
* of a change in the MMU generation number. However, changing the
|
* of a change in the MMU generation number. However, changing the
|
||||||
* generation number is accompanied by KVM_REQ_MMU_RELOAD, which will
|
* generation number is accompanied by KVM_REQ_MMU_FREE_OBSOLETE_ROOTS,
|
||||||
* free the root set here and allocate a new one.
|
* which will free the root set here and allocate a new one.
|
||||||
*/
|
*/
|
||||||
kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
|
kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
|
||||||
|
|
||||||
|
@ -4222,12 +4229,7 @@ static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
|
||||||
*/
|
*/
|
||||||
if (!new_role.direct)
|
if (!new_role.direct)
|
||||||
__clear_sp_write_flooding_count(
|
__clear_sp_write_flooding_count(
|
||||||
to_shadow_page(vcpu->arch.mmu->root_hpa));
|
to_shadow_page(vcpu->arch.mmu->root.hpa));
|
||||||
}
|
|
||||||
|
|
||||||
void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
|
|
||||||
{
|
|
||||||
__kvm_mmu_new_pgd(vcpu, new_pgd, kvm_mmu_calc_root_page_role(vcpu));
|
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
|
EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
|
||||||
|
|
||||||
|
@ -4485,8 +4487,7 @@ static inline bool boot_cpu_is_amd(void)
|
||||||
* possible, however, kvm currently does not do execution-protection.
|
* possible, however, kvm currently does not do execution-protection.
|
||||||
*/
|
*/
|
||||||
static void
|
static void
|
||||||
reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
|
reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
|
||||||
struct kvm_mmu *context)
|
|
||||||
{
|
{
|
||||||
struct rsvd_bits_validate *shadow_zero_check;
|
struct rsvd_bits_validate *shadow_zero_check;
|
||||||
int i;
|
int i;
|
||||||
|
@ -4517,8 +4518,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
|
||||||
* is the shadow page table for intel nested guest.
|
* is the shadow page table for intel nested guest.
|
||||||
*/
|
*/
|
||||||
static void
|
static void
|
||||||
reset_ept_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
|
reset_ept_shadow_zero_bits_mask(struct kvm_mmu *context, bool execonly)
|
||||||
struct kvm_mmu *context, bool execonly)
|
|
||||||
{
|
{
|
||||||
__reset_rsvds_bits_mask_ept(&context->shadow_zero_check,
|
__reset_rsvds_bits_mask_ept(&context->shadow_zero_check,
|
||||||
reserved_hpa_bits(), execonly,
|
reserved_hpa_bits(), execonly,
|
||||||
|
@ -4805,7 +4805,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
|
||||||
context->gva_to_gpa = paging32_gva_to_gpa;
|
context->gva_to_gpa = paging32_gva_to_gpa;
|
||||||
|
|
||||||
reset_guest_paging_metadata(vcpu, context);
|
reset_guest_paging_metadata(vcpu, context);
|
||||||
reset_tdp_shadow_zero_bits_mask(vcpu, context);
|
reset_tdp_shadow_zero_bits_mask(context);
|
||||||
}
|
}
|
||||||
|
|
||||||
static union kvm_mmu_role
|
static union kvm_mmu_role
|
||||||
|
@ -4899,9 +4899,8 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
|
||||||
|
|
||||||
new_role = kvm_calc_shadow_npt_root_page_role(vcpu, ®s);
|
new_role = kvm_calc_shadow_npt_root_page_role(vcpu, ®s);
|
||||||
|
|
||||||
__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
|
|
||||||
|
|
||||||
shadow_mmu_init_context(vcpu, context, ®s, new_role);
|
shadow_mmu_init_context(vcpu, context, ®s, new_role);
|
||||||
|
kvm_mmu_new_pgd(vcpu, nested_cr3);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
|
EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
|
||||||
|
|
||||||
|
@ -4939,27 +4938,25 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
|
||||||
kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
|
kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
|
||||||
execonly, level);
|
execonly, level);
|
||||||
|
|
||||||
__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
|
if (new_role.as_u64 != context->mmu_role.as_u64) {
|
||||||
|
context->mmu_role.as_u64 = new_role.as_u64;
|
||||||
|
|
||||||
if (new_role.as_u64 == context->mmu_role.as_u64)
|
context->shadow_root_level = level;
|
||||||
return;
|
|
||||||
|
|
||||||
context->mmu_role.as_u64 = new_role.as_u64;
|
context->ept_ad = accessed_dirty;
|
||||||
|
context->page_fault = ept_page_fault;
|
||||||
|
context->gva_to_gpa = ept_gva_to_gpa;
|
||||||
|
context->sync_page = ept_sync_page;
|
||||||
|
context->invlpg = ept_invlpg;
|
||||||
|
context->root_level = level;
|
||||||
|
context->direct_map = false;
|
||||||
|
update_permission_bitmask(context, true);
|
||||||
|
context->pkru_mask = 0;
|
||||||
|
reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
|
||||||
|
reset_ept_shadow_zero_bits_mask(context, execonly);
|
||||||
|
}
|
||||||
|
|
||||||
context->shadow_root_level = level;
|
kvm_mmu_new_pgd(vcpu, new_eptp);
|
||||||
|
|
||||||
context->ept_ad = accessed_dirty;
|
|
||||||
context->page_fault = ept_page_fault;
|
|
||||||
context->gva_to_gpa = ept_gva_to_gpa;
|
|
||||||
context->sync_page = ept_sync_page;
|
|
||||||
context->invlpg = ept_invlpg;
|
|
||||||
context->root_level = level;
|
|
||||||
context->direct_map = false;
|
|
||||||
|
|
||||||
update_permission_bitmask(context, true);
|
|
||||||
context->pkru_mask = 0;
|
|
||||||
reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
|
|
||||||
reset_ept_shadow_zero_bits_mask(vcpu, context, execonly);
|
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
|
EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
|
||||||
|
|
||||||
|
@ -5044,20 +5041,6 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_init_mmu);
|
EXPORT_SYMBOL_GPL(kvm_init_mmu);
|
||||||
|
|
||||||
static union kvm_mmu_page_role
|
|
||||||
kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
|
||||||
struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
|
|
||||||
union kvm_mmu_role role;
|
|
||||||
|
|
||||||
if (tdp_enabled)
|
|
||||||
role = kvm_calc_tdp_mmu_root_page_role(vcpu, ®s, true);
|
|
||||||
else
|
|
||||||
role = kvm_calc_shadow_mmu_root_page_role(vcpu, ®s, true);
|
|
||||||
|
|
||||||
return role.base;
|
|
||||||
}
|
|
||||||
|
|
||||||
void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
|
void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
|
@ -5111,17 +5094,73 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
|
||||||
kvm_mmu_sync_roots(vcpu);
|
kvm_mmu_sync_roots(vcpu);
|
||||||
|
|
||||||
kvm_mmu_load_pgd(vcpu);
|
kvm_mmu_load_pgd(vcpu);
|
||||||
static_call(kvm_x86_tlb_flush_current)(vcpu);
|
|
||||||
|
/*
|
||||||
|
* Flush any TLB entries for the new root, the provenance of the root
|
||||||
|
* is unknown. Even if KVM ensures there are no stale TLB entries
|
||||||
|
* for a freed root, in theory another hypervisor could have left
|
||||||
|
* stale entries. Flushing on alloc also allows KVM to skip the TLB
|
||||||
|
* flush when freeing a root (see kvm_tdp_mmu_put_root()).
|
||||||
|
*/
|
||||||
|
static_call(kvm_x86_flush_tlb_current)(vcpu);
|
||||||
out:
|
out:
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
void kvm_mmu_unload(struct kvm_vcpu *vcpu)
|
void kvm_mmu_unload(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
kvm_mmu_free_roots(vcpu, &vcpu->arch.root_mmu, KVM_MMU_ROOTS_ALL);
|
struct kvm *kvm = vcpu->kvm;
|
||||||
WARN_ON(VALID_PAGE(vcpu->arch.root_mmu.root_hpa));
|
|
||||||
kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
kvm_mmu_free_roots(kvm, &vcpu->arch.root_mmu, KVM_MMU_ROOTS_ALL);
|
||||||
WARN_ON(VALID_PAGE(vcpu->arch.guest_mmu.root_hpa));
|
WARN_ON(VALID_PAGE(vcpu->arch.root_mmu.root.hpa));
|
||||||
|
kvm_mmu_free_roots(kvm, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
||||||
|
WARN_ON(VALID_PAGE(vcpu->arch.guest_mmu.root.hpa));
|
||||||
|
vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
|
||||||
|
}
|
||||||
|
|
||||||
|
static bool is_obsolete_root(struct kvm *kvm, hpa_t root_hpa)
|
||||||
|
{
|
||||||
|
struct kvm_mmu_page *sp;
|
||||||
|
|
||||||
|
if (!VALID_PAGE(root_hpa))
|
||||||
|
return false;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* When freeing obsolete roots, treat roots as obsolete if they don't
|
||||||
|
* have an associated shadow page. This does mean KVM will get false
|
||||||
|
* positives and free roots that don't strictly need to be freed, but
|
||||||
|
* such false positives are relatively rare:
|
||||||
|
*
|
||||||
|
* (a) only PAE paging and nested NPT has roots without shadow pages
|
||||||
|
* (b) remote reloads due to a memslot update obsoletes _all_ roots
|
||||||
|
* (c) KVM doesn't track previous roots for PAE paging, and the guest
|
||||||
|
* is unlikely to zap an in-use PGD.
|
||||||
|
*/
|
||||||
|
sp = to_shadow_page(root_hpa);
|
||||||
|
return !sp || is_obsolete_sp(kvm, sp);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __kvm_mmu_free_obsolete_roots(struct kvm *kvm, struct kvm_mmu *mmu)
|
||||||
|
{
|
||||||
|
unsigned long roots_to_free = 0;
|
||||||
|
int i;
|
||||||
|
|
||||||
|
if (is_obsolete_root(kvm, mmu->root.hpa))
|
||||||
|
roots_to_free |= KVM_MMU_ROOT_CURRENT;
|
||||||
|
|
||||||
|
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
|
||||||
|
if (is_obsolete_root(kvm, mmu->root.hpa))
|
||||||
|
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (roots_to_free)
|
||||||
|
kvm_mmu_free_roots(kvm, mmu, roots_to_free);
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_mmu_free_obsolete_roots(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
__kvm_mmu_free_obsolete_roots(vcpu->kvm, &vcpu->arch.root_mmu);
|
||||||
|
__kvm_mmu_free_obsolete_roots(vcpu->kvm, &vcpu->arch.guest_mmu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool need_remote_flush(u64 old, u64 new)
|
static bool need_remote_flush(u64 old, u64 new)
|
||||||
|
@ -5271,7 +5310,6 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
|
||||||
gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes);
|
gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes);
|
||||||
|
|
||||||
++vcpu->kvm->stat.mmu_pte_write;
|
++vcpu->kvm->stat.mmu_pte_write;
|
||||||
kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
|
|
||||||
|
|
||||||
for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn) {
|
for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn) {
|
||||||
if (detect_write_misaligned(sp, gpa, bytes) ||
|
if (detect_write_misaligned(sp, gpa, bytes) ||
|
||||||
|
@ -5296,7 +5334,6 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
kvm_mmu_remote_flush_or_zap(vcpu->kvm, &invalid_list, flush);
|
kvm_mmu_remote_flush_or_zap(vcpu->kvm, &invalid_list, flush);
|
||||||
kvm_mmu_audit(vcpu, AUDIT_POST_PTE_WRITE);
|
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -5306,7 +5343,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
|
||||||
int r, emulation_type = EMULTYPE_PF;
|
int r, emulation_type = EMULTYPE_PF;
|
||||||
bool direct = vcpu->arch.mmu->direct_map;
|
bool direct = vcpu->arch.mmu->direct_map;
|
||||||
|
|
||||||
if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
|
if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root.hpa)))
|
||||||
return RET_PF_RETRY;
|
return RET_PF_RETRY;
|
||||||
|
|
||||||
r = RET_PF_INVALID;
|
r = RET_PF_INVALID;
|
||||||
|
@ -5371,14 +5408,14 @@ void kvm_mmu_invalidate_gva(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||||
if (is_noncanonical_address(gva, vcpu))
|
if (is_noncanonical_address(gva, vcpu))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
static_call(kvm_x86_tlb_flush_gva)(vcpu, gva);
|
static_call(kvm_x86_flush_tlb_gva)(vcpu, gva);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!mmu->invlpg)
|
if (!mmu->invlpg)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
if (root_hpa == INVALID_PAGE) {
|
if (root_hpa == INVALID_PAGE) {
|
||||||
mmu->invlpg(vcpu, gva, mmu->root_hpa);
|
mmu->invlpg(vcpu, gva, mmu->root.hpa);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* INVLPG is required to invalidate any global mappings for the VA,
|
* INVLPG is required to invalidate any global mappings for the VA,
|
||||||
|
@ -5414,7 +5451,7 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid)
|
||||||
uint i;
|
uint i;
|
||||||
|
|
||||||
if (pcid == kvm_get_active_pcid(vcpu)) {
|
if (pcid == kvm_get_active_pcid(vcpu)) {
|
||||||
mmu->invlpg(vcpu, gva, mmu->root_hpa);
|
mmu->invlpg(vcpu, gva, mmu->root.hpa);
|
||||||
tlb_flush = true;
|
tlb_flush = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -5427,7 +5464,7 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid)
|
||||||
}
|
}
|
||||||
|
|
||||||
if (tlb_flush)
|
if (tlb_flush)
|
||||||
static_call(kvm_x86_tlb_flush_gva)(vcpu, gva);
|
static_call(kvm_x86_flush_tlb_gva)(vcpu, gva);
|
||||||
|
|
||||||
++vcpu->stat.invlpg;
|
++vcpu->stat.invlpg;
|
||||||
|
|
||||||
|
@ -5527,8 +5564,8 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
|
||||||
struct page *page;
|
struct page *page;
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
mmu->root_hpa = INVALID_PAGE;
|
mmu->root.hpa = INVALID_PAGE;
|
||||||
mmu->root_pgd = 0;
|
mmu->root.pgd = 0;
|
||||||
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
|
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
|
||||||
mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
|
mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
|
||||||
|
|
||||||
|
@ -5648,9 +5685,13 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Trigger a remote TLB flush before freeing the page tables to ensure
|
* Kick all vCPUs (via remote TLB flush) before freeing the page tables
|
||||||
* KVM is not in the middle of a lockless shadow page table walk, which
|
* to ensure KVM is not in the middle of a lockless shadow page table
|
||||||
* may reference the pages.
|
* walk, which may reference the pages. The remote TLB flush itself is
|
||||||
|
* not required and is simply a convenient way to kick vCPUs as needed.
|
||||||
|
* KVM performs a local TLB flush when allocating a new root (see
|
||||||
|
* kvm_mmu_load()), and the reload in the caller ensure no vCPUs are
|
||||||
|
* running with an obsolete MMU.
|
||||||
*/
|
*/
|
||||||
kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
|
kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
|
||||||
}
|
}
|
||||||
|
@ -5680,11 +5721,11 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
|
||||||
*/
|
*/
|
||||||
kvm->arch.mmu_valid_gen = kvm->arch.mmu_valid_gen ? 0 : 1;
|
kvm->arch.mmu_valid_gen = kvm->arch.mmu_valid_gen ? 0 : 1;
|
||||||
|
|
||||||
/* In order to ensure all threads see this change when
|
/*
|
||||||
* handling the MMU reload signal, this must happen in the
|
* In order to ensure all vCPUs drop their soon-to-be invalid roots,
|
||||||
* same critical section as kvm_reload_remote_mmus, and
|
* invalidating TDP MMU roots must be done while holding mmu_lock for
|
||||||
* before kvm_zap_obsolete_pages as kvm_zap_obsolete_pages
|
* write and in the same critical section as making the reload request,
|
||||||
* could drop the MMU lock and yield.
|
* e.g. before kvm_zap_obsolete_pages() could drop mmu_lock and yield.
|
||||||
*/
|
*/
|
||||||
if (is_tdp_mmu_enabled(kvm))
|
if (is_tdp_mmu_enabled(kvm))
|
||||||
kvm_tdp_mmu_invalidate_all_roots(kvm);
|
kvm_tdp_mmu_invalidate_all_roots(kvm);
|
||||||
|
@ -5697,17 +5738,22 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
|
||||||
* Note: we need to do this under the protection of mmu_lock,
|
* Note: we need to do this under the protection of mmu_lock,
|
||||||
* otherwise, vcpu would purge shadow page but miss tlb flush.
|
* otherwise, vcpu would purge shadow page but miss tlb flush.
|
||||||
*/
|
*/
|
||||||
kvm_reload_remote_mmus(kvm);
|
kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
|
||||||
|
|
||||||
kvm_zap_obsolete_pages(kvm);
|
kvm_zap_obsolete_pages(kvm);
|
||||||
|
|
||||||
write_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
|
|
||||||
if (is_tdp_mmu_enabled(kvm)) {
|
/*
|
||||||
read_lock(&kvm->mmu_lock);
|
* Zap the invalidated TDP MMU roots, all SPTEs must be dropped before
|
||||||
|
* returning to the caller, e.g. if the zap is in response to a memslot
|
||||||
|
* deletion, mmu_notifier callbacks will be unable to reach the SPTEs
|
||||||
|
* associated with the deleted memslot once the update completes, and
|
||||||
|
* Deferring the zap until the final reference to the root is put would
|
||||||
|
* lead to use-after-free.
|
||||||
|
*/
|
||||||
|
if (is_tdp_mmu_enabled(kvm))
|
||||||
kvm_tdp_mmu_zap_invalidated_roots(kvm);
|
kvm_tdp_mmu_zap_invalidated_roots(kvm);
|
||||||
read_unlock(&kvm->mmu_lock);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
|
static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
|
||||||
|
@ -5813,7 +5859,7 @@ static bool slot_rmap_write_protect(struct kvm *kvm,
|
||||||
struct kvm_rmap_head *rmap_head,
|
struct kvm_rmap_head *rmap_head,
|
||||||
const struct kvm_memory_slot *slot)
|
const struct kvm_memory_slot *slot)
|
||||||
{
|
{
|
||||||
return __rmap_write_protect(kvm, rmap_head, false);
|
return rmap_write_protect(rmap_head, false);
|
||||||
}
|
}
|
||||||
|
|
||||||
void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
|
void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
|
||||||
|
@ -5857,12 +5903,52 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
|
||||||
* will clear a separate software-only bit (MMU-writable) and skip the
|
* will clear a separate software-only bit (MMU-writable) and skip the
|
||||||
* flush if-and-only-if this bit was already clear.
|
* flush if-and-only-if this bit was already clear.
|
||||||
*
|
*
|
||||||
* See DEFAULT_SPTE_MMU_WRITEABLE for more details.
|
* See is_writable_pte() for more details.
|
||||||
*/
|
*/
|
||||||
if (flush)
|
if (flush)
|
||||||
kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
|
kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Must be called with the mmu_lock held in write-mode. */
|
||||||
|
void kvm_mmu_try_split_huge_pages(struct kvm *kvm,
|
||||||
|
const struct kvm_memory_slot *memslot,
|
||||||
|
u64 start, u64 end,
|
||||||
|
int target_level)
|
||||||
|
{
|
||||||
|
if (is_tdp_mmu_enabled(kvm))
|
||||||
|
kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end,
|
||||||
|
target_level, false);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* A TLB flush is unnecessary at this point for the same resons as in
|
||||||
|
* kvm_mmu_slot_try_split_huge_pages().
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
|
||||||
|
void kvm_mmu_slot_try_split_huge_pages(struct kvm *kvm,
|
||||||
|
const struct kvm_memory_slot *memslot,
|
||||||
|
int target_level)
|
||||||
|
{
|
||||||
|
u64 start = memslot->base_gfn;
|
||||||
|
u64 end = start + memslot->npages;
|
||||||
|
|
||||||
|
if (is_tdp_mmu_enabled(kvm)) {
|
||||||
|
read_lock(&kvm->mmu_lock);
|
||||||
|
kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end, target_level, true);
|
||||||
|
read_unlock(&kvm->mmu_lock);
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* No TLB flush is necessary here. KVM will flush TLBs after
|
||||||
|
* write-protecting and/or clearing dirty on the newly split SPTEs to
|
||||||
|
* ensure that guest writes are reflected in the dirty log before the
|
||||||
|
* ioctl to enable dirty logging on this memslot completes. Since the
|
||||||
|
* split SPTEs retain the write and dirty bits of the huge SPTE, it is
|
||||||
|
* safe for KVM to decide if a TLB flush is necessary based on the split
|
||||||
|
* SPTEs.
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
|
||||||
static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
|
static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
|
||||||
struct kvm_rmap_head *rmap_head,
|
struct kvm_rmap_head *rmap_head,
|
||||||
const struct kvm_memory_slot *slot)
|
const struct kvm_memory_slot *slot)
|
||||||
|
@ -6202,7 +6288,6 @@ void kvm_mmu_module_exit(void)
|
||||||
mmu_destroy_caches();
|
mmu_destroy_caches();
|
||||||
percpu_counter_destroy(&kvm_total_used_mmu_pages);
|
percpu_counter_destroy(&kvm_total_used_mmu_pages);
|
||||||
unregister_shrinker(&mmu_shrinker);
|
unregister_shrinker(&mmu_shrinker);
|
||||||
mmu_audit_disable();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -6272,6 +6357,13 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
|
||||||
rcu_idx = srcu_read_lock(&kvm->srcu);
|
rcu_idx = srcu_read_lock(&kvm->srcu);
|
||||||
write_lock(&kvm->mmu_lock);
|
write_lock(&kvm->mmu_lock);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Zapping TDP MMU shadow pages, including the remote TLB flush, must
|
||||||
|
* be done under RCU protection, because the pages are freed via RCU
|
||||||
|
* callback.
|
||||||
|
*/
|
||||||
|
rcu_read_lock();
|
||||||
|
|
||||||
ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
|
ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
|
||||||
to_zap = ratio ? DIV_ROUND_UP(nx_lpage_splits, ratio) : 0;
|
to_zap = ratio ? DIV_ROUND_UP(nx_lpage_splits, ratio) : 0;
|
||||||
for ( ; to_zap; --to_zap) {
|
for ( ; to_zap; --to_zap) {
|
||||||
|
@ -6296,12 +6388,18 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
|
||||||
|
|
||||||
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
|
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
|
||||||
kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
|
kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
|
||||||
|
rcu_read_unlock();
|
||||||
|
|
||||||
cond_resched_rwlock_write(&kvm->mmu_lock);
|
cond_resched_rwlock_write(&kvm->mmu_lock);
|
||||||
flush = false;
|
flush = false;
|
||||||
|
|
||||||
|
rcu_read_lock();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
|
kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
|
||||||
|
|
||||||
|
rcu_read_unlock();
|
||||||
|
|
||||||
write_unlock(&kvm->mmu_lock);
|
write_unlock(&kvm->mmu_lock);
|
||||||
srcu_read_unlock(&kvm->srcu, rcu_idx);
|
srcu_read_unlock(&kvm->srcu, rcu_idx);
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,303 +0,0 @@
|
||||||
// SPDX-License-Identifier: GPL-2.0-only
|
|
||||||
/*
|
|
||||||
* mmu_audit.c:
|
|
||||||
*
|
|
||||||
* Audit code for KVM MMU
|
|
||||||
*
|
|
||||||
* Copyright (C) 2006 Qumranet, Inc.
|
|
||||||
* Copyright 2010 Red Hat, Inc. and/or its affiliates.
|
|
||||||
*
|
|
||||||
* Authors:
|
|
||||||
* Yaniv Kamay <yaniv@qumranet.com>
|
|
||||||
* Avi Kivity <avi@qumranet.com>
|
|
||||||
* Marcelo Tosatti <mtosatti@redhat.com>
|
|
||||||
* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <linux/ratelimit.h>
|
|
||||||
|
|
||||||
static char const *audit_point_name[] = {
|
|
||||||
"pre page fault",
|
|
||||||
"post page fault",
|
|
||||||
"pre pte write",
|
|
||||||
"post pte write",
|
|
||||||
"pre sync",
|
|
||||||
"post sync"
|
|
||||||
};
|
|
||||||
|
|
||||||
#define audit_printk(kvm, fmt, args...) \
|
|
||||||
printk(KERN_ERR "audit: (%s) error: " \
|
|
||||||
fmt, audit_point_name[kvm->arch.audit_point], ##args)
|
|
||||||
|
|
||||||
typedef void (*inspect_spte_fn) (struct kvm_vcpu *vcpu, u64 *sptep, int level);
|
|
||||||
|
|
||||||
static void __mmu_spte_walk(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
|
|
||||||
inspect_spte_fn fn, int level)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
|
|
||||||
for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
|
|
||||||
u64 *ent = sp->spt;
|
|
||||||
|
|
||||||
fn(vcpu, ent + i, level);
|
|
||||||
|
|
||||||
if (is_shadow_present_pte(ent[i]) &&
|
|
||||||
!is_last_spte(ent[i], level)) {
|
|
||||||
struct kvm_mmu_page *child;
|
|
||||||
|
|
||||||
child = to_shadow_page(ent[i] & PT64_BASE_ADDR_MASK);
|
|
||||||
__mmu_spte_walk(vcpu, child, fn, level - 1);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void mmu_spte_walk(struct kvm_vcpu *vcpu, inspect_spte_fn fn)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
struct kvm_mmu_page *sp;
|
|
||||||
|
|
||||||
if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
|
|
||||||
return;
|
|
||||||
|
|
||||||
if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
|
|
||||||
hpa_t root = vcpu->arch.mmu->root_hpa;
|
|
||||||
|
|
||||||
sp = to_shadow_page(root);
|
|
||||||
__mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->root_level);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
for (i = 0; i < 4; ++i) {
|
|
||||||
hpa_t root = vcpu->arch.mmu->pae_root[i];
|
|
||||||
|
|
||||||
if (IS_VALID_PAE_ROOT(root)) {
|
|
||||||
root &= PT64_BASE_ADDR_MASK;
|
|
||||||
sp = to_shadow_page(root);
|
|
||||||
__mmu_spte_walk(vcpu, sp, fn, 2);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
typedef void (*sp_handler) (struct kvm *kvm, struct kvm_mmu_page *sp);
|
|
||||||
|
|
||||||
static void walk_all_active_sps(struct kvm *kvm, sp_handler fn)
|
|
||||||
{
|
|
||||||
struct kvm_mmu_page *sp;
|
|
||||||
|
|
||||||
list_for_each_entry(sp, &kvm->arch.active_mmu_pages, link)
|
|
||||||
fn(kvm, sp);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
|
|
||||||
{
|
|
||||||
struct kvm_mmu_page *sp;
|
|
||||||
gfn_t gfn;
|
|
||||||
kvm_pfn_t pfn;
|
|
||||||
hpa_t hpa;
|
|
||||||
|
|
||||||
sp = sptep_to_sp(sptep);
|
|
||||||
|
|
||||||
if (sp->unsync) {
|
|
||||||
if (level != PG_LEVEL_4K) {
|
|
||||||
audit_printk(vcpu->kvm, "unsync sp: %p "
|
|
||||||
"level = %d\n", sp, level);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!is_shadow_present_pte(*sptep) || !is_last_spte(*sptep, level))
|
|
||||||
return;
|
|
||||||
|
|
||||||
gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
|
|
||||||
pfn = kvm_vcpu_gfn_to_pfn_atomic(vcpu, gfn);
|
|
||||||
|
|
||||||
if (is_error_pfn(pfn))
|
|
||||||
return;
|
|
||||||
|
|
||||||
hpa = pfn << PAGE_SHIFT;
|
|
||||||
if ((*sptep & PT64_BASE_ADDR_MASK) != hpa)
|
|
||||||
audit_printk(vcpu->kvm, "levels %d pfn %llx hpa %llx "
|
|
||||||
"ent %llxn", vcpu->arch.mmu->root_level, pfn,
|
|
||||||
hpa, *sptep);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)
|
|
||||||
{
|
|
||||||
static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10);
|
|
||||||
struct kvm_rmap_head *rmap_head;
|
|
||||||
struct kvm_mmu_page *rev_sp;
|
|
||||||
struct kvm_memslots *slots;
|
|
||||||
struct kvm_memory_slot *slot;
|
|
||||||
gfn_t gfn;
|
|
||||||
|
|
||||||
rev_sp = sptep_to_sp(sptep);
|
|
||||||
gfn = kvm_mmu_page_get_gfn(rev_sp, sptep - rev_sp->spt);
|
|
||||||
|
|
||||||
slots = kvm_memslots_for_spte_role(kvm, rev_sp->role);
|
|
||||||
slot = __gfn_to_memslot(slots, gfn);
|
|
||||||
if (!slot) {
|
|
||||||
if (!__ratelimit(&ratelimit_state))
|
|
||||||
return;
|
|
||||||
audit_printk(kvm, "no memslot for gfn %llx\n", gfn);
|
|
||||||
audit_printk(kvm, "index %ld of sp (gfn=%llx)\n",
|
|
||||||
(long int)(sptep - rev_sp->spt), rev_sp->gfn);
|
|
||||||
dump_stack();
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
rmap_head = gfn_to_rmap(gfn, rev_sp->role.level, slot);
|
|
||||||
if (!rmap_head->val) {
|
|
||||||
if (!__ratelimit(&ratelimit_state))
|
|
||||||
return;
|
|
||||||
audit_printk(kvm, "no rmap for writable spte %llx\n",
|
|
||||||
*sptep);
|
|
||||||
dump_stack();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_sptes_have_rmaps(struct kvm_vcpu *vcpu, u64 *sptep, int level)
|
|
||||||
{
|
|
||||||
if (is_shadow_present_pte(*sptep) && is_last_spte(*sptep, level))
|
|
||||||
inspect_spte_has_rmap(vcpu->kvm, sptep);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_spte_after_sync(struct kvm_vcpu *vcpu, u64 *sptep, int level)
|
|
||||||
{
|
|
||||||
struct kvm_mmu_page *sp = sptep_to_sp(sptep);
|
|
||||||
|
|
||||||
if (vcpu->kvm->arch.audit_point == AUDIT_POST_SYNC && sp->unsync)
|
|
||||||
audit_printk(vcpu->kvm, "meet unsync sp(%p) after sync "
|
|
||||||
"root.\n", sp);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
|
|
||||||
if (sp->role.level != PG_LEVEL_4K)
|
|
||||||
return;
|
|
||||||
|
|
||||||
for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
|
|
||||||
if (!is_shadow_present_pte(sp->spt[i]))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
inspect_spte_has_rmap(kvm, sp->spt + i);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
|
|
||||||
{
|
|
||||||
struct kvm_rmap_head *rmap_head;
|
|
||||||
u64 *sptep;
|
|
||||||
struct rmap_iterator iter;
|
|
||||||
struct kvm_memslots *slots;
|
|
||||||
struct kvm_memory_slot *slot;
|
|
||||||
|
|
||||||
if (sp->role.direct || sp->unsync || sp->role.invalid)
|
|
||||||
return;
|
|
||||||
|
|
||||||
slots = kvm_memslots_for_spte_role(kvm, sp->role);
|
|
||||||
slot = __gfn_to_memslot(slots, sp->gfn);
|
|
||||||
rmap_head = gfn_to_rmap(sp->gfn, PG_LEVEL_4K, slot);
|
|
||||||
|
|
||||||
for_each_rmap_spte(rmap_head, &iter, sptep) {
|
|
||||||
if (is_writable_pte(*sptep))
|
|
||||||
audit_printk(kvm, "shadow page has writable "
|
|
||||||
"mappings: gfn %llx role %x\n",
|
|
||||||
sp->gfn, sp->role.word);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
|
|
||||||
{
|
|
||||||
check_mappings_rmap(kvm, sp);
|
|
||||||
audit_write_protection(kvm, sp);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_all_active_sps(struct kvm *kvm)
|
|
||||||
{
|
|
||||||
walk_all_active_sps(kvm, audit_sp);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_spte(struct kvm_vcpu *vcpu, u64 *sptep, int level)
|
|
||||||
{
|
|
||||||
audit_sptes_have_rmaps(vcpu, sptep, level);
|
|
||||||
audit_mappings(vcpu, sptep, level);
|
|
||||||
audit_spte_after_sync(vcpu, sptep, level);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void audit_vcpu_spte(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
|
||||||
mmu_spte_walk(vcpu, audit_spte);
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool mmu_audit;
|
|
||||||
static DEFINE_STATIC_KEY_FALSE(mmu_audit_key);
|
|
||||||
|
|
||||||
static void __kvm_mmu_audit(struct kvm_vcpu *vcpu, int point)
|
|
||||||
{
|
|
||||||
static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10);
|
|
||||||
|
|
||||||
if (!__ratelimit(&ratelimit_state))
|
|
||||||
return;
|
|
||||||
|
|
||||||
vcpu->kvm->arch.audit_point = point;
|
|
||||||
audit_all_active_sps(vcpu->kvm);
|
|
||||||
audit_vcpu_spte(vcpu);
|
|
||||||
}
|
|
||||||
|
|
||||||
static inline void kvm_mmu_audit(struct kvm_vcpu *vcpu, int point)
|
|
||||||
{
|
|
||||||
if (static_branch_unlikely((&mmu_audit_key)))
|
|
||||||
__kvm_mmu_audit(vcpu, point);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void mmu_audit_enable(void)
|
|
||||||
{
|
|
||||||
if (mmu_audit)
|
|
||||||
return;
|
|
||||||
|
|
||||||
static_branch_inc(&mmu_audit_key);
|
|
||||||
mmu_audit = true;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void mmu_audit_disable(void)
|
|
||||||
{
|
|
||||||
if (!mmu_audit)
|
|
||||||
return;
|
|
||||||
|
|
||||||
static_branch_dec(&mmu_audit_key);
|
|
||||||
mmu_audit = false;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int mmu_audit_set(const char *val, const struct kernel_param *kp)
|
|
||||||
{
|
|
||||||
int ret;
|
|
||||||
unsigned long enable;
|
|
||||||
|
|
||||||
ret = kstrtoul(val, 10, &enable);
|
|
||||||
if (ret < 0)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
switch (enable) {
|
|
||||||
case 0:
|
|
||||||
mmu_audit_disable();
|
|
||||||
break;
|
|
||||||
case 1:
|
|
||||||
mmu_audit_enable();
|
|
||||||
break;
|
|
||||||
default:
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
static const struct kernel_param_ops audit_param_ops = {
|
|
||||||
.set = mmu_audit_set,
|
|
||||||
.get = param_get_bool,
|
|
||||||
};
|
|
||||||
|
|
||||||
arch_param_cb(mmu_audit, &audit_param_ops, &mmu_audit, 0644);
|
|
|
@ -30,6 +30,8 @@ extern bool dbg;
|
||||||
#define INVALID_PAE_ROOT 0
|
#define INVALID_PAE_ROOT 0
|
||||||
#define IS_VALID_PAE_ROOT(x) (!!(x))
|
#define IS_VALID_PAE_ROOT(x) (!!(x))
|
||||||
|
|
||||||
|
typedef u64 __rcu *tdp_ptep_t;
|
||||||
|
|
||||||
struct kvm_mmu_page {
|
struct kvm_mmu_page {
|
||||||
/*
|
/*
|
||||||
* Note, "link" through "spt" fit in a single 64 byte cache line on
|
* Note, "link" through "spt" fit in a single 64 byte cache line on
|
||||||
|
@ -59,8 +61,17 @@ struct kvm_mmu_page {
|
||||||
refcount_t tdp_mmu_root_count;
|
refcount_t tdp_mmu_root_count;
|
||||||
};
|
};
|
||||||
unsigned int unsync_children;
|
unsigned int unsync_children;
|
||||||
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
|
union {
|
||||||
DECLARE_BITMAP(unsync_child_bitmap, 512);
|
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
|
||||||
|
tdp_ptep_t ptep;
|
||||||
|
};
|
||||||
|
union {
|
||||||
|
DECLARE_BITMAP(unsync_child_bitmap, 512);
|
||||||
|
struct {
|
||||||
|
struct work_struct tdp_mmu_async_work;
|
||||||
|
void *tdp_mmu_async_data;
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
struct list_head lpage_disallowed_link;
|
struct list_head lpage_disallowed_link;
|
||||||
#ifdef CONFIG_X86_32
|
#ifdef CONFIG_X86_32
|
||||||
|
|
|
@ -416,6 +416,29 @@ TRACE_EVENT(
|
||||||
)
|
)
|
||||||
);
|
);
|
||||||
|
|
||||||
|
TRACE_EVENT(
|
||||||
|
kvm_mmu_split_huge_page,
|
||||||
|
TP_PROTO(u64 gfn, u64 spte, int level, int errno),
|
||||||
|
TP_ARGS(gfn, spte, level, errno),
|
||||||
|
|
||||||
|
TP_STRUCT__entry(
|
||||||
|
__field(u64, gfn)
|
||||||
|
__field(u64, spte)
|
||||||
|
__field(int, level)
|
||||||
|
__field(int, errno)
|
||||||
|
),
|
||||||
|
|
||||||
|
TP_fast_assign(
|
||||||
|
__entry->gfn = gfn;
|
||||||
|
__entry->spte = spte;
|
||||||
|
__entry->level = level;
|
||||||
|
__entry->errno = errno;
|
||||||
|
),
|
||||||
|
|
||||||
|
TP_printk("gfn %llx spte %llx level %d errno %d",
|
||||||
|
__entry->gfn, __entry->spte, __entry->level, __entry->errno)
|
||||||
|
);
|
||||||
|
|
||||||
#endif /* _TRACE_KVMMMU_H */
|
#endif /* _TRACE_KVMMMU_H */
|
||||||
|
|
||||||
#undef TRACE_INCLUDE_PATH
|
#undef TRACE_INCLUDE_PATH
|
||||||
|
|
|
@ -47,8 +47,8 @@ int kvm_page_track_create_memslot(struct kvm *kvm,
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
slot->arch.gfn_track[i] =
|
slot->arch.gfn_track[i] =
|
||||||
kvcalloc(npages, sizeof(*slot->arch.gfn_track[i]),
|
__vcalloc(npages, sizeof(*slot->arch.gfn_track[i]),
|
||||||
GFP_KERNEL_ACCOUNT);
|
GFP_KERNEL_ACCOUNT);
|
||||||
if (!slot->arch.gfn_track[i])
|
if (!slot->arch.gfn_track[i])
|
||||||
goto track_free;
|
goto track_free;
|
||||||
}
|
}
|
||||||
|
@ -75,7 +75,8 @@ int kvm_page_track_write_tracking_alloc(struct kvm_memory_slot *slot)
|
||||||
if (slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE])
|
if (slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE])
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track), GFP_KERNEL_ACCOUNT);
|
gfn_track = __vcalloc(slot->npages, sizeof(*gfn_track),
|
||||||
|
GFP_KERNEL_ACCOUNT);
|
||||||
if (gfn_track == NULL)
|
if (gfn_track == NULL)
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
|
||||||
|
|
|
@ -668,7 +668,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
|
||||||
if (FNAME(gpte_changed)(vcpu, gw, top_level))
|
if (FNAME(gpte_changed)(vcpu, gw, top_level))
|
||||||
goto out_gpte_changed;
|
goto out_gpte_changed;
|
||||||
|
|
||||||
if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
|
if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root.hpa)))
|
||||||
goto out_gpte_changed;
|
goto out_gpte_changed;
|
||||||
|
|
||||||
for (shadow_walk_init(&it, vcpu, fault->addr);
|
for (shadow_walk_init(&it, vcpu, fault->addr);
|
||||||
|
@ -904,12 +904,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
|
||||||
if (is_page_fault_stale(vcpu, fault, mmu_seq))
|
if (is_page_fault_stale(vcpu, fault, mmu_seq))
|
||||||
goto out_unlock;
|
goto out_unlock;
|
||||||
|
|
||||||
kvm_mmu_audit(vcpu, AUDIT_PRE_PAGE_FAULT);
|
|
||||||
r = make_mmu_pages_available(vcpu);
|
r = make_mmu_pages_available(vcpu);
|
||||||
if (r)
|
if (r)
|
||||||
goto out_unlock;
|
goto out_unlock;
|
||||||
r = FNAME(fetch)(vcpu, fault, &walker);
|
r = FNAME(fetch)(vcpu, fault, &walker);
|
||||||
kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);
|
|
||||||
|
|
||||||
out_unlock:
|
out_unlock:
|
||||||
write_unlock(&vcpu->kvm->mmu_lock);
|
write_unlock(&vcpu->kvm->mmu_lock);
|
||||||
|
|
|
@ -192,6 +192,65 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
|
||||||
return wrprot;
|
return wrprot;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static u64 make_spte_executable(u64 spte)
|
||||||
|
{
|
||||||
|
bool is_access_track = is_access_track_spte(spte);
|
||||||
|
|
||||||
|
if (is_access_track)
|
||||||
|
spte = restore_acc_track_spte(spte);
|
||||||
|
|
||||||
|
spte &= ~shadow_nx_mask;
|
||||||
|
spte |= shadow_x_mask;
|
||||||
|
|
||||||
|
if (is_access_track)
|
||||||
|
spte = mark_spte_for_access_track(spte);
|
||||||
|
|
||||||
|
return spte;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Construct an SPTE that maps a sub-page of the given huge page SPTE where
|
||||||
|
* `index` identifies which sub-page.
|
||||||
|
*
|
||||||
|
* This is used during huge page splitting to build the SPTEs that make up the
|
||||||
|
* new page table.
|
||||||
|
*/
|
||||||
|
u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
|
||||||
|
{
|
||||||
|
u64 child_spte;
|
||||||
|
int child_level;
|
||||||
|
|
||||||
|
if (WARN_ON_ONCE(!is_shadow_present_pte(huge_spte)))
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
if (WARN_ON_ONCE(!is_large_pte(huge_spte)))
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
child_spte = huge_spte;
|
||||||
|
child_level = huge_level - 1;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The child_spte already has the base address of the huge page being
|
||||||
|
* split. So we just have to OR in the offset to the page at the next
|
||||||
|
* lower level for the given index.
|
||||||
|
*/
|
||||||
|
child_spte |= (index * KVM_PAGES_PER_HPAGE(child_level)) << PAGE_SHIFT;
|
||||||
|
|
||||||
|
if (child_level == PG_LEVEL_4K) {
|
||||||
|
child_spte &= ~PT_PAGE_SIZE_MASK;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* When splitting to a 4K page, mark the page executable as the
|
||||||
|
* NX hugepage mitigation no longer applies.
|
||||||
|
*/
|
||||||
|
if (is_nx_huge_page_enabled())
|
||||||
|
child_spte = make_spte_executable(child_spte);
|
||||||
|
}
|
||||||
|
|
||||||
|
return child_spte;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled)
|
u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled)
|
||||||
{
|
{
|
||||||
u64 spte = SPTE_MMU_PRESENT_MASK;
|
u64 spte = SPTE_MMU_PRESENT_MASK;
|
||||||
|
@ -250,14 +309,7 @@ u64 mark_spte_for_access_track(u64 spte)
|
||||||
if (is_access_track_spte(spte))
|
if (is_access_track_spte(spte))
|
||||||
return spte;
|
return spte;
|
||||||
|
|
||||||
/*
|
check_spte_writable_invariants(spte);
|
||||||
* Making an Access Tracking PTE will result in removal of write access
|
|
||||||
* from the PTE. So, verify that we will be able to restore the write
|
|
||||||
* access in the fast page fault path later on.
|
|
||||||
*/
|
|
||||||
WARN_ONCE((spte & PT_WRITABLE_MASK) &&
|
|
||||||
!spte_can_locklessly_be_made_writable(spte),
|
|
||||||
"kvm: Writable SPTE is not locklessly dirty-trackable\n");
|
|
||||||
|
|
||||||
WARN_ONCE(spte & (SHADOW_ACC_TRACK_SAVED_BITS_MASK <<
|
WARN_ONCE(spte & (SHADOW_ACC_TRACK_SAVED_BITS_MASK <<
|
||||||
SHADOW_ACC_TRACK_SAVED_BITS_SHIFT),
|
SHADOW_ACC_TRACK_SAVED_BITS_SHIFT),
|
||||||
|
@ -368,8 +420,8 @@ void kvm_mmu_reset_all_pte_masks(void)
|
||||||
shadow_acc_track_mask = 0;
|
shadow_acc_track_mask = 0;
|
||||||
shadow_me_mask = sme_me_mask;
|
shadow_me_mask = sme_me_mask;
|
||||||
|
|
||||||
shadow_host_writable_mask = DEFAULT_SPTE_HOST_WRITEABLE;
|
shadow_host_writable_mask = DEFAULT_SPTE_HOST_WRITABLE;
|
||||||
shadow_mmu_writable_mask = DEFAULT_SPTE_MMU_WRITEABLE;
|
shadow_mmu_writable_mask = DEFAULT_SPTE_MMU_WRITABLE;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Set a reserved PA bit in MMIO SPTEs to generate page faults with
|
* Set a reserved PA bit in MMIO SPTEs to generate page faults with
|
||||||
|
|
|
@ -75,33 +75,13 @@ static_assert(SPTE_TDP_AD_ENABLED_MASK == 0);
|
||||||
static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK));
|
static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK));
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* *_SPTE_HOST_WRITEABLE (aka Host-writable) indicates whether the host permits
|
* {DEFAULT,EPT}_SPTE_{HOST,MMU}_WRITABLE are used to keep track of why a given
|
||||||
* writes to the guest page mapped by the SPTE. This bit is cleared on SPTEs
|
* SPTE is write-protected. See is_writable_pte() for details.
|
||||||
* that map guest pages in read-only memslots and read-only VMAs.
|
|
||||||
*
|
|
||||||
* Invariants:
|
|
||||||
* - If Host-writable is clear, PT_WRITABLE_MASK must be clear.
|
|
||||||
*
|
|
||||||
*
|
|
||||||
* *_SPTE_MMU_WRITEABLE (aka MMU-writable) indicates whether the shadow MMU
|
|
||||||
* allows writes to the guest page mapped by the SPTE. This bit is cleared when
|
|
||||||
* the guest page mapped by the SPTE contains a page table that is being
|
|
||||||
* monitored for shadow paging. In this case the SPTE can only be made writable
|
|
||||||
* by unsyncing the shadow page under the mmu_lock.
|
|
||||||
*
|
|
||||||
* Invariants:
|
|
||||||
* - If MMU-writable is clear, PT_WRITABLE_MASK must be clear.
|
|
||||||
* - If MMU-writable is set, Host-writable must be set.
|
|
||||||
*
|
|
||||||
* If MMU-writable is set, PT_WRITABLE_MASK is normally set but can be cleared
|
|
||||||
* to track writes for dirty logging. For such SPTEs, KVM will locklessly set
|
|
||||||
* PT_WRITABLE_MASK upon the next write from the guest and record the write in
|
|
||||||
* the dirty log (see fast_page_fault()).
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/* Bits 9 and 10 are ignored by all non-EPT PTEs. */
|
/* Bits 9 and 10 are ignored by all non-EPT PTEs. */
|
||||||
#define DEFAULT_SPTE_HOST_WRITEABLE BIT_ULL(9)
|
#define DEFAULT_SPTE_HOST_WRITABLE BIT_ULL(9)
|
||||||
#define DEFAULT_SPTE_MMU_WRITEABLE BIT_ULL(10)
|
#define DEFAULT_SPTE_MMU_WRITABLE BIT_ULL(10)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Low ignored bits are at a premium for EPT, use high ignored bits, taking care
|
* Low ignored bits are at a premium for EPT, use high ignored bits, taking care
|
||||||
|
@ -339,15 +319,86 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check,
|
||||||
__is_rsvd_bits_set(rsvd_check, spte, level);
|
__is_rsvd_bits_set(rsvd_check, spte, level);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* An shadow-present leaf SPTE may be non-writable for 3 possible reasons:
|
||||||
|
*
|
||||||
|
* 1. To intercept writes for dirty logging. KVM write-protects huge pages
|
||||||
|
* so that they can be split be split down into the dirty logging
|
||||||
|
* granularity (4KiB) whenever the guest writes to them. KVM also
|
||||||
|
* write-protects 4KiB pages so that writes can be recorded in the dirty log
|
||||||
|
* (e.g. if not using PML). SPTEs are write-protected for dirty logging
|
||||||
|
* during the VM-iotcls that enable dirty logging.
|
||||||
|
*
|
||||||
|
* 2. To intercept writes to guest page tables that KVM is shadowing. When a
|
||||||
|
* guest writes to its page table the corresponding shadow page table will
|
||||||
|
* be marked "unsync". That way KVM knows which shadow page tables need to
|
||||||
|
* be updated on the next TLB flush, INVLPG, etc. and which do not.
|
||||||
|
*
|
||||||
|
* 3. To prevent guest writes to read-only memory, such as for memory in a
|
||||||
|
* read-only memslot or guest memory backed by a read-only VMA. Writes to
|
||||||
|
* such pages are disallowed entirely.
|
||||||
|
*
|
||||||
|
* To keep track of why a given SPTE is write-protected, KVM uses 2
|
||||||
|
* software-only bits in the SPTE:
|
||||||
|
*
|
||||||
|
* shadow_mmu_writable_mask, aka MMU-writable -
|
||||||
|
* Cleared on SPTEs that KVM is currently write-protecting for shadow paging
|
||||||
|
* purposes (case 2 above).
|
||||||
|
*
|
||||||
|
* shadow_host_writable_mask, aka Host-writable -
|
||||||
|
* Cleared on SPTEs that are not host-writable (case 3 above)
|
||||||
|
*
|
||||||
|
* Note, not all possible combinations of PT_WRITABLE_MASK,
|
||||||
|
* shadow_mmu_writable_mask, and shadow_host_writable_mask are valid. A given
|
||||||
|
* SPTE can be in only one of the following states, which map to the
|
||||||
|
* aforementioned 3 cases:
|
||||||
|
*
|
||||||
|
* shadow_host_writable_mask | shadow_mmu_writable_mask | PT_WRITABLE_MASK
|
||||||
|
* ------------------------- | ------------------------ | ----------------
|
||||||
|
* 1 | 1 | 1 (writable)
|
||||||
|
* 1 | 1 | 0 (case 1)
|
||||||
|
* 1 | 0 | 0 (case 2)
|
||||||
|
* 0 | 0 | 0 (case 3)
|
||||||
|
*
|
||||||
|
* The valid combinations of these bits are checked by
|
||||||
|
* check_spte_writable_invariants() whenever an SPTE is modified.
|
||||||
|
*
|
||||||
|
* Clearing the MMU-writable bit is always done under the MMU lock and always
|
||||||
|
* accompanied by a TLB flush before dropping the lock to avoid corrupting the
|
||||||
|
* shadow page tables between vCPUs. Write-protecting an SPTE for dirty logging
|
||||||
|
* (which does not clear the MMU-writable bit), does not flush TLBs before
|
||||||
|
* dropping the lock, as it only needs to synchronize guest writes with the
|
||||||
|
* dirty bitmap.
|
||||||
|
*
|
||||||
|
* So, there is the problem: clearing the MMU-writable bit can encounter a
|
||||||
|
* write-protected SPTE while CPUs still have writable mappings for that SPTE
|
||||||
|
* cached in their TLB. To address this, KVM always flushes TLBs when
|
||||||
|
* write-protecting SPTEs if the MMU-writable bit is set on the old SPTE.
|
||||||
|
*
|
||||||
|
* The Host-writable bit is not modified on present SPTEs, it is only set or
|
||||||
|
* cleared when an SPTE is first faulted in from non-present and then remains
|
||||||
|
* immutable.
|
||||||
|
*/
|
||||||
|
static inline bool is_writable_pte(unsigned long pte)
|
||||||
|
{
|
||||||
|
return pte & PT_WRITABLE_MASK;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Note: spte must be a shadow-present leaf SPTE. */
|
||||||
|
static inline void check_spte_writable_invariants(u64 spte)
|
||||||
|
{
|
||||||
|
if (spte & shadow_mmu_writable_mask)
|
||||||
|
WARN_ONCE(!(spte & shadow_host_writable_mask),
|
||||||
|
"kvm: MMU-writable SPTE is not Host-writable: %llx",
|
||||||
|
spte);
|
||||||
|
else
|
||||||
|
WARN_ONCE(is_writable_pte(spte),
|
||||||
|
"kvm: Writable SPTE is not MMU-writable: %llx", spte);
|
||||||
|
}
|
||||||
|
|
||||||
static inline bool spte_can_locklessly_be_made_writable(u64 spte)
|
static inline bool spte_can_locklessly_be_made_writable(u64 spte)
|
||||||
{
|
{
|
||||||
if (spte & shadow_mmu_writable_mask) {
|
return spte & shadow_mmu_writable_mask;
|
||||||
WARN_ON_ONCE(!(spte & shadow_host_writable_mask));
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
WARN_ON_ONCE(spte & PT_WRITABLE_MASK);
|
|
||||||
return false;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline u64 get_mmio_spte_generation(u64 spte)
|
static inline u64 get_mmio_spte_generation(u64 spte)
|
||||||
|
@ -364,9 +415,25 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
|
||||||
unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn,
|
unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn,
|
||||||
u64 old_spte, bool prefetch, bool can_unsync,
|
u64 old_spte, bool prefetch, bool can_unsync,
|
||||||
bool host_writable, u64 *new_spte);
|
bool host_writable, u64 *new_spte);
|
||||||
|
u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index);
|
||||||
u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
|
u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
|
||||||
u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
|
u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
|
||||||
u64 mark_spte_for_access_track(u64 spte);
|
u64 mark_spte_for_access_track(u64 spte);
|
||||||
|
|
||||||
|
/* Restore an acc-track PTE back to a regular PTE */
|
||||||
|
static inline u64 restore_acc_track_spte(u64 spte)
|
||||||
|
{
|
||||||
|
u64 saved_bits = (spte >> SHADOW_ACC_TRACK_SAVED_BITS_SHIFT)
|
||||||
|
& SHADOW_ACC_TRACK_SAVED_BITS_MASK;
|
||||||
|
|
||||||
|
spte &= ~shadow_acc_track_mask;
|
||||||
|
spte &= ~(SHADOW_ACC_TRACK_SAVED_BITS_MASK <<
|
||||||
|
SHADOW_ACC_TRACK_SAVED_BITS_SHIFT);
|
||||||
|
spte |= saved_bits;
|
||||||
|
|
||||||
|
return spte;
|
||||||
|
}
|
||||||
|
|
||||||
u64 kvm_mmu_changed_pte_notifier_make_spte(u64 old_spte, kvm_pfn_t new_pfn);
|
u64 kvm_mmu_changed_pte_notifier_make_spte(u64 old_spte, kvm_pfn_t new_pfn);
|
||||||
|
|
||||||
void kvm_mmu_reset_all_pte_masks(void);
|
void kvm_mmu_reset_all_pte_masks(void);
|
||||||
|
|
|
@ -12,7 +12,7 @@ static void tdp_iter_refresh_sptep(struct tdp_iter *iter)
|
||||||
{
|
{
|
||||||
iter->sptep = iter->pt_path[iter->level - 1] +
|
iter->sptep = iter->pt_path[iter->level - 1] +
|
||||||
SHADOW_PT_INDEX(iter->gfn << PAGE_SHIFT, iter->level);
|
SHADOW_PT_INDEX(iter->gfn << PAGE_SHIFT, iter->level);
|
||||||
iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));
|
iter->old_spte = kvm_tdp_mmu_read_spte(iter->sptep);
|
||||||
}
|
}
|
||||||
|
|
||||||
static gfn_t round_gfn_for_level(gfn_t gfn, int level)
|
static gfn_t round_gfn_for_level(gfn_t gfn, int level)
|
||||||
|
@ -40,17 +40,19 @@ void tdp_iter_restart(struct tdp_iter *iter)
|
||||||
* Sets a TDP iterator to walk a pre-order traversal of the paging structure
|
* Sets a TDP iterator to walk a pre-order traversal of the paging structure
|
||||||
* rooted at root_pt, starting with the walk to translate next_last_level_gfn.
|
* rooted at root_pt, starting with the walk to translate next_last_level_gfn.
|
||||||
*/
|
*/
|
||||||
void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
|
void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root,
|
||||||
int min_level, gfn_t next_last_level_gfn)
|
int min_level, gfn_t next_last_level_gfn)
|
||||||
{
|
{
|
||||||
|
int root_level = root->role.level;
|
||||||
|
|
||||||
WARN_ON(root_level < 1);
|
WARN_ON(root_level < 1);
|
||||||
WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);
|
WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);
|
||||||
|
|
||||||
iter->next_last_level_gfn = next_last_level_gfn;
|
iter->next_last_level_gfn = next_last_level_gfn;
|
||||||
iter->root_level = root_level;
|
iter->root_level = root_level;
|
||||||
iter->min_level = min_level;
|
iter->min_level = min_level;
|
||||||
iter->pt_path[iter->root_level - 1] = (tdp_ptep_t)root_pt;
|
iter->pt_path[iter->root_level - 1] = (tdp_ptep_t)root->spt;
|
||||||
iter->as_id = kvm_mmu_page_as_id(sptep_to_sp(root_pt));
|
iter->as_id = kvm_mmu_page_as_id(root);
|
||||||
|
|
||||||
tdp_iter_restart(iter);
|
tdp_iter_restart(iter);
|
||||||
}
|
}
|
||||||
|
@ -87,7 +89,7 @@ static bool try_step_down(struct tdp_iter *iter)
|
||||||
* Reread the SPTE before stepping down to avoid traversing into page
|
* Reread the SPTE before stepping down to avoid traversing into page
|
||||||
* tables that are no longer linked from this entry.
|
* tables that are no longer linked from this entry.
|
||||||
*/
|
*/
|
||||||
iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));
|
iter->old_spte = kvm_tdp_mmu_read_spte(iter->sptep);
|
||||||
|
|
||||||
child_pt = spte_to_child_pt(iter->old_spte, iter->level);
|
child_pt = spte_to_child_pt(iter->old_spte, iter->level);
|
||||||
if (!child_pt)
|
if (!child_pt)
|
||||||
|
@ -121,7 +123,7 @@ static bool try_step_side(struct tdp_iter *iter)
|
||||||
iter->gfn += KVM_PAGES_PER_HPAGE(iter->level);
|
iter->gfn += KVM_PAGES_PER_HPAGE(iter->level);
|
||||||
iter->next_last_level_gfn = iter->gfn;
|
iter->next_last_level_gfn = iter->gfn;
|
||||||
iter->sptep++;
|
iter->sptep++;
|
||||||
iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));
|
iter->old_spte = kvm_tdp_mmu_read_spte(iter->sptep);
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
|
@ -7,7 +7,20 @@
|
||||||
|
|
||||||
#include "mmu.h"
|
#include "mmu.h"
|
||||||
|
|
||||||
typedef u64 __rcu *tdp_ptep_t;
|
/*
|
||||||
|
* TDP MMU SPTEs are RCU protected to allow paging structures (non-leaf SPTEs)
|
||||||
|
* to be zapped while holding mmu_lock for read, and to allow TLB flushes to be
|
||||||
|
* batched without having to collect the list of zapped SPs. Flows that can
|
||||||
|
* remove SPs must service pending TLB flushes prior to dropping RCU protection.
|
||||||
|
*/
|
||||||
|
static inline u64 kvm_tdp_mmu_read_spte(tdp_ptep_t sptep)
|
||||||
|
{
|
||||||
|
return READ_ONCE(*rcu_dereference(sptep));
|
||||||
|
}
|
||||||
|
static inline void kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 val)
|
||||||
|
{
|
||||||
|
WRITE_ONCE(*rcu_dereference(sptep), val);
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* A TDP iterator performs a pre-order walk over a TDP paging structure.
|
* A TDP iterator performs a pre-order walk over a TDP paging structure.
|
||||||
|
@ -57,17 +70,17 @@ struct tdp_iter {
|
||||||
* Iterates over every SPTE mapping the GFN range [start, end) in a
|
* Iterates over every SPTE mapping the GFN range [start, end) in a
|
||||||
* preorder traversal.
|
* preorder traversal.
|
||||||
*/
|
*/
|
||||||
#define for_each_tdp_pte_min_level(iter, root, root_level, min_level, start, end) \
|
#define for_each_tdp_pte_min_level(iter, root, min_level, start, end) \
|
||||||
for (tdp_iter_start(&iter, root, root_level, min_level, start); \
|
for (tdp_iter_start(&iter, root, min_level, start); \
|
||||||
iter.valid && iter.gfn < end; \
|
iter.valid && iter.gfn < end; \
|
||||||
tdp_iter_next(&iter))
|
tdp_iter_next(&iter))
|
||||||
|
|
||||||
#define for_each_tdp_pte(iter, root, root_level, start, end) \
|
#define for_each_tdp_pte(iter, root, start, end) \
|
||||||
for_each_tdp_pte_min_level(iter, root, root_level, PG_LEVEL_4K, start, end)
|
for_each_tdp_pte_min_level(iter, root, PG_LEVEL_4K, start, end)
|
||||||
|
|
||||||
tdp_ptep_t spte_to_child_pt(u64 pte, int level);
|
tdp_ptep_t spte_to_child_pt(u64 pte, int level);
|
||||||
|
|
||||||
void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
|
void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root,
|
||||||
int min_level, gfn_t next_last_level_gfn);
|
int min_level, gfn_t next_last_level_gfn);
|
||||||
void tdp_iter_next(struct tdp_iter *iter);
|
void tdp_iter_next(struct tdp_iter *iter);
|
||||||
void tdp_iter_restart(struct tdp_iter *iter);
|
void tdp_iter_restart(struct tdp_iter *iter);
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -7,12 +7,8 @@
|
||||||
|
|
||||||
hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu);
|
hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
__must_check static inline bool kvm_tdp_mmu_get_root(struct kvm *kvm,
|
__must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root)
|
||||||
struct kvm_mmu_page *root)
|
|
||||||
{
|
{
|
||||||
if (root->role.invalid)
|
|
||||||
return false;
|
|
||||||
|
|
||||||
return refcount_inc_not_zero(&root->tdp_mmu_root_count);
|
return refcount_inc_not_zero(&root->tdp_mmu_root_count);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -26,24 +22,8 @@ static inline bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, int as_id,
|
||||||
{
|
{
|
||||||
return __kvm_tdp_mmu_zap_gfn_range(kvm, as_id, start, end, true, flush);
|
return __kvm_tdp_mmu_zap_gfn_range(kvm, as_id, start, end, true, flush);
|
||||||
}
|
}
|
||||||
static inline bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
|
|
||||||
{
|
|
||||||
gfn_t end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level + 1);
|
|
||||||
|
|
||||||
/*
|
|
||||||
* Don't allow yielding, as the caller may have a flush pending. Note,
|
|
||||||
* if mmu_lock is held for write, zapping will never yield in this case,
|
|
||||||
* but explicitly disallow it for safety. The TDP MMU does not yield
|
|
||||||
* until it has made forward progress (steps sideways), and when zapping
|
|
||||||
* a single shadow page that it's guaranteed to see (thus the mmu_lock
|
|
||||||
* requirement), its "step sideways" will always step beyond the bounds
|
|
||||||
* of the shadow page's gfn range and stop iterating before yielding.
|
|
||||||
*/
|
|
||||||
lockdep_assert_held_write(&kvm->mmu_lock);
|
|
||||||
return __kvm_tdp_mmu_zap_gfn_range(kvm, kvm_mmu_page_as_id(sp),
|
|
||||||
sp->gfn, end, false, false);
|
|
||||||
}
|
|
||||||
|
|
||||||
|
bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
|
||||||
void kvm_tdp_mmu_zap_all(struct kvm *kvm);
|
void kvm_tdp_mmu_zap_all(struct kvm *kvm);
|
||||||
void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm);
|
void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm);
|
||||||
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm);
|
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm);
|
||||||
|
@ -71,6 +51,11 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
|
||||||
struct kvm_memory_slot *slot, gfn_t gfn,
|
struct kvm_memory_slot *slot, gfn_t gfn,
|
||||||
int min_level);
|
int min_level);
|
||||||
|
|
||||||
|
void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm,
|
||||||
|
const struct kvm_memory_slot *slot,
|
||||||
|
gfn_t start, gfn_t end,
|
||||||
|
int target_level, bool shared);
|
||||||
|
|
||||||
static inline void kvm_tdp_mmu_walk_lockless_begin(void)
|
static inline void kvm_tdp_mmu_walk_lockless_begin(void)
|
||||||
{
|
{
|
||||||
rcu_read_lock();
|
rcu_read_lock();
|
||||||
|
@ -94,7 +79,7 @@ static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return sp->tdp_mmu
|
||||||
static inline bool is_tdp_mmu(struct kvm_mmu *mmu)
|
static inline bool is_tdp_mmu(struct kvm_mmu *mmu)
|
||||||
{
|
{
|
||||||
struct kvm_mmu_page *sp;
|
struct kvm_mmu_page *sp;
|
||||||
hpa_t hpa = mmu->root_hpa;
|
hpa_t hpa = mmu->root.hpa;
|
||||||
|
|
||||||
if (WARN_ON(!VALID_PAGE(hpa)))
|
if (WARN_ON(!VALID_PAGE(hpa)))
|
||||||
return false;
|
return false;
|
||||||
|
|
|
@ -324,18 +324,18 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
|
||||||
switch (id) {
|
switch (id) {
|
||||||
case AVIC_IPI_FAILURE_INVALID_INT_TYPE:
|
case AVIC_IPI_FAILURE_INVALID_INT_TYPE:
|
||||||
/*
|
/*
|
||||||
* AVIC hardware handles the generation of
|
* Emulate IPIs that are not handled by AVIC hardware, which
|
||||||
* IPIs when the specified Message Type is Fixed
|
* only virtualizes Fixed, Edge-Triggered INTRs. The exit is
|
||||||
* (also known as fixed delivery mode) and
|
* a trap, e.g. ICR holds the correct value and RIP has been
|
||||||
* the Trigger Mode is edge-triggered. The hardware
|
* advanced, KVM is responsible only for emulating the IPI.
|
||||||
* also supports self and broadcast delivery modes
|
* Sadly, hardware may sometimes leave the BUSY flag set, in
|
||||||
* specified via the Destination Shorthand(DSH)
|
* which case KVM needs to emulate the ICR write as well in
|
||||||
* field of the ICRL. Logical and physical APIC ID
|
* order to clear the BUSY flag.
|
||||||
* formats are supported. All other IPI types cause
|
|
||||||
* a #VMEXIT, which needs to emulated.
|
|
||||||
*/
|
*/
|
||||||
kvm_lapic_reg_write(apic, APIC_ICR2, icrh);
|
if (icrl & APIC_ICR_BUSY)
|
||||||
kvm_lapic_reg_write(apic, APIC_ICR, icrl);
|
kvm_apic_write_nodecode(vcpu, APIC_ICR);
|
||||||
|
else
|
||||||
|
kvm_apic_send_ipi(apic, icrl, icrh);
|
||||||
break;
|
break;
|
||||||
case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING:
|
case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING:
|
||||||
/*
|
/*
|
||||||
|
@ -477,30 +477,28 @@ static void avic_handle_dfr_update(struct kvm_vcpu *vcpu)
|
||||||
svm->dfr_reg = dfr;
|
svm->dfr_reg = dfr;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int avic_unaccel_trap_write(struct vcpu_svm *svm)
|
static int avic_unaccel_trap_write(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct kvm_lapic *apic = svm->vcpu.arch.apic;
|
u32 offset = to_svm(vcpu)->vmcb->control.exit_info_1 &
|
||||||
u32 offset = svm->vmcb->control.exit_info_1 &
|
|
||||||
AVIC_UNACCEL_ACCESS_OFFSET_MASK;
|
AVIC_UNACCEL_ACCESS_OFFSET_MASK;
|
||||||
|
|
||||||
switch (offset) {
|
switch (offset) {
|
||||||
case APIC_ID:
|
case APIC_ID:
|
||||||
if (avic_handle_apic_id_update(&svm->vcpu))
|
if (avic_handle_apic_id_update(vcpu))
|
||||||
return 0;
|
return 0;
|
||||||
break;
|
break;
|
||||||
case APIC_LDR:
|
case APIC_LDR:
|
||||||
if (avic_handle_ldr_update(&svm->vcpu))
|
if (avic_handle_ldr_update(vcpu))
|
||||||
return 0;
|
return 0;
|
||||||
break;
|
break;
|
||||||
case APIC_DFR:
|
case APIC_DFR:
|
||||||
avic_handle_dfr_update(&svm->vcpu);
|
avic_handle_dfr_update(vcpu);
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
kvm_lapic_reg_write(apic, offset, kvm_lapic_get_reg(apic, offset));
|
kvm_apic_write_nodecode(vcpu, offset);
|
||||||
|
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -550,7 +548,7 @@ int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu)
|
||||||
if (trap) {
|
if (trap) {
|
||||||
/* Handling Trap */
|
/* Handling Trap */
|
||||||
WARN_ONCE(!write, "svm: Handling trap read.\n");
|
WARN_ONCE(!write, "svm: Handling trap read.\n");
|
||||||
ret = avic_unaccel_trap_write(svm);
|
ret = avic_unaccel_trap_write(vcpu);
|
||||||
} else {
|
} else {
|
||||||
/* Handling Fault */
|
/* Handling Fault */
|
||||||
ret = kvm_emulate_instruction(vcpu, 0);
|
ret = kvm_emulate_instruction(vcpu, 0);
|
||||||
|
@ -578,7 +576,7 @@ int avic_init_vcpu(struct vcpu_svm *svm)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
void avic_post_state_restore(struct kvm_vcpu *vcpu)
|
void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (avic_handle_apic_id_update(vcpu) != 0)
|
if (avic_handle_apic_id_update(vcpu) != 0)
|
||||||
return;
|
return;
|
||||||
|
@ -586,20 +584,7 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)
|
||||||
avic_handle_ldr_update(vcpu);
|
avic_handle_ldr_update(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
|
static int avic_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
|
||||||
{
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
|
|
||||||
{
|
|
||||||
}
|
|
||||||
|
|
||||||
void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
|
|
||||||
{
|
|
||||||
}
|
|
||||||
|
|
||||||
static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
|
|
||||||
{
|
{
|
||||||
int ret = 0;
|
int ret = 0;
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
|
@ -631,48 +616,6 @@ static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
|
||||||
struct vmcb *vmcb = svm->vmcb01.ptr;
|
|
||||||
bool activated = kvm_vcpu_apicv_active(vcpu);
|
|
||||||
|
|
||||||
if (!enable_apicv)
|
|
||||||
return;
|
|
||||||
|
|
||||||
if (activated) {
|
|
||||||
/**
|
|
||||||
* During AVIC temporary deactivation, guest could update
|
|
||||||
* APIC ID, DFR and LDR registers, which would not be trapped
|
|
||||||
* by avic_unaccelerated_access_interception(). In this case,
|
|
||||||
* we need to check and update the AVIC logical APIC ID table
|
|
||||||
* accordingly before re-activating.
|
|
||||||
*/
|
|
||||||
avic_post_state_restore(vcpu);
|
|
||||||
vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
|
|
||||||
} else {
|
|
||||||
vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
|
|
||||||
}
|
|
||||||
vmcb_mark_dirty(vmcb, VMCB_AVIC);
|
|
||||||
|
|
||||||
if (activated)
|
|
||||||
avic_vcpu_load(vcpu, vcpu->cpu);
|
|
||||||
else
|
|
||||||
avic_vcpu_put(vcpu);
|
|
||||||
|
|
||||||
svm_set_pi_irte_mode(vcpu, activated);
|
|
||||||
}
|
|
||||||
|
|
||||||
void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
|
|
||||||
{
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
bool svm_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
|
static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
|
||||||
{
|
{
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
|
@ -770,7 +713,7 @@ get_pi_vcpu_info(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* svm_update_pi_irte - set IRTE for Posted-Interrupts
|
* avic_pi_update_irte - set IRTE for Posted-Interrupts
|
||||||
*
|
*
|
||||||
* @kvm: kvm
|
* @kvm: kvm
|
||||||
* @host_irq: host irq of the interrupt
|
* @host_irq: host irq of the interrupt
|
||||||
|
@ -778,8 +721,8 @@ get_pi_vcpu_info(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
|
||||||
* @set: set or unset PI
|
* @set: set or unset PI
|
||||||
* returns 0 on success, < 0 on failure
|
* returns 0 on success, < 0 on failure
|
||||||
*/
|
*/
|
||||||
int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
|
int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
|
||||||
uint32_t guest_irq, bool set)
|
uint32_t guest_irq, bool set)
|
||||||
{
|
{
|
||||||
struct kvm_kernel_irq_routing_entry *e;
|
struct kvm_kernel_irq_routing_entry *e;
|
||||||
struct kvm_irq_routing_table *irq_rt;
|
struct kvm_irq_routing_table *irq_rt;
|
||||||
|
@ -879,7 +822,7 @@ int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
bool svm_check_apicv_inhibit_reasons(ulong bit)
|
bool avic_check_apicv_inhibit_reasons(ulong bit)
|
||||||
{
|
{
|
||||||
ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) |
|
ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) |
|
||||||
BIT(APICV_INHIBIT_REASON_ABSENT) |
|
BIT(APICV_INHIBIT_REASON_ABSENT) |
|
||||||
|
@ -924,20 +867,15 @@ avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu, bool r)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||||
{
|
{
|
||||||
u64 entry;
|
u64 entry;
|
||||||
/* ID = 0xff (broadcast), ID > 0xff (reserved) */
|
|
||||||
int h_physical_id = kvm_cpu_get_apicid(cpu);
|
int h_physical_id = kvm_cpu_get_apicid(cpu);
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
lockdep_assert_preemption_disabled();
|
lockdep_assert_preemption_disabled();
|
||||||
|
|
||||||
/*
|
if (WARN_ON(h_physical_id & ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK))
|
||||||
* Since the host physical APIC id is 8 bits,
|
|
||||||
* we can support host APIC ID upto 255.
|
|
||||||
*/
|
|
||||||
if (WARN_ON(h_physical_id > AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK))
|
|
||||||
return;
|
return;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -961,7 +899,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||||
avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true);
|
avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
void avic_vcpu_put(struct kvm_vcpu *vcpu)
|
void __avic_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
u64 entry;
|
u64 entry;
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
@ -980,13 +918,63 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
|
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void avic_vcpu_load(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
int cpu = get_cpu();
|
||||||
|
|
||||||
|
WARN_ON(cpu != vcpu->cpu);
|
||||||
|
|
||||||
|
__avic_vcpu_load(vcpu, cpu);
|
||||||
|
|
||||||
|
put_cpu();
|
||||||
|
}
|
||||||
|
|
||||||
|
static void avic_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
preempt_disable();
|
||||||
|
|
||||||
|
__avic_vcpu_put(vcpu);
|
||||||
|
|
||||||
|
preempt_enable();
|
||||||
|
}
|
||||||
|
|
||||||
|
void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
|
||||||
|
{
|
||||||
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
struct vmcb *vmcb = svm->vmcb01.ptr;
|
||||||
|
bool activated = kvm_vcpu_apicv_active(vcpu);
|
||||||
|
|
||||||
|
if (!enable_apicv)
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (activated) {
|
||||||
|
/**
|
||||||
|
* During AVIC temporary deactivation, guest could update
|
||||||
|
* APIC ID, DFR and LDR registers, which would not be trapped
|
||||||
|
* by avic_unaccelerated_access_interception(). In this case,
|
||||||
|
* we need to check and update the AVIC logical APIC ID table
|
||||||
|
* accordingly before re-activating.
|
||||||
|
*/
|
||||||
|
avic_apicv_post_state_restore(vcpu);
|
||||||
|
vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
|
||||||
|
} else {
|
||||||
|
vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
|
||||||
|
}
|
||||||
|
vmcb_mark_dirty(vmcb, VMCB_AVIC);
|
||||||
|
|
||||||
|
if (activated)
|
||||||
|
avic_vcpu_load(vcpu);
|
||||||
|
else
|
||||||
|
avic_vcpu_put(vcpu);
|
||||||
|
|
||||||
|
avic_set_pi_irte_mode(vcpu, activated);
|
||||||
|
}
|
||||||
|
|
||||||
void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
|
void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (!kvm_vcpu_apicv_active(vcpu))
|
if (!kvm_vcpu_apicv_active(vcpu))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
preempt_disable();
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Unload the AVIC when the vCPU is about to block, _before_
|
* Unload the AVIC when the vCPU is about to block, _before_
|
||||||
* the vCPU actually blocks.
|
* the vCPU actually blocks.
|
||||||
|
@ -1001,21 +989,12 @@ void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
|
||||||
* the cause of errata #1235).
|
* the cause of errata #1235).
|
||||||
*/
|
*/
|
||||||
avic_vcpu_put(vcpu);
|
avic_vcpu_put(vcpu);
|
||||||
|
|
||||||
preempt_enable();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
|
void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
int cpu;
|
|
||||||
|
|
||||||
if (!kvm_vcpu_apicv_active(vcpu))
|
if (!kvm_vcpu_apicv_active(vcpu))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
cpu = get_cpu();
|
avic_vcpu_load(vcpu);
|
||||||
WARN_ON(cpu != vcpu->cpu);
|
|
||||||
|
|
||||||
avic_vcpu_load(vcpu, cpu);
|
|
||||||
|
|
||||||
put_cpu();
|
|
||||||
}
|
}
|
||||||
|
|
|
@ -0,0 +1,35 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
||||||
|
/*
|
||||||
|
* Common Hyper-V on KVM and KVM on Hyper-V definitions (SVM).
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef __ARCH_X86_KVM_SVM_HYPERV_H__
|
||||||
|
#define __ARCH_X86_KVM_SVM_HYPERV_H__
|
||||||
|
|
||||||
|
#include <asm/mshyperv.h>
|
||||||
|
|
||||||
|
#include "../hyperv.h"
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Hyper-V uses the software reserved 32 bytes in VMCB
|
||||||
|
* control area to expose SVM enlightenments to guests.
|
||||||
|
*/
|
||||||
|
struct hv_enlightenments {
|
||||||
|
struct __packed hv_enlightenments_control {
|
||||||
|
u32 nested_flush_hypercall:1;
|
||||||
|
u32 msr_bitmap:1;
|
||||||
|
u32 enlightened_npt_tlb: 1;
|
||||||
|
u32 reserved:29;
|
||||||
|
} __packed hv_enlightenments_control;
|
||||||
|
u32 hv_vp_id;
|
||||||
|
u64 hv_vm_id;
|
||||||
|
u64 partition_assist_page;
|
||||||
|
u64 reserved;
|
||||||
|
} __packed;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Hyper-V uses the software reserved clean bit in VMCB
|
||||||
|
*/
|
||||||
|
#define VMCB_HV_NESTED_ENLIGHTENMENTS VMCB_SW
|
||||||
|
|
||||||
|
#endif /* __ARCH_X86_KVM_SVM_HYPERV_H__ */
|
|
@ -28,6 +28,7 @@
|
||||||
#include "cpuid.h"
|
#include "cpuid.h"
|
||||||
#include "lapic.h"
|
#include "lapic.h"
|
||||||
#include "svm.h"
|
#include "svm.h"
|
||||||
|
#include "hyperv.h"
|
||||||
|
|
||||||
#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
|
#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
|
||||||
|
|
||||||
|
@ -165,15 +166,31 @@ void recalc_intercepts(struct vcpu_svm *svm)
|
||||||
vmcb_set_intercept(c, INTERCEPT_VMSAVE);
|
vmcb_set_intercept(c, INTERCEPT_VMSAVE);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Merge L0's (KVM) and L1's (Nested VMCB) MSR permission bitmaps. The function
|
||||||
|
* is optimized in that it only merges the parts where KVM MSR permission bitmap
|
||||||
|
* may contain zero bits.
|
||||||
|
*/
|
||||||
static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
|
static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
|
||||||
{
|
{
|
||||||
/*
|
struct hv_enlightenments *hve =
|
||||||
* This function merges the msr permission bitmaps of kvm and the
|
(struct hv_enlightenments *)svm->nested.ctl.reserved_sw;
|
||||||
* nested vmcb. It is optimized in that it only merges the parts where
|
|
||||||
* the kvm msr permission bitmap may contain zero bits
|
|
||||||
*/
|
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* MSR bitmap update can be skipped when:
|
||||||
|
* - MSR bitmap for L1 hasn't changed.
|
||||||
|
* - Nested hypervisor (L1) is attempting to launch the same L2 as
|
||||||
|
* before.
|
||||||
|
* - Nested hypervisor (L1) is using Hyper-V emulation interface and
|
||||||
|
* tells KVM (L0) there were no changes in MSR bitmap for L2.
|
||||||
|
*/
|
||||||
|
if (!svm->nested.force_msr_bitmap_recalc &&
|
||||||
|
kvm_hv_hypercall_enabled(&svm->vcpu) &&
|
||||||
|
hve->hv_enlightenments_control.msr_bitmap &&
|
||||||
|
(svm->nested.ctl.clean & BIT(VMCB_HV_NESTED_ENLIGHTENMENTS)))
|
||||||
|
goto set_msrpm_base_pa;
|
||||||
|
|
||||||
if (!(vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_MSR_PROT)))
|
if (!(vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_MSR_PROT)))
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
|
@ -193,6 +210,9 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
|
||||||
svm->nested.msrpm[p] = svm->msrpm[p] | value;
|
svm->nested.msrpm[p] = svm->msrpm[p] | value;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
svm->nested.force_msr_bitmap_recalc = false;
|
||||||
|
|
||||||
|
set_msrpm_base_pa:
|
||||||
svm->vmcb->control.msrpm_base_pa = __sme_set(__pa(svm->nested.msrpm));
|
svm->vmcb->control.msrpm_base_pa = __sme_set(__pa(svm->nested.msrpm));
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
|
@ -298,7 +318,8 @@ static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
static
|
static
|
||||||
void __nested_copy_vmcb_control_to_cache(struct vmcb_ctrl_area_cached *to,
|
void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu,
|
||||||
|
struct vmcb_ctrl_area_cached *to,
|
||||||
struct vmcb_control_area *from)
|
struct vmcb_control_area *from)
|
||||||
{
|
{
|
||||||
unsigned int i;
|
unsigned int i;
|
||||||
|
@ -331,12 +352,19 @@ void __nested_copy_vmcb_control_to_cache(struct vmcb_ctrl_area_cached *to,
|
||||||
to->asid = from->asid;
|
to->asid = from->asid;
|
||||||
to->msrpm_base_pa &= ~0x0fffULL;
|
to->msrpm_base_pa &= ~0x0fffULL;
|
||||||
to->iopm_base_pa &= ~0x0fffULL;
|
to->iopm_base_pa &= ~0x0fffULL;
|
||||||
|
|
||||||
|
/* Hyper-V extensions (Enlightened VMCB) */
|
||||||
|
if (kvm_hv_hypercall_enabled(vcpu)) {
|
||||||
|
to->clean = from->clean;
|
||||||
|
memcpy(to->reserved_sw, from->reserved_sw,
|
||||||
|
sizeof(struct hv_enlightenments));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
|
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
|
||||||
struct vmcb_control_area *control)
|
struct vmcb_control_area *control)
|
||||||
{
|
{
|
||||||
__nested_copy_vmcb_control_to_cache(&svm->nested.ctl, control);
|
__nested_copy_vmcb_control_to_cache(&svm->vcpu, &svm->nested.ctl, control);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached *to,
|
static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached *to,
|
||||||
|
@ -464,14 +492,14 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
|
||||||
CC(!load_pdptrs(vcpu, cr3)))
|
CC(!load_pdptrs(vcpu, cr3)))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (!nested_npt)
|
|
||||||
kvm_mmu_new_pgd(vcpu, cr3);
|
|
||||||
|
|
||||||
vcpu->arch.cr3 = cr3;
|
vcpu->arch.cr3 = cr3;
|
||||||
|
|
||||||
/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
|
/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
|
||||||
kvm_init_mmu(vcpu);
|
kvm_init_mmu(vcpu);
|
||||||
|
|
||||||
|
if (!nested_npt)
|
||||||
|
kvm_mmu_new_pgd(vcpu, cr3);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -494,6 +522,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
|
||||||
if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
|
if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
|
||||||
new_vmcb12 = true;
|
new_vmcb12 = true;
|
||||||
svm->nested.last_vmcb12_gpa = svm->nested.vmcb12_gpa;
|
svm->nested.last_vmcb12_gpa = svm->nested.vmcb12_gpa;
|
||||||
|
svm->nested.force_msr_bitmap_recalc = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
|
if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
|
||||||
|
@ -1302,6 +1331,7 @@ static void nested_copy_vmcb_cache_to_control(struct vmcb_control_area *dst,
|
||||||
dst->virt_ext = from->virt_ext;
|
dst->virt_ext = from->virt_ext;
|
||||||
dst->pause_filter_count = from->pause_filter_count;
|
dst->pause_filter_count = from->pause_filter_count;
|
||||||
dst->pause_filter_thresh = from->pause_filter_thresh;
|
dst->pause_filter_thresh = from->pause_filter_thresh;
|
||||||
|
/* 'clean' and 'reserved_sw' are not changed by KVM */
|
||||||
}
|
}
|
||||||
|
|
||||||
static int svm_get_nested_state(struct kvm_vcpu *vcpu,
|
static int svm_get_nested_state(struct kvm_vcpu *vcpu,
|
||||||
|
@ -1434,7 +1464,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
|
||||||
goto out_free;
|
goto out_free;
|
||||||
|
|
||||||
ret = -EINVAL;
|
ret = -EINVAL;
|
||||||
__nested_copy_vmcb_control_to_cache(&ctl_cached, ctl);
|
__nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl);
|
||||||
if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
|
if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
|
||||||
goto out_free;
|
goto out_free;
|
||||||
|
|
||||||
|
@ -1495,6 +1525,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
|
||||||
if (WARN_ON_ONCE(ret))
|
if (WARN_ON_ONCE(ret))
|
||||||
goto out_free;
|
goto out_free;
|
||||||
|
|
||||||
|
svm->nested.force_msr_bitmap_recalc = true;
|
||||||
|
|
||||||
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
|
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
|
||||||
ret = 0;
|
ret = 0;
|
||||||
|
|
|
@ -101,7 +101,7 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
|
||||||
{
|
{
|
||||||
struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
|
struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
|
||||||
|
|
||||||
if (!enable_pmu)
|
if (!vcpu->kvm->arch.enable_pmu)
|
||||||
return NULL;
|
return NULL;
|
||||||
|
|
||||||
switch (msr) {
|
switch (msr) {
|
||||||
|
|
|
@ -258,6 +258,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
|
||||||
goto e_free;
|
goto e_free;
|
||||||
|
|
||||||
INIT_LIST_HEAD(&sev->regions_list);
|
INIT_LIST_HEAD(&sev->regions_list);
|
||||||
|
INIT_LIST_HEAD(&sev->mirror_vms);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
|
@ -1623,9 +1624,12 @@ static void sev_unlock_vcpus_for_migration(struct kvm *kvm)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static void sev_migrate_from(struct kvm_sev_info *dst,
|
static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
|
||||||
struct kvm_sev_info *src)
|
|
||||||
{
|
{
|
||||||
|
struct kvm_sev_info *dst = &to_kvm_svm(dst_kvm)->sev_info;
|
||||||
|
struct kvm_sev_info *src = &to_kvm_svm(src_kvm)->sev_info;
|
||||||
|
struct kvm_sev_info *mirror;
|
||||||
|
|
||||||
dst->active = true;
|
dst->active = true;
|
||||||
dst->asid = src->asid;
|
dst->asid = src->asid;
|
||||||
dst->handle = src->handle;
|
dst->handle = src->handle;
|
||||||
|
@ -1639,6 +1643,30 @@ static void sev_migrate_from(struct kvm_sev_info *dst,
|
||||||
src->enc_context_owner = NULL;
|
src->enc_context_owner = NULL;
|
||||||
|
|
||||||
list_cut_before(&dst->regions_list, &src->regions_list, &src->regions_list);
|
list_cut_before(&dst->regions_list, &src->regions_list, &src->regions_list);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* If this VM has mirrors, "transfer" each mirror's refcount of the
|
||||||
|
* source to the destination (this KVM). The caller holds a reference
|
||||||
|
* to the source, so there's no danger of use-after-free.
|
||||||
|
*/
|
||||||
|
list_cut_before(&dst->mirror_vms, &src->mirror_vms, &src->mirror_vms);
|
||||||
|
list_for_each_entry(mirror, &dst->mirror_vms, mirror_entry) {
|
||||||
|
kvm_get_kvm(dst_kvm);
|
||||||
|
kvm_put_kvm(src_kvm);
|
||||||
|
mirror->enc_context_owner = dst_kvm;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* If this VM is a mirror, remove the old mirror from the owners list
|
||||||
|
* and add the new mirror to the list.
|
||||||
|
*/
|
||||||
|
if (is_mirroring_enc_context(dst_kvm)) {
|
||||||
|
struct kvm_sev_info *owner_sev_info =
|
||||||
|
&to_kvm_svm(dst->enc_context_owner)->sev_info;
|
||||||
|
|
||||||
|
list_del(&src->mirror_entry);
|
||||||
|
list_add_tail(&dst->mirror_entry, &owner_sev_info->mirror_vms);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static int sev_es_migrate_from(struct kvm *dst, struct kvm *src)
|
static int sev_es_migrate_from(struct kvm *dst, struct kvm *src)
|
||||||
|
@ -1681,7 +1709,7 @@ static int sev_es_migrate_from(struct kvm *dst, struct kvm *src)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
|
int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
{
|
{
|
||||||
struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
|
struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
|
||||||
struct kvm_sev_info *src_sev, *cg_cleanup_sev;
|
struct kvm_sev_info *src_sev, *cg_cleanup_sev;
|
||||||
|
@ -1708,15 +1736,6 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
|
|
||||||
src_sev = &to_kvm_svm(source_kvm)->sev_info;
|
src_sev = &to_kvm_svm(source_kvm)->sev_info;
|
||||||
|
|
||||||
/*
|
|
||||||
* VMs mirroring src's encryption context rely on it to keep the
|
|
||||||
* ASID allocated, but below we are clearing src_sev->asid.
|
|
||||||
*/
|
|
||||||
if (src_sev->num_mirrored_vms) {
|
|
||||||
ret = -EBUSY;
|
|
||||||
goto out_unlock;
|
|
||||||
}
|
|
||||||
|
|
||||||
dst_sev->misc_cg = get_current_misc_cg();
|
dst_sev->misc_cg = get_current_misc_cg();
|
||||||
cg_cleanup_sev = dst_sev;
|
cg_cleanup_sev = dst_sev;
|
||||||
if (dst_sev->misc_cg != src_sev->misc_cg) {
|
if (dst_sev->misc_cg != src_sev->misc_cg) {
|
||||||
|
@ -1738,7 +1757,8 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
if (ret)
|
if (ret)
|
||||||
goto out_source_vcpu;
|
goto out_source_vcpu;
|
||||||
}
|
}
|
||||||
sev_migrate_from(dst_sev, src_sev);
|
|
||||||
|
sev_migrate_from(kvm, source_kvm);
|
||||||
kvm_vm_dead(source_kvm);
|
kvm_vm_dead(source_kvm);
|
||||||
cg_cleanup_sev = src_sev;
|
cg_cleanup_sev = src_sev;
|
||||||
ret = 0;
|
ret = 0;
|
||||||
|
@ -1761,7 +1781,7 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
|
int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
|
||||||
{
|
{
|
||||||
struct kvm_sev_cmd sev_cmd;
|
struct kvm_sev_cmd sev_cmd;
|
||||||
int r;
|
int r;
|
||||||
|
@ -1858,8 +1878,8 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
int svm_register_enc_region(struct kvm *kvm,
|
int sev_mem_enc_register_region(struct kvm *kvm,
|
||||||
struct kvm_enc_region *range)
|
struct kvm_enc_region *range)
|
||||||
{
|
{
|
||||||
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
|
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
|
||||||
struct enc_region *region;
|
struct enc_region *region;
|
||||||
|
@ -1932,8 +1952,8 @@ static void __unregister_enc_region_locked(struct kvm *kvm,
|
||||||
kfree(region);
|
kfree(region);
|
||||||
}
|
}
|
||||||
|
|
||||||
int svm_unregister_enc_region(struct kvm *kvm,
|
int sev_mem_enc_unregister_region(struct kvm *kvm,
|
||||||
struct kvm_enc_region *range)
|
struct kvm_enc_region *range)
|
||||||
{
|
{
|
||||||
struct enc_region *region;
|
struct enc_region *region;
|
||||||
int ret;
|
int ret;
|
||||||
|
@ -1972,7 +1992,7 @@ int svm_unregister_enc_region(struct kvm *kvm,
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd)
|
int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
{
|
{
|
||||||
struct file *source_kvm_file;
|
struct file *source_kvm_file;
|
||||||
struct kvm *source_kvm;
|
struct kvm *source_kvm;
|
||||||
|
@ -2008,10 +2028,10 @@ int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
*/
|
*/
|
||||||
source_sev = &to_kvm_svm(source_kvm)->sev_info;
|
source_sev = &to_kvm_svm(source_kvm)->sev_info;
|
||||||
kvm_get_kvm(source_kvm);
|
kvm_get_kvm(source_kvm);
|
||||||
source_sev->num_mirrored_vms++;
|
mirror_sev = &to_kvm_svm(kvm)->sev_info;
|
||||||
|
list_add_tail(&mirror_sev->mirror_entry, &source_sev->mirror_vms);
|
||||||
|
|
||||||
/* Set enc_context_owner and copy its encryption context over */
|
/* Set enc_context_owner and copy its encryption context over */
|
||||||
mirror_sev = &to_kvm_svm(kvm)->sev_info;
|
|
||||||
mirror_sev->enc_context_owner = source_kvm;
|
mirror_sev->enc_context_owner = source_kvm;
|
||||||
mirror_sev->active = true;
|
mirror_sev->active = true;
|
||||||
mirror_sev->asid = source_sev->asid;
|
mirror_sev->asid = source_sev->asid;
|
||||||
|
@ -2019,6 +2039,7 @@ int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd)
|
||||||
mirror_sev->es_active = source_sev->es_active;
|
mirror_sev->es_active = source_sev->es_active;
|
||||||
mirror_sev->handle = source_sev->handle;
|
mirror_sev->handle = source_sev->handle;
|
||||||
INIT_LIST_HEAD(&mirror_sev->regions_list);
|
INIT_LIST_HEAD(&mirror_sev->regions_list);
|
||||||
|
INIT_LIST_HEAD(&mirror_sev->mirror_vms);
|
||||||
ret = 0;
|
ret = 0;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -2041,19 +2062,17 @@ void sev_vm_destroy(struct kvm *kvm)
|
||||||
struct list_head *head = &sev->regions_list;
|
struct list_head *head = &sev->regions_list;
|
||||||
struct list_head *pos, *q;
|
struct list_head *pos, *q;
|
||||||
|
|
||||||
WARN_ON(sev->num_mirrored_vms);
|
|
||||||
|
|
||||||
if (!sev_guest(kvm))
|
if (!sev_guest(kvm))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
|
WARN_ON(!list_empty(&sev->mirror_vms));
|
||||||
|
|
||||||
/* If this is a mirror_kvm release the enc_context_owner and skip sev cleanup */
|
/* If this is a mirror_kvm release the enc_context_owner and skip sev cleanup */
|
||||||
if (is_mirroring_enc_context(kvm)) {
|
if (is_mirroring_enc_context(kvm)) {
|
||||||
struct kvm *owner_kvm = sev->enc_context_owner;
|
struct kvm *owner_kvm = sev->enc_context_owner;
|
||||||
struct kvm_sev_info *owner_sev = &to_kvm_svm(owner_kvm)->sev_info;
|
|
||||||
|
|
||||||
mutex_lock(&owner_kvm->lock);
|
mutex_lock(&owner_kvm->lock);
|
||||||
if (!WARN_ON(!owner_sev->num_mirrored_vms))
|
list_del(&sev->mirror_entry);
|
||||||
owner_sev->num_mirrored_vms--;
|
|
||||||
mutex_unlock(&owner_kvm->lock);
|
mutex_unlock(&owner_kvm->lock);
|
||||||
kvm_put_kvm(owner_kvm);
|
kvm_put_kvm(owner_kvm);
|
||||||
return;
|
return;
|
||||||
|
@ -2173,7 +2192,7 @@ void __init sev_hardware_setup(void)
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
void sev_hardware_teardown(void)
|
void sev_hardware_unsetup(void)
|
||||||
{
|
{
|
||||||
if (!sev_enabled)
|
if (!sev_enabled)
|
||||||
return;
|
return;
|
||||||
|
@ -2358,7 +2377,7 @@ static void sev_es_sync_from_ghcb(struct vcpu_svm *svm)
|
||||||
memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
|
memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool sev_es_validate_vmgexit(struct vcpu_svm *svm)
|
static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
|
||||||
{
|
{
|
||||||
struct kvm_vcpu *vcpu;
|
struct kvm_vcpu *vcpu;
|
||||||
struct ghcb *ghcb;
|
struct ghcb *ghcb;
|
||||||
|
@ -2463,7 +2482,7 @@ static bool sev_es_validate_vmgexit(struct vcpu_svm *svm)
|
||||||
goto vmgexit_err;
|
goto vmgexit_err;
|
||||||
}
|
}
|
||||||
|
|
||||||
return true;
|
return 0;
|
||||||
|
|
||||||
vmgexit_err:
|
vmgexit_err:
|
||||||
vcpu = &svm->vcpu;
|
vcpu = &svm->vcpu;
|
||||||
|
@ -2486,7 +2505,8 @@ static bool sev_es_validate_vmgexit(struct vcpu_svm *svm)
|
||||||
ghcb_set_sw_exit_info_1(ghcb, 2);
|
ghcb_set_sw_exit_info_1(ghcb, 2);
|
||||||
ghcb_set_sw_exit_info_2(ghcb, reason);
|
ghcb_set_sw_exit_info_2(ghcb, reason);
|
||||||
|
|
||||||
return false;
|
/* Resume the guest to "return" the error code. */
|
||||||
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
void sev_es_unmap_ghcb(struct vcpu_svm *svm)
|
void sev_es_unmap_ghcb(struct vcpu_svm *svm)
|
||||||
|
@ -2545,7 +2565,7 @@ void pre_sev_run(struct vcpu_svm *svm, int cpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
#define GHCB_SCRATCH_AREA_LIMIT (16ULL * PAGE_SIZE)
|
#define GHCB_SCRATCH_AREA_LIMIT (16ULL * PAGE_SIZE)
|
||||||
static bool setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
|
static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
|
||||||
{
|
{
|
||||||
struct vmcb_control_area *control = &svm->vmcb->control;
|
struct vmcb_control_area *control = &svm->vmcb->control;
|
||||||
struct ghcb *ghcb = svm->sev_es.ghcb;
|
struct ghcb *ghcb = svm->sev_es.ghcb;
|
||||||
|
@ -2598,14 +2618,14 @@ static bool setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
|
||||||
}
|
}
|
||||||
scratch_va = kvzalloc(len, GFP_KERNEL_ACCOUNT);
|
scratch_va = kvzalloc(len, GFP_KERNEL_ACCOUNT);
|
||||||
if (!scratch_va)
|
if (!scratch_va)
|
||||||
goto e_scratch;
|
return -ENOMEM;
|
||||||
|
|
||||||
if (kvm_read_guest(svm->vcpu.kvm, scratch_gpa_beg, scratch_va, len)) {
|
if (kvm_read_guest(svm->vcpu.kvm, scratch_gpa_beg, scratch_va, len)) {
|
||||||
/* Unable to copy scratch area from guest */
|
/* Unable to copy scratch area from guest */
|
||||||
pr_err("vmgexit: kvm_read_guest for scratch area failed\n");
|
pr_err("vmgexit: kvm_read_guest for scratch area failed\n");
|
||||||
|
|
||||||
kvfree(scratch_va);
|
kvfree(scratch_va);
|
||||||
goto e_scratch;
|
return -EFAULT;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -2621,13 +2641,13 @@ static bool setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
|
||||||
svm->sev_es.ghcb_sa = scratch_va;
|
svm->sev_es.ghcb_sa = scratch_va;
|
||||||
svm->sev_es.ghcb_sa_len = len;
|
svm->sev_es.ghcb_sa_len = len;
|
||||||
|
|
||||||
return true;
|
return 0;
|
||||||
|
|
||||||
e_scratch:
|
e_scratch:
|
||||||
ghcb_set_sw_exit_info_1(ghcb, 2);
|
ghcb_set_sw_exit_info_1(ghcb, 2);
|
||||||
ghcb_set_sw_exit_info_2(ghcb, GHCB_ERR_INVALID_SCRATCH_AREA);
|
ghcb_set_sw_exit_info_2(ghcb, GHCB_ERR_INVALID_SCRATCH_AREA);
|
||||||
|
|
||||||
return false;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void set_ghcb_msr_bits(struct vcpu_svm *svm, u64 value, u64 mask,
|
static void set_ghcb_msr_bits(struct vcpu_svm *svm, u64 value, u64 mask,
|
||||||
|
@ -2765,17 +2785,18 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
exit_code = ghcb_get_sw_exit_code(ghcb);
|
exit_code = ghcb_get_sw_exit_code(ghcb);
|
||||||
|
|
||||||
if (!sev_es_validate_vmgexit(svm))
|
ret = sev_es_validate_vmgexit(svm);
|
||||||
return 1;
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
sev_es_sync_from_ghcb(svm);
|
sev_es_sync_from_ghcb(svm);
|
||||||
ghcb_set_sw_exit_info_1(ghcb, 0);
|
ghcb_set_sw_exit_info_1(ghcb, 0);
|
||||||
ghcb_set_sw_exit_info_2(ghcb, 0);
|
ghcb_set_sw_exit_info_2(ghcb, 0);
|
||||||
|
|
||||||
ret = 1;
|
|
||||||
switch (exit_code) {
|
switch (exit_code) {
|
||||||
case SVM_VMGEXIT_MMIO_READ:
|
case SVM_VMGEXIT_MMIO_READ:
|
||||||
if (!setup_vmgexit_scratch(svm, true, control->exit_info_2))
|
ret = setup_vmgexit_scratch(svm, true, control->exit_info_2);
|
||||||
|
if (ret)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
ret = kvm_sev_es_mmio_read(vcpu,
|
ret = kvm_sev_es_mmio_read(vcpu,
|
||||||
|
@ -2784,7 +2805,8 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
|
||||||
svm->sev_es.ghcb_sa);
|
svm->sev_es.ghcb_sa);
|
||||||
break;
|
break;
|
||||||
case SVM_VMGEXIT_MMIO_WRITE:
|
case SVM_VMGEXIT_MMIO_WRITE:
|
||||||
if (!setup_vmgexit_scratch(svm, false, control->exit_info_2))
|
ret = setup_vmgexit_scratch(svm, false, control->exit_info_2);
|
||||||
|
if (ret)
|
||||||
break;
|
break;
|
||||||
|
|
||||||
ret = kvm_sev_es_mmio_write(vcpu,
|
ret = kvm_sev_es_mmio_write(vcpu,
|
||||||
|
@ -2817,6 +2839,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
|
||||||
ghcb_set_sw_exit_info_2(ghcb, GHCB_ERR_INVALID_INPUT);
|
ghcb_set_sw_exit_info_2(ghcb, GHCB_ERR_INVALID_INPUT);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
ret = 1;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case SVM_VMGEXIT_UNSUPPORTED_EVENT:
|
case SVM_VMGEXIT_UNSUPPORTED_EVENT:
|
||||||
|
@ -2836,6 +2859,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
|
||||||
{
|
{
|
||||||
int count;
|
int count;
|
||||||
int bytes;
|
int bytes;
|
||||||
|
int r;
|
||||||
|
|
||||||
if (svm->vmcb->control.exit_info_2 > INT_MAX)
|
if (svm->vmcb->control.exit_info_2 > INT_MAX)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
@ -2844,8 +2868,9 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
|
||||||
if (unlikely(check_mul_overflow(count, size, &bytes)))
|
if (unlikely(check_mul_overflow(count, size, &bytes)))
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (!setup_vmgexit_scratch(svm, in, bytes))
|
r = setup_vmgexit_scratch(svm, in, bytes);
|
||||||
return 1;
|
if (r)
|
||||||
|
return r;
|
||||||
|
|
||||||
return kvm_sev_es_string_io(&svm->vcpu, size, port, svm->sev_es.ghcb_sa,
|
return kvm_sev_es_string_io(&svm->vcpu, size, port, svm->sev_es.ghcb_sa,
|
||||||
count, in);
|
count, in);
|
||||||
|
@ -2907,20 +2932,16 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
|
||||||
sev_enc_bit));
|
sev_enc_bit));
|
||||||
}
|
}
|
||||||
|
|
||||||
void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
|
void sev_es_prepare_switch_to_guest(struct vmcb_save_area *hostsa)
|
||||||
{
|
{
|
||||||
struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
|
|
||||||
struct vmcb_save_area *hostsa;
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* As an SEV-ES guest, hardware will restore the host state on VMEXIT,
|
* As an SEV-ES guest, hardware will restore the host state on VMEXIT,
|
||||||
* of which one step is to perform a VMLOAD. Since hardware does not
|
* of which one step is to perform a VMLOAD. KVM performs the
|
||||||
* perform a VMSAVE on VMRUN, the host savearea must be updated.
|
* corresponding VMSAVE in svm_prepare_guest_switch for both
|
||||||
|
* traditional and SEV-ES guests.
|
||||||
*/
|
*/
|
||||||
vmsave(__sme_page_pa(sd->save_area));
|
|
||||||
|
|
||||||
/* XCR0 is restored on VMEXIT, save the current host value */
|
/* XCR0 is restored on VMEXIT, save the current host value */
|
||||||
hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
|
|
||||||
hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
|
hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
|
||||||
|
|
||||||
/* PKRU is restored on VMEXIT, save the current host value */
|
/* PKRU is restored on VMEXIT, save the current host value */
|
||||||
|
|
|
@ -263,7 +263,7 @@ u32 svm_msrpm_offset(u32 msr)
|
||||||
return MSR_INVALID;
|
return MSR_INVALID;
|
||||||
}
|
}
|
||||||
|
|
||||||
#define MAX_INST_SIZE 15
|
static void svm_flush_tlb_current(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
static int get_npt_level(void)
|
static int get_npt_level(void)
|
||||||
{
|
{
|
||||||
|
@ -353,7 +353,7 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
|
static int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
|
@ -401,7 +401,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
|
||||||
* raises a fault that is not intercepted. Still better than
|
* raises a fault that is not intercepted. Still better than
|
||||||
* failing in all cases.
|
* failing in all cases.
|
||||||
*/
|
*/
|
||||||
(void)skip_emulated_instruction(vcpu);
|
(void)svm_skip_emulated_instruction(vcpu);
|
||||||
rip = kvm_rip_read(vcpu);
|
rip = kvm_rip_read(vcpu);
|
||||||
svm->int3_rip = rip + svm->vmcb->save.cs.base;
|
svm->int3_rip = rip + svm->vmcb->save.cs.base;
|
||||||
svm->int3_injected = rip - old_rip;
|
svm->int3_injected = rip - old_rip;
|
||||||
|
@ -668,6 +668,7 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
|
||||||
static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
|
static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
|
||||||
u32 msr, int read, int write)
|
u32 msr, int read, int write)
|
||||||
{
|
{
|
||||||
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
u8 bit_read, bit_write;
|
u8 bit_read, bit_write;
|
||||||
unsigned long tmp;
|
unsigned long tmp;
|
||||||
u32 offset;
|
u32 offset;
|
||||||
|
@ -698,7 +699,7 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
|
||||||
msrpm[offset] = tmp;
|
msrpm[offset] = tmp;
|
||||||
|
|
||||||
svm_hv_vmcb_dirty_nested_enlightenments(vcpu);
|
svm_hv_vmcb_dirty_nested_enlightenments(vcpu);
|
||||||
|
svm->nested.force_msr_bitmap_recalc = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
|
void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
|
||||||
|
@ -873,11 +874,11 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_hardware_teardown(void)
|
static void svm_hardware_unsetup(void)
|
||||||
{
|
{
|
||||||
int cpu;
|
int cpu;
|
||||||
|
|
||||||
sev_hardware_teardown();
|
sev_hardware_unsetup();
|
||||||
|
|
||||||
for_each_possible_cpu(cpu)
|
for_each_possible_cpu(cpu)
|
||||||
svm_cpu_uninit(cpu);
|
svm_cpu_uninit(cpu);
|
||||||
|
@ -1175,7 +1176,7 @@ void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb)
|
||||||
svm->vmcb = target_vmcb->ptr;
|
svm->vmcb = target_vmcb->ptr;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int svm_create_vcpu(struct kvm_vcpu *vcpu)
|
static int svm_vcpu_create(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm;
|
struct vcpu_svm *svm;
|
||||||
struct page *vmcb01_page;
|
struct page *vmcb01_page;
|
||||||
|
@ -1246,7 +1247,7 @@ static void svm_clear_current_vmcb(struct vmcb *vmcb)
|
||||||
cmpxchg(&per_cpu(svm_data, i)->current_vmcb, vmcb, NULL);
|
cmpxchg(&per_cpu(svm_data, i)->current_vmcb, vmcb, NULL);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_free_vcpu(struct kvm_vcpu *vcpu)
|
static void svm_vcpu_free(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
|
@ -1265,7 +1266,7 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
|
||||||
__free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
|
__free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
|
static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
|
struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
|
||||||
|
@ -1280,10 +1281,12 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
|
||||||
* Save additional host state that will be restored on VMEXIT (sev-es)
|
* Save additional host state that will be restored on VMEXIT (sev-es)
|
||||||
* or subsequent vmload of host save area.
|
* or subsequent vmload of host save area.
|
||||||
*/
|
*/
|
||||||
|
vmsave(__sme_page_pa(sd->save_area));
|
||||||
if (sev_es_guest(vcpu->kvm)) {
|
if (sev_es_guest(vcpu->kvm)) {
|
||||||
sev_es_prepare_guest_switch(svm, vcpu->cpu);
|
struct vmcb_save_area *hostsa;
|
||||||
} else {
|
hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
|
||||||
vmsave(__sme_page_pa(sd->save_area));
|
|
||||||
|
sev_es_prepare_switch_to_guest(hostsa);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (tsc_scaling) {
|
if (tsc_scaling) {
|
||||||
|
@ -1315,13 +1318,13 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||||
indirect_branch_prediction_barrier();
|
indirect_branch_prediction_barrier();
|
||||||
}
|
}
|
||||||
if (kvm_vcpu_apicv_active(vcpu))
|
if (kvm_vcpu_apicv_active(vcpu))
|
||||||
avic_vcpu_load(vcpu, cpu);
|
__avic_vcpu_load(vcpu, cpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_vcpu_put(struct kvm_vcpu *vcpu)
|
static void svm_vcpu_put(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (kvm_vcpu_apicv_active(vcpu))
|
if (kvm_vcpu_apicv_active(vcpu))
|
||||||
avic_vcpu_put(vcpu);
|
__avic_vcpu_put(vcpu);
|
||||||
|
|
||||||
svm_prepare_host_switch(vcpu);
|
svm_prepare_host_switch(vcpu);
|
||||||
|
|
||||||
|
@ -1529,6 +1532,15 @@ static int svm_get_cpl(struct kvm_vcpu *vcpu)
|
||||||
return save->cpl;
|
return save->cpl;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void svm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
|
||||||
|
{
|
||||||
|
struct kvm_segment cs;
|
||||||
|
|
||||||
|
svm_get_segment(vcpu, &cs, VCPU_SREG_CS);
|
||||||
|
*db = cs.db;
|
||||||
|
*l = cs.l;
|
||||||
|
}
|
||||||
|
|
||||||
static void svm_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
|
static void svm_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
@ -1563,7 +1575,7 @@ static void svm_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
|
||||||
vmcb_mark_dirty(svm->vmcb, VMCB_DT);
|
vmcb_mark_dirty(svm->vmcb, VMCB_DT);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_post_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
|
static void sev_post_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
|
@ -1647,7 +1659,7 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
|
||||||
unsigned long old_cr4 = vcpu->arch.cr4;
|
unsigned long old_cr4 = vcpu->arch.cr4;
|
||||||
|
|
||||||
if (npt_enabled && ((old_cr4 ^ cr4) & X86_CR4_PGE))
|
if (npt_enabled && ((old_cr4 ^ cr4) & X86_CR4_PGE))
|
||||||
svm_flush_tlb(vcpu);
|
svm_flush_tlb_current(vcpu);
|
||||||
|
|
||||||
vcpu->arch.cr4 = cr4;
|
vcpu->arch.cr4 = cr4;
|
||||||
if (!npt_enabled) {
|
if (!npt_enabled) {
|
||||||
|
@ -2269,7 +2281,7 @@ static int task_switch_interception(struct kvm_vcpu *vcpu)
|
||||||
int_type == SVM_EXITINTINFO_TYPE_SOFT ||
|
int_type == SVM_EXITINTINFO_TYPE_SOFT ||
|
||||||
(int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
|
(int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
|
||||||
(int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
|
(int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
|
||||||
if (!skip_emulated_instruction(vcpu))
|
if (!svm_skip_emulated_instruction(vcpu))
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -3149,7 +3161,7 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
|
||||||
"excp_to:", save->last_excp_to);
|
"excp_to:", save->last_excp_to);
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool svm_check_exit_valid(struct kvm_vcpu *vcpu, u64 exit_code)
|
static bool svm_check_exit_valid(u64 exit_code)
|
||||||
{
|
{
|
||||||
return (exit_code < ARRAY_SIZE(svm_exit_handlers) &&
|
return (exit_code < ARRAY_SIZE(svm_exit_handlers) &&
|
||||||
svm_exit_handlers[exit_code]);
|
svm_exit_handlers[exit_code]);
|
||||||
|
@ -3169,7 +3181,7 @@ static int svm_handle_invalid_exit(struct kvm_vcpu *vcpu, u64 exit_code)
|
||||||
|
|
||||||
int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code)
|
int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code)
|
||||||
{
|
{
|
||||||
if (!svm_check_exit_valid(vcpu, exit_code))
|
if (!svm_check_exit_valid(exit_code))
|
||||||
return svm_handle_invalid_exit(vcpu, exit_code);
|
return svm_handle_invalid_exit(vcpu, exit_code);
|
||||||
|
|
||||||
#ifdef CONFIG_RETPOLINE
|
#ifdef CONFIG_RETPOLINE
|
||||||
|
@ -3204,7 +3216,7 @@ static void svm_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason,
|
||||||
*error_code = 0;
|
*error_code = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
|
static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
struct kvm_run *kvm_run = vcpu->run;
|
struct kvm_run *kvm_run = vcpu->run;
|
||||||
|
@ -3301,7 +3313,7 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu)
|
||||||
++vcpu->stat.nmi_injections;
|
++vcpu->stat.nmi_injections;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void svm_set_irq(struct kvm_vcpu *vcpu)
|
static void svm_inject_irq(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
|
@ -3531,17 +3543,7 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
|
||||||
svm->vmcb->save.rflags |= (X86_EFLAGS_TF | X86_EFLAGS_RF);
|
svm->vmcb->save.rflags |= (X86_EFLAGS_TF | X86_EFLAGS_RF);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int svm_set_tss_addr(struct kvm *kvm, unsigned int addr)
|
static void svm_flush_tlb_current(struct kvm_vcpu *vcpu)
|
||||||
{
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int svm_set_identity_map_addr(struct kvm *kvm, u64 ident_addr)
|
|
||||||
{
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
void svm_flush_tlb(struct kvm_vcpu *vcpu)
|
|
||||||
{
|
{
|
||||||
struct vcpu_svm *svm = to_svm(vcpu);
|
struct vcpu_svm *svm = to_svm(vcpu);
|
||||||
|
|
||||||
|
@ -3915,11 +3917,6 @@ static int __init svm_check_processor_compat(void)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool svm_cpu_has_accelerated_tpr(void)
|
|
||||||
{
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* The kvm parameter can be NULL (module initialization, or invocation before
|
* The kvm parameter can be NULL (module initialization, or invocation before
|
||||||
* VM creation). Be sure to check the kvm parameter before using it.
|
* VM creation). Be sure to check the kvm parameter before using it.
|
||||||
|
@ -4254,7 +4251,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
|
||||||
* by 0x400 (matches the offset of 'struct vmcb_save_area'
|
* by 0x400 (matches the offset of 'struct vmcb_save_area'
|
||||||
* within 'struct vmcb'). Note: HSAVE area may also be used by
|
* within 'struct vmcb'). Note: HSAVE area may also be used by
|
||||||
* L1 hypervisor to save additional host context (e.g. KVM does
|
* L1 hypervisor to save additional host context (e.g. KVM does
|
||||||
* that, see svm_prepare_guest_switch()) which must be
|
* that, see svm_prepare_switch_to_guest()) which must be
|
||||||
* preserved.
|
* preserved.
|
||||||
*/
|
*/
|
||||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->nested.hsave_msr),
|
if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->nested.hsave_msr),
|
||||||
|
@ -4529,21 +4526,20 @@ static int svm_vm_init(struct kvm *kvm)
|
||||||
static struct kvm_x86_ops svm_x86_ops __initdata = {
|
static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.name = "kvm_amd",
|
.name = "kvm_amd",
|
||||||
|
|
||||||
.hardware_unsetup = svm_hardware_teardown,
|
.hardware_unsetup = svm_hardware_unsetup,
|
||||||
.hardware_enable = svm_hardware_enable,
|
.hardware_enable = svm_hardware_enable,
|
||||||
.hardware_disable = svm_hardware_disable,
|
.hardware_disable = svm_hardware_disable,
|
||||||
.cpu_has_accelerated_tpr = svm_cpu_has_accelerated_tpr,
|
|
||||||
.has_emulated_msr = svm_has_emulated_msr,
|
.has_emulated_msr = svm_has_emulated_msr,
|
||||||
|
|
||||||
.vcpu_create = svm_create_vcpu,
|
.vcpu_create = svm_vcpu_create,
|
||||||
.vcpu_free = svm_free_vcpu,
|
.vcpu_free = svm_vcpu_free,
|
||||||
.vcpu_reset = svm_vcpu_reset,
|
.vcpu_reset = svm_vcpu_reset,
|
||||||
|
|
||||||
.vm_size = sizeof(struct kvm_svm),
|
.vm_size = sizeof(struct kvm_svm),
|
||||||
.vm_init = svm_vm_init,
|
.vm_init = svm_vm_init,
|
||||||
.vm_destroy = svm_vm_destroy,
|
.vm_destroy = svm_vm_destroy,
|
||||||
|
|
||||||
.prepare_guest_switch = svm_prepare_guest_switch,
|
.prepare_switch_to_guest = svm_prepare_switch_to_guest,
|
||||||
.vcpu_load = svm_vcpu_load,
|
.vcpu_load = svm_vcpu_load,
|
||||||
.vcpu_put = svm_vcpu_put,
|
.vcpu_put = svm_vcpu_put,
|
||||||
.vcpu_blocking = avic_vcpu_blocking,
|
.vcpu_blocking = avic_vcpu_blocking,
|
||||||
|
@ -4557,9 +4553,9 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.get_segment = svm_get_segment,
|
.get_segment = svm_get_segment,
|
||||||
.set_segment = svm_set_segment,
|
.set_segment = svm_set_segment,
|
||||||
.get_cpl = svm_get_cpl,
|
.get_cpl = svm_get_cpl,
|
||||||
.get_cs_db_l_bits = kvm_get_cs_db_l_bits,
|
.get_cs_db_l_bits = svm_get_cs_db_l_bits,
|
||||||
.set_cr0 = svm_set_cr0,
|
.set_cr0 = svm_set_cr0,
|
||||||
.post_set_cr3 = svm_post_set_cr3,
|
.post_set_cr3 = sev_post_set_cr3,
|
||||||
.is_valid_cr4 = svm_is_valid_cr4,
|
.is_valid_cr4 = svm_is_valid_cr4,
|
||||||
.set_cr4 = svm_set_cr4,
|
.set_cr4 = svm_set_cr4,
|
||||||
.set_efer = svm_set_efer,
|
.set_efer = svm_set_efer,
|
||||||
|
@ -4574,21 +4570,21 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.set_rflags = svm_set_rflags,
|
.set_rflags = svm_set_rflags,
|
||||||
.get_if_flag = svm_get_if_flag,
|
.get_if_flag = svm_get_if_flag,
|
||||||
|
|
||||||
.tlb_flush_all = svm_flush_tlb,
|
.flush_tlb_all = svm_flush_tlb_current,
|
||||||
.tlb_flush_current = svm_flush_tlb,
|
.flush_tlb_current = svm_flush_tlb_current,
|
||||||
.tlb_flush_gva = svm_flush_tlb_gva,
|
.flush_tlb_gva = svm_flush_tlb_gva,
|
||||||
.tlb_flush_guest = svm_flush_tlb,
|
.flush_tlb_guest = svm_flush_tlb_current,
|
||||||
|
|
||||||
.vcpu_pre_run = svm_vcpu_pre_run,
|
.vcpu_pre_run = svm_vcpu_pre_run,
|
||||||
.run = svm_vcpu_run,
|
.vcpu_run = svm_vcpu_run,
|
||||||
.handle_exit = handle_exit,
|
.handle_exit = svm_handle_exit,
|
||||||
.skip_emulated_instruction = skip_emulated_instruction,
|
.skip_emulated_instruction = svm_skip_emulated_instruction,
|
||||||
.update_emulated_instruction = NULL,
|
.update_emulated_instruction = NULL,
|
||||||
.set_interrupt_shadow = svm_set_interrupt_shadow,
|
.set_interrupt_shadow = svm_set_interrupt_shadow,
|
||||||
.get_interrupt_shadow = svm_get_interrupt_shadow,
|
.get_interrupt_shadow = svm_get_interrupt_shadow,
|
||||||
.patch_hypercall = svm_patch_hypercall,
|
.patch_hypercall = svm_patch_hypercall,
|
||||||
.set_irq = svm_set_irq,
|
.inject_irq = svm_inject_irq,
|
||||||
.set_nmi = svm_inject_nmi,
|
.inject_nmi = svm_inject_nmi,
|
||||||
.queue_exception = svm_queue_exception,
|
.queue_exception = svm_queue_exception,
|
||||||
.cancel_injection = svm_cancel_injection,
|
.cancel_injection = svm_cancel_injection,
|
||||||
.interrupt_allowed = svm_interrupt_allowed,
|
.interrupt_allowed = svm_interrupt_allowed,
|
||||||
|
@ -4598,18 +4594,11 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.enable_nmi_window = svm_enable_nmi_window,
|
.enable_nmi_window = svm_enable_nmi_window,
|
||||||
.enable_irq_window = svm_enable_irq_window,
|
.enable_irq_window = svm_enable_irq_window,
|
||||||
.update_cr8_intercept = svm_update_cr8_intercept,
|
.update_cr8_intercept = svm_update_cr8_intercept,
|
||||||
.set_virtual_apic_mode = svm_set_virtual_apic_mode,
|
.refresh_apicv_exec_ctrl = avic_refresh_apicv_exec_ctrl,
|
||||||
.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
|
.check_apicv_inhibit_reasons = avic_check_apicv_inhibit_reasons,
|
||||||
.check_apicv_inhibit_reasons = svm_check_apicv_inhibit_reasons,
|
.apicv_post_state_restore = avic_apicv_post_state_restore,
|
||||||
.load_eoi_exitmap = svm_load_eoi_exitmap,
|
|
||||||
.hwapic_irr_update = svm_hwapic_irr_update,
|
|
||||||
.hwapic_isr_update = svm_hwapic_isr_update,
|
|
||||||
.apicv_post_state_restore = avic_post_state_restore,
|
|
||||||
|
|
||||||
.set_tss_addr = svm_set_tss_addr,
|
|
||||||
.set_identity_map_addr = svm_set_identity_map_addr,
|
|
||||||
.get_mt_mask = svm_get_mt_mask,
|
.get_mt_mask = svm_get_mt_mask,
|
||||||
|
|
||||||
.get_exit_info = svm_get_exit_info,
|
.get_exit_info = svm_get_exit_info,
|
||||||
|
|
||||||
.vcpu_after_set_cpuid = svm_vcpu_after_set_cpuid,
|
.vcpu_after_set_cpuid = svm_vcpu_after_set_cpuid,
|
||||||
|
@ -4634,8 +4623,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.nested_ops = &svm_nested_ops,
|
.nested_ops = &svm_nested_ops,
|
||||||
|
|
||||||
.deliver_interrupt = svm_deliver_interrupt,
|
.deliver_interrupt = svm_deliver_interrupt,
|
||||||
.dy_apicv_has_pending_interrupt = svm_dy_apicv_has_pending_interrupt,
|
.pi_update_irte = avic_pi_update_irte,
|
||||||
.update_pi_irte = svm_update_pi_irte,
|
|
||||||
.setup_mce = svm_setup_mce,
|
.setup_mce = svm_setup_mce,
|
||||||
|
|
||||||
.smi_allowed = svm_smi_allowed,
|
.smi_allowed = svm_smi_allowed,
|
||||||
|
@ -4643,12 +4631,12 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
|
||||||
.leave_smm = svm_leave_smm,
|
.leave_smm = svm_leave_smm,
|
||||||
.enable_smi_window = svm_enable_smi_window,
|
.enable_smi_window = svm_enable_smi_window,
|
||||||
|
|
||||||
.mem_enc_op = svm_mem_enc_op,
|
.mem_enc_ioctl = sev_mem_enc_ioctl,
|
||||||
.mem_enc_reg_region = svm_register_enc_region,
|
.mem_enc_register_region = sev_mem_enc_register_region,
|
||||||
.mem_enc_unreg_region = svm_unregister_enc_region,
|
.mem_enc_unregister_region = sev_mem_enc_unregister_region,
|
||||||
|
|
||||||
.vm_copy_enc_context_from = svm_vm_copy_asid_from,
|
.vm_copy_enc_context_from = sev_vm_copy_enc_context_from,
|
||||||
.vm_move_enc_context_from = svm_vm_migrate_from,
|
.vm_move_enc_context_from = sev_vm_move_enc_context_from,
|
||||||
|
|
||||||
.can_emulate_instruction = svm_can_emulate_instruction,
|
.can_emulate_instruction = svm_can_emulate_instruction,
|
||||||
|
|
||||||
|
@ -4893,7 +4881,7 @@ static __init int svm_hardware_setup(void)
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
err:
|
err:
|
||||||
svm_hardware_teardown();
|
svm_hardware_unsetup();
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -79,7 +79,8 @@ struct kvm_sev_info {
|
||||||
struct list_head regions_list; /* List of registered regions */
|
struct list_head regions_list; /* List of registered regions */
|
||||||
u64 ap_jump_table; /* SEV-ES AP Jump Table address */
|
u64 ap_jump_table; /* SEV-ES AP Jump Table address */
|
||||||
struct kvm *enc_context_owner; /* Owner of copied encryption context */
|
struct kvm *enc_context_owner; /* Owner of copied encryption context */
|
||||||
unsigned long num_mirrored_vms; /* Number of VMs sharing this ASID */
|
struct list_head mirror_vms; /* List of VMs mirroring */
|
||||||
|
struct list_head mirror_entry; /* Use as a list entry of mirrors */
|
||||||
struct misc_cg *misc_cg; /* For misc cgroup accounting */
|
struct misc_cg *misc_cg; /* For misc cgroup accounting */
|
||||||
atomic_t migration_in_progress;
|
atomic_t migration_in_progress;
|
||||||
};
|
};
|
||||||
|
@ -137,6 +138,8 @@ struct vmcb_ctrl_area_cached {
|
||||||
u32 event_inj_err;
|
u32 event_inj_err;
|
||||||
u64 nested_cr3;
|
u64 nested_cr3;
|
||||||
u64 virt_ext;
|
u64 virt_ext;
|
||||||
|
u32 clean;
|
||||||
|
u8 reserved_sw[32];
|
||||||
};
|
};
|
||||||
|
|
||||||
struct svm_nested_state {
|
struct svm_nested_state {
|
||||||
|
@ -163,6 +166,15 @@ struct svm_nested_state {
|
||||||
struct vmcb_save_area_cached save;
|
struct vmcb_save_area_cached save;
|
||||||
|
|
||||||
bool initialized;
|
bool initialized;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Indicates whether MSR bitmap for L2 needs to be rebuilt due to
|
||||||
|
* changes in MSR bitmap for L1 or switching to a different L2. Note,
|
||||||
|
* this flag can only be used reliably in conjunction with a paravirt L1
|
||||||
|
* which informs L0 whether any changes to MSR bitmap for L2 were done
|
||||||
|
* on its side.
|
||||||
|
*/
|
||||||
|
bool force_msr_bitmap_recalc;
|
||||||
};
|
};
|
||||||
|
|
||||||
struct vcpu_sev_es_state {
|
struct vcpu_sev_es_state {
|
||||||
|
@ -321,7 +333,7 @@ static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Only the PDPTRs are loaded on demand into the shadow MMU. All other
|
* Only the PDPTRs are loaded on demand into the shadow MMU. All other
|
||||||
* fields are synchronized in handle_exit, because accessing the VMCB is cheap.
|
* fields are synchronized on VM-Exit, because accessing the VMCB is cheap.
|
||||||
*
|
*
|
||||||
* CR3 might be out of date in the VMCB but it is not marked dirty; instead,
|
* CR3 might be out of date in the VMCB but it is not marked dirty; instead,
|
||||||
* KVM_REQ_LOAD_MMU_PGD is always requested when the cached vcpu->arch.cr3
|
* KVM_REQ_LOAD_MMU_PGD is always requested when the cached vcpu->arch.cr3
|
||||||
|
@ -480,7 +492,6 @@ void svm_vcpu_free_msrpm(u32 *msrpm);
|
||||||
int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
|
int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
|
||||||
void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
|
void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
|
||||||
void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
|
void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
|
||||||
void svm_flush_tlb(struct kvm_vcpu *vcpu);
|
|
||||||
void disable_nmi_singlestep(struct vcpu_svm *svm);
|
void disable_nmi_singlestep(struct vcpu_svm *svm);
|
||||||
bool svm_smi_blocked(struct kvm_vcpu *vcpu);
|
bool svm_smi_blocked(struct kvm_vcpu *vcpu);
|
||||||
bool svm_nmi_blocked(struct kvm_vcpu *vcpu);
|
bool svm_nmi_blocked(struct kvm_vcpu *vcpu);
|
||||||
|
@ -558,6 +569,17 @@ extern struct kvm_x86_nested_ops svm_nested_ops;
|
||||||
|
|
||||||
/* avic.c */
|
/* avic.c */
|
||||||
|
|
||||||
|
#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF)
|
||||||
|
#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31
|
||||||
|
#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31)
|
||||||
|
|
||||||
|
#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK GENMASK_ULL(11, 0)
|
||||||
|
#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12)
|
||||||
|
#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62)
|
||||||
|
#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63)
|
||||||
|
|
||||||
|
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
|
||||||
|
|
||||||
int avic_ga_log_notifier(u32 ga_tag);
|
int avic_ga_log_notifier(u32 ga_tag);
|
||||||
void avic_vm_destroy(struct kvm *kvm);
|
void avic_vm_destroy(struct kvm *kvm);
|
||||||
int avic_vm_init(struct kvm *kvm);
|
int avic_vm_init(struct kvm *kvm);
|
||||||
|
@ -565,18 +587,17 @@ void avic_init_vmcb(struct vcpu_svm *svm);
|
||||||
int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
|
int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
|
||||||
int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
|
int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
|
||||||
int avic_init_vcpu(struct vcpu_svm *svm);
|
int avic_init_vcpu(struct vcpu_svm *svm);
|
||||||
void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
|
void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
|
||||||
void avic_vcpu_put(struct kvm_vcpu *vcpu);
|
void __avic_vcpu_put(struct kvm_vcpu *vcpu);
|
||||||
void avic_post_state_restore(struct kvm_vcpu *vcpu);
|
void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu);
|
||||||
void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
|
void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
|
||||||
void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu);
|
void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu);
|
||||||
bool svm_check_apicv_inhibit_reasons(ulong bit);
|
bool avic_check_apicv_inhibit_reasons(ulong bit);
|
||||||
void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
|
void avic_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr);
|
||||||
void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr);
|
void avic_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr);
|
||||||
void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr);
|
bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu);
|
||||||
bool svm_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu);
|
int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
|
||||||
int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
|
uint32_t guest_irq, bool set);
|
||||||
uint32_t guest_irq, bool set);
|
|
||||||
void avic_vcpu_blocking(struct kvm_vcpu *vcpu);
|
void avic_vcpu_blocking(struct kvm_vcpu *vcpu);
|
||||||
void avic_vcpu_unblocking(struct kvm_vcpu *vcpu);
|
void avic_vcpu_unblocking(struct kvm_vcpu *vcpu);
|
||||||
void avic_ring_doorbell(struct kvm_vcpu *vcpu);
|
void avic_ring_doorbell(struct kvm_vcpu *vcpu);
|
||||||
|
@ -590,17 +611,17 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu);
|
||||||
extern unsigned int max_sev_asid;
|
extern unsigned int max_sev_asid;
|
||||||
|
|
||||||
void sev_vm_destroy(struct kvm *kvm);
|
void sev_vm_destroy(struct kvm *kvm);
|
||||||
int svm_mem_enc_op(struct kvm *kvm, void __user *argp);
|
int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp);
|
||||||
int svm_register_enc_region(struct kvm *kvm,
|
int sev_mem_enc_register_region(struct kvm *kvm,
|
||||||
struct kvm_enc_region *range);
|
struct kvm_enc_region *range);
|
||||||
int svm_unregister_enc_region(struct kvm *kvm,
|
int sev_mem_enc_unregister_region(struct kvm *kvm,
|
||||||
struct kvm_enc_region *range);
|
struct kvm_enc_region *range);
|
||||||
int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd);
|
int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd);
|
||||||
int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd);
|
int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd);
|
||||||
void pre_sev_run(struct vcpu_svm *svm, int cpu);
|
void pre_sev_run(struct vcpu_svm *svm, int cpu);
|
||||||
void __init sev_set_cpu_caps(void);
|
void __init sev_set_cpu_caps(void);
|
||||||
void __init sev_hardware_setup(void);
|
void __init sev_hardware_setup(void);
|
||||||
void sev_hardware_teardown(void);
|
void sev_hardware_unsetup(void);
|
||||||
int sev_cpu_init(struct svm_cpu_data *sd);
|
int sev_cpu_init(struct svm_cpu_data *sd);
|
||||||
void sev_free_vcpu(struct kvm_vcpu *vcpu);
|
void sev_free_vcpu(struct kvm_vcpu *vcpu);
|
||||||
int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
|
int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
|
||||||
|
@ -608,7 +629,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
|
||||||
void sev_es_init_vmcb(struct vcpu_svm *svm);
|
void sev_es_init_vmcb(struct vcpu_svm *svm);
|
||||||
void sev_es_vcpu_reset(struct vcpu_svm *svm);
|
void sev_es_vcpu_reset(struct vcpu_svm *svm);
|
||||||
void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
|
void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
|
||||||
void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu);
|
void sev_es_prepare_switch_to_guest(struct vmcb_save_area *hostsa);
|
||||||
void sev_es_unmap_ghcb(struct vcpu_svm *svm);
|
void sev_es_unmap_ghcb(struct vcpu_svm *svm);
|
||||||
|
|
||||||
/* vmenter.S */
|
/* vmenter.S */
|
||||||
|
|
|
@ -7,35 +7,12 @@
|
||||||
#define __ARCH_X86_KVM_SVM_ONHYPERV_H__
|
#define __ARCH_X86_KVM_SVM_ONHYPERV_H__
|
||||||
|
|
||||||
#if IS_ENABLED(CONFIG_HYPERV)
|
#if IS_ENABLED(CONFIG_HYPERV)
|
||||||
#include <asm/mshyperv.h>
|
|
||||||
|
|
||||||
#include "hyperv.h"
|
|
||||||
#include "kvm_onhyperv.h"
|
#include "kvm_onhyperv.h"
|
||||||
|
#include "svm/hyperv.h"
|
||||||
|
|
||||||
static struct kvm_x86_ops svm_x86_ops;
|
static struct kvm_x86_ops svm_x86_ops;
|
||||||
|
|
||||||
/*
|
|
||||||
* Hyper-V uses the software reserved 32 bytes in VMCB
|
|
||||||
* control area to expose SVM enlightenments to guests.
|
|
||||||
*/
|
|
||||||
struct hv_enlightenments {
|
|
||||||
struct __packed hv_enlightenments_control {
|
|
||||||
u32 nested_flush_hypercall:1;
|
|
||||||
u32 msr_bitmap:1;
|
|
||||||
u32 enlightened_npt_tlb: 1;
|
|
||||||
u32 reserved:29;
|
|
||||||
} __packed hv_enlightenments_control;
|
|
||||||
u32 hv_vp_id;
|
|
||||||
u64 hv_vm_id;
|
|
||||||
u64 partition_assist_page;
|
|
||||||
u64 reserved;
|
|
||||||
} __packed;
|
|
||||||
|
|
||||||
/*
|
|
||||||
* Hyper-V uses the software reserved clean bit in VMCB
|
|
||||||
*/
|
|
||||||
#define VMCB_HV_NESTED_ENLIGHTENMENTS VMCB_SW
|
|
||||||
|
|
||||||
int svm_hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu);
|
int svm_hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu);
|
||||||
|
|
||||||
static inline void svm_hv_init_vmcb(struct vmcb *vmcb)
|
static inline void svm_hv_init_vmcb(struct vmcb *vmcb)
|
||||||
|
|
|
@ -64,9 +64,9 @@ TRACE_EVENT(kvm_hypercall,
|
||||||
* Tracepoint for hypercall.
|
* Tracepoint for hypercall.
|
||||||
*/
|
*/
|
||||||
TRACE_EVENT(kvm_hv_hypercall,
|
TRACE_EVENT(kvm_hv_hypercall,
|
||||||
TP_PROTO(__u16 code, bool fast, __u16 rep_cnt, __u16 rep_idx,
|
TP_PROTO(__u16 code, bool fast, __u16 var_cnt, __u16 rep_cnt,
|
||||||
__u64 ingpa, __u64 outgpa),
|
__u16 rep_idx, __u64 ingpa, __u64 outgpa),
|
||||||
TP_ARGS(code, fast, rep_cnt, rep_idx, ingpa, outgpa),
|
TP_ARGS(code, fast, var_cnt, rep_cnt, rep_idx, ingpa, outgpa),
|
||||||
|
|
||||||
TP_STRUCT__entry(
|
TP_STRUCT__entry(
|
||||||
__field( __u16, rep_cnt )
|
__field( __u16, rep_cnt )
|
||||||
|
@ -74,6 +74,7 @@ TRACE_EVENT(kvm_hv_hypercall,
|
||||||
__field( __u64, ingpa )
|
__field( __u64, ingpa )
|
||||||
__field( __u64, outgpa )
|
__field( __u64, outgpa )
|
||||||
__field( __u16, code )
|
__field( __u16, code )
|
||||||
|
__field( __u16, var_cnt )
|
||||||
__field( bool, fast )
|
__field( bool, fast )
|
||||||
),
|
),
|
||||||
|
|
||||||
|
@ -83,13 +84,14 @@ TRACE_EVENT(kvm_hv_hypercall,
|
||||||
__entry->ingpa = ingpa;
|
__entry->ingpa = ingpa;
|
||||||
__entry->outgpa = outgpa;
|
__entry->outgpa = outgpa;
|
||||||
__entry->code = code;
|
__entry->code = code;
|
||||||
|
__entry->var_cnt = var_cnt;
|
||||||
__entry->fast = fast;
|
__entry->fast = fast;
|
||||||
),
|
),
|
||||||
|
|
||||||
TP_printk("code 0x%x %s cnt 0x%x idx 0x%x in 0x%llx out 0x%llx",
|
TP_printk("code 0x%x %s var_cnt 0x%x rep_cnt 0x%x idx 0x%x in 0x%llx out 0x%llx",
|
||||||
__entry->code, __entry->fast ? "fast" : "slow",
|
__entry->code, __entry->fast ? "fast" : "slow",
|
||||||
__entry->rep_cnt, __entry->rep_idx, __entry->ingpa,
|
__entry->var_cnt, __entry->rep_cnt, __entry->rep_idx,
|
||||||
__entry->outgpa)
|
__entry->ingpa, __entry->outgpa)
|
||||||
);
|
);
|
||||||
|
|
||||||
TRACE_EVENT(kvm_hv_hypercall_done,
|
TRACE_EVENT(kvm_hv_hypercall_done,
|
||||||
|
@ -251,13 +253,13 @@ TRACE_EVENT(kvm_cpuid,
|
||||||
* Tracepoint for apic access.
|
* Tracepoint for apic access.
|
||||||
*/
|
*/
|
||||||
TRACE_EVENT(kvm_apic,
|
TRACE_EVENT(kvm_apic,
|
||||||
TP_PROTO(unsigned int rw, unsigned int reg, unsigned int val),
|
TP_PROTO(unsigned int rw, unsigned int reg, u64 val),
|
||||||
TP_ARGS(rw, reg, val),
|
TP_ARGS(rw, reg, val),
|
||||||
|
|
||||||
TP_STRUCT__entry(
|
TP_STRUCT__entry(
|
||||||
__field( unsigned int, rw )
|
__field( unsigned int, rw )
|
||||||
__field( unsigned int, reg )
|
__field( unsigned int, reg )
|
||||||
__field( unsigned int, val )
|
__field( u64, val )
|
||||||
),
|
),
|
||||||
|
|
||||||
TP_fast_assign(
|
TP_fast_assign(
|
||||||
|
@ -266,7 +268,7 @@ TRACE_EVENT(kvm_apic,
|
||||||
__entry->val = val;
|
__entry->val = val;
|
||||||
),
|
),
|
||||||
|
|
||||||
TP_printk("apic_%s %s = 0x%x",
|
TP_printk("apic_%s %s = 0x%llx",
|
||||||
__entry->rw ? "write" : "read",
|
__entry->rw ? "write" : "read",
|
||||||
__print_symbolic(__entry->reg, kvm_trace_symbol_apic),
|
__print_symbolic(__entry->reg, kvm_trace_symbol_apic),
|
||||||
__entry->val)
|
__entry->val)
|
||||||
|
|
|
@ -320,7 +320,7 @@ static void free_nested(struct kvm_vcpu *vcpu)
|
||||||
kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true);
|
kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true);
|
||||||
vmx->nested.pi_desc = NULL;
|
vmx->nested.pi_desc = NULL;
|
||||||
|
|
||||||
kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
kvm_mmu_free_roots(vcpu->kvm, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
||||||
|
|
||||||
nested_release_evmcs(vcpu);
|
nested_release_evmcs(vcpu);
|
||||||
|
|
||||||
|
@ -1125,15 +1125,15 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!nested_ept)
|
|
||||||
kvm_mmu_new_pgd(vcpu, cr3);
|
|
||||||
|
|
||||||
vcpu->arch.cr3 = cr3;
|
vcpu->arch.cr3 = cr3;
|
||||||
kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
|
kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
|
||||||
|
|
||||||
/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
|
/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
|
||||||
kvm_init_mmu(vcpu);
|
kvm_init_mmu(vcpu);
|
||||||
|
|
||||||
|
if (!nested_ept)
|
||||||
|
kvm_mmu_new_pgd(vcpu, cr3);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -4802,7 +4802,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
|
void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
|
||||||
|
bool vcpu_has_perf_global_ctrl)
|
||||||
{
|
{
|
||||||
struct vcpu_vmx *vmx;
|
struct vcpu_vmx *vmx;
|
||||||
|
|
||||||
|
@ -4810,7 +4811,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
vmx = to_vmx(vcpu);
|
vmx = to_vmx(vcpu);
|
||||||
if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
|
if (vcpu_has_perf_global_ctrl) {
|
||||||
vmx->nested.msrs.entry_ctls_high |=
|
vmx->nested.msrs.entry_ctls_high |=
|
||||||
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
|
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
|
||||||
vmx->nested.msrs.exit_ctls_high |=
|
vmx->nested.msrs.exit_ctls_high |=
|
||||||
|
@ -5011,7 +5012,7 @@ static inline void nested_release_vmcs12(struct kvm_vcpu *vcpu)
|
||||||
vmx->nested.current_vmptr >> PAGE_SHIFT,
|
vmx->nested.current_vmptr >> PAGE_SHIFT,
|
||||||
vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);
|
vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);
|
||||||
|
|
||||||
kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
kvm_mmu_free_roots(vcpu->kvm, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
||||||
|
|
||||||
vmx->nested.current_vmptr = INVALID_GPA;
|
vmx->nested.current_vmptr = INVALID_GPA;
|
||||||
}
|
}
|
||||||
|
@ -5470,7 +5471,7 @@ static int handle_invept(struct kvm_vcpu *vcpu)
|
||||||
VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
|
VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
|
||||||
|
|
||||||
roots_to_free = 0;
|
roots_to_free = 0;
|
||||||
if (nested_ept_root_matches(mmu->root_hpa, mmu->root_pgd,
|
if (nested_ept_root_matches(mmu->root.hpa, mmu->root.pgd,
|
||||||
operand.eptp))
|
operand.eptp))
|
||||||
roots_to_free |= KVM_MMU_ROOT_CURRENT;
|
roots_to_free |= KVM_MMU_ROOT_CURRENT;
|
||||||
|
|
||||||
|
@ -5490,7 +5491,7 @@ static int handle_invept(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
|
|
||||||
if (roots_to_free)
|
if (roots_to_free)
|
||||||
kvm_mmu_free_roots(vcpu, mmu, roots_to_free);
|
kvm_mmu_free_roots(vcpu->kvm, mmu, roots_to_free);
|
||||||
|
|
||||||
return nested_vmx_succeed(vcpu);
|
return nested_vmx_succeed(vcpu);
|
||||||
}
|
}
|
||||||
|
@ -5579,7 +5580,7 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
|
||||||
* TODO: sync only the affected SPTEs for INVDIVIDUAL_ADDR.
|
* TODO: sync only the affected SPTEs for INVDIVIDUAL_ADDR.
|
||||||
*/
|
*/
|
||||||
if (!enable_ept)
|
if (!enable_ept)
|
||||||
kvm_mmu_free_guest_mode_roots(vcpu, &vcpu->arch.root_mmu);
|
kvm_mmu_free_guest_mode_roots(vcpu->kvm, &vcpu->arch.root_mmu);
|
||||||
|
|
||||||
return nested_vmx_succeed(vcpu);
|
return nested_vmx_succeed(vcpu);
|
||||||
}
|
}
|
||||||
|
|
|
@ -32,7 +32,8 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
|
||||||
int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
|
int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
|
||||||
int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
|
int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
|
||||||
u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
|
u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
|
||||||
void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
|
void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
|
||||||
|
bool vcpu_has_perf_global_ctrl);
|
||||||
void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
|
void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
|
||||||
bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
|
bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
|
||||||
int size);
|
int size);
|
||||||
|
|
|
@ -487,7 +487,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
|
||||||
pmu->reserved_bits = 0xffffffff00200000ull;
|
pmu->reserved_bits = 0xffffffff00200000ull;
|
||||||
|
|
||||||
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
|
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
|
||||||
if (!entry || !enable_pmu)
|
if (!entry || !vcpu->kvm->arch.enable_pmu)
|
||||||
return;
|
return;
|
||||||
eax.full = entry->eax;
|
eax.full = entry->eax;
|
||||||
edx.full = entry->edx;
|
edx.full = entry->edx;
|
||||||
|
@ -541,7 +541,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
|
||||||
bitmap_set(pmu->all_valid_pmc_idx,
|
bitmap_set(pmu->all_valid_pmc_idx,
|
||||||
INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
|
INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
|
||||||
|
|
||||||
nested_vmx_pmu_entry_exit_ctls_update(vcpu);
|
nested_vmx_pmu_refresh(vcpu,
|
||||||
|
intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
|
||||||
|
|
||||||
if (intel_pmu_lbr_is_compatible(vcpu))
|
if (intel_pmu_lbr_is_compatible(vcpu))
|
||||||
x86_perf_get_lbr(&lbr_desc->records);
|
x86_perf_get_lbr(&lbr_desc->records);
|
||||||
|
|
|
@ -244,7 +244,7 @@ void vmx_pi_start_assignment(struct kvm *kvm)
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* pi_update_irte - set IRTE for Posted-Interrupts
|
* vmx_pi_update_irte - set IRTE for Posted-Interrupts
|
||||||
*
|
*
|
||||||
* @kvm: kvm
|
* @kvm: kvm
|
||||||
* @host_irq: host irq of the interrupt
|
* @host_irq: host irq of the interrupt
|
||||||
|
@ -252,8 +252,8 @@ void vmx_pi_start_assignment(struct kvm *kvm)
|
||||||
* @set: set or unset PI
|
* @set: set or unset PI
|
||||||
* returns 0 on success, < 0 on failure
|
* returns 0 on success, < 0 on failure
|
||||||
*/
|
*/
|
||||||
int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
|
int vmx_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
|
||||||
bool set)
|
uint32_t guest_irq, bool set)
|
||||||
{
|
{
|
||||||
struct kvm_kernel_irq_routing_entry *e;
|
struct kvm_kernel_irq_routing_entry *e;
|
||||||
struct kvm_irq_routing_table *irq_rt;
|
struct kvm_irq_routing_table *irq_rt;
|
||||||
|
|
|
@ -97,8 +97,8 @@ void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu);
|
||||||
void pi_wakeup_handler(void);
|
void pi_wakeup_handler(void);
|
||||||
void __init pi_init_cpu(int cpu);
|
void __init pi_init_cpu(int cpu);
|
||||||
bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
|
bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
|
||||||
int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
|
int vmx_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
|
||||||
bool set);
|
uint32_t guest_irq, bool set);
|
||||||
void vmx_pi_start_assignment(struct kvm *kvm);
|
void vmx_pi_start_assignment(struct kvm *kvm);
|
||||||
|
|
||||||
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
|
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
|
||||||
|
|
|
@ -541,11 +541,6 @@ static inline bool cpu_need_virtualize_apic_accesses(struct kvm_vcpu *vcpu)
|
||||||
return flexpriority_enabled && lapic_in_kernel(vcpu);
|
return flexpriority_enabled && lapic_in_kernel(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline bool report_flexpriority(void)
|
|
||||||
{
|
|
||||||
return flexpriority_enabled;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int possible_passthrough_msr_slot(u32 msr)
|
static int possible_passthrough_msr_slot(u32 msr)
|
||||||
{
|
{
|
||||||
u32 i;
|
u32 i;
|
||||||
|
@ -645,10 +640,10 @@ static void __loaded_vmcs_clear(void *arg)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Ensure all writes to loaded_vmcs, including deleting it from its
|
* Ensure all writes to loaded_vmcs, including deleting it from its
|
||||||
* current percpu list, complete before setting loaded_vmcs->vcpu to
|
* current percpu list, complete before setting loaded_vmcs->cpu to
|
||||||
* -1, otherwise a different cpu can see vcpu == -1 first and add
|
* -1, otherwise a different cpu can see loaded_vmcs->cpu == -1 first
|
||||||
* loaded_vmcs to its percpu list before it's deleted from this cpu's
|
* and add loaded_vmcs to its percpu list before it's deleted from this
|
||||||
* list. Pairs with the smp_rmb() in vmx_vcpu_load_vmcs().
|
* cpu's list. Pairs with the smp_rmb() in vmx_vcpu_load_vmcs().
|
||||||
*/
|
*/
|
||||||
smp_wmb();
|
smp_wmb();
|
||||||
|
|
||||||
|
@ -2334,7 +2329,7 @@ static int kvm_cpu_vmxon(u64 vmxon_pointer)
|
||||||
return -EFAULT;
|
return -EFAULT;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int hardware_enable(void)
|
static int vmx_hardware_enable(void)
|
||||||
{
|
{
|
||||||
int cpu = raw_smp_processor_id();
|
int cpu = raw_smp_processor_id();
|
||||||
u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
|
u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
|
||||||
|
@ -2375,7 +2370,7 @@ static void vmclear_local_loaded_vmcss(void)
|
||||||
__loaded_vmcs_clear(v);
|
__loaded_vmcs_clear(v);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void hardware_disable(void)
|
static void vmx_hardware_disable(void)
|
||||||
{
|
{
|
||||||
vmclear_local_loaded_vmcss();
|
vmclear_local_loaded_vmcss();
|
||||||
|
|
||||||
|
@ -2950,7 +2945,7 @@ static inline int vmx_get_current_vpid(struct kvm_vcpu *vcpu)
|
||||||
static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
|
static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct kvm_mmu *mmu = vcpu->arch.mmu;
|
struct kvm_mmu *mmu = vcpu->arch.mmu;
|
||||||
u64 root_hpa = mmu->root_hpa;
|
u64 root_hpa = mmu->root.hpa;
|
||||||
|
|
||||||
/* No flush required if the current context is invalid. */
|
/* No flush required if the current context is invalid. */
|
||||||
if (!VALID_PAGE(root_hpa))
|
if (!VALID_PAGE(root_hpa))
|
||||||
|
@ -3930,31 +3925,33 @@ static inline void kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu,
|
||||||
#ifdef CONFIG_SMP
|
#ifdef CONFIG_SMP
|
||||||
if (vcpu->mode == IN_GUEST_MODE) {
|
if (vcpu->mode == IN_GUEST_MODE) {
|
||||||
/*
|
/*
|
||||||
* The vector of interrupt to be delivered to vcpu had
|
* The vector of the virtual has already been set in the PIR.
|
||||||
* been set in PIR before this function.
|
* Send a notification event to deliver the virtual interrupt
|
||||||
|
* unless the vCPU is the currently running vCPU, i.e. the
|
||||||
|
* event is being sent from a fastpath VM-Exit handler, in
|
||||||
|
* which case the PIR will be synced to the vIRR before
|
||||||
|
* re-entering the guest.
|
||||||
*
|
*
|
||||||
* Following cases will be reached in this block, and
|
* When the target is not the running vCPU, the following
|
||||||
* we always send a notification event in all cases as
|
* possibilities emerge:
|
||||||
* explained below.
|
|
||||||
*
|
*
|
||||||
* Case 1: vcpu keeps in non-root mode. Sending a
|
* Case 1: vCPU stays in non-root mode. Sending a notification
|
||||||
* notification event posts the interrupt to vcpu.
|
* event posts the interrupt to the vCPU.
|
||||||
*
|
*
|
||||||
* Case 2: vcpu exits to root mode and is still
|
* Case 2: vCPU exits to root mode and is still runnable. The
|
||||||
* runnable. PIR will be synced to vIRR before the
|
* PIR will be synced to the vIRR before re-entering the guest.
|
||||||
* next vcpu entry. Sending a notification event in
|
* Sending a notification event is ok as the host IRQ handler
|
||||||
* this case has no effect, as vcpu is not in root
|
* will ignore the spurious event.
|
||||||
* mode.
|
|
||||||
*
|
*
|
||||||
* Case 3: vcpu exits to root mode and is blocked.
|
* Case 3: vCPU exits to root mode and is blocked. vcpu_block()
|
||||||
* vcpu_block() has already synced PIR to vIRR and
|
* has already synced PIR to vIRR and never blocks the vCPU if
|
||||||
* never blocks vcpu if vIRR is not cleared. Therefore,
|
* the vIRR is not empty. Therefore, a blocked vCPU here does
|
||||||
* a blocked vcpu here does not wait for any requested
|
* not wait for any requested interrupts in PIR, and sending a
|
||||||
* interrupts in PIR, and sending a notification event
|
* notification event also results in a benign, spurious event.
|
||||||
* which has no effect is safe here.
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec);
|
if (vcpu != kvm_get_running_vcpu())
|
||||||
|
apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
@ -5177,7 +5174,7 @@ static int handle_dr(struct kvm_vcpu *vcpu)
|
||||||
if (!kvm_require_dr(vcpu, dr))
|
if (!kvm_require_dr(vcpu, dr))
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (kvm_x86_ops.get_cpl(vcpu) > 0)
|
if (vmx_get_cpl(vcpu) > 0)
|
||||||
goto out;
|
goto out;
|
||||||
|
|
||||||
dr7 = vmcs_readl(GUEST_DR7);
|
dr7 = vmcs_readl(GUEST_DR7);
|
||||||
|
@ -5310,9 +5307,16 @@ static int handle_apic_eoi_induced(struct kvm_vcpu *vcpu)
|
||||||
static int handle_apic_write(struct kvm_vcpu *vcpu)
|
static int handle_apic_write(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
unsigned long exit_qualification = vmx_get_exit_qual(vcpu);
|
unsigned long exit_qualification = vmx_get_exit_qual(vcpu);
|
||||||
u32 offset = exit_qualification & 0xfff;
|
|
||||||
|
|
||||||
/* APIC-write VM exit is trap-like and thus no need to adjust IP */
|
/*
|
||||||
|
* APIC-write VM-Exit is trap-like, KVM doesn't need to advance RIP and
|
||||||
|
* hardware has done any necessary aliasing, offset adjustments, etc...
|
||||||
|
* for the access. I.e. the correct value has already been written to
|
||||||
|
* the vAPIC page for the correct 16-byte chunk. KVM needs only to
|
||||||
|
* retrieve the register value and emulate the access.
|
||||||
|
*/
|
||||||
|
u32 offset = exit_qualification & 0xff0;
|
||||||
|
|
||||||
kvm_apic_write_nodecode(vcpu, offset);
|
kvm_apic_write_nodecode(vcpu, offset);
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
@ -6975,7 +6979,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
|
||||||
return vmx_exit_handlers_fastpath(vcpu);
|
return vmx_exit_handlers_fastpath(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
|
static void vmx_vcpu_free(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||||
|
|
||||||
|
@ -6986,7 +6990,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
|
||||||
free_loaded_vmcs(vmx->loaded_vmcs);
|
free_loaded_vmcs(vmx->loaded_vmcs);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
|
static int vmx_vcpu_create(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
struct vmx_uret_msr *tsx_ctrl;
|
struct vmx_uret_msr *tsx_ctrl;
|
||||||
struct vcpu_vmx *vmx;
|
struct vcpu_vmx *vmx;
|
||||||
|
@ -7348,11 +7352,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
|
||||||
vmx_secondary_exec_control(vmx));
|
vmx_secondary_exec_control(vmx));
|
||||||
|
|
||||||
if (nested_vmx_allowed(vcpu))
|
if (nested_vmx_allowed(vcpu))
|
||||||
to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
|
vmx->msr_ia32_feature_control_valid_bits |=
|
||||||
FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
|
FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
|
||||||
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
|
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
|
||||||
else
|
else
|
||||||
to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
|
vmx->msr_ia32_feature_control_valid_bits &=
|
||||||
~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
|
~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
|
||||||
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
|
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
|
||||||
|
|
||||||
|
@ -7691,7 +7695,7 @@ static void vmx_migrate_timers(struct kvm_vcpu *vcpu)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static void hardware_unsetup(void)
|
static void vmx_hardware_unsetup(void)
|
||||||
{
|
{
|
||||||
kvm_set_posted_intr_wakeup_handler(NULL);
|
kvm_set_posted_intr_wakeup_handler(NULL);
|
||||||
|
|
||||||
|
@ -7714,21 +7718,20 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit)
|
||||||
static struct kvm_x86_ops vmx_x86_ops __initdata = {
|
static struct kvm_x86_ops vmx_x86_ops __initdata = {
|
||||||
.name = "kvm_intel",
|
.name = "kvm_intel",
|
||||||
|
|
||||||
.hardware_unsetup = hardware_unsetup,
|
.hardware_unsetup = vmx_hardware_unsetup,
|
||||||
|
|
||||||
.hardware_enable = hardware_enable,
|
.hardware_enable = vmx_hardware_enable,
|
||||||
.hardware_disable = hardware_disable,
|
.hardware_disable = vmx_hardware_disable,
|
||||||
.cpu_has_accelerated_tpr = report_flexpriority,
|
|
||||||
.has_emulated_msr = vmx_has_emulated_msr,
|
.has_emulated_msr = vmx_has_emulated_msr,
|
||||||
|
|
||||||
.vm_size = sizeof(struct kvm_vmx),
|
.vm_size = sizeof(struct kvm_vmx),
|
||||||
.vm_init = vmx_vm_init,
|
.vm_init = vmx_vm_init,
|
||||||
|
|
||||||
.vcpu_create = vmx_create_vcpu,
|
.vcpu_create = vmx_vcpu_create,
|
||||||
.vcpu_free = vmx_free_vcpu,
|
.vcpu_free = vmx_vcpu_free,
|
||||||
.vcpu_reset = vmx_vcpu_reset,
|
.vcpu_reset = vmx_vcpu_reset,
|
||||||
|
|
||||||
.prepare_guest_switch = vmx_prepare_switch_to_guest,
|
.prepare_switch_to_guest = vmx_prepare_switch_to_guest,
|
||||||
.vcpu_load = vmx_vcpu_load,
|
.vcpu_load = vmx_vcpu_load,
|
||||||
.vcpu_put = vmx_vcpu_put,
|
.vcpu_put = vmx_vcpu_put,
|
||||||
|
|
||||||
|
@ -7756,21 +7759,21 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
|
||||||
.set_rflags = vmx_set_rflags,
|
.set_rflags = vmx_set_rflags,
|
||||||
.get_if_flag = vmx_get_if_flag,
|
.get_if_flag = vmx_get_if_flag,
|
||||||
|
|
||||||
.tlb_flush_all = vmx_flush_tlb_all,
|
.flush_tlb_all = vmx_flush_tlb_all,
|
||||||
.tlb_flush_current = vmx_flush_tlb_current,
|
.flush_tlb_current = vmx_flush_tlb_current,
|
||||||
.tlb_flush_gva = vmx_flush_tlb_gva,
|
.flush_tlb_gva = vmx_flush_tlb_gva,
|
||||||
.tlb_flush_guest = vmx_flush_tlb_guest,
|
.flush_tlb_guest = vmx_flush_tlb_guest,
|
||||||
|
|
||||||
.vcpu_pre_run = vmx_vcpu_pre_run,
|
.vcpu_pre_run = vmx_vcpu_pre_run,
|
||||||
.run = vmx_vcpu_run,
|
.vcpu_run = vmx_vcpu_run,
|
||||||
.handle_exit = vmx_handle_exit,
|
.handle_exit = vmx_handle_exit,
|
||||||
.skip_emulated_instruction = vmx_skip_emulated_instruction,
|
.skip_emulated_instruction = vmx_skip_emulated_instruction,
|
||||||
.update_emulated_instruction = vmx_update_emulated_instruction,
|
.update_emulated_instruction = vmx_update_emulated_instruction,
|
||||||
.set_interrupt_shadow = vmx_set_interrupt_shadow,
|
.set_interrupt_shadow = vmx_set_interrupt_shadow,
|
||||||
.get_interrupt_shadow = vmx_get_interrupt_shadow,
|
.get_interrupt_shadow = vmx_get_interrupt_shadow,
|
||||||
.patch_hypercall = vmx_patch_hypercall,
|
.patch_hypercall = vmx_patch_hypercall,
|
||||||
.set_irq = vmx_inject_irq,
|
.inject_irq = vmx_inject_irq,
|
||||||
.set_nmi = vmx_inject_nmi,
|
.inject_nmi = vmx_inject_nmi,
|
||||||
.queue_exception = vmx_queue_exception,
|
.queue_exception = vmx_queue_exception,
|
||||||
.cancel_injection = vmx_cancel_injection,
|
.cancel_injection = vmx_cancel_injection,
|
||||||
.interrupt_allowed = vmx_interrupt_allowed,
|
.interrupt_allowed = vmx_interrupt_allowed,
|
||||||
|
@ -7823,8 +7826,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
|
||||||
.pmu_ops = &intel_pmu_ops,
|
.pmu_ops = &intel_pmu_ops,
|
||||||
.nested_ops = &vmx_nested_ops,
|
.nested_ops = &vmx_nested_ops,
|
||||||
|
|
||||||
.update_pi_irte = pi_update_irte,
|
.pi_update_irte = vmx_pi_update_irte,
|
||||||
.start_assignment = vmx_pi_start_assignment,
|
.pi_start_assignment = vmx_pi_start_assignment,
|
||||||
|
|
||||||
#ifdef CONFIG_X86_64
|
#ifdef CONFIG_X86_64
|
||||||
.set_hv_timer = vmx_set_hv_timer,
|
.set_hv_timer = vmx_set_hv_timer,
|
||||||
|
@ -8059,7 +8062,7 @@ static __init int hardware_setup(void)
|
||||||
vmx_set_cpu_caps();
|
vmx_set_cpu_caps();
|
||||||
|
|
||||||
r = alloc_kvm_area();
|
r = alloc_kvm_area();
|
||||||
if (r)
|
if (r && nested)
|
||||||
nested_vmx_hardware_unsetup();
|
nested_vmx_hardware_unsetup();
|
||||||
|
|
||||||
kvm_set_posted_intr_wakeup_handler(pi_wakeup_handler);
|
kvm_set_posted_intr_wakeup_handler(pi_wakeup_handler);
|
||||||
|
@ -8139,7 +8142,6 @@ static int __init vmx_init(void)
|
||||||
ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
|
ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
|
||||||
(ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
|
(ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
|
||||||
KVM_EVMCS_VERSION) {
|
KVM_EVMCS_VERSION) {
|
||||||
int cpu;
|
|
||||||
|
|
||||||
/* Check that we have assist pages on all online CPUs */
|
/* Check that we have assist pages on all online CPUs */
|
||||||
for_each_online_cpu(cpu) {
|
for_each_online_cpu(cpu) {
|
||||||
|
|
|
@ -110,6 +110,8 @@ static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS;
|
||||||
|
|
||||||
#define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE)
|
#define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE)
|
||||||
|
|
||||||
|
#define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE
|
||||||
|
|
||||||
#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \
|
#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \
|
||||||
KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
|
KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
|
||||||
|
|
||||||
|
@ -126,16 +128,15 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
|
||||||
static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
|
static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
|
||||||
|
|
||||||
struct kvm_x86_ops kvm_x86_ops __read_mostly;
|
struct kvm_x86_ops kvm_x86_ops __read_mostly;
|
||||||
EXPORT_SYMBOL_GPL(kvm_x86_ops);
|
|
||||||
|
|
||||||
#define KVM_X86_OP(func) \
|
#define KVM_X86_OP(func) \
|
||||||
DEFINE_STATIC_CALL_NULL(kvm_x86_##func, \
|
DEFINE_STATIC_CALL_NULL(kvm_x86_##func, \
|
||||||
*(((struct kvm_x86_ops *)0)->func));
|
*(((struct kvm_x86_ops *)0)->func));
|
||||||
#define KVM_X86_OP_NULL KVM_X86_OP
|
#define KVM_X86_OP_OPTIONAL KVM_X86_OP
|
||||||
|
#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
|
||||||
#include <asm/kvm-x86-ops.h>
|
#include <asm/kvm-x86-ops.h>
|
||||||
EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
|
EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
|
||||||
EXPORT_STATIC_CALL_GPL(kvm_x86_cache_reg);
|
EXPORT_STATIC_CALL_GPL(kvm_x86_cache_reg);
|
||||||
EXPORT_STATIC_CALL_GPL(kvm_x86_tlb_flush_current);
|
|
||||||
|
|
||||||
static bool __read_mostly ignore_msrs = 0;
|
static bool __read_mostly ignore_msrs = 0;
|
||||||
module_param(ignore_msrs, bool, S_IRUGO | S_IWUSR);
|
module_param(ignore_msrs, bool, S_IRUGO | S_IWUSR);
|
||||||
|
@ -194,6 +195,9 @@ bool __read_mostly enable_pmu = true;
|
||||||
EXPORT_SYMBOL_GPL(enable_pmu);
|
EXPORT_SYMBOL_GPL(enable_pmu);
|
||||||
module_param(enable_pmu, bool, 0444);
|
module_param(enable_pmu, bool, 0444);
|
||||||
|
|
||||||
|
bool __read_mostly eager_page_split = true;
|
||||||
|
module_param(eager_page_split, bool, 0644);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Restoring the host value for MSRs that are only consumed when running in
|
* Restoring the host value for MSRs that are only consumed when running in
|
||||||
* usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
|
* usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
|
||||||
|
@ -760,7 +764,7 @@ bool kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu,
|
||||||
if ((fault->error_code & PFERR_PRESENT_MASK) &&
|
if ((fault->error_code & PFERR_PRESENT_MASK) &&
|
||||||
!(fault->error_code & PFERR_RSVD_MASK))
|
!(fault->error_code & PFERR_RSVD_MASK))
|
||||||
kvm_mmu_invalidate_gva(vcpu, fault_mmu, fault->address,
|
kvm_mmu_invalidate_gva(vcpu, fault_mmu, fault->address,
|
||||||
fault_mmu->root_hpa);
|
fault_mmu->root.hpa);
|
||||||
|
|
||||||
fault_mmu->inject_page_fault(vcpu, fault);
|
fault_mmu->inject_page_fault(vcpu, fault);
|
||||||
return fault->nested_page_fault;
|
return fault->nested_page_fault;
|
||||||
|
@ -853,7 +857,7 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
|
||||||
* Shadow page roots need to be reconstructed instead.
|
* Shadow page roots need to be reconstructed instead.
|
||||||
*/
|
*/
|
||||||
if (!tdp_enabled && memcmp(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs)))
|
if (!tdp_enabled && memcmp(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs)))
|
||||||
kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOT_CURRENT);
|
kvm_mmu_free_roots(vcpu->kvm, mmu, KVM_MMU_ROOT_CURRENT);
|
||||||
|
|
||||||
memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs));
|
memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs));
|
||||||
kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
|
kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
|
||||||
|
@ -869,6 +873,13 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon
|
||||||
if ((cr0 ^ old_cr0) & X86_CR0_PG) {
|
if ((cr0 ^ old_cr0) & X86_CR0_PG) {
|
||||||
kvm_clear_async_pf_completion_queue(vcpu);
|
kvm_clear_async_pf_completion_queue(vcpu);
|
||||||
kvm_async_pf_hash_reset(vcpu);
|
kvm_async_pf_hash_reset(vcpu);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Clearing CR0.PG is defined to flush the TLB from the guest's
|
||||||
|
* perspective.
|
||||||
|
*/
|
||||||
|
if (!(cr0 & X86_CR0_PG))
|
||||||
|
kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS)
|
if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS)
|
||||||
|
@ -1067,28 +1078,43 @@ EXPORT_SYMBOL_GPL(kvm_is_valid_cr4);
|
||||||
|
|
||||||
void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned long cr4)
|
void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned long cr4)
|
||||||
{
|
{
|
||||||
|
if ((cr4 ^ old_cr4) & KVM_MMU_CR4_ROLE_BITS)
|
||||||
|
kvm_mmu_reset_context(vcpu);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If any role bit is changed, the MMU needs to be reset.
|
|
||||||
*
|
|
||||||
* If CR4.PCIDE is changed 1 -> 0, the guest TLB must be flushed.
|
|
||||||
* If CR4.PCIDE is changed 0 -> 1, there is no need to flush the TLB
|
* If CR4.PCIDE is changed 0 -> 1, there is no need to flush the TLB
|
||||||
* according to the SDM; however, stale prev_roots could be reused
|
* according to the SDM; however, stale prev_roots could be reused
|
||||||
* incorrectly in the future after a MOV to CR3 with NOFLUSH=1, so we
|
* incorrectly in the future after a MOV to CR3 with NOFLUSH=1, so we
|
||||||
* free them all. KVM_REQ_MMU_RELOAD is fit for the both cases; it
|
* free them all. This is *not* a superset of KVM_REQ_TLB_FLUSH_GUEST
|
||||||
* is slow, but changing CR4.PCIDE is a rare case.
|
* or KVM_REQ_TLB_FLUSH_CURRENT, because the hardware TLB is not flushed,
|
||||||
*
|
* so fall through.
|
||||||
* If CR4.PGE is changed, the guest TLB must be flushed.
|
|
||||||
*
|
|
||||||
* Note: resetting MMU is a superset of KVM_REQ_MMU_RELOAD and
|
|
||||||
* KVM_REQ_MMU_RELOAD is a superset of KVM_REQ_TLB_FLUSH_GUEST, hence
|
|
||||||
* the usage of "else if".
|
|
||||||
*/
|
*/
|
||||||
if ((cr4 ^ old_cr4) & KVM_MMU_CR4_ROLE_BITS)
|
if (!tdp_enabled &&
|
||||||
kvm_mmu_reset_context(vcpu);
|
(cr4 & X86_CR4_PCIDE) && !(old_cr4 & X86_CR4_PCIDE))
|
||||||
else if ((cr4 ^ old_cr4) & X86_CR4_PCIDE)
|
kvm_mmu_unload(vcpu);
|
||||||
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
|
|
||||||
else if ((cr4 ^ old_cr4) & X86_CR4_PGE)
|
/*
|
||||||
|
* The TLB has to be flushed for all PCIDs if any of the following
|
||||||
|
* (architecturally required) changes happen:
|
||||||
|
* - CR4.PCIDE is changed from 1 to 0
|
||||||
|
* - CR4.PGE is toggled
|
||||||
|
*
|
||||||
|
* This is a superset of KVM_REQ_TLB_FLUSH_CURRENT.
|
||||||
|
*/
|
||||||
|
if (((cr4 ^ old_cr4) & X86_CR4_PGE) ||
|
||||||
|
(!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
|
||||||
kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
|
kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The TLB has to be flushed for the current PCID if any of the
|
||||||
|
* following (architecturally required) changes happen:
|
||||||
|
* - CR4.SMEP is changed from 0 to 1
|
||||||
|
* - CR4.PAE is toggled
|
||||||
|
*/
|
||||||
|
else if (((cr4 ^ old_cr4) & X86_CR4_PAE) ||
|
||||||
|
((cr4 & X86_CR4_SMEP) && !(old_cr4 & X86_CR4_SMEP)))
|
||||||
|
kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
|
||||||
|
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_post_set_cr4);
|
EXPORT_SYMBOL_GPL(kvm_post_set_cr4);
|
||||||
|
|
||||||
|
@ -1166,7 +1192,7 @@ static void kvm_invalidate_pcid(struct kvm_vcpu *vcpu, unsigned long pcid)
|
||||||
if (kvm_get_pcid(vcpu, mmu->prev_roots[i].pgd) == pcid)
|
if (kvm_get_pcid(vcpu, mmu->prev_roots[i].pgd) == pcid)
|
||||||
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
|
||||||
|
|
||||||
kvm_mmu_free_roots(vcpu, mmu, roots_to_free);
|
kvm_mmu_free_roots(vcpu->kvm, mmu, roots_to_free);
|
||||||
}
|
}
|
||||||
|
|
||||||
int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
|
int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
|
||||||
|
@ -1656,8 +1682,7 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Update reserved bits */
|
if ((efer ^ old_efer) & KVM_MMU_EFER_ROLE_BITS)
|
||||||
if ((efer ^ old_efer) & EFER_NX)
|
|
||||||
kvm_mmu_reset_context(vcpu);
|
kvm_mmu_reset_context(vcpu);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
|
@ -2026,17 +2051,10 @@ static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
if (((data & APIC_SHORT_MASK) == APIC_DEST_NOSHORT) &&
|
if (((data & APIC_SHORT_MASK) == APIC_DEST_NOSHORT) &&
|
||||||
((data & APIC_DEST_MASK) == APIC_DEST_PHYSICAL) &&
|
((data & APIC_DEST_MASK) == APIC_DEST_PHYSICAL) &&
|
||||||
((data & APIC_MODE_MASK) == APIC_DM_FIXED) &&
|
((data & APIC_MODE_MASK) == APIC_DM_FIXED) &&
|
||||||
((u32)(data >> 32) != X2APIC_BROADCAST)) {
|
((u32)(data >> 32) != X2APIC_BROADCAST))
|
||||||
|
return kvm_x2apic_icr_write(vcpu->arch.apic, data);
|
||||||
data &= ~(1 << 12);
|
|
||||||
kvm_apic_send_ipi(vcpu->arch.apic, (u32)data, (u32)(data >> 32));
|
|
||||||
kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR2, (u32)(data >> 32));
|
|
||||||
kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR, (u32)data);
|
|
||||||
trace_kvm_apic_write(APIC_ICR, (u32)data);
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
@ -2413,7 +2431,7 @@ static inline u64 __scale_tsc(u64 ratio, u64 tsc)
|
||||||
return mul_u64_u64_shr(tsc, ratio, kvm_tsc_scaling_ratio_frac_bits);
|
return mul_u64_u64_shr(tsc, ratio, kvm_tsc_scaling_ratio_frac_bits);
|
||||||
}
|
}
|
||||||
|
|
||||||
u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc, u64 ratio)
|
u64 kvm_scale_tsc(u64 tsc, u64 ratio)
|
||||||
{
|
{
|
||||||
u64 _tsc = tsc;
|
u64 _tsc = tsc;
|
||||||
|
|
||||||
|
@ -2428,7 +2446,7 @@ static u64 kvm_compute_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
|
||||||
{
|
{
|
||||||
u64 tsc;
|
u64 tsc;
|
||||||
|
|
||||||
tsc = kvm_scale_tsc(vcpu, rdtsc(), vcpu->arch.l1_tsc_scaling_ratio);
|
tsc = kvm_scale_tsc(rdtsc(), vcpu->arch.l1_tsc_scaling_ratio);
|
||||||
|
|
||||||
return target_tsc - tsc;
|
return target_tsc - tsc;
|
||||||
}
|
}
|
||||||
|
@ -2436,7 +2454,7 @@ static u64 kvm_compute_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
|
||||||
u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
|
u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
|
||||||
{
|
{
|
||||||
return vcpu->arch.l1_tsc_offset +
|
return vcpu->arch.l1_tsc_offset +
|
||||||
kvm_scale_tsc(vcpu, host_tsc, vcpu->arch.l1_tsc_scaling_ratio);
|
kvm_scale_tsc(host_tsc, vcpu->arch.l1_tsc_scaling_ratio);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
|
EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
|
||||||
|
|
||||||
|
@ -2639,7 +2657,7 @@ static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 adjustment)
|
||||||
{
|
{
|
||||||
if (vcpu->arch.l1_tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
|
if (vcpu->arch.l1_tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
|
||||||
WARN_ON(adjustment < 0);
|
WARN_ON(adjustment < 0);
|
||||||
adjustment = kvm_scale_tsc(vcpu, (u64) adjustment,
|
adjustment = kvm_scale_tsc((u64) adjustment,
|
||||||
vcpu->arch.l1_tsc_scaling_ratio);
|
vcpu->arch.l1_tsc_scaling_ratio);
|
||||||
adjust_tsc_offset_guest(vcpu, adjustment);
|
adjust_tsc_offset_guest(vcpu, adjustment);
|
||||||
}
|
}
|
||||||
|
@ -3059,7 +3077,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
|
||||||
/* With all the info we got, fill in the values */
|
/* With all the info we got, fill in the values */
|
||||||
|
|
||||||
if (kvm_has_tsc_control)
|
if (kvm_has_tsc_control)
|
||||||
tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz,
|
tgt_tsc_khz = kvm_scale_tsc(tgt_tsc_khz,
|
||||||
v->arch.l1_tsc_scaling_ratio);
|
v->arch.l1_tsc_scaling_ratio);
|
||||||
|
|
||||||
if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
|
if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
|
||||||
|
@ -3282,7 +3300,7 @@ static void kvmclock_reset(struct kvm_vcpu *vcpu)
|
||||||
static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
|
static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
++vcpu->stat.tlb_flush;
|
++vcpu->stat.tlb_flush;
|
||||||
static_call(kvm_x86_tlb_flush_all)(vcpu);
|
static_call(kvm_x86_flush_tlb_all)(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
|
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
|
||||||
|
@ -3300,14 +3318,14 @@ static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
|
||||||
kvm_mmu_sync_prev_roots(vcpu);
|
kvm_mmu_sync_prev_roots(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static_call(kvm_x86_tlb_flush_guest)(vcpu);
|
static_call(kvm_x86_flush_tlb_guest)(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static inline void kvm_vcpu_flush_tlb_current(struct kvm_vcpu *vcpu)
|
static inline void kvm_vcpu_flush_tlb_current(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
++vcpu->stat.tlb_flush;
|
++vcpu->stat.tlb_flush;
|
||||||
static_call(kvm_x86_tlb_flush_current)(vcpu);
|
static_call(kvm_x86_flush_tlb_current)(vcpu);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -3869,7 +3887,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||||
ratio = vcpu->arch.tsc_scaling_ratio;
|
ratio = vcpu->arch.tsc_scaling_ratio;
|
||||||
}
|
}
|
||||||
|
|
||||||
msr_info->data = kvm_scale_tsc(vcpu, rdtsc(), ratio) + offset;
|
msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case MSR_MTRRcap:
|
case MSR_MTRRcap:
|
||||||
|
@ -4245,6 +4263,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||||
case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
|
case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
|
||||||
case KVM_CAP_VCPU_ATTRIBUTES:
|
case KVM_CAP_VCPU_ATTRIBUTES:
|
||||||
case KVM_CAP_SYS_ATTRIBUTES:
|
case KVM_CAP_SYS_ATTRIBUTES:
|
||||||
|
case KVM_CAP_VAPIC:
|
||||||
case KVM_CAP_ENABLE_CAP:
|
case KVM_CAP_ENABLE_CAP:
|
||||||
r = 1;
|
r = 1;
|
||||||
break;
|
break;
|
||||||
|
@ -4286,9 +4305,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||||
*/
|
*/
|
||||||
r = static_call(kvm_x86_has_emulated_msr)(kvm, MSR_IA32_SMBASE);
|
r = static_call(kvm_x86_has_emulated_msr)(kvm, MSR_IA32_SMBASE);
|
||||||
break;
|
break;
|
||||||
case KVM_CAP_VAPIC:
|
|
||||||
r = !static_call(kvm_x86_cpu_has_accelerated_tpr)();
|
|
||||||
break;
|
|
||||||
case KVM_CAP_NR_VCPUS:
|
case KVM_CAP_NR_VCPUS:
|
||||||
r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
|
r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
|
||||||
break;
|
break;
|
||||||
|
@ -4343,7 +4359,13 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||||
if (r < sizeof(struct kvm_xsave))
|
if (r < sizeof(struct kvm_xsave))
|
||||||
r = sizeof(struct kvm_xsave);
|
r = sizeof(struct kvm_xsave);
|
||||||
break;
|
break;
|
||||||
|
case KVM_CAP_PMU_CAPABILITY:
|
||||||
|
r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
|
case KVM_CAP_DISABLE_QUIRKS2:
|
||||||
|
r = KVM_X86_VALID_QUIRKS;
|
||||||
|
break;
|
||||||
default:
|
default:
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -5145,7 +5167,7 @@ static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
|
||||||
kvm->arch.last_tsc_khz == vcpu->arch.virtual_tsc_khz &&
|
kvm->arch.last_tsc_khz == vcpu->arch.virtual_tsc_khz &&
|
||||||
kvm->arch.last_tsc_offset == offset);
|
kvm->arch.last_tsc_offset == offset);
|
||||||
|
|
||||||
tsc = kvm_scale_tsc(vcpu, rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset;
|
tsc = kvm_scale_tsc(rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset;
|
||||||
ns = get_kvmclock_base_ns();
|
ns = get_kvmclock_base_ns();
|
||||||
|
|
||||||
__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched);
|
__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched);
|
||||||
|
@ -5890,6 +5912,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
switch (cap->cap) {
|
switch (cap->cap) {
|
||||||
|
case KVM_CAP_DISABLE_QUIRKS2:
|
||||||
|
r = -EINVAL;
|
||||||
|
if (cap->args[0] & ~KVM_X86_VALID_QUIRKS)
|
||||||
|
break;
|
||||||
|
fallthrough;
|
||||||
case KVM_CAP_DISABLE_QUIRKS:
|
case KVM_CAP_DISABLE_QUIRKS:
|
||||||
kvm->arch.disabled_quirks = cap->args[0];
|
kvm->arch.disabled_quirks = cap->args[0];
|
||||||
r = 0;
|
r = 0;
|
||||||
|
@ -5990,15 +6017,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
|
||||||
#endif
|
#endif
|
||||||
case KVM_CAP_VM_COPY_ENC_CONTEXT_FROM:
|
case KVM_CAP_VM_COPY_ENC_CONTEXT_FROM:
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
if (kvm_x86_ops.vm_copy_enc_context_from)
|
if (!kvm_x86_ops.vm_copy_enc_context_from)
|
||||||
r = kvm_x86_ops.vm_copy_enc_context_from(kvm, cap->args[0]);
|
break;
|
||||||
return r;
|
|
||||||
|
r = static_call(kvm_x86_vm_copy_enc_context_from)(kvm, cap->args[0]);
|
||||||
|
break;
|
||||||
case KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM:
|
case KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM:
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
if (kvm_x86_ops.vm_move_enc_context_from)
|
if (!kvm_x86_ops.vm_move_enc_context_from)
|
||||||
r = kvm_x86_ops.vm_move_enc_context_from(
|
break;
|
||||||
kvm, cap->args[0]);
|
|
||||||
return r;
|
r = static_call(kvm_x86_vm_move_enc_context_from)(kvm, cap->args[0]);
|
||||||
|
break;
|
||||||
case KVM_CAP_EXIT_HYPERCALL:
|
case KVM_CAP_EXIT_HYPERCALL:
|
||||||
if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
|
if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
|
@ -6014,6 +6044,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
|
||||||
kvm->arch.exit_on_emulation_error = cap->args[0];
|
kvm->arch.exit_on_emulation_error = cap->args[0];
|
||||||
r = 0;
|
r = 0;
|
||||||
break;
|
break;
|
||||||
|
case KVM_CAP_PMU_CAPABILITY:
|
||||||
|
r = -EINVAL;
|
||||||
|
if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
|
||||||
|
break;
|
||||||
|
|
||||||
|
mutex_lock(&kvm->lock);
|
||||||
|
if (!kvm->created_vcpus) {
|
||||||
|
kvm->arch.enable_pmu = !(cap->args[0] & KVM_PMU_CAP_DISABLE);
|
||||||
|
r = 0;
|
||||||
|
}
|
||||||
|
mutex_unlock(&kvm->lock);
|
||||||
|
break;
|
||||||
default:
|
default:
|
||||||
r = -EINVAL;
|
r = -EINVAL;
|
||||||
break;
|
break;
|
||||||
|
@ -6473,8 +6515,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
|
||||||
break;
|
break;
|
||||||
case KVM_MEMORY_ENCRYPT_OP: {
|
case KVM_MEMORY_ENCRYPT_OP: {
|
||||||
r = -ENOTTY;
|
r = -ENOTTY;
|
||||||
if (kvm_x86_ops.mem_enc_op)
|
if (!kvm_x86_ops.mem_enc_ioctl)
|
||||||
r = static_call(kvm_x86_mem_enc_op)(kvm, argp);
|
goto out;
|
||||||
|
|
||||||
|
r = static_call(kvm_x86_mem_enc_ioctl)(kvm, argp);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case KVM_MEMORY_ENCRYPT_REG_REGION: {
|
case KVM_MEMORY_ENCRYPT_REG_REGION: {
|
||||||
|
@ -6485,8 +6529,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
|
||||||
goto out;
|
goto out;
|
||||||
|
|
||||||
r = -ENOTTY;
|
r = -ENOTTY;
|
||||||
if (kvm_x86_ops.mem_enc_reg_region)
|
if (!kvm_x86_ops.mem_enc_register_region)
|
||||||
r = static_call(kvm_x86_mem_enc_reg_region)(kvm, ®ion);
|
goto out;
|
||||||
|
|
||||||
|
r = static_call(kvm_x86_mem_enc_register_region)(kvm, ®ion);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case KVM_MEMORY_ENCRYPT_UNREG_REGION: {
|
case KVM_MEMORY_ENCRYPT_UNREG_REGION: {
|
||||||
|
@ -6497,8 +6543,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
|
||||||
goto out;
|
goto out;
|
||||||
|
|
||||||
r = -ENOTTY;
|
r = -ENOTTY;
|
||||||
if (kvm_x86_ops.mem_enc_unreg_region)
|
if (!kvm_x86_ops.mem_enc_unregister_region)
|
||||||
r = static_call(kvm_x86_mem_enc_unreg_region)(kvm, ®ion);
|
goto out;
|
||||||
|
|
||||||
|
r = static_call(kvm_x86_mem_enc_unregister_region)(kvm, ®ion);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
case KVM_HYPERV_EVENTFD: {
|
case KVM_HYPERV_EVENTFD: {
|
||||||
|
@ -8426,8 +8474,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
|
||||||
kvm_rip_write(vcpu, ctxt->eip);
|
kvm_rip_write(vcpu, ctxt->eip);
|
||||||
if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
|
if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
|
||||||
r = kvm_vcpu_do_singlestep(vcpu);
|
r = kvm_vcpu_do_singlestep(vcpu);
|
||||||
if (kvm_x86_ops.update_emulated_instruction)
|
static_call_cond(kvm_x86_update_emulated_instruction)(vcpu);
|
||||||
static_call(kvm_x86_update_emulated_instruction)(vcpu);
|
|
||||||
__kvm_set_rflags(vcpu, ctxt->eflags);
|
__kvm_set_rflags(vcpu, ctxt->eflags);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -8826,6 +8873,12 @@ int kvm_arch_init(void *opaque)
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (IS_ENABLED(CONFIG_PREEMPT_RT) && !boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
|
||||||
|
pr_err("RT requires X86_FEATURE_CONSTANT_TSC\n");
|
||||||
|
r = -EOPNOTSUPP;
|
||||||
|
goto out;
|
||||||
|
}
|
||||||
|
|
||||||
r = -ENOMEM;
|
r = -ENOMEM;
|
||||||
|
|
||||||
x86_emulator_cache = kvm_alloc_emulator_cache();
|
x86_emulator_cache = kvm_alloc_emulator_cache();
|
||||||
|
@ -8985,7 +9038,7 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
|
||||||
*
|
*
|
||||||
* @apicid - apicid of vcpu to be kicked.
|
* @apicid - apicid of vcpu to be kicked.
|
||||||
*/
|
*/
|
||||||
static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
|
static void kvm_pv_kick_cpu_op(struct kvm *kvm, int apicid)
|
||||||
{
|
{
|
||||||
struct kvm_lapic_irq lapic_irq;
|
struct kvm_lapic_irq lapic_irq;
|
||||||
|
|
||||||
|
@ -9104,7 +9157,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
|
||||||
if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
|
if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
|
||||||
break;
|
break;
|
||||||
|
|
||||||
kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
|
kvm_pv_kick_cpu_op(vcpu->kvm, a1);
|
||||||
kvm_sched_yield(vcpu, a1);
|
kvm_sched_yield(vcpu, a1);
|
||||||
ret = 0;
|
ret = 0;
|
||||||
break;
|
break;
|
||||||
|
@ -9268,10 +9321,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
|
||||||
*/
|
*/
|
||||||
else if (!vcpu->arch.exception.pending) {
|
else if (!vcpu->arch.exception.pending) {
|
||||||
if (vcpu->arch.nmi_injected) {
|
if (vcpu->arch.nmi_injected) {
|
||||||
static_call(kvm_x86_set_nmi)(vcpu);
|
static_call(kvm_x86_inject_nmi)(vcpu);
|
||||||
can_inject = false;
|
can_inject = false;
|
||||||
} else if (vcpu->arch.interrupt.injected) {
|
} else if (vcpu->arch.interrupt.injected) {
|
||||||
static_call(kvm_x86_set_irq)(vcpu);
|
static_call(kvm_x86_inject_irq)(vcpu);
|
||||||
can_inject = false;
|
can_inject = false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -9351,7 +9404,7 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
|
||||||
if (r) {
|
if (r) {
|
||||||
--vcpu->arch.nmi_pending;
|
--vcpu->arch.nmi_pending;
|
||||||
vcpu->arch.nmi_injected = true;
|
vcpu->arch.nmi_injected = true;
|
||||||
static_call(kvm_x86_set_nmi)(vcpu);
|
static_call(kvm_x86_inject_nmi)(vcpu);
|
||||||
can_inject = false;
|
can_inject = false;
|
||||||
WARN_ON(static_call(kvm_x86_nmi_allowed)(vcpu, true) < 0);
|
WARN_ON(static_call(kvm_x86_nmi_allowed)(vcpu, true) < 0);
|
||||||
}
|
}
|
||||||
|
@ -9365,7 +9418,7 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
|
||||||
goto out;
|
goto out;
|
||||||
if (r) {
|
if (r) {
|
||||||
kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
|
kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
|
||||||
static_call(kvm_x86_set_irq)(vcpu);
|
static_call(kvm_x86_inject_irq)(vcpu);
|
||||||
WARN_ON(static_call(kvm_x86_interrupt_allowed)(vcpu, true) < 0);
|
WARN_ON(static_call(kvm_x86_interrupt_allowed)(vcpu, true) < 0);
|
||||||
}
|
}
|
||||||
if (kvm_cpu_has_injectable_intr(vcpu))
|
if (kvm_cpu_has_injectable_intr(vcpu))
|
||||||
|
@ -9693,8 +9746,7 @@ void __kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
|
||||||
|
|
||||||
lockdep_assert_held_write(&kvm->arch.apicv_update_lock);
|
lockdep_assert_held_write(&kvm->arch.apicv_update_lock);
|
||||||
|
|
||||||
if (!kvm_x86_ops.check_apicv_inhibit_reasons ||
|
if (!static_call(kvm_x86_check_apicv_inhibit_reasons)(bit))
|
||||||
!static_call(kvm_x86_check_apicv_inhibit_reasons)(bit))
|
|
||||||
return;
|
return;
|
||||||
|
|
||||||
old = new = kvm->arch.apicv_inhibit_reasons;
|
old = new = kvm->arch.apicv_inhibit_reasons;
|
||||||
|
@ -9727,10 +9779,12 @@ void __kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
|
||||||
} else
|
} else
|
||||||
kvm->arch.apicv_inhibit_reasons = new;
|
kvm->arch.apicv_inhibit_reasons = new;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(__kvm_request_apicv_update);
|
|
||||||
|
|
||||||
void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
|
void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
|
||||||
{
|
{
|
||||||
|
if (!enable_apicv)
|
||||||
|
return;
|
||||||
|
|
||||||
down_write(&kvm->arch.apicv_update_lock);
|
down_write(&kvm->arch.apicv_update_lock);
|
||||||
__kvm_request_apicv_update(kvm, activate, bit);
|
__kvm_request_apicv_update(kvm, activate, bit);
|
||||||
up_write(&kvm->arch.apicv_update_lock);
|
up_write(&kvm->arch.apicv_update_lock);
|
||||||
|
@ -9769,11 +9823,11 @@ static void vcpu_load_eoi_exitmap(struct kvm_vcpu *vcpu)
|
||||||
bitmap_or((ulong *)eoi_exit_bitmap,
|
bitmap_or((ulong *)eoi_exit_bitmap,
|
||||||
vcpu->arch.ioapic_handled_vectors,
|
vcpu->arch.ioapic_handled_vectors,
|
||||||
to_hv_synic(vcpu)->vec_bitmap, 256);
|
to_hv_synic(vcpu)->vec_bitmap, 256);
|
||||||
static_call(kvm_x86_load_eoi_exitmap)(vcpu, eoi_exit_bitmap);
|
static_call_cond(kvm_x86_load_eoi_exitmap)(vcpu, eoi_exit_bitmap);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
static_call(kvm_x86_load_eoi_exitmap)(
|
static_call_cond(kvm_x86_load_eoi_exitmap)(
|
||||||
vcpu, (u64 *)vcpu->arch.ioapic_handled_vectors);
|
vcpu, (u64 *)vcpu->arch.ioapic_handled_vectors);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -9796,10 +9850,7 @@ static void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
|
||||||
if (!lapic_in_kernel(vcpu))
|
if (!lapic_in_kernel(vcpu))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
if (!kvm_x86_ops.set_apic_access_page_addr)
|
static_call_cond(kvm_x86_set_apic_access_page_addr)(vcpu);
|
||||||
return;
|
|
||||||
|
|
||||||
static_call(kvm_x86_set_apic_access_page_addr)(vcpu);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu)
|
void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu)
|
||||||
|
@ -9844,8 +9895,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
|
if (kvm_check_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu))
|
||||||
kvm_mmu_unload(vcpu);
|
kvm_mmu_free_obsolete_roots(vcpu);
|
||||||
if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
|
if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
|
||||||
__kvm_migrate_timers(vcpu);
|
__kvm_migrate_timers(vcpu);
|
||||||
if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
|
if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
|
||||||
|
@ -9990,7 +10041,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
preempt_disable();
|
preempt_disable();
|
||||||
|
|
||||||
static_call(kvm_x86_prepare_guest_switch)(vcpu);
|
static_call(kvm_x86_prepare_switch_to_guest)(vcpu);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Disable IRQs before setting IN_GUEST_MODE. Posted interrupt
|
* Disable IRQs before setting IN_GUEST_MODE. Posted interrupt
|
||||||
|
@ -10071,7 +10122,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
|
||||||
*/
|
*/
|
||||||
WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu));
|
WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu));
|
||||||
|
|
||||||
exit_fastpath = static_call(kvm_x86_run)(vcpu);
|
exit_fastpath = static_call(kvm_x86_vcpu_run)(vcpu);
|
||||||
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
|
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
@ -10373,10 +10424,7 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
|
||||||
/* Swap (qemu) user FPU context for the guest FPU context. */
|
/* Swap (qemu) user FPU context for the guest FPU context. */
|
||||||
static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
|
static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
/*
|
/* Exclude PKRU, it's restored separately immediately after VM-Exit. */
|
||||||
* Exclude PKRU from restore as restored separately in
|
|
||||||
* kvm_x86_ops.run().
|
|
||||||
*/
|
|
||||||
fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, true);
|
fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, true);
|
||||||
trace_kvm_fpu(1);
|
trace_kvm_fpu(1);
|
||||||
}
|
}
|
||||||
|
@ -10566,16 +10614,6 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
|
|
||||||
{
|
|
||||||
struct kvm_segment cs;
|
|
||||||
|
|
||||||
kvm_get_segment(vcpu, &cs, VCPU_SREG_CS);
|
|
||||||
*db = cs.db;
|
|
||||||
*l = cs.l;
|
|
||||||
}
|
|
||||||
EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
|
|
||||||
|
|
||||||
static void __get_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
|
static void __get_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
|
||||||
{
|
{
|
||||||
struct desc_ptr dt;
|
struct desc_ptr dt;
|
||||||
|
@ -11348,15 +11386,17 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
|
||||||
static_call(kvm_x86_update_exception_bitmap)(vcpu);
|
static_call(kvm_x86_update_exception_bitmap)(vcpu);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Reset the MMU context if paging was enabled prior to INIT (which is
|
* On the standard CR0/CR4/EFER modification paths, there are several
|
||||||
* implied if CR0.PG=1 as CR0 will be '0' prior to RESET). Unlike the
|
* complex conditions determining whether the MMU has to be reset and/or
|
||||||
* standard CR0/CR4/EFER modification paths, only CR0.PG needs to be
|
* which PCIDs have to be flushed. However, CR0.WP and the paging-related
|
||||||
* checked because it is unconditionally cleared on INIT and all other
|
* bits in CR4 and EFER are irrelevant if CR0.PG was '0'; and a reset+flush
|
||||||
* paging related bits are ignored if paging is disabled, i.e. CR0.WP,
|
* is needed anyway if CR0.PG was '1' (which can only happen for INIT, as
|
||||||
* CR4, and EFER changes are all irrelevant if CR0.PG was '0'.
|
* CR0 will be '0' prior to RESET). So we only need to check CR0.PG here.
|
||||||
*/
|
*/
|
||||||
if (old_cr0 & X86_CR0_PG)
|
if (old_cr0 & X86_CR0_PG) {
|
||||||
|
kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
|
||||||
kvm_mmu_reset_context(vcpu);
|
kvm_mmu_reset_context(vcpu);
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Intel's SDM states that all TLB entries are flushed on INIT. AMD's
|
* Intel's SDM states that all TLB entries are flushed on INIT. AMD's
|
||||||
|
@ -11614,6 +11654,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
|
||||||
raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
|
raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
|
||||||
|
|
||||||
kvm->arch.guest_can_read_msr_platform_info = true;
|
kvm->arch.guest_can_read_msr_platform_info = true;
|
||||||
|
kvm->arch.enable_pmu = enable_pmu;
|
||||||
|
|
||||||
#if IS_ENABLED(CONFIG_HYPERV)
|
#if IS_ENABLED(CONFIG_HYPERV)
|
||||||
spin_lock_init(&kvm->arch.hv_root_tdp_lock);
|
spin_lock_init(&kvm->arch.hv_root_tdp_lock);
|
||||||
|
@ -11811,7 +11852,7 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages)
|
||||||
if (slot->arch.rmap[i])
|
if (slot->arch.rmap[i])
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
slot->arch.rmap[i] = kvcalloc(lpages, sz, GFP_KERNEL_ACCOUNT);
|
slot->arch.rmap[i] = __vcalloc(lpages, sz, GFP_KERNEL_ACCOUNT);
|
||||||
if (!slot->arch.rmap[i]) {
|
if (!slot->arch.rmap[i]) {
|
||||||
memslot_rmap_free(slot);
|
memslot_rmap_free(slot);
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
@ -11848,7 +11889,7 @@ static int kvm_alloc_memslot_metadata(struct kvm *kvm,
|
||||||
|
|
||||||
lpages = __kvm_mmu_slot_lpages(slot, npages, level);
|
lpages = __kvm_mmu_slot_lpages(slot, npages, level);
|
||||||
|
|
||||||
linfo = kvcalloc(lpages, sizeof(*linfo), GFP_KERNEL_ACCOUNT);
|
linfo = __vcalloc(lpages, sizeof(*linfo), GFP_KERNEL_ACCOUNT);
|
||||||
if (!linfo)
|
if (!linfo)
|
||||||
goto out_free;
|
goto out_free;
|
||||||
|
|
||||||
|
@ -11998,6 +12039,9 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm,
|
||||||
if (kvm_dirty_log_manual_protect_and_init_set(kvm))
|
if (kvm_dirty_log_manual_protect_and_init_set(kvm))
|
||||||
return;
|
return;
|
||||||
|
|
||||||
|
if (READ_ONCE(eager_page_split))
|
||||||
|
kvm_mmu_slot_try_split_huge_pages(kvm, new, PG_LEVEL_4K);
|
||||||
|
|
||||||
if (kvm_x86_ops.cpu_dirty_log_size) {
|
if (kvm_x86_ops.cpu_dirty_log_size) {
|
||||||
kvm_mmu_slot_leaf_clear_dirty(kvm, new);
|
kvm_mmu_slot_leaf_clear_dirty(kvm, new);
|
||||||
kvm_mmu_slot_remove_write_access(kvm, new, PG_LEVEL_2M);
|
kvm_mmu_slot_remove_write_access(kvm, new, PG_LEVEL_2M);
|
||||||
|
@ -12042,8 +12086,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
|
||||||
static inline bool kvm_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
|
static inline bool kvm_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
return (is_guest_mode(vcpu) &&
|
return (is_guest_mode(vcpu) &&
|
||||||
kvm_x86_ops.guest_apic_has_interrupt &&
|
static_call(kvm_x86_guest_apic_has_interrupt)(vcpu));
|
||||||
static_call(kvm_x86_guest_apic_has_interrupt)(vcpu));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
|
static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
|
||||||
|
@ -12296,14 +12339,28 @@ static inline bool apf_pageready_slot_free(struct kvm_vcpu *vcpu)
|
||||||
|
|
||||||
static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu)
|
static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu))
|
|
||||||
|
if (!kvm_pv_async_pf_enabled(vcpu))
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
if (!kvm_pv_async_pf_enabled(vcpu) ||
|
if (vcpu->arch.apf.send_user_only &&
|
||||||
(vcpu->arch.apf.send_user_only && static_call(kvm_x86_get_cpl)(vcpu) == 0))
|
static_call(kvm_x86_get_cpl)(vcpu) == 0)
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
return true;
|
if (is_guest_mode(vcpu)) {
|
||||||
|
/*
|
||||||
|
* L1 needs to opt into the special #PF vmexits that are
|
||||||
|
* used to deliver async page faults.
|
||||||
|
*/
|
||||||
|
return vcpu->arch.apf.delivery_as_pf_vmexit;
|
||||||
|
} else {
|
||||||
|
/*
|
||||||
|
* Play it safe in case the guest temporarily disables paging.
|
||||||
|
* The real mode IDT in particular is unlikely to have a #PF
|
||||||
|
* exception setup.
|
||||||
|
*/
|
||||||
|
return is_paging(vcpu);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
|
bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
|
||||||
|
@ -12398,7 +12455,7 @@ bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu)
|
||||||
void kvm_arch_start_assignment(struct kvm *kvm)
|
void kvm_arch_start_assignment(struct kvm *kvm)
|
||||||
{
|
{
|
||||||
if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1)
|
if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1)
|
||||||
static_call_cond(kvm_x86_start_assignment)(kvm);
|
static_call_cond(kvm_x86_pi_start_assignment)(kvm);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);
|
EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);
|
||||||
|
|
||||||
|
@ -12446,7 +12503,7 @@ int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
|
||||||
|
|
||||||
irqfd->producer = prod;
|
irqfd->producer = prod;
|
||||||
kvm_arch_start_assignment(irqfd->kvm);
|
kvm_arch_start_assignment(irqfd->kvm);
|
||||||
ret = static_call(kvm_x86_update_pi_irte)(irqfd->kvm,
|
ret = static_call(kvm_x86_pi_update_irte)(irqfd->kvm,
|
||||||
prod->irq, irqfd->gsi, 1);
|
prod->irq, irqfd->gsi, 1);
|
||||||
|
|
||||||
if (ret)
|
if (ret)
|
||||||
|
@ -12471,7 +12528,7 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
|
||||||
* when the irq is masked/disabled or the consumer side (KVM
|
* when the irq is masked/disabled or the consumer side (KVM
|
||||||
* int this case doesn't want to receive the interrupts.
|
* int this case doesn't want to receive the interrupts.
|
||||||
*/
|
*/
|
||||||
ret = static_call(kvm_x86_update_pi_irte)(irqfd->kvm, prod->irq, irqfd->gsi, 0);
|
ret = static_call(kvm_x86_pi_update_irte)(irqfd->kvm, prod->irq, irqfd->gsi, 0);
|
||||||
if (ret)
|
if (ret)
|
||||||
printk(KERN_INFO "irq bypass consumer (token %p) unregistration"
|
printk(KERN_INFO "irq bypass consumer (token %p) unregistration"
|
||||||
" fails: %d\n", irqfd->consumer.token, ret);
|
" fails: %d\n", irqfd->consumer.token, ret);
|
||||||
|
@ -12482,7 +12539,7 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
|
||||||
int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq,
|
int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq,
|
||||||
uint32_t guest_irq, bool set)
|
uint32_t guest_irq, bool set)
|
||||||
{
|
{
|
||||||
return static_call(kvm_x86_update_pi_irte)(kvm, host_irq, guest_irq, set);
|
return static_call(kvm_x86_pi_update_irte)(kvm, host_irq, guest_irq, set);
|
||||||
}
|
}
|
||||||
|
|
||||||
bool kvm_arch_irqfd_route_changed(struct kvm_kernel_irq_routing_entry *old,
|
bool kvm_arch_irqfd_route_changed(struct kvm_kernel_irq_routing_entry *old,
|
||||||
|
|
|
@ -302,6 +302,8 @@ extern int pi_inject_timer;
|
||||||
|
|
||||||
extern bool report_ignored_msrs;
|
extern bool report_ignored_msrs;
|
||||||
|
|
||||||
|
extern bool eager_page_split;
|
||||||
|
|
||||||
static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
|
static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
|
||||||
{
|
{
|
||||||
return pvclock_scale_delta(nsec, vcpu->arch.virtual_tsc_mult,
|
return pvclock_scale_delta(nsec, vcpu->arch.virtual_tsc_mult,
|
||||||
|
|
|
@ -732,7 +732,7 @@ int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data)
|
||||||
instructions[0] = 0xb8;
|
instructions[0] = 0xb8;
|
||||||
|
|
||||||
/* vmcall / vmmcall */
|
/* vmcall / vmmcall */
|
||||||
kvm_x86_ops.patch_hypercall(vcpu, instructions + 5);
|
static_call(kvm_x86_patch_hypercall)(vcpu, instructions + 5);
|
||||||
|
|
||||||
/* ret */
|
/* ret */
|
||||||
instructions[8] = 0xc3;
|
instructions[8] = 0xc3;
|
||||||
|
@ -867,7 +867,7 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu)
|
||||||
vcpu->run->exit_reason = KVM_EXIT_XEN;
|
vcpu->run->exit_reason = KVM_EXIT_XEN;
|
||||||
vcpu->run->xen.type = KVM_EXIT_XEN_HCALL;
|
vcpu->run->xen.type = KVM_EXIT_XEN_HCALL;
|
||||||
vcpu->run->xen.u.hcall.longmode = longmode;
|
vcpu->run->xen.u.hcall.longmode = longmode;
|
||||||
vcpu->run->xen.u.hcall.cpl = kvm_x86_ops.get_cpl(vcpu);
|
vcpu->run->xen.u.hcall.cpl = static_call(kvm_x86_get_cpl)(vcpu);
|
||||||
vcpu->run->xen.u.hcall.input = input;
|
vcpu->run->xen.u.hcall.input = input;
|
||||||
vcpu->run->xen.u.hcall.params[0] = params[0];
|
vcpu->run->xen.u.hcall.params[0] = params[0];
|
||||||
vcpu->run->xen.u.hcall.params[1] = params[1];
|
vcpu->run->xen.u.hcall.params[1] = params[1];
|
||||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue