The 'absolute' clock offset type has a 'start' attribute which is an
unix epoch timestamp to which the hardware clock is always set at start
of the VM.
This is useful if some VM needs to be kept set to an arbitrary time for
e.g. testing or working around broken software.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The qemu-vdagent channel is introduced since:
"05b09f039e conf: add qemu-vdagent channel"
It will be in the version 8.4.0.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Add the ability to configure a qemu-vdagent in guest domains. This
device is similar to the spice vdagent channel except that qemu handles
the spice-vdagent protocol messages itself rather than routing them over
a spice protocol channel.
The qemu-vdagent device has two notable configuration options which
determine whether qemu will handle particular vdagent features:
'clipboard' and 'mouse'.
The 'clipboard' option allows qemu to synchronize its internal clipboard
manager with the guest clipboard, which enables client<->guest clipboard
synchronization for non-spice guests such as vnc.
The 'mouse' option allows absolute mouse positioning to be sent over the
vdagent channel rather than using a usb or virtio tablet device.
Sample configuration:
<channel type='qemu-vdagent'>
<target type='virtio' name='com.redhat.spice.0'/>
<source>
<clipboard copypaste='yes'/>
<mouse mode='client'/>
</source>
</channel>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Most of the anchors that were forward ported to formatdomain.rst when it
was converted are not actually referenced by our documentation. Since
it's now quite some time after the conversion was done we can remove
them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Additionally hyperlinks in other parts of the documentation are updated
to match.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Also adjust direct links from other pages.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Added "rss" and "rss_hash_report" configuration that should be
used with qemu virtio RSS. Both options are triswitches. Used as
"driver" options and affects only NIC with model type "virtio".
In other patches - options should turn on virtio-net RSS and hash
properties.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fix the referenced anchor in 'formatdomain.rst' right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
In 0895a0e, it was noted that the "sockets" value in the topology
section of capabilities reflects not the number of sockets per NUMA
node, not the total number.
Unfortunately, the fix was applied to the wrong place: the domain XML
format documentation, not that for the capabilities output. And, in
fact, the domain XML interprets "sockets" as the total number, not a
per-node value.
Back out this change in favour of a note in the capabilities
documentation instead.
Fixes: 0895a0e75d
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
This reverts commit 150540394d.
Turns out, this feature is not needed and QEMU will fix TSC
without any intervention from outside.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>P
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The workaround is to reset the TSC to a small value. Add
to the domain configuration an attribute for this. It can be used
by QEMU and in principle also by ESXi, which has a property called
monitor_control.enable_softResetClearTSC as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since its v5.0.0 release QEMU is capable of specifying number of
threads used to allocate memory. It defaults to 1, which may be
too low for humongous guests with gigantic pages.
In general, on QEMU cmd line level it is possible to use
different number of threads per each memory-backend-* object, in
practical terms it's not useful. Therefore, use <memoryBacking/>
to set guest wide value and let all memory devices 'inherit' it,
silently. IOW, don't introduce per device knob because that would
only complicate things for a little or no benefit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The idea of the manual mode is to allow a synchronized snapshot in cases
when the storage is outsourced to an unmanaged storage provider which
requires cooperation with snapshotting.
The mode will instruct the hypervisor to pause along when the other
components are snapshotted and the 'manual' disk can be snapshotted
along. This increases latency of the snapshot but allows them in
otherwise impossible situations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In cases when the hostname of the NBD server doesn't match the hostname
in the TLS certificate the new attribute 'tlsHostname' can be used to
override it.
Add the XML infrastructure and tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are dupliacated and non-continuous CPU IDs used in HMAT
example. Fix that.
Signed-off-by: Jing Qi <jinqi@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Most people will want to use isa-debugcon to obtain debug output
for SeaBIOS / EDK II, so let's include a ready-made example for
that scenario in our documentation.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce support for
<serial type='pty'>
<target type='isa-debug'>
<model type='isa-debugcon'/>
</target>
<address type='isa' iobase='0x402'/>
</console>
which is used as a way to receive debug messages from the
firmware on x86 platforms.
Note that the default port is hypervisor specific, with QEMU
currently using 0xe9 since that's the original Bochs debug port.
For use with SeaBIOS/OVMF, the iobase port needs to be explicitly
set to 0x402.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Many domain elements have "QEMU and KVM only" or "QEMU/KVM since x.y.z"
remarks. Most of the elements work for HVF domain, so it makes sense to
add respective notices for HVF domain.
All the elements have been manually tested.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Brad Laue <brad@brad-x.com>
Tested-by: Christophe Fergeau <cfergeau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
QEMU supports Hypervisor.framework since 2.12 as hvf accel.
Hypervisor.framework provides a lightweight interface to run a virtual
cpu on macOS without the need to install third-party kernel
extensions (KEXTs).
It's supported since macOS 10.10 on machines with Intel VT-x feature
set that includes Extended Page Tables (EPT) and Unrestricted Mode.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Brad Laue <brad@brad-x.com>
Tested-by: Christophe Fergeau <cfergeau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The docs illustration for the <os> schema contains a mixture of
incompatible configuration options. This is rather confusing and
misleading to users. Splitting the illustration into four separate
examples clarifies the situation.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add a sentence to the active_pcr_banks node documentation that clarifies
that when the active_pcr_banks node is removed from the XML or when it
is omitted that the set of active PCR banks is not changed anymore.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2039246
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The whole idea of VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE is that the
memory location is restricted only via CGroups and thus can be
changed on the fly (which is exactly what
qemuDomainSetNumaParamsLive() does. Allow this mode there then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Normally the SEV measurement only covers the firmware
loader contents. When doing a direct kernel boot, however,
with new enough OVMF it is possible to ask for the
measurement to cover the kernel, ramdisk and command line.
It can't be done automatically as that would break existing
guests using direct kernel boot with old firmware, so there
is a new XML setting allowing this behaviour to be toggled.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Dirty ring feature was introduced in qemu-6.1.0, this patch
add the corresponding feature named 'dirty-ring', which enable
dirty ring feature when starting VM.
To enable the feature, the following XML needs to be added to
the guest's domain description:
<features>
<kvm>
<dirty-ring state='on' size='xxx'>
</kvm>
</features>
If property "state=on", property "size" must be specified, which
should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
After previous commit it's possible for domains to fine tune TCG
features (well, just one - tb-cache). Check that domain has TCG
enabled, otherwise the feature makes no sense.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It may come handy to be able to tweak TCG options, in this
specific case the size of translation block cache size (tb-size).
Since we can expect more knobs to tweak let's put them under
common element, like this:
<domain>
<features>
<tcg>
<tb-cache unit='MiB'>128</tb-cache>
</tcg>
</features>
</domain>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extend the TPM backend XML with a node 'active_pcr_banks' that allows a
user to specify the PCR banks to activate before starting a VM. Valid
choices for PCR banks are sha1, sha256, sha384 and sha512. When the XML
node is provided, the set of active PCR banks is 'enforced' by running
swtpm_setup before every start of the VM. The activation requires that
swtpm_setup v0.7 or later is installed and may not have any effect
otherwise.
<tpm model='tpm-tis'>
<backend type='emulator' version='2.0'>
<active_pcr_banks>
<sha256/>
<sha384/>
</active_pcr_banks>
</backend>
</tpm>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2016599
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
QEMU version 3.1 introduced PV_SEND_IPI CPUID feature bit under
commit 7f710c32bb8 (target-i386: adds PV_SEND_IPI CPUID feature bit).
This patch adds a new KVM feature 'pv-ipi' to disable this feature
(enabled by default). Newer CPU platform (Ex, AMD Zen2) supports
hardware accelation for IPI in guest, to use this feature to get
better performance in some scenarios. Detailed about the discussion:
https://lkml.org/lkml/2021/10/20/423
To disable kvm-pv-ipi and have libvirt add "-cpu host,kvm-pv-ipi=off"
to the QEMU command line, the following XML code needs to be added to the
guest's domain description:
<features>
<kvm>
<pv-ipi state='off'/>
</kvm>
</features>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 7300ccc9b3.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
This change introduces a new libvirt sub-element <pci> under
<features> that can be used to configure all pci related features.
Currently the only sub-sub element supported by this sub-element is
'acpi-bridge-hotplug' as shown below:
<features>
<pci>
<acpi-bridge-hotplug state='on|off'/>
</pci>
</features>
The above option is only available for the QEMU driver, for x86 guests
only. It is a global option, affecting all PCI bridge controllers on
the guest.
The 'acpi-bridge-hotplug' option enables or disables ACPI hotplug
support for cold-plugged pci bridges. Examples of bridges include the
PCI-PCI bridge (pci-bridge controller) for pc (i440fx) machinetypes,
or PCIe-PCI bridges and pcie-root-port controllers for q35
machinetypes.
For pc machinetypes in x86, this option has been available in QEMU
since version 2.1. Please see the following changes in qemu repo:
9e047b982452c6 ("piix4: add acpi pci hotplug support")
133a2da488062e ("pc: acpi: generate AML only for PCI0 devices if PCI
bridge hotplug is disabled")
For q35 machinetypes, this was introduced in QEMU 6.1 with the
following changes in qemu repo:
(a) c0e427d6eb5fef ("hw/acpi/ich9: Enable ACPI PCI hot-plug")
(b) 17858a16950860 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on
Q35")
The reasons for enabling ACPI based hotplug for PCIe (q35) based
machines (as opposed to native hotplug) are outlined in (b). There are
use cases where users would still want to use native
hotplug. Therefore, this config option enables users to choose either
ACPI based hotplug or native hotplug for bridges (for example for pcie
root port controller in q35 machines).
Qemu capability validation checks have also been added along with
related unit tests to exercise the new conf option.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Laine Stump <laine@redhat.com>
This change introduces libvirt xml support to enable/disable hotplug on the
pci-root controller. It adds a 'target' subelement for the pci-root controller
with a 'hotplug' property. This property can be used to enable or disable
hotplug for the pci-root controller. For example, in order to disable hotplug
on the pci-root controller, one has to use set '<target hotplug='off'>' as
shown below:
<controller type='pci' model='pci-root'>
<target hotplug='off'/>
</controller>
'<target hotplug='on'>' option would enable hotplug for pci-root controller.
This is also the default value. This option is only available for pc machine
types and is applicable for qemu/kvm accelerator only.This feature was
introduced from qemu version 5.2 with the following change in qemu repository:
3d7e78aa7777f ("Introduce a new flag for i440fx to disable PCI hotplug on the root bus")
The above qemu commit describes some reasons why users might to disable hotplug
on PCI root buses.
Related unit tests to exercise the new conf option has also been added.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The virtio-mem has another property that isn't exposed yet:
current size exposed to the guest. Please note, that this is
different to <requested/> because esp. on sizing the memory
down guest may refuse to release some blocks. Therefore, let's
have another size to report in the XML. But because of its
nature, the <current/> won't be parsed and is report only (for
live XMLs).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:
1) block-size, which defines the size of a block
2) requested-size, which defines how much memory (in bytes)
is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'current' and it is dealt with in future
commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.
Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:
<memory model='virtio-mem'>
<source>
<nodemask>1-3</nodemask>
<pagesize unit='KiB'>2048</pagesize>
</source>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of
'block-size', and
2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
address.
Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The option "queue-size" for virtio-blk was added in qemu-2.12.0, and
default value increased from qemu-5.0.0.
However, increasing this value may lead to drop of random access
performance.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In the example for <memory model='dimm'/> we show how to
configure hugepages as backend. In the example we show 4MiB
hugepages which are non-standard and thus at the first glance may
mislead users thinking that a regular sized pages (4K) will be
used. Use 2MiB as the value instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The virtiofs project started off using "virtio-fs" but later switched to
the "virtiofs" spelling because it matches the spelling of the mount -t
virtiofs command-line. Update the kbase article with the new spelling so
it matches the virtiofs website.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We simply terminate qemu instead of issuing a reset as the semantics of
the setting dictate.
Fix it by handling it identically to 'fake reboot'.
We need to forbid the combination of 'onReboot' -> 'destroy' and
'onPoweroff' -> reboot though as the handling would be hairy and it
honetly makes no sense.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The qemu driver didn't ever implement any meaningful handling for the
'preserve' action.
Forbid the flag in the qemu def validator and update the documentation
to be factual.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The qemu driver didn't ever implement any meaningful handling for the
'rename-restart' action.
At this point the following handling would take place:
'on_reboot' set to 'rename-restart' is ignored on guest-initiated
reboots, the guest simply reboots.
For on_poweroff set to 'rename-restart' the following happens:
guest initiated shutdown -> 'destroy'
libvirt initiated shutdown -> 'reboot'
In addition when 'on_reboot' is 'destroy' in addition to 'on_poweroff'
being 'rename-restart' the guest is able to execute instructions after
issuing a reset before libvirt terminates it. This will be addressed
separately later.
Forbid the flag in the qemu def validator and update the documentation
to be factual.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>