Add more elements for tuning the virtiofsd daemon
and the vhost-user-fs device:
<driver type='virtiofs' queue='1024' xattr='on'>
<binary path='/usr/libexec/virtiofsd'>
<cache mode='always'/>
<lock posix='off' flock='off'/>
</binary>
</driver>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Introduce a new 'virtiofs' driver type for filesystem.
<filesystem type='mount' accessmode='passthrough'>
<driver type='virtiofs'/>
<source dir='/path'/>
<target dir='mount_tag'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</filesystem>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Allow adding new groups without changing indentation.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
This is a very simple thing to parse and format, but needs to be done
in 4 places, so two trivial utility functions have been made that can
be called from all the higher level parser/formatters:
<domain><interface>
<domain><interface><actual> (only in domain status)
<network>
<networkport>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is in the data structure and the parse/format functions, and is
getting passed all around correctly, it just was omitted from the RNG,
which hasn't been noticed because no human is creating <networkport>
XML, and so it's never getting validated against the schema.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We are going to add support for specifying offset and size attributes
which will allow controling where the image and where the guest data
itself starts in the source of the disk. This will be represented by
a <slices> element filled with either a <slice type='storage'> for the
offset of the image format data.
Add the XML documentation and RNG schema.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new timer model will be used to control the behavior of the
virtual timer for KVM ARM/virt guests.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch adds support for the tpm-spapr device model for ppc64. The XML for
this type of TPM looks as follows:
<tpm model='tpm-spapr'>
<backend type='emulator'/>
</tpm>
Extend the documentation.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The subelement <teaming> of <interface> devices is used to configure a
simple teaming association between two interfaces in a domain. Example:
<interface type='bridge'>
<source bridge='br0'/>
<model type='virtio'/>
<mac address='00:11:22:33:44:55'/>
<alias name='ua-backup0'/>
<teaming type='persistent'/>
</interface>
<interface type='hostdev'>
<source>
<address type='pci' bus='0x02' slot='0x10' function='0x4'/>
</source>
<mac address='00:11:22:33:44:55'/>
<teaming type='transient' persistent='ua-backup0'/>
</interface>
The interface with <teaming type='persistent'/> is assumed to always
be present, while the interface with type='transient' may be be
unplugged and later re-plugged; the persistent='blah' attribute (and
in the one currently available implementation, also the matching MAC
addresses) is what associates the two devices with each other. It is
up to the hypervisor and the guest network drivers to determine what
to do with this information.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The 'builtin' rng backend model can be used as following:
<rng model='virtio'>
<backend model='builtin'/>
</rng>
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If users wish to use different name for exported disks or bitmaps
the new fields allow to do so. Additionally they also document the
current settings.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Update the host CPU code to report the die_id in the NUMA topology
capabilities. On systems with multiple dies, this fixes the bug
where CPU cores can't be distinguished:
<cpus num='12'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='1' socket_id='0' core_id='1' siblings='1'/>
<cpu id='2' socket_id='0' core_id='0' siblings='2'/>
<cpu id='3' socket_id='0' core_id='1' siblings='3'/>
</cpus>
Notice how core_id is repeated within the scope of the same socket_id.
It now reports
<cpus num='12'>
<cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0'/>
<cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1'/>
<cpu id='2' socket_id='0' die_id='1' core_id='0' siblings='2'/>
<cpu id='3' socket_id='0' die_id='1' core_id='1' siblings='3'/>
</cpus>
So core_id is now unique within a (socket_id, die_id) pair.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Recently CPU hardware vendors have started to support a new structure
inside the CPU package topology known as a "die". Thus the hierarchy
is now:
sockets > dies > cores > threads
This adds support for "dies" in the XML parser, with the value
defaulting to 1 if not specified for backwards compatibility.
For example a system with 64 logical CPUs might report
<topology sockets="4" dies="2" cores="4" threads="2"/>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There is no need to require users to produce iSCSI disk source
following our ordering of children elements. In fact, we don't
even accept our own order in the schema :(.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
One of the first versions thought of using disk path as the second
option but this was dropped as being a legacy interface. Remove the
leftover pointless <choice> wrapper for the disk name as there's just
one option now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
While command line arguments are sort of positional (because you
have to have two entries, one for "-arg" the other for "value"),
it doesn't really matter whether env variables come before or
after command line arguments.
And it matters even less when playing with qemu capabilities.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The phyp driver was added in 2009 and does not appear to have had any
real feature change since 2011. There's virtually no evidence online
of users actually using it. IMO it's time to kill it.
This was discussed a bit in April 2016:
https://www.redhat.com/archives/libvir-list/2016-April/msg01060.html
Final discussion is here:
https://www.redhat.com/archives/libvir-list/2019-December/msg01162.html
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
It will be used to represent the type of a filesystem pool in ESXi.
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This patch introduces a new PCI hostdev address type called
'unassigned'. This new type gives users the option to add
PCI hostdevs to the domain XML in an 'unassigned' state, meaning
that the device exists in the domain, is managed by Libvirt
like any regular PCI hostdev, but the guest does not have
access to it.
This adds extra options for managing PCI device binding
inside Libvirt, for example, making all the managed PCI hostdevs
declared in the domain XML to be detached from the host and bind
to the chosen driver and, at the same time, allowing just a
subset of these devices to be usable by the guest.
Next patch will use this new address type in the QEMU driver to
avoid adding unassigned devices to the QEMU launch command line.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There is this class of PCI devices that act like disks: NVMe.
Therefore, they are both PCI devices and disks. While we already
have <hostdev/> (and can assign a NVMe device to a domain
successfully) we don't have disk representation. There are three
problems with PCI assignment in case of a NVMe device:
1) domains with <hostdev/> can't be migrated
2) NVMe device is assigned whole, there's no way to assign only a
namespace
3) Because hypervisors see <hostdev/> they don't put block layer
on top of it - users don't get all the fancy features like
snapshots
NVMe namespaces are way of splitting one continuous NVDIMM memory
into smaller ones, effectively creating smaller NVMe-s (which can
then be partitioned, LVMed, etc.)
Because of all of this the following XML was chosen to model a
NVMe device:
<disk type='nvme' device='disk'>
<driver name='qemu' type='raw'/>
<source type='pci' managed='yes' namespace='1'>
<address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Following domain configuration changes create two memory bandwidth
monitors: one is monitoring the bandwidth consumed by vCPU 0,
another is for vCPU 5.
```
<cputune>
<memorytune vcpus='0-4'>
<node id='0' bandwidth='20'/>
<node id='1' bandwidth='30'/>
+ <monitor vcpus='0'/>
</memorytune>
+ <memorytune vcpus='5'>
+ <monitor vcpus='5'/>
+ </memorytune>
</cputune>
```
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Huaqiang <huaqiang.wang@intel.com>
Originally, inside <cputune/cachetune>, it requires the <cache> element to
be in the position before <monitor>, and following configuration is not
permitted by schema, but it is better to let it be valid.
<cputune>
<cachetune vcpus='0-1'>
<monitor level='3' vcpus='0-1'/>
^
|__ Not permitted originally because it is in the place
before <cache> element.
<cache id='0' level='3' type='both' size='3' unit='MiB'/>
<cache id='1' level='3' type='both' size='3' unit='MiB'/>
</cachetune>
...
</cputune>
And, let schema do more strict check by identifying following configuration to
be invalid, due to <cachetune> should contain at least one <cache> or <monitor>
element.
<cputune>
<cachetune vcpus='0-1'>
^
|__ a <cachetune> SHOULD contain at least one <cache> or <monitor>
</cachetune>
...
</cputune>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Huaqiang <huaqiang.wang@intel.com>
This flag will allow figuring out whether the hypervisor supports the
incremental backup and checkpoint features.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Prepare for new backup APIs by describing the XML that will represent
a backup. The XML resembles snapshots and checkpoints in being able
to select actions for a set of disks, but has other differences. It
can support both push model (the hypervisor does the backup directly
into the destination file) and pull model (the hypervisor exposes an
access port for a third party to grab what is necessary). Add
testsuite coverage for some minimal uses of the XML.
The <disk> element within <domainbackup> tries to model the same
elements as a <disk> under <domain>, but sharing the RNG grammar
proved to be hairy. That is in part because while <domain> use
<source> to describe a host resource in use by the guest, a backup job
is using a host resource that is not visible to the guest: a push
backup action is instead describing a <target> (which ultimately could
be a remote network resource, but for simplicity the RNG just
validates a local file for now), and a pull backup action is instead
describing a temporary local file <scratch> (which probably should not
be a remote resource). A future refactoring may thus introduce some
way to parameterize RNG to accept <disk type='FOO'>...</disk> so that
the name of the subelement can be <source> for domain, or <target> or
<scratch> as needed for backups. Future patches may improve this area
of code.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since the max unit of virtio scsi disk is 16383, update the range of
driveUnit to it.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Historically we've only supported the <backingStore> as an output-only
element for domain disks. The documentation states that it may become
supported on input. To allow management apps detectin once that happens
add a domain capability which will be asserted if the hypervisor driver
will be able to obey the <backingStore> as configured on input.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The 'ramfb' attribute provides a framebuffer to the guest that can be
used as a boot display for the vgpu
For example, the following configuration can be used to provide a vgpu
with a boot display:
<hostdev mode='subsystem' type='mdev' model='vfio-pci' display='on' ramfb='on'>
<source>
<address uuid='$UUID'/>
</source>
</hostdev>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
The libxl driver exposes a 'hap' feature in the capability XML but our
schema didn't cover it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This commit adds resolution element with parameters 'x' and 'y' into video
XML domain group definition. Both, properties were added into an element
called 'resolution' and it was added inside 'model' element. They are set
as optional. This element does not follow QEMU properties 'xres' and
'yres' format. Both HTML documentation and schema were changed too. This
commit includes a simple test case to cover resolution for QEMU video
models. The new XML format for resolution looks like:
<model ...>
<resolution x='800' y='600'/>
</model>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
This patch adds the implementation of the ccf-assist pSeries
feature, based on the QEMU_CAPS_MACHINE_PSERIES_CAP_CCF_ASSIST
capability that was added in the previous patch.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This device is a very simple framebuffer device supported by qemu that
is mostly intended to use as a boot framebuffer in conjunction with a
vgpu. However, there is also a standalone ramfb device that can be used
as a primary display device and is useful for e.g. aarch64 guests where
different memory mappings between the host and guest can prevent use of
other devices with framebuffers such as virtio-vga.
https://bugzilla.redhat.com/show_bug.cgi?id=1679680 describes the
issues in more detail.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
vhost-user-gpu helper takes --render-node option to specify on which
GPU should the renderning be done.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Accept a new driver name attribute to specify usage of helper process, ex:
<video>
<driver name='vhostuser'/>
<model type='virtio'/>
</video>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Although <interface type='ethernet'> has always been able to use an
existing tap device, this is just a coincidence due to the fact that
the same ioctl is used to create a new tap device or get a handle to
an existing device.
Even then, once we have the handle to the device, we still insist on
doing extra setup to it (setting the MAC address and IFF_UP). That
*might* be okay if libvirtd is running as a privileged process, but if
libvirtd is running as an unprivileged user, those attempted
modifications to the tap device will fail (yes, even if the tap is set
to be owned by the user running libvirtd). We could avoid this if we
knew that the device already existed, but as stated above, an existing
device and new device are both accessed in the same manner, and
anyway, we need to preserve existing behavior for those who are
already using pre-existing devices with privileged libvirtd (and
allowing/expecting libvirt to configure the pre-existing device).
In order to cleanly support the idea of using a pre-existing and
pre-configured tap device, this patch introduces a new optional
attribute "managed" for the interface <target> element. This
attribute is only valid for <interface type='ethernet'> (since all
other interface types have mandatory config that doesn't apply in the
case where we expect the tap device to be setup before we
get it). The syntax would look something like this:
<interface type='ethernet'>
<target dev='mytap0' managed='no'/>
...
</interface>
This patch just adds managed to the grammar and parser for <target>,
but has no functionality behind it.
(NB: when managed='no' (the default when not specified is 'yes'), the
target dev is always a name explicitly provided, so we don't
auto-remove it from the config just because it starts with "vnet"
(VIR_NET_GENERATED_TAP_PREFIX); this makes it possible to use the
same pattern of names that libvirt itself uses when it automatically
creates the tap devices.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Support 'Direct Mode' for Hyper-V Synthetic Timers in domain config.
Make it 'stimer' enlightenment option as it is not a separate thing.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
QEMU version 2.12.1 introduced a performance feature under commit
be7773268d98 ("target-i386: add KVM_HINTS_DEDICATED performance hint")
This patch adds a new KVM feature 'hint-dedicated' to set this performance
hint for KVM guests. The feature is off by default.
To enable this hint and have libvirt add "-cpu host,kvm-hint-dedicated=on"
to the QEMU command line, the following XML code needs to be added to the
guest's domain description in conjunction with CPU mode='host-passthrough'.
<features>
<kvm>
<hint-dedicated state='on'/>
</kvm>
</features>
...
<cpu mode='host-passthrough ... />
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Signed-off-by: Menno Lageman <menno.lageman@oracle.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There is no restriction on maximum value of PCI domain. In fact,
Linux kernel uses plain atomic inc when assigning PCI domains:
drivers/pci/pci.c:static int pci_get_new_domain_nr(void)
drivers/pci/pci.c-{
drivers/pci/pci.c- return atomic_inc_return(&__domain_nr);
drivers/pci/pci.c-}
Of course, this function is called only if kernel was compiled
without PCI domain support or ACPI did not provide PCI domain.
However, QEMU still has the same restriction as us: in
set_pci_host_devaddr() QEMU checks if domain isn't greater than
0xffff. But one can argue that that's a QEMU limitation. We still
want to be able to cope with other hypervisors that don't have
this limitation (possibly).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Prepare for new checkpoint APIs by describing the XML that will
represent a checkpoint. The checkpoint XML is modeled heavily after
virDomainSnapshotPtr. See the docs for more details.
Add testsuite coverage for some minimal uses of the XML (bare minimum,
the sample from html, and a full dumpxml, and some counter-examples
that should fail schema validation). Although use of the REDEFINE flag
will require the <domain> subelement to be present, it is easier for
most of the tests to provide counterpart output produced with the
NO_DOMAIN flag (particularly since synthesizing a valid <domain>
during testing is not trivial).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Extend the TPM device XML parser and XML generator with emulator
state encryption support.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add support for usage type vTPM to secret.
Extend the schema for the Secret to support the vTPM usage type
and add a test case for parsing the Secret with usage type vTPM.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Update schema and configuration to allow specifying new video type of
'bochs'. Add implementation and tests for qemu.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Similarly how we allow adding arbitrary command line arguments and
environment variables this patch introduces the ability to control
libvirt's perception of the qemu process by tweaking the capability bits
for testing purposes.
The idea is to allow developers and users either test a new feature by
enabling it early or disabling it to see whether it introduced
regressions.
This feature is not meant for production use though, so users should
handle it with care.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow using seclabels the same way as disk images allow it. Currently
the snapshot code copies the seclabels from the original image if no
seclabel is provided. Also there's no code change required as the
snapshot XML parser actually uses parts of the disk parser thus
seclabels are already parsed and formatted and even applied thus this is
just a formalization of our support for this.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
According to sPAPR, addresses are 32-bit (8 hex digits) rather
than 64-bit (16 hex digits). Update the schema accordingly.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Hosts for rbd are ceph monitor daemons. These have fixed IP addresses,
so they are often referenced by IP rather than hostname for
convenience, or to avoid relying on DNS. Using IPv4 addresses as the
host name works already, but IPv6 addresses require rbd-specific
escaping because the colon is used as an option separator in the
string passed to librados.
Escape these colons, and enclose the IPv6 address in square brackets
so it is distinguished from the port, which is currently mandatory.
Signed-off-by: Yi Li <yili@winhong.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>