QCOW2 images now support 'extended_l2' which splits the default clusters
into 32 subcluster allocation units. This allows the allocation units to
be smaller without increasing the size of L2 table too much and thus also
the cache requirements for holding the full L2 table in memory.
Unfortunately it's incompatible with qemu versions older than 5.2 thus
can't be used as default.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function was formatted weirdly which prompted additions to conform
to the unusual style.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
After previous commit, it's no longer possible to change nodeset
for strict numatune. Therefore, we can start generating
host-nodes onto command line again.
This partially reverts d73265af6e.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Let's imagine a guest that's configured with strict numatune:
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
For guests with NUMA:
Depending on machine type used (see commit v6.4.0-rc1~75) we
generate either:
1) -object '{"qom-type":"memory-backend-ram","id":"ram-node0",\
"size":20971520,"host-nodes":[0],"policy":"preferred"}' \
-numa node,nodeid=0,cpus=0,memdev=ram-node0
or
2) -numa node,nodeid=0,cpus=0,mem=20480
Later, when QEMU boots up and cpuset CGroup controller is
available we further restrict QEMU there too. But there's a
behaviour difference hidden: while in case 1) QEMU is restricted
from beginning, in case 2) it is not and thus it may happen that
it will allocate memory from different NUMA node and even though
CGroup will try to migrate it, it may fail to do so (e.g. because
memory is locked). Therefore, one can argue that case 2) is
broken. NB, case 2) is exactly what mode 'restrictive' is for.
However, in case 1) we are unable to update QEMU with new
host-nodes, simply because it's lacking a command to do so.
For guests without NUMA:
It's very close to case 2) from above. We have commit
v7.10.0-rc1~163 that prevents us from outputting host-nodes when
generating memory-backend-* for system memory, but that simply
allows QEMU to allocate memory anywhere and then relies on
CGroups to move it to desired location.
Due to all of this, there is no reliable way to change nodeset
for mode 'strict'. Let's forbid it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The whole idea of VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE is that the
memory location is restricted only via CGroups and thus can be
changed on the fly (which is exactly what
qemuDomainSetNumaParamsLive() does. Allow this mode there then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Set the kernel-hashes property on the sev-guest object if the config
asked for it explicitly. While QEMU machine types currently default to
having this setting off, it is not guaranteed to remain this way.
We can't assume that the QEMU capabilities were generated on an AMD host
with SEV, so we must force set the QEMU_CAPS_SEV_GUEST. This also means
that the 'sev' info in the qemuCaps struct might be NULL, but this is
harmless from POV of testing the CLI generator.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This sev-guest object property indicates whether QEMU should
expose the kernel, ramdisk, cmdline hashes to the firmware
for measurement.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Normally the SEV measurement only covers the firmware
loader contents. When doing a direct kernel boot, however,
with new enough OVMF it is possible to ask for the
measurement to cover the kernel, ramdisk and command line.
It can't be done automatically as that would break existing
guests using direct kernel boot with old firmware, so there
is a new XML setting allowing this behaviour to be toggled.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The VNC password authentication scheme is quite horrendous in that it
takes the user password and directly uses it as a DES case. DES is a
byte 8 keyed cipher, so the VNC password can never be more than 8
characters long. Anything over that length will be silently dropped.
We should validate this length restriction when accepting user XML
configs and report an error. For the global VNC password we don't
really want to break daemon startup by reporting an error, but
logging a warning is worthwhile.
https://bugzilla.redhat.com/show_bug.cgi?id=1506689
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Over time, the code using them got split into other files.
(Mostly qemu_interface.c and qemu_process.c)
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
'virDomainDiskInsert' orders the inserted disks by target. If the target
is not provided though it would try to parse it anyways. This lead to a
crash when parsing a definition where there are multiple disks and of
two disks sharing the bus at least one also misses the target.
Since we want to actually use the parser for stuff which doesn't
necessarily need the disk target, we make virDomainDiskInsert tolerant
of missing target instead. The definition will be rejected by the
validator regardless of the order the disks were inserted in.
Fixes: 61fd7174
Closes: https://gitlab.com/libvirt/libvirt/-/issues/257
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 52521de8332c2323bd ("qemu: Use qemuDomainSaveStatus") replaced a call
to virDomainObjSave() with qemuDomainSaveStatus() as a part of cleanup. Since
qemuDomainSaveStatus() does not indicate any failure through its return code,
the error handling cleanup code got eliminated in the process. Thus upon
failure, we will no longer killing the started qemu process. This commit fixes
this by reverting the change that was introduced with the above commit.
Fixes: 52521de8332c2323bd ("qemu: Use qemuDomainSaveStatus")
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Now that we only check whether the dnsmasq version is new enough,
there is no need for the caps field.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Since dnsmasq supports --ra-param for a long time, this code is now
unused.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
All the capabilities should be supported in 2.67.
Make this the minimum version, since even the oldest
distros we support have moved on:
Debian 8: 2.72
CentOS 7: 2.76
Ubuntu 18.04: 2.79
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The admin module is very closely tied to RPC. If we are
building without RPC support there's not much use for the
admin module, in fact it fails to build.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The logging manager is very closely tied to RPC. If we are
building without RPC support there's not much use for the
manager, in fact it fails to build.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Our RPC layer is as tied to XDR as possible. Therefore, if we
haven't detected and XDR library there's not much sense in trying
to build RPC layer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's nothing RPC specific about virnettlscontext.c or
virnetsocket.c. We use TLS for other things than just RPC
encryption (e.g. for generating random numbers) and sockets can
be used even without RPC.
Move these two sources into a static library (virt_socket) so
that other areas can use it even when RPC is disabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When implementing sparse streams, one of improvements I did was
to increase client buffer size for sending/receiving stream data
(commit v1.3.5-rc1~502). Previously, we were using 64KiB buffer
while packets on RPC are 256KiB (usable data is slightly less
because of the header). This meant that it took multiple calls of
virStreamRecv()/virStreamSend() to serve a single packet of data.
In my fix, I've included the virnetprotocol.h file which provides
VIR_NET_MESSAGE_LEGACY_PAYLOAD_MAX macro which is the exact size
of data in a single packet. However, including the file from
libvirt-stream.c which implements public APIs is not right. If
RPC module is not built then the file doesn't exists.
Redefine the macro and drop the include. The size can never
change anyways.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
And its callers. The parameter is no longer used since virDomainObjSave
was replaced with qemuDomainSaveStatus wrapper.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is a nice wrapper around virDomainObjSave which logs a warning, but
otherwise ignores the error. Let's use it where appropriate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When return-path is enabled, QEMU on the source host won't report
completed migration until the destination QEMU sends a confirmation it
successfully loaded all data. Libvirt would detect such situation in the
Finish phase and report the error read from QEMU's stderr back to the
source, but using return-path could give use a bit better error
reporting with an earlier restart of vCPUs on the source.
The capability is only enabled when the connection between QEMU
processes on the source and destination hosts is bidirectional. In other
words, only when VIR_MIGRATE_TUNNELLED is not set, because our tunnel
only allows one-way communication from the source to the destination.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
So far we were enabling specific migration capabilities when a
corresponding API flag is set. We need to generalize our code to be able
to enable some migration capabilities unless a particular API flag is
used.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Different CPU generations have different limits on the number
of SEV/SEV-ES guests that can be run. Since both limits come
from the same overall set, there is typically also BIOS config
to set the tradeoff betweeen SEV and SEV-ES guest limits.
This is important information to expose for a mgmt application
scheduling guests to hosts.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This will be needed directly in the QEMU driver in a later patch.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There are limits on the number of SEV/SEV-ES guests that can
be run on machines, which may be influenced by firmware
settings. This is important to expose to users.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Report extra info about the SEV setup, returning those fields
that are required to calculate the expected launch measurement
HMAC(0x04 || API_MAJOR || API_MINOR || BUILD ||
GCTX.POLICY || GCTX.LD || MNONCE; GCTX.TIK)
specified in section 6.5.1 of AMD Secure Encrypted
Virtualization API.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We're only returning the set of fields needed to perform an
attestation, per the SEV API docs.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Querying launch params on a inactive guest currently triggers
a warning about the monitor being NULL.
https://bugzilla.redhat.com/show_bug.cgi?id=2030437
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since commit 46783e6307, the 'virsh dominfo' command calls
virDomainGetMessages to report any messages from the domain.
Hypervisors not implementing the API now get the following
libvirtd log message when clients invoke 'virsh dominfo'
this function is not supported by the connection driver: virDomainGetMessages
Although libxl currently does not support any tainting or
deprecation messages, provide an implementation to squelch
the previously unseen error message when collecting dominfo.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, this attribute may either have a value of "custom", or be absent
(which defaults to "custom"), for backwards compatibility.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use two variables with automatic cleanup instead of reusing one.
Remove the pointless cleanup label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto and get rid of the cleanup label, as well as the ret
variable.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto where possible, reduce scope of some variables and remove
pointless ret and rc variables.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
On QEMU command line it's represented by the dirty-ring-size
attribute of KVM accelerator.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Dirty ring feature was introduced in qemu-6.1.0, this patch
add the corresponding feature named 'dirty-ring', which enable
dirty ring feature when starting VM.
To enable the feature, the following XML needs to be added to
the guest's domain description:
<features>
<kvm>
<dirty-ring state='on' size='xxx'>
</kvm>
</features>
If property "state=on", property "size" must be specified, which
should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In future commits we will need to store not just an array of
VIR_TRISTATE_SWITCH_* but also an additional integer. Follow the
example of TCG and introduce a structure where both the array an
integer can live.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is no longer anything to initialize at binary startup time.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since the currentBackend (direct vs. firewalld) setting is no longer
used for anything, we don't need to set it (either explicitly from
tests, or implicitly during init), and can completely remove it.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It's unclear exactly why this check exists; possibly a parallel to a
long-removed check for the firewall-cmd binary (added to viriptables.c
with the initial support for firewalld in commit bf156385a0 in 2012,
and long since removed), or possibly because virFirewallOnceInit() was
intended to be called at daemon startup, and it seemed like a good
idea to just log this error once when trying to determine whether to
use firewalld, or direct iptables commands, and then not waste time
building commands that could never be executed. The odd thing is that
it would sometimes result in logging an error when it couldn't find a
binary that wasn't needed anyway (e.g., if all the rules were iptables
rules, but ebtables and/or ip6tables weren't also installed).
If we just remove this check, then virCommandRun() will end up logging
an error and failing if the needed binary isn't found when we try to
execute it, which seems like it should just as good (or at least good
enough, especially since we eventually want to get rid of iptables
completely).
So let's remove it!
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function doesn't have anything to do with manipulating
virFirewall objects, but rather should be called in response to dbus
events about the firewalld service. Move this function into
virfirewalld.c, and rename it to virFirewallDSynchronize().
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function doesn't need to check for a backend - synchronization
with firewalld should always be done whenever firewalld is registered
and available, not just when the firewalld backend is selected.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since commit b19863640 both useful cases of the switch statement in
this function have made the same call (and the other/default case is
just an error that can never happen). Eliminate the switch to help
eliminate use of currentBackend.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than calling these "ADD" and "REMOVE", which could be confused
with some other random items with the same names, make them more
specific by prepending "VIR_NETFILTER_" (because they will also be
used by the nftables backend) and rename them to match the
iptables/nftables operators they signify, i.e. INSERT and DELETE, just
to eliminate confusion (in particular, in case someone ever decides
that we need to also use the nftables "add" operator, which appends a
rule to a chain rather than inserting it at the beginning of the
chain).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function formats an address + prefix as, e.g. 192.168.122.0/24,
which is useful in places other than iptables. Move it to
virsocketaddr.c and make it public so that others can use it. While
moving, the bit that masks off the host bits of the address is made
optional, so that the function is more generally useful.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The network driver has put all its rules into private chains (created
by libvirt) since commit 7431b3eb9a, which was included in
libvirt-5.1.0. When the conversion was made, code was included that
would attempt to delete existing rules in the default chains, to make
it possible to upgrade libvirt without restarting the host OS.
Almost 3 years has passed, and it is doubtful that anyone will be
attempting to upgrade directly from a pre-5.1.0 libvirt to something
as new as 8.0.0 (possibly with the exception of upgrading the entire
OS to a new release, which would include also rebooting), so it is now
safe to remove this code.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto for virCommand and char * and drop the cleanup label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use automatic cleanup for virCommand, steal it on success
and remove the error label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto and remove the ret variable, as well as the cleanup label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto and remove the 'ret' variable, as well as the out label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto and remove the 'ret' variable, as well as the cleanup label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_auto and remove the 'ret' variable, as well as the cleanup label.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Generating command line is pretty easy - just put tb-size=XXX
onto -accel tcg part. Note, that QEMU expects the size in MiB.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/229
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
After previous commit it's possible for domains to fine tune TCG
features (well, just one - tb-cache). Check that domain has TCG
enabled, otherwise the feature makes no sense.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It may come handy to be able to tweak TCG options, in this
specific case the size of translation block cache size (tb-size).
Since we can expect more knobs to tweak let's put them under
common element, like this:
<domain>
<features>
<tcg>
<tb-cache unit='MiB'>128</tb-cache>
</tcg>
</features>
</domain>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When using the monolithic daemon the driver for virStream is
always virFDStreamDrv and thus calling virStreamInData() results
in calling virFDStreamInData().
But things are different with split daemon, especially when a
client connects to one of hypervisor daemons (e.g. virtqemud) and
then lets the daemon connect to the storage daemon for
vol-upload/vol-download. Here, the hypervisor daemon acts like
both client and server. This is reflected by stream->driver
pointing to remoteStreamDrv, which doesn't have streamInData
callback implemented and thus vol-upload/vol-download with sparse
flag fails.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2026537
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The aim of this function is to look at a virNetClientStream and
tell whether the incoming packet (if there's one) contains data
(type VIR_NET_STREAM) or a hole (type VIR_NET_STREAM_HOLE) and
how big the section is. This function will be called from the
remote driver in one of future commits.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>