And remove the superfluous integer return value.
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Error propagation is already there for socket backends. Add it to other
protocols, simplifying code that tests for errors that will never happen.
With all protocols understanding Error, the code can be simplified
further by removing the return value.
Unfortunately, the quality of error messages varies depending
on where the error is detected, because no Error is passed to the
NonBlockingConnectHandler. Thus, the exact error message still cannot
be sent to the user if the OS reports it asynchronously via SO_ERROR.
If NonBlockingConnectHandler received an Error**, we could for
example report the error class and/or message via a new field of the
query-migration command even if it is reported asynchronously.
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: File descriptor named 'ffff' has not been found
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes migration-unix.c again a cut-and-paste job from migration-tcp.c,
exactly as it was in the beginning. :)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The call to migrate_fd_error() was missing for non-socket backends, so
centralize it in qmp_migrate().
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
(The awful error message will be fixed later in the series).
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The migration code is using errp to detect "internal" errors, this means
that it relies on errp being non-NULL.
No impact so far because our only QMP clients (the QMP marshaller and HMP)
never pass a NULL Error **. But if we had others, this patch would make
sure that migration can work with a NULL Error **.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We only use it once, just remove the callback indirection.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
We only used it once, just remove the callback indirection
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Change XBZRLE cache size in bytes (the size should be a power of 2, it will be
rounded down to the nearest power of 2).
If XBZRLE cache size is too small there will be many cache miss.
New query-migrate-cache-size QMP command and 'info migrate_cache_size' HMP
command to query cache value.
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the outgoing migration check to see if the page is cached and
changed, then send compressed page by using save_xbrle_page function.
In the incoming migration check to see if RAM_SAVE_FLAG_XBZRLE is set
and decompress the page (by using load_xbrle function).
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The management can enable/disable a capability for the next migration by using
migrate-set-capabilities QMP command.
The user can use migrate_set_capability HMP command.
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The management can query the current migration capabilities using
query-migrate-capabilities QMP command.
The user can use 'info migrate_capabilities' HMP command.
Currently only XBZRLE capability is available.
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We add time spent for migration to the output of "info migrate"
command. 'total_time' means time since the start fo migration if
migration is 'active', and total time of migration if migration is
completed. As we are also interested in transferred ram when
migration completes, adding all ram statistics
Signed-off-by: Juan Quintela <quintela@redhat.com>
Use help functions in qemu-socket.c for tcp migration,
which already support ipv6 addresses.
Currently errp will be set to UNDEFINED_ERROR when migration fails,
qemu would output "migration failed: ...", and current user can
see a message("An undefined error has occurred") in monitor.
This patch changed tcp_start_outgoing_migration()/inet_connect()
/inet_connect_opts(), socket error would be passed back,
then current user can see a meaningful err message in monitor.
Qemu will exit if listening fails, so output socket error
to qemu stderr.
For IPv6 brackets must be mandatory if you require a port.
Referencing to RFC5952, the recommended format is:
[2312::8274]:5200
test status: Successed
listen side: qemu-kvm .... -incoming tcp:[2312::8274]:5200
client side: qemu-kvm ...
(qemu) migrate -d tcp:[2312::8274]:5200
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Wakeup the guest when the live part of the migation is finished.
This avoids being in suspended state on migration, so we don't
have to save the is_suspended bit.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
The migrate command is one of those commands where HMP and QMP completely
mix up together. This made the conversion to the QAPI (which separates the
command into QMP and HMP parts) a bit difficult.
The first important change to be noticed is that this commit completes the
removal of the Monitor object from migration code, started by the previous
commit.
Another important and tricky change is about supporting the non-detached
mode. That is, if the user doesn't pass '-d' the migrate command will lock
the monitor and will only release it when migration is finished.
To support this in the new HMP command (hmp_migrate()), it is necessary
to create a timer which runs every second and checks if the migration is
still active. If it is, the timer callback will re-schedule itself to run
one second in the future. If the migration has already finished, the
monitor lock is released and the user can use it normally.
All these changes should be transparent to the user.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The Monitor object is passed back and forth within the migration/savevm
code so that it can print errors and progress to the user.
However, that approach assumes a HMP monitor, being completely invalid
in QMP.
This commit drops almost every single usage of the Monitor object, all
monitor_printf() calls have been converted into DPRINTF() ones.
There are a few remaining Monitor objects, those are going to be dropped
by the next commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Notifiers do not need to access both ends of the list, and using
a QLIST also simplifies the API.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also, we now return the qemu_fclose() value unchanged to the caller. For
reference, the migrate_fd_cleanup() callers are the following:
- migrate_fd_completed(): any negative value is considered an
error, so the change is OK.
- migrate_fd_error(): doesn't check the migrate_fd_cleanup() return value
- migrate_fd_cancel(): doesn't check the migrate_fd_cleanup() return
value
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Image files have two types of data: immutable data that describes things like
image size, backing files, etc. and mutable data that includes offset and
reference count tables.
Today, image formats aggressively cache mutable data to improve performance. In
some cases, this happens before a guest even starts. When dealing with live
migration, since a file is open on two machines, the caching of meta data can
lead to data corruption.
This patch addresses this by introducing a mechanism to invalidate any cached
mutable data a block driver may have which is then used by the live migration
code.
NB, this still requires coherent shared storage. Addressing migration without
coherent shared storage (i.e. NFS) requires additional work.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This lets different subsystems register an Error that is thrown whenever
migration is attempted. This works nicely because it gracefully supports
things like hotplug.
Right now, if multiple errors are registered, only one of them is reported.
I expect that for 1.1, we'll extend query-migrate to return all of the reasons
why migration is disabled at any given point in time.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Migration with fd uses s->mon to pass the fd. But we only assign the
s->mon for !detached migration. Fix it. Once there add a comment
indicating that s->mon has two uses.
Bug reported by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
CC: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A simple migration reproduces it:
1. Start the source VM with:
# qemu [...] -S
2. Start the destination VM with:
# qemu <source VM cmd-line> -incoming tcp:0:4444
3. In the source VM:
(qemu) migrate -d tcp:0:4444
4. The source VM will segfault as soon as migration completes (might not
happen in the first try)
What is happening here is that qemu_file_put_notify() can end up closing
's->file' (in which case it's also set to NULL). The call stack is rather
complex, but Eduardo helped tracking it to:
select loop -> migrate_fd_put_notify() -> qemu_file_put_notify() ->
buffered_put_buffer() -> migrate_fd_put_ready() ->
migrate_fd_completed() -> migrate_fd_cleanup().
To be honest, it's not completely clear to me in which cases 's->file'
is not closed (on error maybe)? But I doubt this fix will make anything
worse.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This means we can remove the two forward declarations.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
It is only used inside migration.c, and fields on that struct are
accessed all around the place on that file.
Signed-off-by: Juan Quintela <quintela@redhat.com>
We called it from a single place, and always with state !=
MIG_STATE_ACTIVE. Just remove the whole callback. For users of the
notifier, notice that this is exactly the case where they don't care,
we are just freeing the state from previous failed migration (it can't
be a sucessful one, otherwise we would not be running on that machine
in the first place).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
This function is a bit different of the others that change the state,
in the sense that if migrate_fd_cleanup() returns an error, it set the
status to error, not completed.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Use MIG_STATE_ACTIVE only when migration has really started. Use this
new state to setup migration parameters. Change defines for an
anonymous struct.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once there, remove all parameters that don't need to be passed to
*start_outgoing_migration() functions
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
I have to move two functions postions to avoid forward declarations
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Now the function returned errno, so it is better the new name.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
make functions propagate errno, instead of just using -EIO. Add a
comment about what are the return value of qemu_savevm_state_iterate().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Although migrate_fd_put_buffer() sets MIG_STATE_ERROR if it failed,
since migrate_fd_put_notify() isn't checking error of underlying
QEMUFile, those resources are kept open. This patch checks it and
calls migrate_fd_error() in case of error.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Once there, make sure that if we already know that there is one error,
just call migration_fd_cleanup() with the ERROR state.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
It should be a matter of allowing the transition POSTMIGRATE ->
FINISH_MIGRATE, but it turns out that the VM won't do the
transition the second time because it's already stopped.
So this commit also adds vm_stop_force_state() which performs
the transition even if the VM is already stopped.
While there also allow other states to migrate.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Next commit will convert the query-status command to use the
RunState type as generated by the QAPI.
In order to "transparently" replace the current enum by the QAPI
one, we have to make some changes to some enum values.
As the changes are simple renames, I'll do them in one shot. The
changes are:
- Rename the prefix from RSTATE_ to RUN_STATE_
- RUN_STATE_SAVEVM to RUN_STATE_SAVE_VM
- RUN_STATE_IN_MIGRATE to RUN_STATE_INMIGRATE
- RUN_STATE_PANICKED to RUN_STATE_INTERNAL_ERROR
- RUN_STATE_POST_MIGRATE to RUN_STATE_POSTMIGRATE
- RUN_STATE_PRE_LAUNCH to RUN_STATE_PRELAUNCH
- RUN_STATE_PRE_MIGRATE to RUN_STATE_PREMIGRATE
- RUN_STATE_RESTORE to RUN_STATE_RESTORE_VM
- RUN_STATE_PRE_MIGRATE to RUN_STATE_FINISH_MIGRATE
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Test against RSTATE_IN_MIGRATE instead.
Please, note that the RSTATE_IN_MIGRATE state is only set when all the
initial VM setup is done, while 'incoming_expected' was set right in
the beginning when parsing command-line options. Shouldn't be a problem
as far as I could check.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Currently, only vm_start() and vm_stop() change the VM state.
That's, the state is only changed when starting or stopping the VM.
This commit adds the runstate_set() function, which makes it possible
to also do state transitions when the VM is stopped or running.
Additional states are also added and the current state is stored.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.
One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.
This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
If migration failed in migrate_fd_put_buffer, the monitor may have been
resumed not only in the error path of that function but also once again
in migrate_fd_put_ready which is called unconditionally by
migrate_fd_connect.
Fix this by establishing a cleaner policy: the monitor shall be resumed
when the migration file is closed, either via callback
(migrate_fd_close) or in migrate_fd_cleanup if no file is open (i.e. no
callback invoked).
Reported-By: Michael Tokarev <mjt@tls.msk.ru>
Tested-By: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This allows to pass additional information to the notifier callback
which is useful if sender and receiver do not share any other distinct
data structure.
Will be used first for the clock reset notifier.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Define and use dedicated constants for vm_stop reasons, they actually
have nothing to do with the EXCP_* defines used so far. At this chance,
specify more detailed reasons so that VM state change handlers can
evaluate them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch adds functions to register and unregister notifiers for
migration state changes and a function to query the migration state.
The notifier is called on every state change. Once after establishing a
new migration object (which is in active state then) and once when the
state changes from active to completed, canceled or error.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The no_migrate save state flag is currently only checked in the
last phase of migration. This means that we potentially waste
a lot of time and bandwidth with the live state handlers before
we ever check the no_migrate flags. The error message printed
when we catch a non-migratable device doesn't get printed for
a detached migration. And, no_migrate does nothing to prevent
an incoming migration to a target that includes a non-migratable
device. This attempts to fix all of these.
One notable difference in behavior is that an outgoing migration
now checks for non-migratable devices before ever connecting to
the target system. This means the target will remain listening
rather than exit from failure.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There's no need to flush requests after vmstop
as vmstop does it for us automatically now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
I'd like to disable bandwidth limit or make it very high,
Use int64_t all over to make values >= 4g work.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
Clarify default value of MB in migration speed argument in monitor, if
no suffix is specified. This differ from previous default of bytes,
but is consistent with the rest of the places where we accept a size
argument.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If ->write fails, declare migration status as MIG_STATE_ERROR.
Also, in buffered_file.c, ->close the object in case of an
error.
Fixes "migrate -d "exec:dd of=file", where dd fails to open file.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a 'cont' is issued on a VM that's just waiting for an incoming
migration, the VM reboots and boots into the guest, possibly corrupting
its storage since it could be shared with another VM running elsewhere.
Ensure that a VM started with '-incoming' is only run when an incoming
migration successfully completes.
A new qerror, QERR_MIGRATION_EXPECTED, is added to signal that 'cont'
failed due to no incoming migration has been attempted yet.
Reported-by: Laine Stump <laine@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Historically, user monitor arguments beginning with '-' (eg. '-f')
were passed as integers down to handlers.
I've maintained this behavior in the new monitor because we didn't
have a boolean type at the very beginning of QMP. Today we have it
and this behavior is causing trouble to QMP's argument checker.
This commit fixes the problem by doing the following changes:
1. User Monitor
Before: the optional arg was represented as a QInt, we'd pass 1
down to handlers if the user specified the argument or
0 otherwise
This commit: the optional arg is represented as a QBool, we pass
true down to handlers if the user specified the
argument, otherwise _nothing_ is passed
2. QMP
Before: the client was required to pass the arg as QBool, but we'd
convert it to QInt internally. If the argument wasn't passed,
we'd pass 0 down
This commit: still require a QBool, but doesn't do any conversion and
doesn't pass any default value
3. Convert existing handlers (do_eject()/do_migrate()) to the new way
Before: Both handlers would expect a QInt value, either 0 or 1
This commit: Change the handlers to accept a QBool, they handle the
following cases:
A) true is passed: the option is enabled
B) false is passed: the option is disabled
C) nothing is passed: option not specified, use
default behavior
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Although there is no difference, other migration related code use
qemu_free(), and it should be better to be consistent.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch makes sure that if the exec: process exits with a non-zero return
status, we treat the migration as failed.
This fixes https://bugs.launchpad.net/qemu/+bug/391879
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Previous commit added QMP documentation to the qemu-monitor.hx
file, it's is a copy of this information.
While it's good to keep it near code, maintaining two copies of
the same information is too hard and has little benefit as we
don't expect client writers to consult the code to find how to
use a QMP command.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
While there I'm also dropping a unneeded else clause (the last
one in the function).
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The following handlers always succeed and hence can be converted
to cmd_new_ret() in the same commit.
- do_stop()
- do_quit()
- do_system_reset()
- do_system_powerdown()
- do_migrate_cancel()
- do_qmp_capabilities()
- do_migrate_set_speed()
- do_migrate_set_downtime()
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It's not needed to check the return of qobject_from_jsonf()
anymore, as an assert() has been added there.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>