Commit Graph

84 Commits

Author SHA1 Message Date
Fam Zheng e4654d2d94 block: per caller dirty bitmap
Previously a BlockDriverState has only one dirty bitmap, so only one
caller (e.g. a block job) can keep track of writing. This changes the
dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the
lifecycle is managed with these new functions:

    bdrv_create_dirty_bitmap
    bdrv_release_dirty_bitmap

Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap.

In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument
is added to these functions, since each caller has its own dirty bitmap:

    bdrv_get_dirty
    bdrv_dirty_iter_init
    bdrv_get_dirty_count

bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will
internally walk the list of all dirty bitmaps and set them one by one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:33 +01:00
Peter Lieven d32f35cbc5 block: introduce BDRV_REQ_MAY_UNMAP request flag
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Fam Zheng 8442cfd034 migration: omit drive ref as we have bdrv_ref now
block-migration.c does not actually use DriveInfo anywhere.  Hence it's
safe to drive ref code, we really only care about referencing BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Peter Lieven 323004a39d block-migration: efficiently encode zero blocks
this patch adds a efficient encoding for zero blocks by
adding a new flag indicating a block is completely zero.

additionally bdrv_write_zeros() is used at the destination
to efficiently write these zeroes. depending on the implementation
this avoids that the destination target gets fully provisioned.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Paolo Bonzini 9b09503752 migration: run setup callbacks out of big lock
Only the migration_bitmap_sync() call needs the iothread lock.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini 32c835ba39 migration: run pending/iterate callbacks out of big lock
This makes it possible to do blocking writes directly to the socket,
with no buffer in the middle.  For RAM, only the migration_bitmap_sync()
call needs the iothread lock.  For block migration, it is needed by
the block layer (including bdrv_drain_all and dirty bitmap access),
but because some code is shared between iterate and complete, all of
mig_save_device_dirty is run with the lock taken.

In the savevm case, the iterate callback runs within the big lock.
This is annoying because it complicates the rules.  Luckily we do not
need to do anything about it: the RAM iterate callback does not need
the iothread lock, and block migration never runs during savevm.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini 52e850dea9 block-migration: add lock
Some state is shared between the block migration code and its AIO
callbacks.  Once block migration will run outside the iothread,
the block migration code and the AIO callbacks will be able to
run concurrently.  Protect the critical sections with a separate
lock.  Do the same for completed_sectors, which can be used from
the monitor.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini 323920c4ea block-migration: document usage of state across threads
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini 13197e3cba block-migration: small preparatory changes for locking
Some small changes that will simplify the positioning of lock/unlock
primitives.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini a55ce1c851 block-migration: remove variables that are never read
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini d418cf57a3 block-migration: remove useless calls to blk_mig_cleanup
Now that the cancel callback is called consistently for all errors,
we can avoid doing its work in the other callbacks.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:00 +01:00
Stefan Hajnoczi 6aaa9dae80 block-migration: fix pending() and iterate() return values
The return value of .save_live_pending() is the number of bytes
remaining.  This is just an estimate because we do not know how many
blocks will be dirtied by the running guest.

Currently our return value for .save_live_pending() is wrong because it
includes dirty blocks but not in-flight bdrv_aio_readv() requests or
unsent blocks.  Crucially, it also doesn't include the bulk phase where
the entire device is transferred - therefore we risk completing block
migration before all blocks have been transferred!

The return value of .save_live_iterate() is the number of bytes
transferred this iteration.  Currently we return whether there are bytes
remaining, which is incorrect.

Move the bytes remaining calculation into .save_live_pending() and
really return the number of bytes transferred this iteration in
.save_live_iterate().

Also fix the %ld format specifier which was used for a uint64_t
argument.  PRIu64 must be use to avoid warnings on 32-bit hosts.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 1360661835-28663-3-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-12 16:26:44 -06:00
Stefan Hajnoczi 2c5a7f2011 block-migration: fix block_save_iterate() return value
The .save_live_iterate() function returns 0 to continue iterating or 1
to stop iterating.

Since 16310a3cca it only ever returns 0,
leading to an infinite loop.

Return 1 if we have finished sending dirty blocks.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1360534366-26723-4-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:05 -06:00
Stefan Hajnoczi 9ee0cb201e block-migration: fix blk_mig_save_dirty_block() return value checking
Commit 43be3a25c9 changed the
blk_mig_save_dirty_block() return code handling.  The function's doc
comment says:

  /* return value:
   * 0: too much data for max_downtime
   * 1: few enough data for max_downtime
   */

Because of the 1 return value, callers must check for ret < 0 instead of
just:

  if (ret) { ... }

We do not want to bail when 1 is returned, only on error.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360534366-26723-3-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:04 -06:00
Stefan Hajnoczi d5f1f286ef block-migration: improve "Unknown flags" error message
Show the actual flags value and include "block migration" in the error
message so it's clear where the error is coming from.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360534366-26723-2-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:04 -06:00
Paolo Bonzini 50717e941b block: allow customizing the granularity of the dirty bitmap
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:34 +01:00
Paolo Bonzini acc906c6c5 block: return count of dirty sectors, not chunks
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:33 +01:00
Juan Quintela e4ed1541ac savevm: New save live migration method: pending
Code just now does (simplified for clarity)

    if (qemu_savevm_state_iterate(s->file) == 1) {
       vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
       qemu_savevm_state_complete(s->file);
    }

Problem here is that qemu_savevm_state_iterate() returns 1 when it
knows that remaining memory to sent takes less than max downtime.

But this means that we could end spending 2x max_downtime, one
downtime in qemu_savevm_iterate, and the other in
qemu_savevm_state_complete.

Changed code to:

    pending_size = qemu_savevm_state_pending(s->file, max_size);
    DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
    if (pending_size >= max_size) {
        ret = qemu_savevm_state_iterate(s->file);
     } else {
        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
        qemu_savevm_state_complete(s->file);
     }

So what we do is: at current network speed, we calculate the maximum
number of bytes we can sent: max_size.

Then we ask every save_live section how much they have pending.  If
they are less than max_size, we move to complete phase, otherwise we
do an iterate one.

This makes things much simpler, because now individual sections don't
have to caluclate the bandwidth (it was implossible to do right from
there).

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-20 23:09:25 +01:00
Paolo Bonzini 9c17d615a6 softmmu: move include files to include/sysemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:45 +01:00
Paolo Bonzini 1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini caf71f86a3 migration: move include files to include/migration/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:32 +01:00
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Juan Quintela 43be3a25c9 block-migration: handle errors with the return codes correctly
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Juan Quintela ceb2bd09a1 block-migration: Switch meaning of return value
Make consistent the result of blk_mig_save_dirty_block() and
mig_save_device_dirty()

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Juan Quintela 59feec4247 block-migration: make flush_blks() return errors
This means we don't need to pass through qemu_file to get the errors.
Adjust all callers.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Kevin Wolf 946d58be15 block-migration: Flush requests in blk_mig_cleanup
When cancelling block migration, all in-flight requests of the block
migration must be completed before the data can be freed. This was
visible as failing assertions and segfaults.

Reported-by: Peter Lieven <pl@dlhnet.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28 17:43:28 +02:00
Juan Quintela 16310a3cca savevm: split save_live into stage2 and stage3
We split it into 2 functions, foo_live_iterate, and foo_live_complete.
At this point, we only remove the bits that are for the other stage,
functionally this is equivalent to previous code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela d1315aac6e savevm: split save_live_setup from save_live_state
This patch splits stage 1 to its own function for both save_live
users, ram and block.  It is just a copy of the function, removing the
parts of the other stages.  Optimizations would came later.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela 6bd6878133 savevm: introduce is_active method
Enable the creation of a method to tell migration if that section is
active and should be migrate.  We use it for blk-migration, that is
normally not active.  We don't create the method for RAM, as setups
without RAM are very strange O:-)

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela 9b5bfab05f savevm: Refactor cancel operation in its own operation
Intead of abusing stage with value -1.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela 7908c78d3e savevm: Live migration handlers register the struct directly
Notice that the live migration users never unregister, so no problem
about freeing the ops structure.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Isaku Yamahata 6607ae235b Add MigrationParams structure
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
2012-06-29 13:18:21 +02:00
Luiz Capitulino 539de1246d Purge migration of (almost) everything to do with monitors
The Monitor object is passed back and forth within the migration/savevm
code so that it can print errors and progress to the user.

However, that approach assumes a HMP monitor, being completely invalid
in QMP.

This commit drops almost every single usage of the Monitor object, all
monitor_printf() calls have been converted into DPRINTF() ones.

There are a few remaining Monitor objects, those are going to be dropped
by the next commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-03-15 10:39:52 -03:00
Paolo Bonzini 6b620ca3b0 prepare for future GPLv2+ relicensing
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-01-13 10:55:56 -06:00
Paolo Bonzini ad54ae80c7 block: bdrv_aio_* do not return NULL
Initially done with the following semantic patch:

@ rule1 @
expression E;
statement S;
@@
  E =
(
   bdrv_aio_readv
|  bdrv_aio_writev
|  bdrv_aio_flush
|  bdrv_aio_discard
|  bdrv_aio_ioctl
)
     (...);
(
- if (E == NULL) { ... }
|
- if (E)
    { <... S ...> }
)

which however missed the occurrence in block/blkverify.c
(as it should have done), and left behind some unused
variables.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:07 +01:00
Stefan Hajnoczi 922453bca6 block: convert qemu_aio_flush() calls to bdrv_drain_all()
Many places in QEMU call qemu_aio_flush() to complete all pending
asynchronous I/O.  Most of these places actually want to drain all block
requests but there is no block layer API to do so.

This patch introduces the bdrv_drain_all() API to wait for requests
across all BlockDriverStates to complete.  As a bonus we perform checks
after qemu_aio_wait() to ensure that requests really have finished.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:56:06 +01:00
Stefan Weil 4238e26416 Fix some spelling bugs in documentation and comments
These errors were detected by codespell:

remaing -> remaining
soley -> solely
virutal -> virtual
seperate -> separate

libcacard.txt still needs some more patches.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-11-17 12:57:36 +00:00
Juan Quintela 2975725f6b migration: make *save_live return errors
Make *save_live() return negative values when there is one error, and
updates all callers to check for the error.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Juan Quintela 42802d47dd migration: use qemu_file_get_error() return value when possible
Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Juan Quintela 624b9cc209 migration: rename qemu_file_has_error to qemu_file_get_error
Now the function returned errno, so it is better the new name.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2011-10-20 13:23:52 +02:00
Juan Quintela dcd1d224df migration: change has_error to contain errno values
We normally already have an errno value.  When not, abuse EIO.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Anthony Liguori 7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Markus Armbruster 6daf194dde Strip trailing '\n' from error_report()'s first argument
error_report() prepends location, and appends a newline.  The message
constructed from the arguments should not contain a newline.  Fix the
obvious offenders.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-24 09:13:36 +01:00
Avishay Traeger ff5c52a379 Improve accuracy of block migration bandwidth calculation
block_mig_state.total_time is currently the sum of the read request
latencies.  This is not very accurate because block migration uses aio and
so several requests can be submitted at once.  Bandwidth should be computed
with wall-clock time, not by adding the latencies.  In this case,
"total_time" has a higher value than it should, and so the computed
bandwidth is lower than it is in reality.  This means that migration can
take longer than it needs to.
However, we don't want to use pure wall-clock time here.  We are computing
bandwidth in the asynchronous phase, where the migration repeatedly wakes
up and sends some aio requests.  The computed bandwidth will be used for
synchronous transfer.

Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-27 14:36:57 +02:00
Avishay Traeger 155eb9aa09 Fix integer overflow in block migration bandwidth calculation
block_mig_state.reads is an int, and multiplying by BLOCK_SIZE yielded a
negative number, resulting in a negative bandwidth (running on a 32-bit
machine).  Change order to avoid.

Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-07 13:51:48 +02:00
Marcelo Tosatti 8591675f44 block: enable in_use flag
Set block device in use during block migration, disallow drive_del and
bdrv_truncate for in use devices.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-07 12:51:19 +01:00
Marcelo Tosatti f48905d44f block-migration: add reference to target DriveInfo
So that ejection of attached device by guest does not free data
in use by block migration instance.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-07 12:51:19 +01:00
Marcelo Tosatti 8f794c557c block-migration: actually disable dirty tracking on cleanup
Call to set_dirty_tracking() is misplaced.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-07 12:51:19 +01:00
Pierre Riteau 77358b59f6 Fix block migration when the device size is not a multiple of 1 MB
b02bea3a85 added a check on the return
value of bdrv_write and aborts migration when it fails. However, if the
size of the block device to migrate is not a multiple of BLOCK_SIZE
(currently 1 MB), the last bdrv_write will fail with -EIO.

Fixed by calling bdrv_write with the correct size of the last block.

Signed-off-by: Pierre Riteau <Pierre.Riteau@irisa.fr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 16:41:50 +01:00