openkylin/qemu - qemu - 红山开源项目托管

Commit Graph

Author	SHA1	Message	Date
Fam Zheng	cb6414dfec	vhdx: Use QEMU UUID API This removes our dependency to libuuid, so that the driver can always be built. Similar to how we handled data plane configure options, --enable-vhdx and --disable-vhdx are also changed to a nop with a message saying it's obsolete. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-Id: <1474432046-325-4-git-send-email-famz@redhat.com>	2016-09-23 11:42:52 +08:00
Colin Lord	4be4879ff8	blockdev: Modularize nfs block driver Modularizes the nfs block driver so that it gets dynamically loaded. Signed-off-by: Colin Lord <clord@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1471008424-16465-5-git-send-email-clord@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-09-20 22:12:57 +02:00
Marc Mari	88d88798b7	blockdev: Add dynamic module loading for block drivers Extend the current module interface to allow for block drivers to be loaded dynamically on request. The only block drivers that can be converted into modules are the drivers that don't perform any init operation except for registering themselves. In addition, only the protocol drivers are being modularized, as they are the only ones which see significant performance benefits. The format drivers do not generally link to external libraries, so modularizing them is of no benefit from a performance perspective. All the necessary module information is located in a new structure found in module_block.h This spoils the purpose of `5505e8b76f` (block/dmg: make it modular). Before this patch, if module build is enabled, block-dmg.so is linked to libbz2, whereas the main binary is not. In downstream, theoretically, it means only the qemu-block-extra package depends on libbz2, while the main QEMU package needn't to. With this patch, we (temporarily) change the case so that the main QEMU depends on libbz2 again. Signed-off-by: Marc Marí <markmb@redhat.com> Signed-off-by: Colin Lord <clord@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1471008424-16465-4-git-send-email-clord@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> [mreitz: Do a signed comparison against the length of block_driver_modules[], so it will not cause a compile error when empty] Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-09-20 22:12:03 +02:00
Wen Congyang	29ff789060	replication: Implement new driver for block replication Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-id: 1469602913-20979-10-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-13 11:00:56 +01:00
Wen Congyang	258854ad1d	block: Link backup into block core Some programs that add a dependency on it will use the block layer directly. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1469602913-20979-5-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-13 11:00:56 +01:00
Kevin Wolf	83fd6dd3e7	block: Move bdrv_commit() to block/commit.c No code changes, just moved from one file to another. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Pavel Dovgalyuk	63785678f3	replay: introduce block devices record/replay This patch introduces block driver that implement recording and replaying of block devices' operations. All block completion operations are added to the queue. Queue is flushed at checkpoints and information about processed requests is recorded to the log. In replay phase the queue is matched with events read from the log. Therefore block devices requests are processed deterministically. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> [ kwolf: Rebased onto modified and already applied part of the series ] Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-03-30 12:15:57 +02:00
Daniel P. Berrange	78368575a6	block: add generic full disk encryption driver Add a block driver that is capable of supporting any full disk encryption format. This utilizes the previously added block encryption code, and at this time supports the LUKS format. The driver code is capable of supporting any format supported by the QCryptoBlock module, so it registers one block driver for each format. This patch only registers the "luks" driver since the "qcow" driver is there only for back-compatibility with existing qcow built-in encryption. New LUKS compatible volumes can be formatted using qemu-img with defaults for all settings. $ qemu-img create --object secret,data=123456,id=sec0 \ -f luks -o key-secret=sec0 demo.luks 10G Alternatively the cryptographic settings can be explicitly set $ qemu-img create --object secret,data=123456,id=sec0 \ -f luks -o key-secret=sec0,cipher-alg=aes-256,\ cipher-mode=cbc,ivgen-alg=plain64,hash-alg=sha256 \ demo.luks 10G And query its size $ qemu-img info demo.img image: demo.img file format: luks virtual size: 10G (10737418240 bytes) disk size: 132K encrypted: yes Note that it was not necessary to provide the password when querying info for the volume. The password is only required when performing I/O on the volume All volumes created by this new 'luks' driver should be capable of being opened by the kernel dm-crypt driver. The only algorithms listed in the LUKS spec that are not currently supported by this impl are sha512 and ripemd160 hashes and cast6 cipher. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> [ kwolf - Added #include to resolve conflict with `da34e65c` ] Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-03-30 12:11:26 +02:00
Fam Zheng	ebab225910	block: Move block dirty bitmap code to separate files The only code change is making bdrv_dirty_bitmap_truncate public. It is used in block.c. Also two long lines (bdrv_get_dirty) are wrapped. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1457412306-18940-5-git-send-email-famz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-03-14 17:35:05 +01:00
Daniel P. Berrange	488981a4af	block: convert quorum blockdrv to use crypto APIs Get rid of direct use of gnutls APIs in quorum blockdrv in favour of using the crypto APIs. This avoids the need to do conditional compilation of the quorum driver. It can simply report an error at file open file instead if the required hash algorithm isn't supported by QEMU. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1435770638-25715-8-git-send-email-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-08 13:11:01 +02:00
Alberto Garcia	2ff1f2e3a3	throttle: Add throttle group infrastructure Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Stefan Hajnoczi	61007b316c	block: move I/O request processing to block/io.c The block.c file has grown to over 6000 lines. It is time to split this file so there are fewer conflicts and the code is easier to maintain. Extract I/O request processing code: * Read * Write * Zero writes and making the image empty * Flush * Discard * ioctl * Tracked requests and queuing * Throttling and copy-on-read * Block status and allocated functions * Refreshing block limits * Reading/writing vmstate * qemu_blockalign() and friends The patch simply moves code from block.c into block/io.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:17 +02:00
Michael Tokarev	5505e8b76f	block/dmg: make it modular dmg can optionally utilize libbz2, make it modular Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:11 +02:00
Peter Wu	6b383c08c4	block/dmg: support bzip2 block entry types This patch adds support for bzip2-compressed block entries as introduced with OS X 10.4 (source: https://en.wikipedia.org/wiki/Apple_Disk_Image). It was tested against a 5.2G "OS X Yosemite" installation image which stores the BLXX block in the XML property list (instead of resource forks) and has over 5k chunks. New configure entries are added (--enable-bzip2 / --disable-bzip2) to control inclusion of bzip2 functionality (which requires linking against libbz2). The help message suggests that this option is needed for DMG files, but the tests are generic enough that other parts of QEMU can use bzip2 if needed. The identifiers are based on http://newosxbook.com/DMG.html. The decompression routines are based on the zlib case, but as there is no way to reset the decompression state (unlike zlib), memory is allocated and deallocated for every decompression. This should not be problematic as the decompression takes most of the time and as blocks are typically about/over 1 MiB in size, only one allocation is done every 2000 sectors. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1420566495-13284-12-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Francesco Romani	e2462113b2	block: add event when disk usage exceeds threshold Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. To let the guest run smoothly and be not unnecessarily paused, oVirt sets a disk usage threshold (so called 'high water mark') based on the occupation of the device, and automatically extends the image once the threshold is reached or exceeded. In order to detect the crossing of the threshold, oVirt has no choice but aggressively polling the QEMU monitor using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: deployments with hundreds of VMs are no longer rare. To fix this, this patch adds: * A new monitor command `block-set-write-threshold', to set a mark for a given block device. * A new event `BLOCK_WRITE_THRESHOLD', to report if a block device usage exceeds the threshold. * A new `write_threshold' field into the `BlockDeviceInfo' structure, to report the configured threshold. This will allow the managing application to use smarter and more efficient monitoring, greatly reducing the need of polling. [Updated qemu-iotests 067 output to add the new 'write_threshold' property. --Stefan] [Changed g_assert_false() to !g_assert() to fix the build on older glib versions. --Kevin] Signed-off-by: Francesco Romani <fromani@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Max Reitz	d4a3238af5	qemu-img: Implement commit like QMP qemu-img should use QMP commands whenever possible in order to ensure feature completeness of both online and offline image operations. As qemu-img itself has no access to QMP (since this would basically require just everything being linked into qemu-img), imitate QMP's implementation of block-commit by using commit_active_start() and then waiting for the block job to finish. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1414159063-25977-9-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-11-03 11:41:48 +00:00
Markus Armbruster	26f54e9a3c	block: New BlockBackend A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit `48f364d` "blockdev: Refuse to drive_del something added with blockdev-add" and commit `2d246f0` "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-10-20 13:41:26 +02:00
Stefan Hajnoczi	550830f935	block: delete cow block driver This patch removes support for the cow file format. Normally we do not break backwards compatibility but in this case there is no impact and it is the most logical option. Extraordinary claims require extraordinary evidence so I will show why removing the cow block driver is the right thing to do. The cow file format is the disk image format for Usermode Linux, a way of running a Linux system in userspace. The performance of UML was never great and it was hacky, but it enjoyed some popularity before hardware virtualization support became mainstream. QEMU's block/cow.c is supposed to read this image file format. Unfortunately the file format was underspecified: 1. Earlier Linux versions used the MAXPATHLEN constant for the backing filename field. The value of MAXPATHLEN can change, so Linux switched to a 4096 literal but QEMU has a 1024 literal. 2. Padding was not used on the header struct (both in the Linux kernel and in QEMU) so the struct layout varied across architectures. In particular, i386 and x86_64 were different due to int64_t alignment differences. Linux now uses __attribute__((packed)), QEMU does not. Therefore: 1. QEMU cow images do not conform to the Linux cow image file format. 2. cow images cannot be shared between different host architectures. This means QEMU cow images are useless and QEMU has not had bug reports from users actually hitting these issues. Let's get rid of this thing, it serves no purpose and no one will be affected. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1410877464-20481-1-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-09-22 11:39:45 +01:00
Fam Zheng	e819ab2277	block: Introduce "null" drivers This is an analogue to Linux null_blk. It can be used for testing or benchmarking block device emulation and general block layer functionalities such as coroutines and throttling, where disk IO is not necessary or wanted. Use null-aio:// for AIO version, and null-co:// for coroutine version. [Resolved conflict with Fam's async bdrv_aio_cancel() series: 1. Drop .bdrv_aio_cancel() since it is now done by block.c 2. Rename qemu_aio_release() to qemu_aio_unref() --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1410415798-20673-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-09-22 11:39:23 +01:00
Benoît Canet	5e5a94b605	block: Extract the block accounting code The plan is to add new accounting metrics (latency, invalid requests, failed requests, queue depth) and block.c is overpopulated so it will be better to work in a separate module. Moreover the long term plan is to have statistics in each of the BDS of the graph for metrology purpose; this means that the device model statistics must move from the topmost BDS to the device model. So we need to decouple the statistic code from BlockDriverState. This is another argument for the extraction of the code in a separate module. CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Benoit Canet <benoit@irqsave.net> CC: Fam Zheng <famz@redhat.com> CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Benoît Canet <benoit.canet@nodalink.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-09-10 10:41:29 +02:00
Paolo Bonzini	b493317d34	aio-win32: add support for sockets Uses the same select/WSAEventSelect scheme as main-loop.c. WSAEventSelect() is edge-triggered, so it cannot be used directly, but it is still used as a way to exit from a blocking g_poll(). Before g_poll() is called, we poll sockets with a non-blocking select() to achieve the level-triggered semantics we require: if a socket is ready, the g_poll() is made non-blocking too. Based on a patch from Or Goshen. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-08-29 10:46:58 +01:00
Chrysostomos Nanakos	c9a12e751b	block: Support Archipelago as a QEMU block backend VM Image on Archipelago volume is specified like this: file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[, file.vport=<vlmcd_port>][,file.segment=<segment_name>]] 'archipelago' is the protocol. 'mport' is the port number on which mapperd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'vport' is the port number on which vlmcd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'segment' is the name of the shared memory segment Archipelago stack is using. This is optional and if not specified, QEMU will make Archipelago to use the default value, 'archipelago'. Examples: file.driver=archipelago,file.volume=my_vm_volume file.driver=archipelago,file.volume=my_vm_volume,file.mport=123 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234,file.segment=my_segment Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-08-15 15:07:14 +02:00
Peter Maydell	e7a1d6c52a	Block patches -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTB8hAAAoJEH8JsnLIjy/WQdgP/jEu5baA1/qKanDsS9l+81u1 /sIYSWpHDEJ0uavqTMBeyMOwkzel7SZRusIwA/d5pMqxbY6/86YJumTTozFWvtqc IABqHtRKCxjcLdZRPbkuNAOiw6p76vSZa543o2t8OAhK2DIFy530wWXeoQEYvuJX 4pOh0lTradOrF1z6uW4ozgQ1efPppwh/iqwfWWNJVTgfnWxJk6qQaATEgkuSdsUN Wp78UzOxLGO6JKJB6kP3LfNL0ANTYHpfH2/wkE6cW6TkSUduOm6hIBY+tb9khqYt INOKxqFADK6EOgjvJBsZuZUtOnHK5oM921LepN/mOPAs6gKcn2j+FfqJrl3I1/5M AXM3M0FPuijEKPGWw7pCLt7j84KJkD9a/rsKO37yRzw17fOma2Rpr4TrX43BF+5t CGqQ7PzDJ6Fng4EXjyNDzviwXIK8xmG1tfn92tq/BUd6OuM9MCyzEGvEiGOMBoXv w4iOV7UC+1P3TjnTBhMlBVGywSfdOJoHr9k4lXGNp0h8fPhM9rfruI3BFysxaas6 GmKbd7yvKwXOTptd3I9SB8BzVUL3CcD3FK24+cWKAl8GgyiDIWRlvBYyMp3p8Z8f NDzcxYP6aRGsoddvpIWr3Tz89hw5wTW5u3RmNgxJUguz6HYKFbl30dpGT+96q2BN YIAANTdPxn7BP6r3glQH =ZaDG -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block patches # gpg: Signature made Fri 21 Feb 2014 21:42:24 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (54 commits) iotests: Mixed quorum child device specifications quorum: Simplify quorum_open() quorum: Add unit test. quorum: Add quorum_open() and quorum_close(). quorum: Implement recursive .bdrv_recurse_is_first_non_filter in quorum. quorum: Add quorum_co_flush(). quorum: Add quorum_invalidate_cache(). quorum: Add quorum_getlength(). quorum: Add quorum mechanism. quorum: Add quorum_aio_readv. blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify. quorum: Add quorum_aio_writev and its dependencies. quorum: Create BDRVQuorumState and BlkDriver and do init. quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB. check-qdict: Test termination of qdict_array_split() check-qdict: Adjust test for qdict_array_split() qdict: Extract non-QDicts in qdict_array_split() qemu-config: Sections must consist of keys qemu-iotests: Check qemu-img command line parsing qemu-img: Allow -o help with incomplete argument list ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-02-25 10:50:11 +00:00
Benoît Canet	95c6bff356	quorum: Add quorum mechanism. This patchset enables the core of the quorum mechanism. The num_children reads are compared to get the majority version and if this version exists more than threshold times the guest won't see the error at all. If a block is corrupted or if an error occurs during an IO or if the quorum cannot be established QMP events are used to report to the management. Use gnutls's SHA-256 to compare versions. --enable-quorum must be used to enable the feature. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-02-21 22:29:50 +01:00
Benoît Canet	27cec15e4e	quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB. Quorum is a block filter mirroring writes to num_children children. For reads quorum reads each children and does a vote. If more than vote_threshold versions are identical the quorum is reached and this winning version is returned to the guest. So quorum prevents bit corruption. For high availability purpose minority errors are reported via QMP but the guest does not see them. This patch creates the driver C source file and introduces the structures that will be used in asynchronous reads and writes. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-02-21 22:29:48 +01:00
Fam Zheng	6ebc91e5d0	block: use per-object cflags and libs No longer adds flags and libs for them to global variables, instead create config-host.mak variables like FOO_CFLAGS and FOO_LIBS, which is used as per object cflags and libs. This removes unwanted dependencies from libcacard. Signed-off-by: Fam Zheng <famz@redhat.com> [Split from Fam's patch to enable modules. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-02-20 13:12:54 +01:00
Peter Lieven	6542aa9c75	block: add native support for NFS This patch adds native support for accessing images on NFS shares without the requirement to actually mount the entire NFS share on the host. NFS Images can simply be specified by an url of the form: nfs://<host>/<export>/<filename>[?param=value[&param2=value2[&...]]] For example: qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2 You need LibNFS from Ronnie Sahlberg available at: git://github.com/sahlberg/libnfs.git for this to work. During configure it is automatically probed for libnfs and support is enabled on-the-fly. You can forbid or enforce libnfs support with --disable-libnfs or --enable-libnfs respectively. Due to NFS restrictions you might need to execute your binaries as root, allow them to open priviledged ports (<1024) or specify insecure option on the NFS server. For additional information on ROOT vs. non-ROOT operation and URL format + parameters see: https://raw.github.com/sahlberg/libnfs/master/README Supported by qemu are the uid, gid and tcp-syncnt URL parameters. LibNFS currently support NFS version 3 only. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-02-09 09:12:38 +01:00
Marc-André Lureau	2302c1cafb	Split nbd block client code Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2013-12-16 10:12:20 +01:00
Jeff Cody	0a43a1b5d7	block: vhdx - log parsing, replay, and flush support This adds support for VHDX v0 logs, as specified in Microsoft's VHDX Specification Format v1.00: https://www.microsoft.com/en-us/download/details.aspx?id=34750 The following support is added: * Log parsing, and validation - validate that an existing log is correct. * Log search - search through an existing log, to find any valid sequence of entries. * Log replay and flush - replay an existing log, and flush/clear the log when complete. The VHDX log is a circular buffer, with elements (sectors) of 4KB. A log entry is a variably-length number of sectors, that is comprised of a header and 'descriptors', that describe each sector. A log may contain multiple entries, know as a log sequence. In a log sequence, each log entry immediately follows the previous entry, with an incrementing sequence number. There can only ever be one active and valid sequence in the log. Each log entry must match the file log GUID in order to be valid (along with other criteria). Once we have flushed all valid log entries, we marked the file log GUID to be zero, which indicates a buffer with no valid entries. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-11-07 13:58:58 +01:00
Jeff Cody	0f48e8f097	block: vhdx - break endian translation functions out This moves the endian translation functions out from the vhdx.c source, into a separate source file. In addition to the previously defined endian functions, new endian translation functions for log support are added as well. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-11-07 13:58:58 +01:00
Jeff Cody	4f18b7824a	block: vhdx - add header update capability. This adds the ability to update the headers in a VHDX image, including generating a new MS-compatible GUID. As VHDX depends on uuid.h, VHDX is now a configurable build option. If VHDX support is enabled, that will also enable uuid as well. The default is to have VHDX enabled. To enable/disable VHDX: --enable-vhdx, --disable-vhdx Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-11-07 13:58:58 +01:00
Laszlo Ersek	7a6d3fc594	switch raw block driver from "raw.o" to "raw_bsd.o" "Incoming" function prototypes and "outgoing" function calls must match reality. Implemented using the "struct BlockDriver" definition in "include/block/block_int.h", and gcc errors & warnings. v1->v2: On 08/20/13 09:51, Kevin Wolf wrote: > Am 18.08.2013 um 16:29 hat Paolo Bonzini geschrieben: >> Il 16/08/2013 16:15, Laszlo Ersek ha scritto: >>> +static int raw_reopen_prepare(BDRVReopenState reopen_state, >>> + BlockReopenQueue queue, Error *errp) >>> { >>> - return bdrv_reopen_prepare(bs->file); >>> + BDRVReopenState tmp = reopen_state; >>> + >>> + tmp.bs = tmp.bs->file; >>> + return bdrv_reopen_prepare(&tmp, queue, errp); >>> } >> >> This should just return zero, my fault. > > Which is because bdrv_reopen_queue() already queues bs->file for reopen. > The simple return 0; implementation is shared by all other format drivers > that support reopening images. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-08-30 15:28:52 +02:00
Dietmar Maurer	98d2c6f2cd	block: add basic backup support to block driver backup_start() creates a block job that copies a point-in-time snapshot of a block device to a target block device. We call backup_do_cow() for each write during backup. That function reads the original data from the block device before it gets overwritten. The data is then written to the target device. Currently backup cluster size is hardcoded to 65536 bytes. [I made a number of changes to Dietmar's original patch and folded them in to make code review easy. Here is the full list: * Drop BackupDumpFunc interface in favor of a target block device * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes() * Use 0 delay instead of 1us, like other block jobs * Unify creation/start functions into backup_start() * Simplify cleanup, free bitmap in backup_run() instead of cb * function * Use HBitmap to avoid duplicating bitmap code * Use bdrv_getlength() instead of accessing ->total_sectors * directly * Delete the backup.h header file, it is no longer necessary * Move ./backup.c to block/backup.c * Remove #ifdefed out code * Coding style and whitespace cleanups * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks * Keep our own in-flight CowRequest list instead of using block.c tracked requests. This means a little code duplication but is much simpler than trying to share the tracked requests list and use the backup block size. * Add on_source_error and on_target_error error handling. * Use trace events instead of DPRINTF() -- stefanha] Signed-off-by: Dietmar Maurer <dietmar@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-06-28 09:20:26 +02:00
Wenchao Xia	f364ec65b5	block: move qmp and info dump related code to block/qapi.c This patch is a pure code move patch, except following modification: 1 get_human_readable_size() is changed to static function. 2 dump_human_image_info() is renamed to bdrv_image_info_dump(). 3 in qmp_query_block() and qmp_query_blockstats, use bdrv_next(bs) instead of direct traverse of global array 'bdrv_states'. 4 collect_snapshots() and collect_image_info() are renamed, unused parameter *fmt in collect_image_info() is removed. 5 code style fix. To avoid conflict and tip better, macro in header file is BLOCK_QAPI_H instead of QAPI_H. Now block.h and snapshot.h are at the same level in include path, block_int.h and qapi.h will both include them. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-06-04 13:56:30 +02:00
Wenchao Xia	de08c606f9	block: move snapshot code in block.c to block/snapshot.c All snapshot related code, except bdrv_snapshot_dump() and bdrv_is_snapshot(), is moved to block/snapshot.c. bdrv_snapshot_dump() will be moved to another file later. bdrv_is_snapshot() is not related with internal snapshot. It also fixes small code style errors reported by check script. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-06-04 13:56:30 +02:00
Jeff Cody	e8d4e5ffdb	block: initial VHDX driver support framework - supports open and probe This is the initial block driver framework for VHDX image support (i.e. Hyper-V image file formats), that supports opening VHDX files, and parsing the headers. This commit does not yet enable: - reading - writing - updating the header - differencing files (images with parents) - log replay / dirty logs (only clean images) This is based on Microsoft's VHDX specification: "VHDX Format Specification v0.95", published 4/12/2012 https://www.microsoft.com/en-us/download/details.aspx?id=29681 Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-05-03 10:31:58 +02:00
Richard W.M. Jones	0a12ec87a5	block: Add support for Secure Shell (ssh) block device. qemu-system-x86_64 -drive file=ssh://hostname/some/image QEMU will ssh into 'hostname' and open '/some/image' which is made available as a standard block device. You can specify a username (ssh://user@host/...) and/or a port number (ssh://host:port/...). You can also use an alternate syntax using properties (file.user, file.host, file.port, file.path). Current limitations: - Authentication must be done without passwords or passphrases, using ssh-agent. Other authentication methods are not supported. - Uses a single connection, instead of concurrent AIO with multiple SSH connections. This is implemented using libssh2 on the client side. The server just requires a regular ssh daemon with sftp-server support. Most ssh daemons on Unix/Linux systems will work out of the box. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-04-15 10:18:05 +02:00
Paolo Bonzini	525877c999	build: move rules from Makefile to */Makefile.objs Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:06 +01:00
Paolo Bonzini	f563a5d7a8	Merge remote-tracking branch 'origin/master' into threadpool Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:42:51 +01:00
Paolo Bonzini	a27365265c	raw-win32: implement native asynchronous I/O With the new support for EventNotifiers in the AIO event loop, we can hook a completion port to every opened file and use asynchronous I/O on them. Wine's support is extremely inefficient, also because it really does the I/O synchronously on regular files. (!) But it works, and it is good to keep the Win32 and POSIX ports as similar as possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	10fb6e0682	raw-posix: move linux-aio.c to block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	f42b22077b	aio: add Win32 implementation The Win32 implementation will only accept EventNotifiers, thus a few drivers are disabled under Windows. EventNotifiers are a good match for the GSource implementation, too, because the Win32 port of glib allows to place their HANDLEs in a GPollFD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	893f7ebafe	mirror: introduce mirror job This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
Paolo Bonzini	2f0c9fe64c	block: move job APIs to separate files Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 19:14:26 +02:00
Jeff Cody	747ff60263	block: add live block commit functionality This adds the live commit coroutine. This iteration focuses on the commit only below the active layer, and not the active layer itself. The behaviour is similar to block streaming; the sectors are walked through, and anything that exists above 'base' is committed back down into base. At the end, intermediate images are deleted, and the chain stitched together. Images are restored to their original open flags upon completion. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 18:23:12 +02:00
Bharata B Rao	8d6d89cb63	block: Support GlusterFS as a QEMU block backend. This patch adds gluster as the new block backend in QEMU. This gives QEMU the ability to boot VM images from gluster volumes. Its already possible to boot from VM images on gluster volumes using FUSE mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. This is made possible by using libgfapi routines to perform IO on gluster volumes directly. VM Image on gluster volume is specified like this: file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] 'gluster' is the protocol. 'transport' specifies the transport type used to connect to gluster management daemon (glusterd). Valid transport types are tcp, unix and rdma. If a transport type isn't specified, then tcp type is assumed. 'server' specifies the server where the volume file specification for the given volume resides. This can be either hostname, ipv4 address or ipv6 address. ipv6 address needs to be within square brackets [ ]. If transport type is 'unix', then 'server' field should not be specifed. The 'socket' field needs to be populated with the path to unix domain socket. 'port' is the port number on which glusterd is listening. This is optional and if not specified, QEMU will send 0 which will make gluster to use the default port. If the transport type is unix, then 'port' should not be specified. 'volname' is the name of the gluster volume which contains the VM image. 'image' is the path to the actual VM image that resides on gluster volume. Examples: file=gluster://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket file=gluster+rdma://1.2.3.4:24007/testvol/a.img Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 17:58:12 +02:00
Paolo Bonzini	7456e4ce8d	build: move block/ objects to nested Makefile.objs Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-06-07 09:21:13 +02:00

47 Commits