Go to file
Vladimir Sementsov-Ogievskiy af5bcd775f block: copy-before-write: realize snapshot-access API
Current scheme of image fleecing looks like this:

[guest]                    [NBD export]
  |                              |
  |root                          | root
  v                              v
[copy-before-write] -----> [temp.qcow2]
  |                 target  |
  |file                     |backing
  v                         |
[active disk] <-------------+

 - On guest writes copy-before-write filter copies old data from active
   disk to temp.qcow2. So fleecing client (NBD export) when reads
   changed regions from temp.qcow2 image and unchanged from active disk
   through backing link.

This patch makes possible new image fleecing scheme:

[guest]                   [NBD export]
   |                            |
   | root                       | root
   v                 file       v
[copy-before-write]<------[snapshot-access]
   |           |
   | file      | target
   v           v
[active-disk] [temp.img]

 - copy-before-write does CBW operations and also provides
   snapshot-access API. The API may be accessed through
   snapshot-access driver.

Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
   by original dirty bitmap used on copy-before-write open, client gets
   -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
   discarding data in temp.img informs block-copy process to not copy
   these clusters. Next read from discarded area will return -EACCES.
   This is significant thing: when fleecing user reads data that was
   not yet copied to temp.img, we can avoid copying it on further guest
   write.

3. Synchronisation between client reads and block-copy write is more
   efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag
   used for writes to temp.qcow2. New scheme is less blocking:
     - fleecing reads are never blocked: if data region is untouched or
       in-flight, we just read from active-disk, otherwise we read from
       temp.img
     - writes to temp.img are not blocked by fleecing reads
     - still, guest writes of-course are blocked by in-flight fleecing
       reads, that currently read from active-disk - it's the minimum
       necessary blocking

4. Temporary image may be of any format, as we don't rely on backing
   feature.

5. Permission relation are simplified. With old scheme we have to share
   write permission on target child of copy-before-write, otherwise
   backing link conflicts with copy-before-write file child write
   permissions. With new scheme we don't have backing link, and
   copy-before-write node may have unshared access to temporary node.
   (Not realized in this commit, will be in future).

6. Having control on fleecing reads we'll be able to implement
   alternative behavior on failed copy-before-write operations.
   Currently we just break guest request (that's a historical behavior
   of backup). But in some scenarios it's a bad behavior: better
   is to drop the backup as failed but don't break guest request.
   With new scheme we can simply unset some bits in a bitmap on CBW
   failure and further fleecing reads will -EACCES, or something like
   this. (Not implemented in this commit, will be in future)
   Additional application for this is implementing timeout for CBW
   operations.

Iotest 257 output is updated, as two more bitmaps now live in
copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-13-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
.github/workflows .github: move repo lockdown to the v2 configuration 2021-10-12 08:38:10 +01:00
.gitlab/issue_templates GitLab: Add "Feature Request" issue template. 2021-06-25 10:08:37 +01:00
.gitlab-ci.d gitlab: upgrade the job definition for s390x to 20.04 2022-02-28 16:42:35 +00:00
accel accel/tcg/cpu-exec: Fix precise single-stepping after interrupt 2022-02-28 08:04:06 -10:00
audio coreaudio: Notify error in coreaudio_init_out 2022-03-04 11:22:40 +01:00
authz configure, meson: convert pam detection to meson 2021-06-25 10:54:10 +02:00
backends * More Meson conversions (0.59.x now required rather than suggested) 2022-02-21 17:24:05 +00:00
block block: copy-before-write: realize snapshot-access API 2022-03-07 09:33:31 +01:00
bsd-user bsd-user: Add safe system call macros 2022-02-26 21:05:21 -07:00
capstone@f8b1b83301 capstone: Update to upstream "next" branch 2020-10-03 04:23:14 -05:00
chardev chardev/char-socket: tcp_chr_sync_read: don't clobber errno 2022-01-07 05:19:55 -05:00
common-user common-user/host/sparc64: Fix safe_syscall_base 2022-02-09 08:46:23 +11:00
configs hw/openrisc/openrisc_sim: Add automatic device tree generation 2022-02-26 10:39:36 +09:00
contrib meson: refine check for whether to look for virglrenderer 2022-02-21 10:35:53 +01:00
crypto configure, meson: move some default-disabled options to meson_options.txt 2022-02-21 10:35:53 +01:00
disas target/riscv: setup everything for rv64 to support rv128 execution 2022-01-08 15:46:10 +10:00
docs Block layer patches 2022-03-05 10:59:04 +00:00
dtc@b6910bec11 dtc: Update to version 1.6.1 2021-10-14 08:08:11 +02:00
dump dump: Remove is_zero_page() 2021-12-15 10:31:42 +01:00
ebpf ebpf: only include in system emulators 2021-09-17 16:07:52 +08:00
fpu softfloat: Add float64r32 arithmetic routines 2021-12-17 17:57:15 +01:00
fsdev 9pfs: make V9fsPath usable via P9Array API 2021-10-27 14:45:22 +02:00
gdb-xml target/arm: Advertise MVE to gdb when present 2021-11-02 14:14:55 -04:00
hw ide: Increment BB in-flight counter for TRIM BH 2022-03-07 09:19:20 +01:00
include block: introduce snapshot-access block driver 2022-03-07 09:33:31 +01:00
io aio-posix: split poll check from ready handler 2022-01-12 17:09:39 +00:00
libdecnumber libdecnumber: Introduce decNumberIntegralToInt128 2021-11-09 10:32:52 +11:00
linux-headers linux-headers: Update headers to v5.17-rc1 2022-02-17 17:21:45 +00:00
linux-user linux-user: Add missing "qemu/timer.h" include 2022-02-21 10:18:06 +01:00
meson@12f9f04ba0 meson: bump submodule to 0.59.3 2021-11-02 15:57:28 +01:00
migration include/block/snapshot: global state API + assertions 2022-03-04 18:18:25 +01:00
monitor block: rename bdrv_invalidate_cache_all, blk_invalidate_cache and test_sync_op_invalidate_cache 2022-03-04 18:14:40 +01:00
nbd nbd/server.c: Remove unused field 2022-01-28 16:48:28 -06:00
net Trivial branch pull request 20220222 2022-02-22 20:17:09 +00:00
pc-bios spapr/vof: Install rom and nvram binaries 2022-02-09 09:08:56 +01:00
plugins * Improve virtio-net failover test 2022-02-22 13:07:32 +00:00
po po: update turkish translation 2021-10-22 18:07:30 +02:00
python Revert "python: pin setuptools below v60.0.0" 2022-02-23 17:07:26 -05:00
qapi block: introduce snapshot-access block driver 2022-03-07 09:33:31 +01:00
qga meson, configure: move ntddscsi API check to meson 2022-02-21 10:35:54 +01:00
qobject qobject: braces {} are necessary for all arms of this statement 2021-02-04 13:20:29 +01:00
qom Mark remaining global TypeInfo instances as const 2022-02-21 13:30:20 +00:00
replay replay: notify CPU on event 2021-04-01 10:37:20 +02:00
roms Fixes and updates for hppa target 2022-02-02 19:54:30 +00:00
scripts Testing and semihosting updates: 2022-03-02 10:46:16 +00:00
scsi error: Use error_fatal to simplify obvious fatal errors (again) 2021-08-26 17:15:28 +02:00
semihosting semihosting/arm-compat: replace heuristic for softmmu SYS_HEAPINFO 2022-02-28 16:42:35 +00:00
slirp@a88d9ace23 Update libslirp to v4.6.1 2021-08-03 16:07:22 +04:00
softmmu Block layer patches 2022-03-05 10:59:04 +00:00
storage-daemon qsd: Add --daemonize 2022-03-04 18:14:40 +01:00
stubs main-loop.h: introduce qemu_in_main_thread() 2022-03-04 18:18:15 +01:00
subprojects/libvhost-user libvhost-user: Map shared RAM with MAP_NORESERVE to support virtio-mem with hugetlb 2022-02-04 09:07:43 -05:00
target target/ppc: Add missing helper_reset_fpstatus to helper_XVCVSPBF16 2022-03-05 07:16:48 +01:00
tcg tcg/i386: Implement bitsel for avx512 2022-03-04 08:50:41 -10:00
tests block: copy-before-write: realize snapshot-access API 2022-03-07 09:33:31 +01:00
tools virtiofsd: Let meson check for statx.stx_mnt_id 2022-03-02 18:12:40 +00:00
trace tracing: excise the tcg related from tracetool 2022-02-09 12:08:42 +00:00
ui ui/cocoa: Add Services menu 2022-03-04 11:29:55 +01:00
util block/dirty-bitmap: introduce bdrv_dirty_bitmap_status() 2022-03-07 09:33:30 +01:00
.cirrus.yml tests: Manually remove libxml2 on MSYS2 runners 2022-02-09 12:08:42 +00:00
.dir-locals.el Add .dir-locals.el file to configure emacs coding style 2015-10-08 19:46:01 +03:00
.editorconfig .editorconfig: update the automatic mode setting for Emacs 2021-03-10 15:34:11 +00:00
.exrc qemu: add .exrc 2012-09-07 09:02:44 +03:00
.gdbinit .gdbinit: load QEMU sub-commands when gdb starts 2017-06-07 14:38:45 +01:00
.gitattributes maint: Tell git that *.py files should use python diff hunks 2021-02-15 22:13:34 -05:00
.gitignore .gitignore: add .gcov pattern 2022-02-09 12:08:41 +00:00
.gitlab-ci.yml docs: Document GitLab custom CI/CD variables 2021-07-29 07:56:01 +02:00
.gitmodules gitmodules: Correct libvirt-ci submodule URL 2022-02-09 12:08:41 +00:00
.gitpublish Add a git-publish configuration file 2018-03-05 09:03:17 +00:00
.mailmap MAINTAINERS: Change philmd's email address 2021-12-31 13:42:54 +01:00
.patchew.yml scripts/checkpatch: roll diff tweaking into checkpatch itself 2021-06-25 10:08:33 +01:00
.readthedocs.yml readthedocs: build with Python 3.6 2020-10-05 16:30:45 +01:00
.travis.yml travis.yml: Update the s390x jobs to Ubuntu Focal 2022-02-28 16:42:35 +00:00
COPYING COPYING: update from FSF 2008-10-12 17:54:42 +00:00
COPYING.LIB COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org 2019-01-30 11:01:22 +01:00
Kconfig meson: Introduce target-specific Kconfig 2021-07-09 18:21:34 +02:00
Kconfig.host kconfig: split CONFIG_SPARSE_MEM from fuzzing 2021-10-14 09:50:56 +02:00
LICENSE tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing 2019-11-11 15:11:21 +01:00
MAINTAINERS block: introduce snapshot-access block driver 2022-03-07 09:33:31 +01:00
Makefile Makefile: also remove .gcno files when cleaning 2022-02-09 12:08:41 +00:00
README.rst README: Fix some documentation URLs 2021-10-23 20:28:12 +02:00
VERSION Open 6.3 development tree 2021-12-14 12:40:12 -08:00
block.c block_int-common.h: assertions in the callers of BdrvChildClass function pointers 2022-03-04 18:18:25 +01:00
blockdev-nbd.c block/nbd: Use qcrypto_tls_creds_check_endpoint() 2021-06-29 18:29:47 +01:00
blockdev.c assertions for blockdev.h global state API 2022-03-04 18:18:25 +01:00
blockjob.c assertions for blockjob.h global state API 2022-03-04 18:18:25 +01:00
configure Use long endian options for ppc64 2022-03-05 07:16:46 +01:00
cpu.c cpu.c: Make start-powered-off settable after realize 2022-02-08 10:56:27 +00:00
cpus-common.c overall/alpha tcg cpus|hppa: Fix Lesser GPL version number 2020-11-15 16:43:54 +01:00
disas.c Do not include cpu.h if it's not really necessary 2021-05-02 17:24:51 +02:00
gdbstub.c gdbstub, kvm: let KVM report supported singlestep flags 2021-12-10 09:47:18 +01:00
gitdm.config contrib/gitdm: add a new interns group-map for GSoC/Outreachy work 2021-07-23 17:22:16 +01:00
hmp-commands-info.hx monitor: move x-query-profile into accel/tcg to fix build 2022-01-18 16:42:42 +00:00
hmp-commands.hx qapi/monitor: allow VNC display id in set/expire_password 2022-03-02 18:12:40 +00:00
iothread.c iothread: use IOThreadParamInfo in iothread_[set|get]_param() 2021-10-07 15:29:50 +01:00
job-qmp.c progressmeter: protect with a mutex 2021-06-25 14:24:24 +03:00
job.c job.h: assertions in the callers of JobDriver function pointers 2022-03-04 18:18:26 +01:00
memory_ldst.c.inc exec/memory_ldst: Use correct type sizes 2021-05-26 08:35:51 -07:00
meson.build target/nios2: Replace MMU_LOG with tracepoints 2022-03-03 09:36:38 -10:00
meson_options.txt configure, meson: move CONFIG_IASL to a Meson option 2022-02-21 10:35:54 +01:00
module-common.c all: Clean up includes 2016-02-04 17:41:30 +00:00
os-posix.c os-posix: Add os_set_daemonize() 2022-03-04 18:14:40 +01:00
os-win32.c remove qemu-options* from root directory 2021-05-26 14:49:46 +02:00
page-vary-common.c exec: Build page-vary-common.c with -fno-lto 2021-03-23 19:36:47 -06:00
page-vary.c exec: Build page-vary-common.c with -fno-lto 2021-03-23 19:36:47 -06:00
qemu-bridge-helper.c qemu-bridge-helper: relocate path to default ACL 2020-09-30 19:11:36 +02:00
qemu-edid.c edid: set default resolution to 1280x800 (WXGA) 2022-01-13 10:59:16 +01:00
qemu-img-cmds.hx qemu-img: Unify [-b [-F]] documentation 2022-02-01 13:49:15 +01:00
qemu-img.c qemu-img: make is_allocated_sectors() more efficient 2022-01-14 12:03:16 +01:00
qemu-io-cmds.c block: Acquire AioContexts during bdrv_reopen_multiple() 2021-07-09 13:19:11 +02:00
qemu-io.c error: Use error_fatal to simplify obvious fatal errors (again) 2021-08-26 17:15:28 +02:00
qemu-keymap.c qemu-keymap: Add license in generated files 2021-12-17 10:41:50 +01:00
qemu-nbd.c nbd/server: Add --selinux-label option 2021-11-16 10:16:38 -06:00
qemu-options.hx qemu-options: fix incorrect description for '-drive index=' 2022-02-22 17:15:21 +01:00
qemu.nsi nsis: adjust for new MinGW paths 2021-01-23 15:55:05 -05:00
qemu.sasl sasl: remove comment about obsolete kerberos versions 2021-06-14 13:28:50 +01:00
replication.c replication: move include out of root directory 2021-05-26 14:49:46 +02:00
trace-events tracing: remove TCG memory access tracing 2022-02-09 12:08:42 +00:00
version.rc configure: remove CONFIG_FILEVERSION and CONFIG_PRODUCTVERSION 2021-01-02 21:03:37 +01:00

README.rst

===========
QEMU README
===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Documentation
=============

Documentation can be found hosted online at
`<https://www.qemu.org/documentation/>`_. The documentation for the
current development version that is available at
`<https://www.qemu.org/docs/master/>`_ is generated from the ``docs/``
folder in the source tree, and is built by `Sphinx
<https://www.sphinx-doc.org/en/master/>_`.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:


.. code-block:: shell

  mkdir build
  cd build
  ../configure
  make

Additional information can also be found online via the QEMU website:

* `<https://wiki.qemu.org/Hosts/Linux>`_
* `<https://wiki.qemu.org/Hosts/Mac>`_
* `<https://wiki.qemu.org/Hosts/W32>`_


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

.. code-block:: shell

   git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the `style section
<https://www.qemu.org/docs/master/devel/style.html>` of
the Developers Guide.

Additional information on submitting patches can be found online via
the QEMU website

* `<https://wiki.qemu.org/Contribute/SubmitAPatch>`_
* `<https://wiki.qemu.org/Contribute/TrivialPatches>`_

The QEMU website is also maintained under source control.

.. code-block:: shell

  git clone https://gitlab.com/qemu-project/qemu-web.git

* `<https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/>`_

A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.

For installation instructions, please go to

*  `<https://github.com/stefanha/git-publish>`_

The workflow with 'git-publish' is:

.. code-block:: shell

  $ git checkout master -b my-feature
  $ # work on new commits, add your 'Signed-off-by' lines to each
  $ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.

Sending v2:

.. code-block:: shell

  $ git checkout my-feature # same topic branch
  $ # making changes to the commits (using 'git rebase', for example)
  $ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.

Bug reporting
=============

The QEMU project uses GitLab issues to track bugs. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

* `<https://gitlab.com/qemu-project/qemu/-/issues>`_

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via GitLab.

For additional information on bug reporting consult:

* `<https://wiki.qemu.org/Contribute/ReportABug>`_


ChangeLog
=========

For version history and release notes, please visit
`<https://wiki.qemu.org/ChangeLog/>`_ or look at the git history for
more detailed information.


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC

* `<mailto:qemu-devel@nongnu.org>`_
* `<https://lists.nongnu.org/mailman/listinfo/qemu-devel>`_
* #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

* `<https://wiki.qemu.org/Contribute/StartHere>`_