Allow bdrv_open() to handle references to existing block devices just as
bdrv_file_open() is already capable of.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make bdrv_open() take a pointer to a BDS pointer, similarly to
bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open()
will create a new BDS with an empty name; if the BDS pointer is not
NULL, that existing BDS will be reused (in the same way as bdrv_open()
already did).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of making the backing file contents visible again after a discard
request, set the zero flag if possible (i.e. on version >= 3).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out. Unnecessarily hard for
optimizers, static checkers, and human readers. Dumb it down to
obvious.
Gets rid of several dozen Coverity false positives.
Note that the obvious form is already used in many places.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
When starting a block job, commit_active_start() relies on whether *errp
is set by mirror_start_job. This allows it to determine if the mirror
job start failed, so that it can clean up any changes to open flags from
the bdrv_reopen(). If errp is NULL, then it will not be able to
determine if mirror_start_job failed or not.
To avoid this, use a local Error variable, and then propagate the error
(if any) to errp.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There are a handful of places in the block layer where a failure path
has a valid -errno value, yet error_setg() is used. Those instances
should instead use error_setg_errno(), to preserve as much error
information as possible.
This patch replaces those instances with error_setg_errno(), so that
errno is passed up the stack in the error message.
Reported-By: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
iSCSI currently does not need to do any actions to support the
current usage of bdrv_reopen(). However, it is important to note
a couple of things: 1.) A connection will not be re-established to
an iSCSI target, and 2.) If iscsi_open() is changed to parse 'flags',
then iscsi_reopen_prepare() may need to be more than a stub.
In light of the above, this commit adds comments above both of the
functions to bring attention to these facts.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
raw copies over the BlockLimits of bs->file during bdrv_open().
However, since commit d34682cd it is immediately overwritten during
bdrv_refresh_limits(). This caused all fields except for
opt_transfer_length and opt_mem_alignment (which happen to be correctly
inherited in generic code) to be zeroed.
Move the BlockLimit assignment to a .bdrv_refresh_limits() callback to
make it work again for all fields.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
In the case of a metadata preallocation with a large cluster size,
qcow2_alloc_cluster_offset() can allocate nothing and returns a
NULL l2meta. This patch checks for it and link2 l2 with only valid
l2meta.
Replace 9 and 512 with BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE
respectively while at the function.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When cluster size is big enough it can lead to an offset overflow
in qcow2_alloc_clusters_at(). This patch fixes it.
The allocation is stopped each time at L2 table boundary
(see handle_alloc()), so the possible maximum bytes could be
2^(cluster_bits - 3 + cluster_bits)
cluster_bits - 3 is used to compute the number of entry by L2
and the additional cluster_bits is to take into account each
clusters referenced by the L2 entries.
so int is safe for cluster_bits<=17, unsafe otherwise.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.
The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:
./qemu-img create -f qcow2 \
-o 'cluster_size=131072,preallocation=metadata' file.img 4G
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
the opt_transfer_length has nothing to do with logical
block provisioning stuff so always copy it from
the block limits VPD page.
Reported-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds native support for accessing images on NFS
shares without the requirement to actually mount the entire
NFS share on the host.
NFS Images can simply be specified by an url of the form:
nfs://<host>/<export>/<filename>[?param=value[¶m2=value2[&...]]]
For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2
You need LibNFS from Ronnie Sahlberg available at:
git://github.com/sahlberg/libnfs.git
for this to work.
During configure it is automatically probed for libnfs and support
is enabled on-the-fly. You can forbid or enforce libnfs support
with --disable-libnfs or --enable-libnfs respectively.
Due to NFS restrictions you might need to execute your binaries
as root, allow them to open priviledged ports (<1024) or specify
insecure option on the NFS server.
For additional information on ROOT vs. non-ROOT operation and URL
format + parameters see:
https://raw.github.com/sahlberg/libnfs/master/README
Supported by qemu are the uid, gid and tcp-syncnt URL parameters.
LibNFS currently support NFS version 3 only.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Errors are inadvertently ignored in a few places. Has always been
broken. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
this adds a basic vmdk corruption check. it should detect severe
table corruptions and file truncation.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The QCOW2 .bdrv_make_empty implementation always returns 0 for success,
but does not actually do anything.
The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The QED .bdrv_make_empty() implementation does nothing but return
-ENOTSUP, which causes problems in bdrv_commit(). Since the function
stub exists for QED, it is called, which then always returns an error.
The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* bonzini/scsi-next:
scsi: Support TEST UNIT READY in the dummy LUN0
block: add .bdrv_reopen_prepare() stub for iscsi
virtio-scsi: Prevent assertion on missed events
virtio-scsi: Cleanup of I/Os that never started
scsi: Assign cancel_io vector for scsi_disk_emulate_ops
Conflicts:
block/iscsi.c
aliguori: resolve trivial merge conflict in block/iscsi.c
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
The new 'align' option of blkdebug can be used in order to emulate
backends with a required 4k alignment on hosts which only really require
512 byte alignment.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The iSCSI backend already gets the block size from the READ CAPACITY
command it sends. Save it so that the generic block layer gets it
too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Add a bs->request_alignment field that contains the required
offset/length alignment for I/O requests and fill it in the raw block
drivers. Use ioctls if possible, else see what alignment it takes for
O_DIRECT to succeed.
While at it, also expose the memory alignment requirements, which may be
(and in practice are) different from the disk alignment requirements.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
When reopening with different flags, or when backing files disappear
from the chain, the limits may change. Make sure they get updated in
these cases.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
If the top image to commit is the active layer, and also larger than
the base image, then an I/O error will likely be returned during
block-commit.
For instance, if we have a base image with a virtual size 10G, and a
active layer image of size 20G, then committing the snapshot via
'block-commit' will likely fail.
This will automatically attempt to resize the base image, if the
active layer image to be committed is larger.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
libcurl versions 7.16.0 and later have a timer callback interface which
must be implemented in order for libcurl to make forward progress (it
will sometimes rely on being called back on the timeout if there are
no file descriptors registered). Implement the callback, and use a
QEMU AIO timer to ensure we prod libcurl again when it asks us to.
Based on Peter's original patch plus my fix to add curl_multi_timeout_do.
Should compile just fine even on older versions of libcurl.
I also tried copy-on-read and streaming:
$ ./qemu-img create -f qcow2 -o \
backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \
foo.qcow2 1G
$ x86_64-softmmu/qemu-system-x86_64 \
-drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \
-device ide-cd,drive=cd --enable-kvm -m 1024
Direct http usage is probably too slow, but with copy-on-read ultimately
the image does boot!
After some time, streaming gets canceled by an EIO, which needs further
investigation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently there is no way to query BlockStats of the backing chain. This
adds "backing" field into BlockStats to make it possible.
The comment of "parent" is reworded.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the function mirror_iteration() -> qemu_iovec_init(),
it allocates memory for op->qiov.iov, when the write request calls back,
but in the function mirror_iteration_done(), it only frees the op,
not free the op->qiov.iov, so this causes memory leak.
It should use qemu_iovec_destroy() to free op->qiov.
Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It was muted in the previous commit 4bc74be9. Let's revive it since nothing
prevents us to do it.
With this patch, following command will work as other formats:
$ qemu-img map sheepdog:image
Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry
start at bit 9. The offset is cluster offset, and the smallest possible
cluster size is 512 bytes.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the filename is not prefixed by "blkverify:" in
blkverify_parse_filename(), the blkverify driver was not selected
through that protocol prefix, but by an explicit command line (or QMP)
option (like driver=blkverify).
If blkverify_parse_filename() has been called, a filename has been
given. If it is not prefixed, it is probably really just a plain
filename. This is no problem, since we can use it as the test image
filename and rely on the user to specify the raw image filename through
the new corresponding option.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce the "test" and "raw" options for specifying images.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce the "image" option as an alternative to specifying the image
through the filename.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow specifying a reference to an existing block device (by name) for
bdrv_file_open() instead of a filename and/or options.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use qemu_config_parse_qdict() to parse the command-line options in
addition to the config file.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move the check whether there actually is a config file into the
read_config() function.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the filename is not prefixed by "blkdebug:" in
blkdebug_parse_filename(), the blkdebug driver was not selected through
that protocol prefix, but by an explicit command line option
(file.driver=blkdebug or something similar). Contrary to the current
reaction, this is not a problem at all; we just need to store the
filename (in the x-image option) and can go on; the user just has to
manually specify the config option.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use an Error variable in the read_config() function.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Local variable "n" as int64_t avoids overflow with large sector number
calculation. See test case change for failure case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We should pass base_inode->vdi_id to base_vdi_id of SheepdogVdiReq so that sheep
can create a clone instead a fresh volume.
This fixes following command:
qemu-create -b sheepdog:base sheepdog:clone
so users can boot sheepdog:clone as a normal volume.
Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
GlusterFS supports creation of zero-filled file on GlusterFS volume
by means of an API called glfs_zerofill(). Use this API from QEMU to
create an image that is filled with zeroes by using the preallocation
option of qemu-img.
qemu-img create gluster://server/volume/image -o preallocation=full 10G
The allowed values for preallocation are 'full' and 'off'. By default
preallocation is off and image is not zero-filled.
glfs_zerofill() offloads the writing of zeroes to the server and if
the storage supports SCSI WRITESAME, GlusterFS server can issue
BLKZEROOUT ioctl to achieve the zeroing.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Support .bdrv_co_write_zeroes() from gluster driver by using GlusterFS API
glfs_zerofill() that off-loads the writing of zeroes to GlusterFS server.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convert the read, write, flush and discard implementations from aio-based
ones to coroutine based ones.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>