Commit Graph

63372 Commits

Author SHA1 Message Date
Alberto Garcia 1bab38e7bd block: Simplify bdrv_reopen_abort()
If a bdrv_reopen_multiple() call fails, then the explicit_options
QDict has to be deleted for every entry in the reopen queue. This must
happen regardless of whether that entry's bdrv_reopen_prepare() call
succeeded or not.

This patch simplifies the cleanup code a bit.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia 2f624b80ba block: Remove children options from bs->{options,explicit_options}
When bdrv_open_inherit() opens a BlockDriverState the options QDict
can contain options for some of its children, passed in the form of
child-name.option=value

So while each child is opened with that subset of options, those same
options remain stored in the parent BDS, leaving (at least) two copies
of each one of them ("child-name.option=value" in the parent and
"option=value" in the child).

Having the children options stored in the parent is unnecessary and it
can easily lead to an inconsistent state:

  $ qemu-img create -f qcow2 hd0.qcow2 10M
  $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
  $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2

  $ $QEMU -drive file=hd2.qcow2,node-name=hd2,backing.node-name=hd1

This opens a chain of images hd0 <- hd1 <- hd2. Now let's remove hd1
using block_stream:

  (qemu) block_stream hd2 0 hd0.qcow2

After this hd2 contains backing.node-name=hd1, which is no longer
correct because hd1 doesn't exist anymore.

This patch removes all children options from the parent dictionaries
at the end of bdrv_open_inherit() and bdrv_reopen_queue_child().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia 655b4b67e3 qdict: Make qdict_extract_subqdict() accept dst = NULL
This function extracts all options from a QDict starting with a
certain prefix and puts them in a new QDict.

We'll have a couple of cases where we simply want to discard those
options instead of copying them, and that's what this patch does.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Vladimir Sementsov-Ogievskiy f66b1f0e27 block: drop empty .bdrv_close handlers
.bdrv_close handler is optional after previous commit, no needs to keep
empty functions more.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Vladimir Sementsov-Ogievskiy 3c005293c2 block: make .bdrv_close optional
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Daniel P. Berrangé 8d65a3ccfd qemu-img: fix regression copying secrets during convert
When the convert command is creating an output file that needs
secrets, we need to ensure those secrets are passed to both the
blk_new_open and bdrv_create API calls.

This is done by qemu-img extracting all opts matching the name
suffix "key-secret". Unfortunately the code doing this was run after the
call to bdrv_create(), which meant the QemuOpts it was extracting
secrets from was now empty.

Previously this worked by luks as a bug meant the "key-secret"
parameters were not purged from the QemuOpts. This bug was fixed in

  commit b76b4f6045
  Author: Kevin Wolf <kwolf@redhat.com>
  Date:   Thu Jan 11 16:18:08 2018 +0100

    qcow2: Use visitor for options in qcow2_create()

Exposing the latent bug in qemu-img. This fix simply moves the copying
of secrets to before the bdrv_create() call.

Cc: qemu-stable@nongnu.org
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf 86fae10c64 mirror: Fail gracefully for source == target
blockdev-mirror with the same node for source and target segfaults
today: A node is in its own backing chain, so mirror_start_job() decides
that this is an active commit. When adding the intermediate nodes with
block_job_add_bdrv(), it starts the iteration through the subchain with
the backing file of source, though, so it never reaches target and
instead runs into NULL at the base.

While we could fix that by starting with source itself, there is no
point in allowing mirroring a node into itself and I wouldn't be
surprised if this caused more problems later.

So just check for this scenario and error out.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf dbfdf6cb36 qapi/block: Document restrictions for node names
blockdev-add fails if an invalid node name is given, so we should
document what a valid node name even is.

Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cong Li <coli@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf 6984eb8da2 block: Remove dead deprecation warning code
This reinstates commit 6266e900b8,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

We removed all options from the 'deprecated' array, so the code is dead
and can be removed as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf 572023f7b2 block: Remove deprecated -drive option serial
This reinstates commit b008326744,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.

Tests need to be updated to set the serial number with -global instead
of using the -drive option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf 7f8fc97155 block: Remove deprecated -drive option addr
This reinstates commit eae3bd1eb7,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

The -drive option addr was deprecated in QEMU 2.10. It's time to remove
it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf b24ec3c462 block: Remove deprecated -drive geometry options
This reinstates commit a7aff6dd10,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

The -drive options cyls, heads, secs and trans were deprecated in
QEMU 2.10. It's time to remove them.

hd-geo-test tested both the old version with geometry options in -drive
and the new one with -device. Therefore the code using -drive doesn't
have to be replaced there, we just need to remove the -drive test cases.
This in turn allows some simplification of the code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 12:50:39 +02:00
Fam Zheng 497da8236a luks: Allow share-rw=on
Format drivers such as qcow2 don't allow sharing the same image between
two QEMU instances in order to prevent image corruptions, because of
metadata cache. LUKS driver don't modify metadata except for when
creating image, so it is safe to relax the permission. This makes
share-rw=on property work on virtual devices.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia 25b8e4db7f throttle-groups: Don't allow timers without throttled requests
Commit 6fccbb475b fixed a bug caused by
QEMU attempting to remove a throttle group member with no pending
requests but an active timer set. This was the result of a previous
bdrv_drained_begin() call processing the throttled requests but
leaving the timer untouched.

Although the commit does solve the problem, the situation shouldn't
happen in the first place. If we try to drain a throttle group member
which has a timer set, we should cancel the timer instead of ignoring
it.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia 3db3e9c621 qemu-iotests: Update 093 to improve the draining test
The previous patch fixes a problem in which draining a block device
with more than one throttled request can make it wait first for the
completion of requests in other members of the same group.

This patch updates test_remove_group_member() in iotest 093 to
reproduce that scenario. This updated test would hang QEMU without the
fix from the previous patch.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia 5d8e4ca035 throttle-groups: Skip the round-robin if a member is being drained
In the throttling code after an I/O request has been completed the
next one is selected from a different member using a round-robin
algorithm. This ensures that all members get a chance to finish their
pending I/O requests.

However, if a group member has its I/O limits disabled (because it's
being drained) then we should always give it priority in order to have
all its pending requests finished as soon as possible.

If we don't do this we could have a member in the process of being
drained waiting for the throttled requests of other members, for which
the I/O limits still apply.

This can have additional consequences: if we're running in qtest mode
(with QEMU_CLOCK_VIRTUAL) then timers can only fire if we advance the
clock manually, so attempting to drain a block device can hang QEMU in
the BDRV_POLL_WHILE() loop at the end of bdrv_do_drained_begin().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Alberto Garcia ef7a6a3c2a qemu-iotests: Test removing a throttle group member with a pending timer
A throttle group can have several members, and each one of them can
have several pending requests in the queue.

The requests are processed in a round-robin fashion, so the algorithm
decides the drive that is going to run the next request and sets a
timer in it. Once the timer fires and the throttled request is run
then the next drive from the group is selected and a new timer is set.

If the user tried to remove a drive from a group and that drive had a
timer set then the code was not taking care of setting up a new timer
in one of the remaining members of the group, freezing their I/O.

This problem was fixed in 6fccbb475b,
and this patch adds a new test case that reproduces this exact
scenario.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-08-15 12:50:39 +02:00
Kevin Wolf f62492bb8d block/qapi: Fix memory leak in qmp_query_blockstats()
For BlockBackends that are skipped in query-blockstats, we would leak
info since commit 567dcb31. Allocate info only later to avoid the memory
leak.

Fixes: CID 1394727
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2018-08-15 12:50:39 +02:00
Marc-André Lureau cb9ec42f33 monitor: fix oob command leak
Spotted by ASAN, during make check...

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67
    #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24
    #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144
    #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506
    #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569
    #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306
    #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357
    #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561
    #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592
    #11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257
    #12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105
    #13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323
    #14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373
    #15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124
    #16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337
    #17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
    #18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
    #19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
    #20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
    #21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com>
[Screwed up in commit b27314567d]
Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 08:12:57 +02:00
Marc-André Lureau 42478dacc8 tests: fix crumple/recursive leak
Spotted by ASAN:

=================================================================
==27907==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4120 byte(s) in 1 object(s) allocated from:
    #0 0x7f913458ce50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f9133fd641d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
    #2 0x5561c6643c95 in qdict_crumple_test_recursive /home/elmarco/src/qq/tests/check-block-qdict.c:438
    #3 0x7f9133ff7c49  (/lib64/libglib-2.0.so.0+0x73c49)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-2-marcandre.lureau@redhat.com>
[Screwed up in commit 2860b2b2cb]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 08:12:19 +02:00
Markus Armbruster b736e25a18 qapi: Fix some pycodestyle-3 complaints
Fix the following issues:

    common.py:873:13: E129 visually indented line with same indent as next logical line
    common.py:1766:5: E741 ambiguous variable name 'l'
    common.py:1784:1: E305 expected 2 blank lines after class or function definition, found 1
    common.py:1833:1: E305 expected 2 blank lines after class or function definition, found 1
    common.py:1843:1: E305 expected 2 blank lines after class or function definition, found 1
    visit.py:181:18: E127 continuation line over-indented for visual indent

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180621083551.775-1-armbru@redhat.com>
[Fixup squashed in:]
Message-ID: <871sd0nzw9.fsf@dusky.pond.sub.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-08-15 07:24:22 +02:00
Marc-André Lureau 214e4a5b38 tests: change /0.15/* tests to /qmp/*
Presumably 0.15 was the version it was first introduced, but
qmp keeps evolving. There is no point in having that version
as test prefix, 'qmp' makes more sense here.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180326150916.9602-12-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 07:24:22 +02:00
Marc-André Lureau fcfab75410 qmp-shell: learn to send commands with quoted arguments
Use shlex to split the CLI command, respecting quoted arguments, and
also comments. This allows to call for ex:

(QEMU) human-monitor-command command-line="screendump /dev/null"
{"execute": "human-monitor-command", "arguments": {"command-line": "screendump /dev/null"}}

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180326150916.9602-3-marcandre.lureau@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-15 07:24:22 +02:00
Fam Zheng 37a81812f7 aio-posix: Improve comment around marking node deleted
The counter is for qemu_lockcnt_inc/dec sections (read side),
qemu_lockcnt_lock/unlock is for the write side.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180803063917.30292-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng af7e916869 tests/vm: Add vm-build-all/vm-clean-all in help text
Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180727083445.21436-1-famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Peter Maydell f2d4becdc7 tests/vm: Use make's --output-sync option
Use make's --output-sync option when running tests inside VMs,
so that if we're building with parallelization the output doesn't
get scrambled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180803085230.30574-6-peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Peter Maydell eb2712f568 tests/vm: Bump guest RAM up from 2G to 4G
Currently we run the guests in a VM which is given only 2G of RAM.
Since the guests are configured without any swap space, builds
can fail because the system runs out of memory and kills the
compiler, especially if the job count is set for a lot of
parallelism. Bump the setting up from 2G to 4G to give us some
more headroom.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180803085230.30574-5-peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Peter Maydell 41e3340afe tests/vm: Propagate V=1 down into the make inside the VM
Invoking 'make vm-build-freebsd' and friends with V=1 should
propagate that verbosity setting down into the build run
inside the VM. Make sure we do that. This brings it into
line with how the container tests handle V=1.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180803085230.30574-4-peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Peter Maydell ebb61f804d tests/vm: Pass the jobs parallelism setting to 'make check'
Our test suite works for parallel execution too, and this can
noticeably speed up a test run; pass the 'jobs' setting to
it as well as to the build proper.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180803085230.30574-3-peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng ebe95fa094 tests: vm: Add vm-clean-all
The images are big. Add a rule to clean up easily.

Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180716020008.31468-1-famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 1bd2698808 tests: Add centos VM testing
This one does docker testing in the VM. It is intended to replace the
native docker testing on patchew testers.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180712012829.20231-5-famz@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 73fb4f1de3 tests: Allow overriding archive path with SRC_ARCHIVE
In VM based tests, the source archive is created in host, we don't have
to run archive-source.sh again, as it complicates the Makefile and
scripts.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180712012829.20231-4-famz@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 983c2a777b tests: Add an option for snapshot (default: off)
Not using snapshot has the benefit of automatically persisting useful
test harnesses, such as docker images and ccache database. Although it
will lose some cleanness, it is imaginably useful for patchew.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180712012829.20231-2-famz@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 8158ed48bb docker: Install more packages in centos7
This makes test-block work.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180711065813.14894-1-famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng b37548fcd1 aio: Do aio_notify_accept only during blocking aio_poll
An aio_notify() pairs with an aio_notify_accept(). The former should
happen in the main thread or a vCPU thread, and the latter should be
done in the IOThread.

There is one rare case that the main thread or vCPU thread may "steal"
the aio_notify() event just raised by itself, in bdrv_set_aio_context()
[1]. The sequence is like this:

    main thread                     IO Thread
    ===============================================================
    bdrv_drained_begin()
      aio_disable_external(ctx)
                                    aio_poll(ctx, true)
                                      ctx->notify_me += 2
    ...
    bdrv_drained_end()
      ...
        aio_notify()
    ...
    bdrv_set_aio_context()
      aio_poll(ctx, false)
[1]     aio_notify_accept(ctx)
                                      ppoll() /* Hang! */

[1] is problematic. It will clear the ctx->notifier event so that
the blocked ppoll() will not return.

(For the curious, this bug was noticed when booting a number of VMs
simultaneously in RHV.  One or two of the VMs will hit this race
condition, making the VIRTIO device unresponsive to I/O commands. When
it hangs, Seabios is busy waiting for a read request to complete (read
MBR), right after initializing the virtio-blk-pci device, using 100%
guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750
for the original bug analysis.)

aio_notify() only injects an event when ctx->notify_me is set,
correspondingly aio_notify_accept() is only useful when ctx->notify_me
_was_ set. Move the call to it into the "blocking" branch. This will
effectively skip [1] and fix the hang.

Furthermore, blocking aio_poll is only allowed on home thread
(in_aio_context_home_thread), because otherwise two blocking
aio_poll()'s can steal each other's ctx->notifier event and cause
hanging just like described above.

Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 70232b5253 aio-posix: Don't count ctx->notifier as progress when polling
The same logic exists in fd polling. This change is especially important
to avoid busy loop once we limit aio_notify_accept() to blocking
aio_poll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-2-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Paolo Bonzini 2f0d8947a6 nvme: simplify plug/unplug
bdrv_io_plug/bdrv_io_unplug take care of keeping a nesting count,
so change s->plugged to just a bool.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180813144320.12382-2-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Fam Zheng 9582f357bb nvme: Fix nvme_init error handling
It is wrong to leave this field as 1, as nvme_close() called in the
error handling code in nvme_file_open() will use it and try to free
s->queues again.

Another problem is the cleaning ups are duplicated between the fail*
labels of nvme_init() and nvme_file_open(), which calls nvme_close().

A third problem is nvme_close() misses g_free() and
event_notifier_cleanup().

Fix all of them.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>

Message-Id: <20180712025420.4932-1-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Philippe Mathieu-Daudé a3f9f64bf9 tests/vm: Add flex and bison to the vm image
Similar to 79f24568e5, this fixes the following warnings:

           CHK version_gen.h
           LEX convert-dtsv0-lexer.lex.c
  make[1]: flex: Command not found
           BISON dtc-parser.tab.c
  make[1]: bison: Command not found
           LEX dtc-lexer.lex.c
  make[1]: flex: Command not found

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180628153535.1411-5-f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Philippe Mathieu-Daudé dcf7ea4a78 tests/vm: Only use -cpu 'host' if KVM is available
If KVM is not available, then use the 'max' cpu.

This fixes:

  ERROR:root:Log:
  ERROR:root:qemu-system-x86_64: CPU model 'host' requires KVM
  Failed to prepare guest environment
  error: [Errno 104] Connection reset by peer
  source/qemu/tests/vm/Makefile.include:25: recipe for target 'tests/vm/ubuntu.i386.img' failed
  make: *** [tests/vm/ubuntu.i386.img] Error 2

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180628153535.1411-4-f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15 10:12:35 +08:00
Richard Henderson 054e7adf4e target/arm: Fix typo in helper_sve_movz_d
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180801123111.3595-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-14 17:17:22 +01:00
Richard Henderson bbd0968c45 target/arm: Reorganize SVE WHILE
The pseudocode for this operation is an increment + compare loop,
so comparing <= the maximum integer produces an all-true predicate.

Rather than bound in both the inline code and the helper, pass the
helper the number of predicate bits to set instead of the number
of predicate elements to set.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180801123111.3595-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-14 17:17:22 +01:00
Richard Henderson 7a31e0c6c6 target/arm: Fix typo in do_sat_addsub_64
Used the wrong temporary in the computation of subtractive overflow.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180801123111.3595-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-14 17:17:22 +01:00
Richard Henderson df4e001093 target/arm: Fix sign of sve_cmpeq_ppzw/sve_cmpne_ppzw
The normal vector element is sign-extended before
comparing with the wide vector element.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180801123111.3595-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-14 17:17:22 +01:00
Peter Maydell 5f62d3b9e6 target/arm: Implement tailchaining for M profile cores
Tailchaining is an optimization in handling of exception return
for M-profile cores: if we are about to pop the exception stack
for an exception return, but there is a pending exception which
is higher priority than the priority we are returning to, then
instead of unstacking and then immediately taking the exception
and stacking registers again, we can chain to the pending
exception without unstacking and stacking.

For v6M and v7M it is IMPDEF whether tailchaining happens for pending
exceptions; for v8M this is architecturally required.  Implement it
in QEMU for all M-profile cores, since in practice v6M and v7M
hardware implementations generally do have it.

(We were already doing tailchaining for derived exceptions which
happened during exception return, like the validity checks and
stack access failures; these have always been required to be
tailchained for all versions of the architecture.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180720145647.8810-5-peter.maydell@linaro.org
2018-08-14 17:17:22 +01:00
Peter Maydell 89b1fec193 target/arm: Restore M-profile CONTROL.SPSEL before any tailchaining
On exception return for M-profile, we must restore the CONTROL.SPSEL
bit from the EXCRET value before we do any kind of tailchaining,
including for the derived exceptions on integrity check failures.
Otherwise we will give the guest an incorrect EXCRET.SPSEL value on
exception entry for the tailchained exception.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180720145647.8810-4-peter.maydell@linaro.org
2018-08-14 17:17:22 +01:00
Peter Maydell b8109608bc target/arm: Initialize exc_secure correctly in do_v7m_exception_exit()
In do_v7m_exception_exit(), we use the exc_secure variable to track
whether the exception we're returning from is secure or non-secure.
Unfortunately the statement initializing this was accidentally
inside an "if (env->v7m.exception != ARMV7M_EXCP_NMI)" conditional,
which meant that we were using the wrong value for NMI handlers.
Move the initialization out to the right place.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180720145647.8810-3-peter.maydell@linaro.org
2018-08-14 17:17:21 +01:00
Peter Maydell a9074977ef target/arm: Improve exception-taken logging
Improve the exception-taken logging by logging in
v7m_exception_taken() the exception we're going to take
and whether it is secure/nonsecure.

This requires us to move logging at many callsites from after the
call to before it, so that the logging appears in a sensible order.

(This will make tail-chaining produce more useful logs; for the
current callers of v7m_exception_taken() we know which exception
we're going to take, so custom log messages at the callsite sufficed;
for tail-chaining only v7m_exception_taken() knows the exception
number that we're going to tail-chain to.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180720145647.8810-2-peter.maydell@linaro.org
2018-08-14 17:17:21 +01:00
Peter Maydell 3d0e3080d8 target/arm: Treat SCTLR_EL1.M as if it were zero when HCR_EL2.TGE is set
One of the required effects of setting HCR_EL2.TGE is that when
SCR_EL3.NS is 1 then SCTLR_EL1.M must behave as if it is zero for
all purposes except direct reads. That is, it effectively disables
the MMU for the NS EL0/EL1 translation regime.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180724115950.17316-6-peter.maydell@linaro.org
2018-08-14 17:17:21 +01:00
Peter Maydell ac656b166b target/arm: Provide accessor functions for HCR_EL2.{IMO, FMO, AMO}
The IMO, FMO and AMO bits in HCR_EL2 are defined to "behave as
1 for all purposes other than direct reads" if HCR_EL2.TGE
is set and HCR_EL2.E2H is 0, and to "behave as 0 for all
purposes other than direct reads" if HCR_EL2.TGE is set
and HRC_EL2.E2H is 1.

To avoid having to check E2H and TGE everywhere where we test IMO and
FMO, provide accessors arm_hcr_el2_imo(), arm_hcr_el2_fmo()and
arm_hcr_el2_amo().  We don't implement ARMv8.1-VHE yet, so the E2H
case will never be true, but we include the logic to save effort when
we eventually do get to that.

(Note that in several of these callsites the change doesn't
actually make a difference as either the callsite is handling
TGE specially anyway, or the CPU can't get into that situation
with TGE set; we change everywhere for consistency.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180724115950.17316-5-peter.maydell@linaro.org
2018-08-14 17:17:21 +01:00