Move to system/, as this is mostly about configuring vfio-ap.
Message-Id: <20200213162942.14177-3-cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
While at it, also fix the numbering in 'What QEMU does'.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200213162942.14177-2-cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There is a special quiesce PSW that we check for "shutdown". Otherwise disabled
wait is detected as "crashed". Architecturally we must only check PSW bits
116-127. Fix this.
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1582204582-22995-1-git-send-email-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Up to now we only had an ioctl to reset vcpu data QEMU couldn't reach
for the initial reset, which was also called for the clear reset. To
be architecture compliant, we also need to clear local interrupts on a
normal reset.
Because of this and the upcoming protvirt support we need to add
ioctls for the missing clear and normal resets.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200214151636.8764-3-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Update to commit b1da3acc781c ("Merge tag 'ecryptfs-5.6-rc3-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs")
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
RNSBG is handled via the op_rosbg() helper function. But RNSBG has
the opcode 0xEC54, i.e. 0x54 as second byte, while op_rosbg() currently
checks for 0x55. This seems to be a typo, fix it to use 0x54 instead,
so that op_rosbg() does not abort() anymore if a program uses RNSBG.
I've checked with a simple test function that I now get the same results
with KVM and with TCG:
static void test_rnsbg(void)
{
uint64_t r1, r2;
r2 = 0xffff000000000000UL;
r1 = 0x123456789bdfaaaaUL;
asm volatile (" rnsbg %0,%1,12,61,16 " : "+r"(r1) : "r"(r2));
printf("r1 afterwards: 0x%lx\n", r1);
}
Buglink: https://bugs.launchpad.net/qemu/+bug/1860920
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200130133417.10531-1-thuth@redhat.com>
Fixes: d6c6372e18 ("target-s390: Implement R[NOX]SBG")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
* FXAM fix (myself)
* memdev refactoring (Igor)
* memory region API cleanups (Peter, Philippe)
* ioeventfd optimization (Stefan)
* new WHPX maintainer (Sunil)
* Large guest startup optimizations (Chen)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJeVRYwAAoJEL/70l94x66DfNMH/3GFuVzL9eqM6kvuqT8HdG79
SG3+XyVEuStBDkj3BMNMx2ua1pUaT52UOCpmYdfTUni5pKbbrSlRLH/O4lhHW+M4
51KgDI8UjEz+R4LjtRSEAPFA9FOCqKVi7F+jy3Qk8xFQ9OF2qF/QYOr+vOkJmBJg
pHYa12yseHLRy+mG7ou7rcU2xxk2a13zA3z4ANjZzgtVkDMErEEbvkLfRxn9ksyT
m1HIrgXByLXfsRCrLa0YFjJNpuazvcDT1LFLODORJA6czcw7fXtnwN5QARZhEZNu
UpFasLarWcF4TpAXdXvqOqkprAlXZg8qojMcX3ENmIkNAQWN+1pjGbN1izMiWm4=
=cY/g
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* device_del fix (Julia)
* FXAM fix (myself)
* memdev refactoring (Igor)
* memory region API cleanups (Peter, Philippe)
* ioeventfd optimization (Stefan)
* new WHPX maintainer (Sunil)
* Large guest startup optimizations (Chen)
# gpg: Signature made Tue 25 Feb 2020 12:42:24 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (104 commits)
WHPX: Assigning maintainer for Windows Hypervisor Platform
accel/kvm: Check ioctl(KVM_SET_USER_MEMORY_REGION) return value
target/i386: check for empty register in FXAM
qdev-monitor: Forbid repeated device_del
mem-prealloc: optimize large guest startup
memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()
Avoid cpu_physical_memory_rw() with a constant is_write argument
Let cpu_[physical]_memory() calls pass a boolean 'is_write' argument
exec: Let cpu_[physical]_memory API use a boolean 'is_write' argument
Avoid address_space_rw() with a constant is_write argument
Let address_space_rw() calls pass a boolean 'is_write' argument
exec: Let address_space_unmap() use a boolean 'is_write' argument
hw/virtio: Let vhost_memory_map() use a boolean 'is_write' argument
hw/virtio: Let virtqueue_map_iovec() use a boolean 'is_write' argument
hw/ide: Let the DMAIntFunc prototype use a boolean 'is_write' argument
hw/ide/internal: Remove unused DMARestartFunc typedef
Remove unnecessary cast when using the cpu_[physical]_memory API
exec: Let the cpu_[physical]_memory API use void pointer arguments
Remove unnecessary cast when using the address_space API
hw/net: Avoid casting non-const pointer, use address_space_write()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the following tools documentation files to the new tools manual:
docs/interop/qemu-img.rst
docs/interop/qemu-nbd.rst
docs/interop/virtfs-proxy-helper.rst
docs/interop/qemu-trace-stap.rst
docs/interop/virtiofsd.rst
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-4-peter.maydell@linaro.org
The qemu-option-trace.rst.inc file contains a rST documentation
fragment which describes trace options common to qemu-nbd and
qemu-img. We put this file into interop/, but we'd like to move the
qemu-nbd and qemu-img files into the tools/ manual. We could move
the .rst.inc file along with them, but we're eventually going to want
to use it for the main QEMU binary options documentation too, and
that will be in system/. So move qemu-option-trace.rst.inc to the
top-level docs/ directory, where all these files can include it via
.. include:: ../qemu-option-trace.rst.inc
This does have the slight downside that we now need to explicitly
tell Make which manuals use this file rather than relying on
a wildcard for all .rst.inc in the manual.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-3-peter.maydell@linaro.org
Some of the documentation for QEMU "tools" which are standalone
binaries like qemu-img is an awkward fit in our current 5-manual
split. We've put it into "interop", but they're not really
about interoperability.
Create a new top level manual "tools" which will be a better
home for this documentation. This commit creates an empty
initial manual; we will move the relevant documentation
files in a subsequent commit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-2-peter.maydell@linaro.org
This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory)
and consolidates it around hostmem backend. It allows to
* resolve conflicts between global -mem-prealloc and hostmem's "policy" option,
fixing premature allocation before binding policy is applied
* simplify complicated memory allocation routines which had to deal with 2 ways
to allocate RAM.
* reuse hostmem backends of a choice for main RAM without adding extra CLI
options to duplicate hostmem features. A recent case was -mem-shared, to
enable vhost-user on targets that don't support hostmem backends [1] (ex: s390)
* move RAM allocation from individual boards into generic machine code and
provide them with prepared MemoryRegion.
* clean up deprecated NUMA features which were tied to the old API (see patches)
- "numa: remove deprecated -mem-path fallback to anonymous RAM"
- (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types"
- (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes"
Introduce a new machine.memory-backend property and wrapper code that aliases
global -mem-path and -mem-alloc into automatically created hostmem backend
properties (provided memory-backend was not set explicitly given by user).
A bulk of trivial patches then follow to incrementally convert individual
boards to using machine.memory-backend provided MemoryRegion.
Board conversion typically involves:
* providing MachineClass::default_ram_size and MachineClass::default_ram_id
so generic code could create default backend if user didn't explicitly provide
memory-backend or -m options
* dropping memory_region_allocate_system_memory() call
* using convenience MachineState::ram MemoryRegion, which points to MemoryRegion
allocated by ram-memdev
On top of that for some boards:
* missing ram_size checks are added (typically it were boards with fixed ram size)
* ram_size fixups are replaced by checks and hard errors, forcing user to
provide correct "-m" values instead of ignoring it and continuing running.
After all boards are converted, the old API is removed and memory allocation
routines are cleaned up.
kvm_vm_ioctl() can fail, check its return value, and log an error
when it failed. This fixes Coverity CID 1412229:
Unchecked return value (CHECKED_RETURN)
check_return: Calling kvm_vm_ioctl without checking return value
Reported-by: Coverity (CID 1412229)
Fixes: 235e8982ad ("support using KVM_MEM_READONLY flag for regions")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200221163336.2362-1-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The fxam instruction returns the wrong result after fdecstp or after
an underflow. Check fptags to handle this.
Reported-by: <chengang@emindsoft.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device unplug can be done asynchronously. Thus, sending the second
device_del before the previous unplug is complete may lead to
unexpected results. On PCIe devices, this cancels the hot-unplug
process.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200220165556.39388-1-jusual@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[desc]:
Large memory VM starts slowly when using -mem-prealloc, and
there are some areas to optimize in current method;
1、mmap will be used to alloc threads stack during create page
clearing threads, and it will attempt mm->mmap_sem for write
lock, but clearing threads have hold read lock, this competition
will cause threads createion very slow;
2、methods of calcuating pages for per threads is not well;if we use
64 threads to split 160 hugepage,63 threads clear 2page,1 thread
clear 34 page,so the entire speed is very slow;
to solve the first problem,we add a mutex in thread function,and
start all threads when all threads finished createion;
and the second problem, we spread remainder to other threads,in
situation that 160 hugepage and 64 threads, there are 32 threads
clear 3 pages,and 32 threads clear 2 pages.
[test]:
320G 84c VM start time can be reduced to 10s
680G 84c VM start time can be reduced to 18s
Signed-off-by: bauerchen <bauerchen@tencent.com>
Reviewed-by: Pan Rui <ruippan@tencent.com>
Reviewed-by: Ivan Ren <ivanren@tencent.com>
[Simplify computation of the number of pages per thread. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reallocing the ioeventfds[] array each time an element is added is very
expensive as the number of ioeventfds increases. Batch allocate instead
to amortize the cost of realloc.
This patch reduces Linux guest boot times from 362s to 140s when there
are 2 virtio-blk devices with 1 virtqueue and 99 virtio-blk devices with
32 virtqueues.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200218182226.913977-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This pull request contains a virtio-blk/scsi performance optimization, event
loop scalability improvements, and a qtest-based device fuzzing framework. I
am including the fuzzing patches because I have reviewed them and Thomas Huth
is currently away on leave.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl5Q6z0ACgkQnKSrs4Gr
c8hWHwf6A8f4+CY+wvW1k0GkXlo6Z8QT3Lms0pMlbCcM+eTgRTz4HE2FK7lDFM8b
xUfjy78pImXJyquGeLyzcFDMrJNJkHfz1BncYxDHs/rweWNM8R9Wy08my3YLRAnK
MIM0unoWzXY8QKx+f61IEDGz1ZMVHTMpV99GhacMdx3xkyOTQk5TW1HYUUGYyiL6
gE3eVlo6a9SR06fznmBIuV3Thbqu/jI4SVk3n+gb6+HU5L9T+X6y9UsungVAd68M
oByV/RNu6+HL+YPt39XmIaUA2iOz4aq+7pw3r7w9BMrTr7AK522yoGkYXFpipCJA
spLoz+YS1Lpmezl+CuJgIak+wTJqmA==
=OnmP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
This pull request contains a virtio-blk/scsi performance optimization, event
loop scalability improvements, and a qtest-based device fuzzing framework. I
am including the fuzzing patches because I have reviewed them and Thomas Huth
is currently away on leave.
# gpg: Signature made Sat 22 Feb 2020 08:50:05 GMT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request: (31 commits)
fuzz: add documentation to docs/devel/
fuzz: add virtio-scsi fuzz target
fuzz: add virtio-net fuzz target
fuzz: add i440fx fuzz targets
fuzz: add configure flag --enable-fuzzing
fuzz: add target/fuzz makefile rules
fuzz: add support for qos-assisted fuzz targets
fuzz: support for fork-based fuzzing.
main: keep rcu_atfork callback enabled for qtest
exec: keep ram block across fork when using qtest
fuzz: add fuzzer skeleton
libqos: move useful qos-test funcs to qos_external
libqos: split qos-test and libqos makefile vars
libqos: rename i2c_send and i2c_recv
qtest: add in-process incoming command handler
libqtest: make bufwrite rely on the TransportOps
libqtest: add a layer of abstraction to send/recv
qtest: add qtest_server_send abstraction
fuzz: add FUZZ_TARGET module type
module: check module wasn't already initialized
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-23-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-scsi fuzz target sets up and fuzzes the available virtio-scsi
queues. After an element is placed on a queue, the fuzzer can select
whether to perform a kick, or continue adding elements.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-22-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-net fuzz target feeds inputs to all three virtio-net
virtqueues, and uses forking to avoid leaking state between fuzz runs.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-21-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These three targets should simply fuzz reads/writes to a couple ioports,
but they mostly serve as examples of different ways to write targets.
They demonstrate using qtest and qos for fuzzing, as well as using
rebooting and forking to reset state, or not resetting it at all.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-20-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-19-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220041118.23264-18-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-17-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fork() is a simple way to ensure that state does not leak in between
fuzzing runs. Unfortunately, the fuzzer mutation engine relies on
bitmaps which contain coverage information for each fuzzing run, and
these bitmaps should be copied from the child to the parent(where the
mutation occurs). These bitmaps are created through compile-time
instrumentation and they are not shared with fork()-ed processes, by
default. To address this, we create a shared memory region, adjust its
size and map it _over_ the counter region. Furthermore, libfuzzer
doesn't generally expose the globals that specify the location of the
counters/coverage bitmap. As a workaround, we rely on a custom linker
script which forces all of the bitmaps we care about to be placed in a
contiguous region, which is easy to locate and mmap over.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-16-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The qtest-based fuzzer makes use of forking to reset-state between
tests. Keep the callback enabled, so the call_rcu thread gets created
within the child process.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220041118.23264-15-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Ram blocks were marked MADV_DONTFORK breaking fuzzing-tests which
execute each test-input in a forked process.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-14-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
tests/fuzz/fuzz.c serves as the entry point for the virtual-device
fuzzer. Namely, libfuzzer invokes the LLVMFuzzerInitialize and
LLVMFuzzerTestOneInput functions, both of which are defined in this
file. This change adds a "FuzzTarget" struct, along with the
fuzz_add_target function, which should be used to define new fuzz
targets.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-13-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The moved functions are not specific to qos-test and might be useful
elsewhere. For example the virtual-device fuzzer makes use of them for
qos-assisted fuzz-targets.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-12-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Most qos-related objects were specified in the qos-test-obj-y variable.
qos-test-obj-y also included qos-test.o which defines a main().
This made it difficult to repurpose qos-test-obj-y to link anything
beside tests/qos-test against libqos. This change separates objects that
are libqos-specific and ones that are qos-test specific into different
variables.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200220041118.23264-11-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The names i2c_send and i2c_recv collide with functions defined in
hw/i2c/core.c. This causes an error when linking against libqos and
softmmu simultaneously (for example when using qtest inproc). Rename the
libqos functions to avoid this.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20200220041118.23264-10-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The handler allows a qtest client to send commands to the server by
directly calling a function, rather than using a file/CharBackend
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-9-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When using qtest "in-process" communication, qtest_sendf directly calls
a function in the server (qtest.c). Previously, bufwrite used
socket_send, which bypasses the TransportOps enabling the call into
qtest.c. This change replaces the socket_send calls with ops->send,
maintaining the benefits of the direct socket_send call, while adding
support for in-process qtest calls.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-8-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This makes it simple to swap the transport functions for qtest commands
to and from the qtest client. For example, now it is possible to
directly pass qtest commands to a server handler that exists within the
same process, without the standard way of writing to a file descriptor.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-7-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qtest_server_send is a function pointer specifying the handler used to
transmit data to the qtest client. In the standard configuration, this
calls the CharBackend handler, but now it is possible for other types of
handlers, e.g direct-function calls if the qtest client and server
exist within the same process (inproc)
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20200220041118.23264-6-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-5-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtual-device fuzzer must initialize QOM, prior to running
vl:qemu_init, so that it can use the qos_graph to identify the arguments
required to initialize a guest for libqos-assisted fuzzing. This change
prevents errors when vl:qemu_init tries to (re)initialize the previously
initialized QOM module.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200220041118.23264-4-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A program might rely on functions implemented in vl.c, but implement its
own main(). By placing main into a separate source file, there are no
complaints about duplicate main()s when linking against vl.o. For
example, the virtual-device fuzzer uses a main() provided by libfuzzer,
and needs to perform some initialization before running the softmmu
initialization. Now, main simply calls three vl.c functions which
handle the guest initialization, main loop and cleanup.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-3-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move vl.c to a separate directory, similar to linux-user/
Update the chechpatch and get_maintainer scripts, since they relied on
/vl.c for top_of_tree checks.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-2-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
File descriptor monitoring is O(1) with epoll(7), but
aio_dispatch_handlers() still scans all AioHandlers instead of
dispatching just those that are ready. This makes aio_poll() O(n) with
respect to the total number of registered handlers.
Add a local ready_list to aio_poll() so that each nested aio_poll()
builds a list of handlers ready to be dispatched. Since file descriptor
polling is level-triggered, nested aio_poll() calls also see fds that
were ready in the parent but not yet dispatched. This guarantees that
nested aio_poll() invocations will dispatch all fds, even those that
became ready before the nested invocation.
Since only handlers ready to be dispatched are placed onto the
ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
dispatch.
Note that AioContext polling is still O(n) and currently cannot be fully
disabled. This still needs to be fixed before aio_poll() is fully O(1).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-6-stefanha@redhat.com
[Fix compilation error on macOS where there is no epoll(87). The
aio_epoll() prototype was out of date and aio_add_ready_list() needed to
be moved outside the ifdef.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It is not necessary to scan all AioHandlers for deletion. Keep a list
of deleted handlers instead of scanning the full list of all handlers.
The AioHandler->deleted field can be dropped. Let's check if the
handler has been inserted into the deleted list instead. Add a new
QLIST_IS_INSERTED() API for this check.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QLIST_REMOVE() assumes the element is in a list. It also leaves the
element's linked list pointers dangling.
Introduce a safe version of QLIST_REMOVE() and convert open-coded
instances of this pattern.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Don't pass the nanosecond timeout into epoll_wait(), which expects
milliseconds.
The epoll_wait() timeout value does not matter if qemu_poll_ns()
determined that the poll fd is ready, but passing a value in the wrong
units is still ugly. Pass a 0 timeout to epoll_wait() instead.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
epoll_handler is a stack variable and must not be accessed after it goes
out of scope:
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
AioHandler epoll_handler;
...
add_pollfd(&epoll_handler);
ret = aio_epoll(ctx, pollfds, npfd, timeout);
} ...
...
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
nodes[i]->pfd.revents = pollfds[i].revents;
}
}
nodes[0] is &epoll_handler, which has already gone out of scope.
There is no need to use pollfds[] for epoll. We don't need an
AioHandler for the epoll fd.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The ctx->first_bh list contains all created BHs, including those that
are not scheduled. The list is iterated by the event loop and therefore
has O(n) time complexity with respected to the number of created BHs.
Rewrite BHs so that only scheduled or deleted BHs are enqueued.
Only BHs that actually require action will be iterated.
One semantic change is required: qemu_bh_delete() enqueues the BH and
therefore invokes aio_notify(). The
tests/test-aio.c:test_source_bh_delete_from_cb() test case assumed that
g_main_context_iteration(NULL, false) returns false after
qemu_bh_delete() but it now returns true for one iteration. Fix up the
test case.
This patch makes aio_compute_timeout() and aio_bh_poll() drop from a CPU
profile reported by perf-top(1). Previously they combined to 9% CPU
utilization when AioContext polling is commented out and the guest has 2
virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32 devices.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200221093951.1414693-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QSLIST is the only family of lists for which we do not have RCU-friendly accessors,
add them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220103828.24525-1-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The first rcu_read_lock/unlock() is expensive. Nested calls are cheap.
This optimization increases IOPS from 73k to 162k with a Linux guest
that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
devices.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200218182708.914552-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>