Recently, linux support multiqueue tap which could let userspace call TUNSETIFF
for a signle device many times to create multiple file descriptors as
independent queues. User could also enable/disabe a specific queue through
TUNSETQUEUE.
The patch adds the generic infrastructure to create multiqueue taps. To achieve
this a new parameter "queues" were introduced to specify how many queues were
expected to be created for tap by qemu itself. Alternatively, management could
also pass multiple pre-created tap file descriptors separated with ':' through a
new parameter fds like -netdev tap,id=hn0,fds="X:Y:..:Z". Multiple vhost file
descriptors could also be passed in this way.
Each TAPState were still associated to a tap fd, which mean multiple TAPStates
were created when user needs multiqueue taps. Since each TAPState contains one
NetClientState, with the multiqueue nic support, an N peers of NetClientState
were built up.
A new parameter, mq_required were introduce in tap_open() to create multiqueue
tap fds.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces a helper tap_get_ifname() to get the device name of tap
device. This is needed when ifname is unspecified in the command line and qemu
were asked to create tap device by itself. In this situation, the name were
allocated by kernel, so if multiqueue is asked, we need to fetch its name after
creating the first queue.
Only linux has this support since it's the only platform that supports
multiqueue tap.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduce a new bit - enabled in TAPState which tracks whether a
specific queue/fd is enabled. The tap/fd is enabled during initialization and
could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
specific helpers to do the real work. Polling of a tap fd can only done when
the tap was enabled.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds basic multiqueue support for qemu. The idea is simple, an array
of NetClientStates were introduced in NICState, parse_netdev() were extended to
find and match all NetClientStates belongs to the backend and place their
pointers in NICConf. Then qemu_new_nic can setup a N:N mapping between NICStates
that belongs to a nic and NICStates belongs to the netdev. And a queue_index
were introduced in NetClientState to track its index. After this, each peers of
a NICState were abstracted as a queue.
After this change, all NetClientState that belongs to the same backend/nic has
the same id. When use want to change the link status, all NetClientStates that
belongs to the same backend/nic will be also changed. When user want to delete
a device or netdev, all NetClientStates that belongs to the same backend/nic
will be deleted also. Changing or deleting an specific queue is not allowed.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To allow allocating an array of NetClientState and free it once, this patch
introduces destructor of NetClientState. Which could do type specific free,
which could be used by multiqueue to free the array once.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In multiqueue, all NetClientState that belongs to the same netdev or nic has the
same id. So this patches introduces an helper qemu_find_net_clients_except()
which finds all NetClientState with the same id. This will be used by multiqueue
networking.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue nic, this patch separate the nic destructor from
qemu_del_net_client() to a new helper qemu_del_nic() since the mapping bettween
NiCState and NetClientState were not 1:1 in multiqueue. The following patches
would refactor this function to support multiqueue nic.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, this patch introduces a helper qemu_get_nic() to get
NICState from a NetClientState. The following patches would refactor this helper
to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, the patch introduce a helper qemu_get_queue()
which is used to get the NetClientState of a device. The following patches would
refactor this helper to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Remove an unnecessary mutual inclusion loop between qemu-pixman.h and
console.h, since the former was only including the latter for
'PixelFormat*', which can be provided by typedefs.h. This requires a
minor adjustment to the files which included qemu-pixman.h, since
they were relying on it implicitly dragging in all of console.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* afaerber/qom-cpu: (37 commits)
kvm: Pass CPUState to kvm_on_sigbus_vcpu()
cpu: Unconditionalize CPUState fields
target-m68k: Use type_register() instead of type_register_static()
target-unicore32: Use type_register() instead of type_register_static()
target-openrisc: Use type_register() instead of type_register_static()
target-unicore32: Catch attempt to instantiate abstract type in cpu_init()
target-openrisc: Catch attempt to instantiate abstract type in cpu_init()
target-m68k: Catch attempt to instantiate abstract type in cpu_init()
target-arm: Catch attempt to instantiate abstract type in cpu_init()
target-alpha: Catch attempt to instantiate abstract type in cpu_init()
qom: Introduce object_class_is_abstract()
target-unicore32: Detect attempt to instantiate non-CPU type in cpu_init()
target-openrisc: Detect attempt to instantiate non-CPU type in cpu_init()
target-m68k: Detect attempt to instantiate non-CPU type in cpu_init()
target-alpha: Detect attempt to instantiate non-CPU type in cpu_init()
target-arm: Detect attempt to instantiate non-CPU type in cpu_init()
cpu: Add model resolution support to CPUClass
target-i386: Remove setting tsc-frequency from x86_def_t
target-i386: Set custom features/properties without intermediate x86_def_t
target-i386: Remove vendor_override field from CPUX86State
...
Conflicts:
tests/Makefile
Resolved simple conflict caused by lack of context in Makefile
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
# By Paolo Bonzini (14) and others
# Via Kevin Wolf
* kwolf/for-anthony: (24 commits)
ide: Add fall through annotations
block: Create proper size file for disk mirror
ahci: Add migration support
ahci: Change data types in preparation for migration
ahci: Remove unused AHCIDevice fields
hbitmap: add assertion on hbitmap_iter_init
mirror: do nothing on zero-sized disk
block/vdi: Check for bad signature
block/vdi: Improved return values from vdi_open
block/vdi: Improve debug output for signature
block: Use error code EMEDIUMTYPE for wrong format in some block drivers
block: Add special error code for wrong format
mirror: support arbitrarily-sized iterations
mirror: support more than one in-flight AIO operation
mirror: add buf-size argument to drive-mirror
mirror: switch mirror_iteration to AIO
mirror: allow customizing the granularity
block: allow customizing the granularity of the dirty bitmap
block: return count of dirty sectors, not chunks
mirror: perform COW if the cluster size is bigger than the granularity
...
Since commit 20d695a925 (kvm: Pass
CPUState to kvm_arch_*) CPUArchState is no longer needed.
Allows to change qemu_kvm_eat_signals() argument as well.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Commits fc8c5b8c41 (Makefile.user: Define
CONFIG_USER_ONLY for libuser/) and
dd83b06ae6 (qom: Introduce CPU class)
specifically prepared the qom/cpu.c file to be compiled differently for
softmmu and *-user. This broke as part of build system refactorings
while CPU patches were in flight, adding conditional fields
kvm_fd (8737c51c04) and
kvm_vcpu_dirty (20d695a925) for softmmu.
linux-user and bsd-user would therefore get a CPUState type with
instance_size ~8 bytes longer than expected.
Fix this by unconditionally having the fields in CPUState.
In practice, target-specific CPU types' instance_size would compensate
this, and upstream qom/cpu.c does not yet touch any affected field.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This lets a caller check if an ObjectClass as returned by, e.g.,
object_class_by_name() is instantiatable.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Introduce CPUClass::class_by_name and add a default implementation.
Hook up the alpha and ppc implementations.
Introduce a wrapper function cpu_class_by_name().
Signed-off-by: Andreas Färber <afaerber@suse.de>
The code that calculates the APIC ID will use smp_cores/smp_threads, so
just define them as 1 on *-user to avoid #ifdefs in the code.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This will allow each architecture to define how the VCPU ID is set on
the KVM_CREATE_VCPU ioctl call.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
read_splashfile() passes the address of an int variable as size_t *
parameter to g_file_get_contents(), with a cast to gag the compiler.
No problem on machines where sizeof(size_t) == sizeof(int).
Happens to work on my x86_64 box (64 bit little endian): the least
significant 32 bits of the file size end up in the right place
(caller's variable file_size), and the most significant 32 bits
clobber a place that gets assigned to before its next use (caller's
variable file_type).
I'd expect it to break on a 64 bit big-endian box.
Fix up the variable types and drop the problematic cast.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
hbitmap_iter_init causes an out-of-bounds access when the "first"
argument is or greater than or equal to the size of the bitmap.
Forbid this with an assertion, and remove the failing testcase.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block drivers need a special error code for "wrong format".
From the available error codes EMEDIUMTYPE fits best.
It is not available on all platforms, so a definition in
qemu-common.h and a specific error report are needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.
Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is needed in the following patch.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
HBitmaps provides an array of bits. The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.
In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level. When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).
Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):
bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word
bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits. To move down, you shift the index left
similarly, and add the word index within the group. Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.
Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
When iterating on a bitmap, each bit (on any level) is only visited
once. Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps. Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We can provide fast versions based on the other functions defined
by host-utils.h. Some care is required on glibc, which provides
ffsl already.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Juan Quintela (7) and Paolo Bonzini (6)
# Via Juan Quintela
* quintela/thread.next:
migration: remove argument to qemu_savevm_state_cancel
migration: Only go to the iterate stage if there is anything to send
migration: unfold rest of migrate_fd_put_ready() into thread
migration: move exit condition to migration thread
migration: Add buffered_flush error handling
migration: move beginning stage to the migration thread
qemu-file: Only set last_error if it is not already set
migration: fix off-by-one in buffered_rate_limit
migration: remove double call to migrate_fd_close
migration: make function static
use XFER_LIMIT_RATIO consistently
Protect migration_bitmap_sync() with the ramlist lock
Unlock ramlist lock also in error case
# By Kevin Wolf (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: support viostor virtio-pci status bit setting
dataplane: avoid reentrancy during virtio_blk_data_plane_stop()
win32-aio: use iov utility functions instead of open-coding them
win32-aio: Fix memory leak
win32-aio: Fix vectored reads
aio: Fix return value of aio_poll()
ide: Remove wrong assertion
block: fix null-pointer bug on error case in block commit
s390x-linux-user now also uses GETPC. Instead of adding it to the list of
targets which use GETPC, the macro is now defined unconditionally.
This avoids future build regressions like this one:
CC s390x-linux-user/target-s390x/int_helper.o
cc1: warnings being treated as errors
qemu/target-s390x/int_helper.c: In function ‘helper_divs32’:
qemu/target-s390x/int_helper.c:47: error: implicit declaration of function ‘GETPC’
qemu/target-s390x/int_helper.c:47: error: nested extern declaration of ‘GETPC’
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit c64ca8140e (cpu: Move
queued_work_{first,last} to CPUState) moved the qemu_work_item fields
away. Clean up the now unused prototype.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Code mixes uint32_t, int and size_t. Very unlikely to go wrong in
practice, but clean it up anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
# By Wenchao Xia
# Via Luiz Capitulino
* luiz/queue/qmp:
HMP: add sub command table to info
HMP: move define of mon_cmds
HMP: add infrastructure for sub command
HMP: delete info handler
HMP: add QDict to info callback handler
Add a documentation section "Methods" and discuss among others how to
handle overriding virtual methods.
Clarify DeviceClass::realize documentation and refer to the above.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
This patch change all info call back function to take
additional QDict * parameter, which allow those command
take parameter. Now it is set to NULL at default case.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
aio_poll() must return true if any work is still pending, even if it
didn't make progress, so that bdrv_drain_all() doesn't stop waiting too
early. The possibility of stopping early occasionally lead to a failed
assertion in bdrv_drain_all(), when some in-flight request was missed
and the function didn't really drain all requests.
In order to make that change, the return value as specified in the
function comment must change for blocking = false; fortunately, the
return value of blocking = false callers is only used in test cases, so
this change shouldn't cause any trouble.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
OpenBSD system compiler (gcc 4.2.1) has problems with concatenation
of macro arguments in macro functions:
CC aes.o
In file included from /src/qemu/include/qemu-common.h:126,
from /src/qemu/aes.c:30:
/src/qemu/include/qemu/bswap.h: In function 'leul_to_cpu':
/src/qemu/include/qemu/bswap.h:461: warning: implicit declaration of function 'bswapHOST_LONG_BITS'
/src/qemu/include/qemu/bswap.h:461: warning: nested extern declaration of 'bswapHOST_LONG_BITS'
Function leul_to_cpu() is only used in kvm-all.c, so the warnings
are not fatal on OpenBSD without -Werror.
Fix by applying glue(). Also add do {} while(0) wrapping and fix
semicolon use while at it.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qemu_chr_new_from_opts handles QemuOpts release now, so callers don't
have to worry. It will either be saved in CharDriverState, then
released in qemu_chr_delete, or in the error case released instantly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
A usage with a hardcoded partial path such as
object_resolve_path_component(obj, "foo")
is totally valid but currently leads to a compilation error. Fix this.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Any KVM-specific code that use these constants must check if
kvm_enabled() is true before using them.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move the declaration to qemu/cpu.h and add documentation.
The implementation still depends on CPUArchState for CPU iteration.
Signed-off-by: Andreas Färber <afaerber@suse.de>