Commit Graph

43 Commits

Author SHA1 Message Date
Markus Armbruster 095ed5be7b posix-aio-compat: Plug memory leak on paio_init() error path
Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-11 12:49:52 -06:00
Frediano Ziglio e1d3b25499 block: avoid SIGUSR2
Now that iothread is always compiled sending a signal seems only an
additional step. This patch also avoid writing to two pipe (one from signal
and one in qemu_service_io).

Work with kvm enabled or disabled. strace output is more readable (less syscalls).

[ kwolf: Merged build fix by Paolo Bonzini ]

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 14:32:56 +02:00
Frediano Ziglio 21cfa41e91 posix-aio-compat: Removed unused offset variable
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 12:27:43 +02:00
Avi Kivity e4ea78ee76 posix-aio-compat: fix latency issues
In certain circumstances, posix-aio-compat can incur a lot of latency:
 - threads are created by vcpu threads, so if vcpu affinity is set,
   aio threads inherit vcpu affinity.  This can cause many aio threads
   to compete for one cpu.
 - we can create up to max_threads (64) aio threads in one go; since a
   pthread_create can take around 30μs, we have up to 2ms of cpu time
   under a global lock.

Fix by:
 - moving thread creation to the main thread, so we inherit the main
   thread's affinity instead of the vcpu thread's affinity.
 - if a thread is currently being created, and we need to create yet
   another thread, let thread being born create the new thread, reducing
   the amount of time we spend under the main thread.
 - drop the local lock while creating a thread (we may still hold the
   global mutex, though)

Note this doesn't eliminate latency completely; scheduler artifacts or
lack of host cpu resources can still cause it.  We may want pre-allocated
threads when this cannot be tolerated.

Thanks to Uli Obergfell of Red Hat for his excellent analysis and suggestions.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Anthony Liguori 7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Kevin Wolf ba1d1afdfe posix-aio-compat: Allow read after EOF
In order to be able to transparently replace bdrv_read calls by bdrv_co_read,
reading beyond EOF must produce zeros instead of short reads for AIO, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Kevin Wolf 384acbf46b async: Remove AsyncContext
The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy
during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and
can run AIO callbacks of different requests if it weren't for AsyncContexts).

Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be
removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Alexandre Raymond 9bf0960a9a Fix compilation warning due to missing header for sigaction (followup)
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.

Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-08 09:04:29 +01:00
Kevin Wolf 5be4aab701 posix-aio-compat: Fix idle_threads counter
A thread should only be counted as idle when it really is waiting for new
requests. Without this patch, sometimes too few threads are started as busy
threads are counted as idle.

Not sure if it makes a difference in practice outside some artificial
qemu-io/qemu-img tests, but I think the change makes sense in any case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-05-18 14:38:45 +02:00
Stefan Hajnoczi ddca9fb2b5 trace: Trace posix-aio-compat.c completion and cancellation
This patch adds paio_complete() and paio_cancel() trace events to
complement the paio_submit() event.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-03-07 15:34:46 +00:00
Jes Sorensen dc786bc910 Move qemu_gettimeofday() to OS specific files
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-30 08:02:38 +00:00
Stefan Weil b0cd712cc3 Fix spelling in comments
multifuction -> multifunction
successfull -> successful.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
2010-10-05 13:53:56 -05:00
Christoph Hellwig 72aef7318f use qemu_blockalign consistently
Use qemu_blockalign for all allocations in the block layer.  This allows
increasing the required alignment, which is need to support O_DIRECT on
devices with large block sizes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21 15:39:42 +02:00
Stefan Hajnoczi 6d519a5f95 trace: Trace virtio-blk, multiwrite, and paio_submit
This patch adds trace events that make it possible to observe
virtio-blk.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-09-09 16:22:45 -05:00
Andrew de Quincey 34cf008129 posix-aio-compat: Fix async_conmtext for ioctl
Set the async_context_id field when queuing an async ioctl call

Signed-off-by: Andrew de Quincey <adq@lidskialf.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30 18:29:22 +02:00
Stefan Hajnoczi b587a52c00 posix-aio-compat: Expand tabs that have crept in
This patch expands tabs on a few lines so the code formats nicely and
follows the QEMU coding style.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28 13:14:26 +02:00
Kirill A. Shutemov 4817d32757 posix-aio-compat.c: fix warning with _FORTIFY_SOURCE
CC    posix-aio-compat.o
cc1: warnings being treated as errors
posix-aio-compat.c: In function 'aio_signal_handler':
posix-aio-compat.c:505: error: ignoring return value of 'write', declared with attribute warn_unused_result
make: *** [posix-aio-compat.o] Error 1

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 14:59:19 -06:00
Kevin Wolf 6769da29c7 posix-aio-compat: Fix error check
Checking for nbytes < 0 is pointless as long as it's a size_t. If we want to
use negative numbers for error codes, we should use signed types.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:26:00 -06:00
Kevin Wolf 40ff6d7e8d Don't leak file descriptors
We're leaking file descriptors to child processes. Set FD_CLOEXEC on file
descriptors that don't need to be passed to children to stop this misbehaviour.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 11:45:50 -06:00
Kevin Wolf 1e5b9d2fcc Remove aio_ctx from paio_* interface
The context parameter in paio_submit isn't used anyway, so there is no reason
why block drivers should need to remember it. This also avoids passing a Linux
AIO context to paio_submit (which doesn't do any harm as long as the parameter
is unused, but it is highly confusing).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-30 08:39:34 -05:00
Kevin Wolf e5f37649c6 posix-aio-compat: Honour AsyncContext
Don't call callbacks that don't belong to the active AsyncContext.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:59 -05:00
Kevin Wolf 8febfa2684 Add qemu_aio_process_queue()
We'll leave some AIO completions unhandled when we can't call the callback.
qemu_aio_process_queue() is used later to run any callbacks that are left and
can be run then.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:59 -05:00
Kevin Wolf 59c7b155aa posix-aio-compat: Split out posix_aio_process_queue
We need to process the request queue and run callbacks separately from reading
out the queue in a later patch, so split it out.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:58 -05:00
malc ee3993069f posix-aio-compat: avoid signal race when spawning a thread
Signed-off-by: malc <av1474@comtv.ru>
2009-09-27 04:16:02 +04:00
Blue Swirl 72cf2d4f0e Fix sys-queue.h conflict for good
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc923584,
f40d753718,
96555a96d7 and
3990d09adf but the fixes were fragile.

Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-12 07:36:22 +00:00
Blue Swirl 47faadc676 Unbreak BSD: use qemu_fdatasync instead of fdatasync
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-12 06:19:14 +00:00
Christoph Hellwig b2e12bc6e3 block: add aio_flush operation
Instead stalling the VCPU while serving a cache flush try to do it
asynchronously.  Use our good old helper thread pool to issue an
asynchronous fdatasync for raw-posix.  Note that while Linux AIO
implements a fdatasync operation it is not useful for us because
it isn't actually implement in asynchronous fashion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 10:19:46 -05:00
Christoph Hellwig 9ef91a6771 raw-posix: refactor AIO support
Currently the raw-posix.c code contains a lot of knowledge about the
asynchronous I/O scheme that is mostly implemented in posix-aio-compat.c.
All this code does not really belong here and is getting a bit in the
way of implementing native AIO on Linux.

So instead move all the guts of the AIO implementation into
posix-aio-compat.c (which might need a better name, btw).

There's now a very small interface between the AIO providers and raw-posix.c:

 - an init routine is called from raw_open_common to return an AIO context
   for this drive.  An AIO implementation may either re-use one context
   for all drives, or use a different one for each as the Linux native
   AIO support will do.
 - an submit routine is called from the aio_reav/writev methods to submit
   an AIO request

There are no indirect calls involved in this interface as we need to
decide which one to call manually.  We will only call the Linux AIO native
init function if we were requested to by vl.c, and we will only call
the native submit function if we are asked to and the request is properly
aligned.  That's also the reason why the alignment check actually does
the inverse move and now goes into raw-posix.c.

The old posix-aio-compat.h headers is removed now that most of it's
content is private to posix-aio-compat.c, and instead we add a new
block/raw-posix-aio.h headers is created containing only the tiny interface
between raw-posix.c and the AIO implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:30:22 -05:00
Juan Quintela 2341f9a1ad rename HAVE_PREADV to CONFIG_PREADV
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-27 14:09:20 -05:00
Christoph Hellwig e7d54ae83c fix asynchronous ioctls
posix_aio_read expect aio requests to return the number of bytes
requests to be successfull, so we need to fake this up for ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-08 15:51:19 -05:00
aliguori ceb42de899 native preadv/pwritev support (Christoph Hellwig)
This ties up the preadv/pwritev syscalls to qemu if they are declared in
unistd.h.  This is the case currently on at least NetBSD and OpenBSD and
will hopefully soon be the case on Linux.

Thanks to Blue Swirl and Gerd Hoffmann for the configure autodetection
of preadv/pwritev.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7021 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-07 18:43:28 +00:00
aliguori f141eafe28 push down vector linearization to posix-aio-compat.c (Christoph Hellwig)
Make all AIO requests vectored and defer linearization until the actual
I/O thread.  This prepares for using native preadv/pwritev.

Also enables asynchronous direct I/O by handling that case in the I/O thread.

Qcow and qcow2 propably want to be adopted to directly deal with multi-segment
requests, but that can be implemented later.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7020 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-07 18:43:24 +00:00
aliguori 221f715d90 new scsi-generic abstraction, use SG_IO (Christoph Hellwig)
Okay, I started looking into how to handle scsi-generic I/O in the
new world order.

I think the best is to use the SG_IO ioctl instead of the read/write
interface as that allows us to support scsi passthrough on disk/cdrom
devices, too.  See Hannes patch on the kvm list from August for an
example.

Now that we always do ioctls we don't need another abstraction than
bdrv_ioctl for the synchronous requests for now, and for asynchronous
requests I've added a aio_ioctl abstraction keeping it simple.

Long-term we might want to move the ops to a higher-level abstraction
and let the low-level code fill out the request header, but I'm lazy
enough to leave that to the people trying to support scsi-passthrough
on a non-Linux OS.

Tested lightly by issuing various sg_ commands from sg3-utils in a guest
to a host CDROM device.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6895 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28 17:28:41 +00:00
malc 514f7a2774 Properly handle pthread_cond_timedwait timing out
pthread_cond_timedwait is allowed to both consume the signal and
return with the value indicating the timeout, hence predicate should
always be (re)checked before taking an action

git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6634 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21 05:48:19 +00:00
malc a8227a5a20 Cosmetics
Avoid repeated creation/initalization/destruction of attr and calls to
getpid

git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6633 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21 05:48:17 +00:00
malc 5d47e3728b Avoid thundering herd problem
Broadcast was used so that the I/O threads would wakeup, reset their
ts values and all but one go to sleep, in other words an optimization
to prevent threads from exiting in presence of continuing I/O
activity. Spurious wakeups make the looping around cond_timedwait with
ever reinitialized ts potentially unsafe and as such ts in no longer
reinitilized inside the loop, hence switch to signal is warranted and
this benefits of this particlaur optimization are lost.

(It's worth noting that timed variants of pthread calls use realtime
clock by default, and therefore can hang "forever" should the host
time be changed. Unfortunatelly not all host systems QEMU runs on
support CLOCK_MONOTONIC and/or pthread_condattr_setclock with this
value)

git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6632 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21 05:48:15 +00:00
malc 30525aff78 Avoid infinite loop around timed condition variable
This can happen due to spurious wakeups

git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6631 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21 05:48:13 +00:00
malc 8653c0158c Error checking
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6630 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21 05:48:11 +00:00
blueswir1 55f11ca3c2 Rename sigev_signo to avoid FreeBSD problems (Juergen Lock)
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6414 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-24 11:54:21 +00:00
blueswir1 c9db92fcc1 Use kill instead of sigqueue: re-enables AIO on OpenBSD
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6360 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-17 06:49:15 +00:00
aliguori f094a78220 Fix race in POSIX AIO emulation (Jan Kiszka)
When we cancel an AIO request that is already being processed by
aio_thread, qemu_paio_cancel should return QEMU_PAIO_NOTCANCELED as long
as aio_thread isn't done with this request. But as the latter currently
updates aiocb->ret after every block of the request, we may report
QEMU_PAIO_ALLDONE too early.

Futhermore, in case some zero-length request should have been queued,
aiocb->ret is never set to != -EINPROGRESS and callers like
raw_aio_cancel could get stuck in an endless loop.

Fix those issues by updating aiocb->ret _after_ the request has been
fully processed. This also simplifies the locking.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6278 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-13 15:13:53 +00:00
blueswir1 1d6198c3b0 Remove unnecessary trailing newlines
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6000 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13 09:32:43 +00:00
aliguori 3c529d9359 Replace posix-aio with custom thread pool
glibc implements posix-aio as a thread pool and imposes a number of limitations.

1) it limits one request per-file descriptor.  we hack around this by dup()'ing
file descriptors which is hideously ugly

2) it's impossible to add new interfaces and we need a vectored read/write
operation to properly support a zero-copy API.

What has been suggested to me by glibc folks, is to implement whatever new
interfaces we want and then it can eventually be proposed for standardization.
This requires that we implement our own posix-aio implementation though.

This patch implements posix-aio using pthreads.  It immediately eliminates the
need for fd pooling.

It performs at least as well as the current posix-aio code (in some
circumstances, even better).

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5996 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-12 16:41:40 +00:00