Commit Graph

38 Commits

Author SHA1 Message Date
Michael S. Tsirkin 89c473fd82 virtio-pci: fix bus master work around on load
Commit c81131db15
detects old guests by comparing virtio and
PCI status. It attempts to do this on load,
as well, but load_config callback in a binding
is invoked too early and so the virtio status
isn't set yet.

We could add yet another callback to the
binding, to invoke after load, but it
seems easier to reuse the existing vmstate
callback.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
2011-03-28 18:34:23 +02:00
Amit Shah 6b331efb73 virtio-serial: Use a struct to pass config information from proxy
Instead of using a single variable to pass to the virtio_serial_init
function, use a struct so that expanding the number of variables to be
passed on later is easier.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2011-03-21 16:55:11 +05:30
mst@redhat.com 5430a28fe4 vhost: force vhost off for non-MSI guests
When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Added a vhostforce flag to force vhost-net back on.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-01 16:50:44 -06:00
Stefan Hajnoczi 25db9ebe15 virtio-pci: Use ioeventfd for virtqueue notify
Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a notify to be
processed synchronously and spin waiting for completion.
For virtio-net, this also seems to interact with the guest stack in strange
ways so that TCP throughput for small message sizes (~200bytes)
is harmed. Only enable ioeventfd for virtio-blk for now.

Care must be taken not to interfere with vhost-net, which uses host
notifiers.  If the set_host_notifier() API is used by a device
virtio-pci will disable virtio-ioeventfd and let the device deal with
host notifiers as it wishes.

Finally, there used to be a limit of 6 KVM io bus devices inside the
kernel.  On such a kernel, don't use ioeventfd for virtqueue host
notification since the limit is reached too easily.  This ensures that
existing vhost-net setups (which always use ioeventfd) have ioeventfds
available so they can continue to work.

After migration and on VM change state (running/paused) virtio-ioeventfd
will enable/disable itself.

 * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd
 * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd
 * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd
 * vm_change_state(running=0) -> disable virtio-ioeventfd
 * vm_change_state(running=1) -> enable virtio-ioeventfd

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 14:44:16 +02:00
Michael S. Tsirkin 85cf2a8d74 virtio: move vmstate change tracking to core
Move tracking vmstate change from virtio-net to virtio.c
as it is going to be used by virito-blk and virtio-pci
for the ioeventfd support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 14:44:07 +02:00
Michael S. Tsirkin 54dd932128 virtio: change set guest notifier to per-device
When using irqfd with vhost-net to inject interrupts,
a single evenfd might inject multiple interrupts.
Implementing this is much easier with a single
per-device callback to set guest notifiers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-10-07 12:19:47 +02:00
Anthony Liguori aab2e8f79a Merge remote branch 'kwolf/for-anthony' into staging 2010-09-08 14:26:57 -05:00
Alex Williamson f0c07c7c7b virtio-net: Make tx_timer timeout configurable
Add an option to make the TX mitigation timer adjustable as a device
option.  The 150us hard coded default used currently is reasonable,
but may not be suitable for all workloads, this gives us a way to
adjust it using a single binary.  We can't support any random option
though, so use the "x-" prefix to indicate this is a developer
option.  Usage:

-device virtio-net-pci,x-txtimer=500000,... # .5ms timeout

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-07 20:29:24 +03:00
Kevin Wolf 42fb2e0720 virtio: Factor virtqueue_map_sg out
Separate the mapping of requests to host memory from the descriptor iteration.
The next patch will make use of it in a different context.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30 18:29:19 +02:00
Amit Shah 8b53a86577 virtio-serial: Cleanup on device hot-unplug
Free malloc'ed memory, unregister from savevm and clean up virtio-common
bits on device hot-unplug.

This was found performing a migration after device hot-unplug.

Reported-by: <lihuang@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 16:19:00 -05:00
Alex Williamson 9d0d313859 virtio-blk: Create exit function to unregister savevm
Otherwise we can't migrate after we've removed a virtio block device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-26 13:39:39 +02:00
Anthony Liguori 9f10751365 virtio-9p: Add a virtio 9p device to qemu
This patch doesn't implement the 9p protocol handling
code. It adds a simple device which dump the protocol data.

[jvrao@linux.vnet.ibm.com: Little-Endian to host format conversion]
[aneesh.kumar@linux.vnet.ibm.com: Multiple-mounts support]

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-05-03 12:17:37 -05:00
Michael S. Tsirkin 2be24aaafe virtio: move typedef to qemu-common
make it possible to use type without header include,
simplifying header dependencies.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Michael S. Tsirkin 3e607cb503 virtio: add set_status callback
vhost net backend needs to be notified when
frontend status changes. Add a callback,
similar to set_features.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Michael S. Tsirkin 1cbdabe203 virtio: notifier support + APIs for queue fields
vhost needs physical addresses for ring and other queue fields,
so add APIs for these. In particular, add binding API to set
host/guest notifiers.  Will be used by vhost.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Christoph Hellwig 428c149b0b block: add topology qdev properties
Add three new qdev properties to export block topology information to
the guest.  This is needed to get optimal I/O alignment for RAID arrays
or SSDs.

The options are:

 - physical_block_size to specify the physical block size of the device,
   this is going to increase from 512 bytes to 4096 kilobytes for many
   modern storage devices
 - min_io_size to specify the minimal I/O size without performance impact,
   this is typically set to the RAID chunk size for arrays.
 - opt_io_size to specify the optimal sustained I/O size, this is
   typically the RAID stripe width for arrays.

I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.

Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in.  The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.

To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers.  Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.

Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:

  -drive file=scratch.img,media=disk,cache=none,id=scratch \
  -device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192

aliguori: updated patch to take into account BLOCK events

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10 16:53:25 -06:00
Amit Shah 98b19252cf virtio-console: qdev conversion, new virtio-serial-bus
This commit converts the virtio-console device to create a new
virtio-serial bus that can host console and generic serial ports. The
file hosting this code is now called virtio-serial-bus.c.

The virtio console is now a very simple qdev device that sits on the
virtio-serial-bus and communicates between the bus and qemu's chardevs.

This commit also includes a few changes to the virtio backing code for
pci and s390 to spawn the virtio-serial bus.

As a result of the qdev conversion, we get rid of a lot of legacy code.
The old-style way of instantiating a virtio console using

    -virtioconsole ...

is maintained, but the new, preferred way is to use

    -device virtio-serial -device virtconsole,chardev=...

With this commit, multiple devices as well as multiple ports with a
single device can be supported.

For multiple ports support, each port gets an IO vq pair. Since the
guest needs to know in advance how many vqs a particular device will
need, we have to set this number as a property of the virtio-serial
device and also as a config option.

In addition, we also spawn a pair of control IO vqs. This is an internal
channel meant for guest-host communication for things like port
open/close, sending port properties over to the guest, etc.

This commit is a part of a series of other commits to get the full
implementation of multiport support. Future commits will add other
support as well as ride on the savevm version that we bump up here.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20 08:25:23 -06:00
Amit Shah bb61564c77 virtio: Remove duplicate macro definition for max. virtqueues, bump up the max
VIRTIO_PCI_QUEUE_MAX is redefined in hw/virtio.c. Let's just keep it in
hw/virtio.h.

Also, bump up the value of the maximum allowed virtqueues to 64. This is
in preparation to allow multiple ports per virtio-console device.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20 08:25:23 -06:00
Michael S. Tsirkin 8172539d21 virtio: add features as qdev properties
Add feature bits as properties to virtio. This makes it possible to e.g. define
machine without indirect buffer support, which is required for 0.10
compatibility, or without hardware checksum support, which is required for 0.11
compatibility.  Since default values for optional features are now set by qdev,
get_features callback has been modified: it sets non-optional bits, and clears
bits not supported by host.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 13:40:59 -06:00
Michael S. Tsirkin 704a76fcd2 virtio: rename features -> guest_features
Rename features->guest_features. This is
what they are, avoid confusion with
host features which we also need to keep around.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11 13:40:59 -06:00
Michael S. Tsirkin 6d74ca5aa8 virtio: verify features on load
migrating between hosts which have different features
might break silently, if the migration destination
does not support some features supported by source.

Prevent this from happening by comparing acked feature
bits with the mask supported by the device.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:38 -06:00
Gerd Hoffmann 97b156213e virtio: use qdev properties for configuration.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:40 -05:00
Anthony Liguori c227f0995e Revert "Get rid of _t suffix"
In the very least, a change like this requires discussion on the list.

The naming convention is goofy and it causes a massive merge problem.  Something
like this _must_ be presented on the list first so people can provide input
and cope with it.

This reverts commit 99a0949b72.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-01 16:12:16 -05:00
malc 99a0949b72 Get rid of _t suffix
Some not so obvious bits, slirp and Xen were left alone for the time
being.

Signed-off-by: malc <av1474@comtv.ru>
2009-10-01 22:45:02 +04:00
Stefan Weil d89c682f20 Suppress some variants of English in comments
Replace surpress, supress by suppress.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-09-25 16:31:35 +02:00
Gerd Hoffmann d176c495b6 qdev-ify virtio-blk.
First user of the new drive property.  With this patch applied host
and guest config can be specified separately, like this:

  -drive if=none,id=disk1,file=/path/to/disk.img
  -device virtio-blk-pci,drive=disk1

You can set any property for virtio-blk-pci now.  You can set the pci
address via addr=.  You can switch the device into 0.10 compat mode
using class=0x0180.  As this is per device you can have one 0.10 and one
0.11 virtio block device in a single virtual machine.

Old syntax continues to work.  Internally it does the same as the two
lines above though.  One side effect this has is a different
initialization order, which might result in a different pci address
being assigned by default.

Long term plan here is to have this working for all block devices, i.e.
once all scsi is properly qdev-ified you will be able to do something
like this:

  -drive if=none,id=sda,file=/path/to/disk.img
  -device lsi,id=lsi,addr=<pciaddr>
  -device scsi-disk,drive=sda,bus=lsi.0,lun=<n>

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Message-Id:
2009-08-10 13:05:28 -05:00
Michael S. Tsirkin ff24bd589c qemu/virtio: virtio save/load bindings
Implement bindings for virtio save/load. Use them in virtio pci.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24 09:09:15 -05:00
Michael S. Tsirkin 7055e687cd qemu/virtio: virtio support for many interrupt vectors
Extend virtio to support many interrupt vectors, and rearrange code in
preparation for multi-vector support (mostly move reset out to bindings,
because we will have to reset the vectors in transport-specific code).
Actual bindings in pci, and use in net, to follow.
Load and save are not connected to bindings yet, so they are left
stubbed out for now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24 09:09:14 -05:00
Mark McLoughlin efeea6d048 virtio: add support for indirect ring entries
Support a new feature flag for indirect ring entries. These are ring
entries which point to a table of buffer descriptors.

The idea here is to increase the ring capacity by allowing a larger
effective ring size whereby the ring size dictates the number of
requests that may be outstanding, rather than the size of those
requests.

This should be most effective in the case of block I/O where we can
potentially benefit by concurrently dispatching a large number of
large requests. Even in the simple case of single segment block
requests, this results in a threefold increase in ring capacity.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-22 10:10:50 -05:00
Paul Brook 53c25cea7d Separate virtio PCI code
Split the PCI host bindings from the VRing transport implementation.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-18 18:26:33 +01:00
Paul Brook cf21e106cd Virtio-net qdev conversion
Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-14 22:35:07 +01:00
aliguori b946a15332 Introduce VLANClientState::cleanup() (Mark McLoughlin)
We're currently leaking memory and file descriptors on device
hot-unplug.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7150 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-17 17:11:08 +00:00
aliguori 8eca6b1bc7 Fix oops on 2.6.25 guest (Rusty Russell)
I believe this is behind the following:
https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128

virtio_pci in 2.6.25 didn't do feature negotiation correctly: it acked every
bit.  Fortunately, we can detect this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6975 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-05 17:40:08 +00:00
blueswir1 173a543b36 Add and use #defines for PCI device classes
This patch adds and uses #defines for PCI device classes and subclases,
using a new pci_config_set_class() function, similar to the recently
added pci_config_set_vendor_id() and pci_config_set_device_id().

Change since v1: fixed compilation of hw/sun4u.c

Signed-off-by: Stuart Brady <stuart.brady@gmail.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6491 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-01 19:26:20 +00:00
aliguori bf9298b90e Make struct iovec universally available
Vectored IO APIs will require some sort of vector argument.  It makes sense to
use struct iovec and just define it globally for Windows.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5889 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-05 20:05:26 +00:00
aliguori bb6834cfae Fix windows build after virtio changes
Windows does not have sys/uio.h and does not have err.h.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5877 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-04 21:28:28 +00:00
aliguori f46f15bca7 Remove TARGET_PAGE_SIZE from virtio interface (Hollis Blanchard)
TARGET_PAGE_SIZE should only be used internal to qemu, not in guest/host
interfaces. The virtio frontend code in Linux uses two constants (PFN shift
and vring alignment) for the interface, so update qemu to match.

I've tested this with PowerPC KVM and confirmed that it fixes virtio problems
when using non-TARGET_PAGE_SIZE pages in the guest.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5871 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-04 19:58:45 +00:00
aliguori 967f97fa00 Virtio core support
This patch adds core support for VirtIO.  VirtIO is a paravirtualization
framework that has been in Linux since 2.6.21.  A PCI transport has been
available since 2.6.25.  Network drivers are also available for Windows.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5869 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-04 19:38:57 +00:00