Commit Graph

61331 Commits

Author SHA1 Message Date
Peter Maydell 75578d6fce linux-user: Assert on bad type in thunk_type_align() and thunk_type_size()
In thunk_type_align() and thunk_type_size() we currently return
-1 if the value at the type_ptr isn't one of the TYPE_* values
we understand. However, this should never happen, and if it does
then the calling code will go confusingly wrong because none
of the callsites try to handle an error return. Switch to an
assertion instead, so that if this does somehow happen we'll have
a nice clear backtrace of what happened rather than a weird crash
or misbehaviour.

This also silences various Coverity complaints about not handling
the negative return value (CID 1005735, 1005736, 1005738, 1390582).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180514174616.19601-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-05-24 20:46:54 +02:00
Peter Maydell 62b9b076d9 vga: catch depth 0
hw/display: add new bochs-display device
 some cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbBt4qAAoJEEy22O7T6HE49zIQAJpmAAqJidE14l6mcnxzxqoS
 yC0VyXFqEbwueTeNd4VIjrJRkOHk8auWCes+QKoR1ZfbEEsLDzD8CF4tKbSj8Y41
 0Poe+vYrOQQazkxWSB08xBE6mdTtXTkDC4zkmNSDxb5LIBGBa1A/juPksaXawFDK
 pfDaCYZdO7RgnU2EgoykDteHMnzKeCO9fMPQvdVuen4DDqeIlXyYAUDEYDag5gJy
 DhZqDDz31m7g9JYLOVSGW0Qd2uZXvw55A3pnDNiCyyDKJl0xLeJOBif714J2GI0r
 wDLDY9sSjx78d/qmNtX+X4lvXW7GWGDw228VjW2XBKqzligg9rM/8h6aIPbqqT5N
 XWdOT6Qx4lSbcsbinSWVmWu9XDb7NXKSu3oc/tTC/ImJTuBPUkaGuMTmd5kCe5Mk
 VG5ZY8ow9pQ9THU1pFW6CA/MRWWfm5IPxkJBiT4sJb8i5aSPxTSBEZjMfVdCvh78
 av2g5NB4IYCOHDDwAOra+NDKULuDFRBYFKyb4Ge52zKe2UB3sKnxlIWeOGS88C6D
 gH/j+02eO4Q8EWauUUFUJNW3TGUirlcl5oxvyZjhEbjsh+2RasOT1ULW54IIdpnU
 ijhHfnRcMNG4iNq1HA49tjHq3QraGQ9zZF3wOSrwyq47uU2Rjp+HS5TnZuzz3Mi4
 VEjZXPmVd9R9++c3E+o1
 =FJjD
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kraxel/tags/vga-20180524-pull-request' into staging

vga: catch depth 0
hw/display: add new bochs-display device
some cleanups.

# gpg: Signature made Thu 24 May 2018 16:45:46 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/vga-20180524-pull-request:
  MAINTAINERS: add vga entries
  bochs-display: add pcie support
  bochs-display: add dirty tracking support
  hw/display: add new bochs-display device
  vga-pci: use PCI_VGA_MMIO_SIZE
  vga: move bochs vbe defines to header file
  vga: catch depth 0

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-24 17:48:01 +01:00
Peter Maydell 45eabb2ede pc, pci, virtio, vhost: fixes, features
Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
 iommu rework fixing a couple of security problems (no CVEs yet), fixes
 all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbBX2cAAoJECgfDbjSjVRpOEYIAIR6KGkwbAJ9SnO9B71DQHl1
 yYYgM7i2HwyZ1YPnXOYWnI1lzQ1bARTf2krQJFGmfjlDaueFf9KnXdNByoVCmG8m
 UhF/rQp3DcJ4wTABktPtME8gWdQxKPmDxlN5W3f29Zrm3g9S+Hshi+sfPZUkBxL4
 gQMFRctb2SxvQXG+lusHVwo1oF6pzGZMmX35906he3m4xS/cfoeCP7Qj6nSvHZq7
 lsLoOeYxHtXWA9gTYxpd7zW+hhUxkspoOqcXySHfO7e5enJANaulTxKuC0T+6HL4
 O2iUM+1wjUYE0tQcNJ6x7emA82k5OdG2OMD6gbR1oSdquttJo7+4R+goqpb44rc=
 =NUoY
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, virtio, vhost: fixes, features

Beginning of merging vDPA, new PCI ID, a new virtio balloon stat, intel
iommu rework fixing a couple of security problems (no CVEs yet), fixes
all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 23 May 2018 15:41:32 BST
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (28 commits)
  intel-iommu: rework the page walk logic
  util: implement simple iova tree
  intel-iommu: trace domain id during page walk
  intel-iommu: pass in address space when page walk
  intel-iommu: introduce vtd_page_walk_info
  intel-iommu: only do page walk for MAP notifiers
  intel-iommu: add iommu lock
  intel-iommu: remove IntelIOMMUNotifierNode
  intel-iommu: send PSI always even if across PDEs
  nvdimm: fix typo in label-size definition
  contrib/vhost-user-blk: enable protocol feature for vhost-user-blk
  hw/virtio: Fix brace Werror with clang 6.0.0
  libvhost-user: Send messages with no data
  vhost-user+postcopy: Use qemu_set_nonblock
  virtio: support setting memory region based host notifier
  vhost-user: support receiving file descriptors in slave_read
  vhost-user: add Net prefix to internal state structure
  linux-headers: add kvm header for mips
  linux-headers: add unistd.h on all arches
  update-linux-headers.sh: unistd.h, kvm consistency
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-24 14:22:23 +01:00
Peter Maydell 37cbe4da61 Block layer patches:
- Generic background jobs
 - qemu-iotests fixes for NFS and the 'migration' group
 - sheepdog: Minor code simplification
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbBV+tAAoJEH8JsnLIjy/WiBoP/0C7IObZ6/yOedSFsCADdt1U
 Kv2So7V7OBC8QeDhJqB9x5GZ4XieBWHqCR7yFKG3/piMuv1+HRiuA4aNiW1mwreU
 VqI6Ls4pZ8jEWYl6p0+enhA2cQVEDJjICC7n6xw8rhu1OaMdXwzNXBjXVMJHmMCg
 m6KixQXB1JCPuGZmoNE6wE34gsCiWTKrbt6T83RMcacJaVBcumN6RyzkFeLZ2jef
 XXNXYGrZ2/UBGJOBxpLjbZvgYehMcUd20HZ0jNFtiA9+APBVoc6iFAXseT9tbedm
 QBdOw8ESshTKOfpNxLEDrQomCpCAmGf5q3z7omikFk3VBaap1cE5ZCA/ylvb/sxp
 ZoigX/VfZ8auol+y/kLS4SAGKbnyg6qT7LX1G1op+Fwewo+mMTpBeNvDGQIIGR33
 cCRl7YBpHvtN2T4GzmGhPKcWqNpykqmNgW/ZMW1BG1NeR8zkdEtvH4xESxAYkaXm
 a7nFCaxAQlU1l0EV9jVKSnS07/8eKCXStGatBBMYn4VjRYZSVaWW1J5WW16nJpip
 eDfv3QQPujAsSAwC1V+4WyrcqBx1Ya2Accic2rbpiLFHAsEFl+UMhiw2ken1x85W
 +ZOyAMXXRPVsYMPAOCZIemYONAons2GSJsJvIZC9ln4yjAFNuGNTvTDgoiZfYxBC
 P9eAY6T905Y2f7bMYV9y
 =cNlf
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Generic background jobs
- qemu-iotests fixes for NFS and the 'migration' group
- sheepdog: Minor code simplification

# gpg: Signature made Wed 23 May 2018 13:33:49 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (46 commits)
  qemu-iotests: Test job-* with block jobs
  iotests: Move qmp_to_opts() to VM
  blockjob: Remove BlockJob.driver
  job: Add query-jobs QMP command
  job: Add lifecycle QMP commands
  job: Add JOB_STATUS_CHANGE QMP event
  job: Introduce qapi/job.json
  job: Move progress fields to Job
  job: Add job_transition_to_ready()
  job: Add job_is_ready()
  job: Add job_dismiss()
  job: Add job_yield()
  block: Cancel job in bdrv_close_all() callers
  job: Move completion and cancellation to Job
  job: Move transactions to Job
  job: Switch transactions to JobTxn
  job: Move job_finish_sync() to Job
  job: Move .complete callback to Job
  job: Add job_drain()
  job: Convert block_job_cancel_async() to Job
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-24 13:24:22 +01:00
Peter Maydell 5ff2a4b97f Xen 2018/05/22
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbBGT2AAoJEIlPj0hw4a6Q0BIQAIciEH80ZLPSQPJciRoBCB/s
 1BUsbFelfeHcTXAOxiahmaQbzzffosifK0uJAH5Quy5RotnKknguRgfgAl8nWX34
 vXBbYKOmdudNwk3wIhcnnLfiRvNcdF4yDVVArsnvEbCWUBV2mpTrhuqmepZMiY7s
 GQU42ly09WQePBPFW19RS/kxsaR1lSuGIuSxtc+0QvDQpqMRK3HslHrjD0jgUISh
 pitqL3ztPHpJN1d9KI9fNT066magjnAkWV1gSYH9MAFvWV6JVuXqyJPY/x+YcZEU
 uPgZdEmONIoFqLD7yEHf/kXtCoRZavnvX3t0OPvUOPrN8eeui/UrtOyElY0NxmS/
 RGKggOAjN4VYdAfXHqLIWHkkE8CicFkukr6hk2nyfDKT63ZTR6TfEESrweL3clej
 J2dQ1bPvmAhlPU7tFeo8drAFgeIjLl9QAXEtp2+1FXGYBOT4avwtnt4rrGpnSbpL
 wV6SObd4d8yJR1JrA42jXuL2qlubainlceg4dSFGf84iUx68Sq+DXM+ssbVxs4Tc
 VEJ2nIGK2OF0jafFUV4sqZ7iDQoor0A/SGiOHk8j3kr/kWbidid6opl+MRcr/II3
 CYHyR8IH3LLY/CY18ppVbuCNxovPTilwRsleNy6ZxM1uFeWIOGWUy0WutPTi8wq8
 Ksa1MVPftDaMOLL2WE5C
 =e+YI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/sstabellini-http/tags/xen-20180522-tag' into staging

Xen 2018/05/22

# gpg: Signature made Tue 22 May 2018 19:44:06 BST
# gpg:                using RSA key 894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini-http/tags/xen-20180522-tag:
  xen_disk: be consistent with use of xendev and blkdev->xendev
  xen_disk: use a single entry iovec
  xen_backend: make the xen_feature_grant_copy flag private
  xen_disk: remove use of grant map/unmap
  xen_backend: add an emulation of grant copy
  xen: remove other open-coded use of libxengnttab
  xen_disk: remove open-coded use of libxengnttab
  xen_backend: add grant table helpers
  xen: add a meaningful declaration of grant_copy_segment into xen_common.h
  checkpatch: generalize xen handle matching in the list of types
  xen-hvm: create separate function for ioreq server initialization
  xen_pt: Present the size of 64 bit BARs correctly
  configure: Add explanation for --enable-xen-pci-passthrough
  xen/pt: use address_space_memory object for memory region hooks
  xen-pvdevice: Introduce a simplistic xen-pvdevice save state

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-24 11:30:59 +01:00
Peter Maydell 9cac60db45 target/lm32: BQL patch
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIZA+SEU3p8KQzj6ytFirsNjTeOMFAlsEYJoACgkQtFirsNjT
 eONBxRAAin6dSNGmN38I1znn/yBaxFnK+jEeEJgQX1WKS29str1UCGsnxT6Y47Hk
 hptrXTsS1uQGC0cr8d4TCxi2lckW87T73eANX52Y7JWf1KPHmLlFsZ4CezdkbhT6
 lANubq9HMM8SMCQGnIlz+NO4hDn/+peGWRH+2ehy7eMKHRY+vyVLAaUqnWgu17kn
 y/p4CPnT/89QwVQotistJLPayTCQ1Emmq6B1wUs4y2MsLDmHBti1pYPolUb1vpKi
 TM98Ah/U9UYxIn6uxGAuzO46Bkge5CNZZ/ev9o5OoJI7g9mXcn9e1fhbAyrsh/uR
 GABEou6j4m5VgRBzH5KAGTO8FLkx/8q24eD8IcVqFRnKZfpPv7EGqoqYyE3HhX62
 JMOXDXdKZjaHSG1J8ndM8FdmJg1NyD2XJ4yR8kTQUHUYayfZblnG/yiEw2CF+zdU
 prpQctGIqC1FeCGx/FF7tCl4/C1TH1ekWVv8qv4ZlKXdwebSqr9+YuvQFdKBZzFd
 g7f1A91nVITf8BfEjKji8+gA2yBMMHsbd8wiQLHjaNhC3wCUiPvCvP5h/f+3WzFA
 o5tx2/C6WlEQ9cJXT7X9cVnYhC55JukP+ZTk2Q7lXNX+4Yn6ge0NCeWL0sRQD1j7
 xlw9swt6GXv354LzK/2MgxPvkZ5EVUWazysGPfFpw34fojjt72w=
 =M2dr
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mwalle/tags/lm32-queue/20180521' into staging

target/lm32: BQL patch

# gpg: Signature made Tue 22 May 2018 19:25:30 BST
# gpg:                using RSA key B458ABB0D8D378E3
# gpg: Good signature from "Michael Walle <michael@walle.cc>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2190 3E48 4537 A7C2 90CE  3EB2 B458 ABB0 D8D3 78E3

* remotes/mwalle/tags/lm32-queue/20180521:
  lm32: take BQL before writing IP/IM register

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-24 10:25:43 +01:00
Gerd Hoffmann dbb2e4726c MAINTAINERS: add vga entries
Add entries for standard vga, virtio-gpu and cirrus.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180522165058.15404-7-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann f2581064a6 bochs-display: add pcie support
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180522165058.15404-6-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann 33ebad5405 bochs-display: add dirty tracking support
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180522165058.15404-5-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann 765c942908 hw/display: add new bochs-display device
After writing up the virtual mdev device emulating a display supporting
the bochs vbe dispi interface (mbochs.ko) and seeing how simple it
actually is I've figured that would be useful for qemu too.

So, here it is, -device bochs-display.  It is basically -device VGA
without legacy vga emulation.  PCI bar 0 is the framebuffer, PCI bar 2
is mmio with the registers.  The vga registers are simply not there
though, neither in the legacy ioport location nor in the mmio bar.
Consequently it is PCI class DISPLAY_OTHER not DISPLAY_VGA.

So there is no text mode emulation, no weird video modes (planar,
256color palette), no memory window at 0xa0000.  Just a linear
framebuffer in the pci memory bar.  And the amount of code to emulate
this (and therefore the attack surface) is an order of magnitude smaller
when compared to vga emulation.

Compatibility wise it works with OVMF (latest git master).
The bochs-drm.ko linux kernel module can handle it just fine too.
So UEFI guests should not see any functional difference to VGA.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180522165058.15404-4-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann 83ff909f93 vga-pci: use PCI_VGA_MMIO_SIZE
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180522165058.15404-3-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann a3ee49f075 vga: move bochs vbe defines to header file
Create a new header file, move the bochs vbe dispi interface
defines to it, so they can be used outside vga code.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180522165058.15404-2-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Gerd Hoffmann a89fe6c329 vga: catch depth 0
depth == 0 is used to indicate 256 color modes.  Our region calculation
goes wrong in that case.  So detect that and just take the safe code
path we already have for the wraparound case.

While being at it also catch depth == 15 (where our region size
calculation goes wrong too).  And make the comment more verbose,
explaining what is going on here.

Without this windows guest install might trigger an assert due to trying
to check dirty bitmap outside the snapshot region.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1575541
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180514103117.21059-1-kraxel@redhat.com
2018-05-24 10:42:13 +02:00
Peter Xu 63b88968f1 intel-iommu: rework the page walk logic
This patch fixes a potential small window that the DMA page table might
be incomplete or invalid when the guest sends domain/context
invalidations to a device.  This can cause random DMA errors for
assigned devices.

This is a major change to the VT-d shadow page walking logic. It
includes but is not limited to:

- For each VTDAddressSpace, now we maintain what IOVA ranges we have
  mapped and what we have not.  With that information, now we only send
  MAP or UNMAP when necessary.  Say, we don't send MAP notifies if we
  know we have already mapped the range, meanwhile we don't send UNMAP
  notifies if we know we never mapped the range at all.

- Introduce vtd_sync_shadow_page_table[_range] APIs so that we can call
  in any places to resync the shadow page table for a device.

- When we receive domain/context invalidation, we should not really run
  the replay logic, instead we use the new sync shadow page table API to
  resync the whole shadow page table without unmapping the whole
  region.  After this change, we'll only do the page walk once for each
  domain invalidations (before this, it can be multiple, depending on
  number of notifiers per address space).

While at it, the page walking logic is also refactored to be simpler.

CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Tested-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:34:05 +03:00
Peter Xu eecf5eedbd util: implement simple iova tree
Introduce a simplest iova tree implementation based on GTree.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:58 +03:00
Peter Xu d118c06ebb intel-iommu: trace domain id during page walk
This patch only modifies the trace points.

Previously we were tracing page walk levels.  They are redundant since
we have page mask (size) already.  Now we trace something much more
useful which is the domain ID of the page walking.  That can be very
useful when we trace more than one devices on the same system, so that
we can know which map is for which domain.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:58 +03:00
Peter Xu 2f764fa87d intel-iommu: pass in address space when page walk
We pass in the VTDAddressSpace too.  It'll be used in the follow up
patches.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:58 +03:00
Peter Xu fe215b0cbb intel-iommu: introduce vtd_page_walk_info
During the recursive page walking of IOVA page tables, some stack
variables are constant variables and never changed during the whole page
walking procedure.  Isolate them into a struct so that we don't need to
pass those contants down the stack every time and multiple times.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:58 +03:00
Peter Xu 4f8a62a933 intel-iommu: only do page walk for MAP notifiers
For UNMAP-only IOMMU notifiers, we don't need to walk the page tables.
Fasten that procedure by skipping the page table walk.  That should
boost performance for UNMAP-only notifiers like vhost.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Peter Xu 1d9efa73e1 intel-iommu: add iommu lock
SECURITY IMPLICATION: this patch fixes a potential race when multiple
threads access the IOMMU IOTLB cache.

Add a per-iommu big lock to protect IOMMU status.  Currently the only
thing to be protected is the IOTLB/context cache, since that can be
accessed even without BQL, e.g., in IO dataplane.

Note that we don't need to protect device page tables since that's fully
controlled by the guest kernel.  However there is still possibility that
malicious drivers will program the device to not obey the rule.  In that
case QEMU can't really do anything useful, instead the guest itself will
be responsible for all uncertainties.

CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Peter Xu b4a4ba0d68 intel-iommu: remove IntelIOMMUNotifierNode
That is not really necessary.  Removing that node struct and put the
list entry directly into VTDAddressSpace.  It simplfies the code a lot.
Since at it, rename the old notifiers_list into vtd_as_with_notifiers.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Peter Xu 36d2d52bdb intel-iommu: send PSI always even if across PDEs
SECURITY IMPLICATION: without this patch, any guest with both assigned
device and a vIOMMU might encounter stale IO page mappings even if guest
has already unmapped the page, which may lead to guest memory
corruption.  The stale mappings will only be limited to the guest's own
memory range, so it should not affect the host memory or other guests on
the host.

During IOVA page table walking, there is a special case when the PSI
covers one whole PDE (Page Directory Entry, which contains 512 Page
Table Entries) or more.  In the past, we skip that entry and we don't
notify the IOMMU notifiers.  This is not correct.  We should send UNMAP
notification to registered UNMAP notifiers in this case.

For UNMAP only notifiers, this might cause IOTLBs cached in the devices
even if they were already invalid.  For MAP/UNMAP notifiers like
vfio-pci, this will cause stale page mappings.

This special case doesn't trigger often, but it is very easy to be
triggered by nested device assignments, since in that case we'll
possibly map the whole L2 guest RAM region into the device's IOVA
address space (several GBs at least), which is far bigger than normal
kernel driver usages of the device (tens of MBs normally).

Without this patch applied to L1 QEMU, nested device assignment to L2
guests will dump some errors like:

qemu-system-x86_64: VFIO_MAP_DMA: -17
qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000,
                    0x7f89a920d000) = -17 (File exists)

CC: QEMU Stable <qemu-stable@nongnu.org>
Acked-by: Jason Wang <jasowang@redhat.com>
[peterx: rewrite the commit message]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Ross Zwisler 1a97a478e6 nvdimm: fix typo in label-size definition
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit da6789c27c ("nvdimm: add a macro for property "label-size"")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:03 +03:00
Changpeng Liu 7d405b2ffb contrib/vhost-user-blk: enable protocol feature for vhost-user-blk
This patch reports the protocol feature that is only advertised by
QEMU if the device implements the config ops.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:03 +03:00
Richard Henderson ebf2a499a5 hw/virtio: Fix brace Werror with clang 6.0.0
The warning is

hw/virtio/vhost-user.c:1319:26: error: suggest braces
      around initialization of subobject [-Werror,-Wmissing-braces]
    VhostUserMsg msg = { 0 };
                         ^
                         {}

While the original code is correct, and technically exactly correct
as per ISO C89, both GCC and Clang support plain empty set of braces
as an extension.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:02 +03:00
Dr. David Alan Gilbert e3638151da libvhost-user: Send messages with no data
The response to a VHOST_USER_POSTCOPY_ADVISE contains a fd but doesn't
actually contain any data.   FIx vu_message_write so that it doesn't
do a 0-byte write() call, since this was ending up with rc=0
that was confusing the error handling code.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:02 +03:00
Dr. David Alan Gilbert 9952e807fd vhost-user+postcopy: Use qemu_set_nonblock
Use qemu_set_nonblock rather than a simple fcntl; cleaner
and I have no reason to change other flags.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:02 +03:00
Tiwei Bie 6f80e6170e virtio: support setting memory region based host notifier
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:01:54 +03:00
Tiwei Bie 1f3a4519b1 vhost-user: support receiving file descriptors in slave_read
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:01:54 +03:00
Kevin Wolf bdebdc712b qemu-iotests: Test job-* with block jobs
This adds a test case that tests the new job-* QMP commands with
mirror and backup block jobs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:52 +02:00
Kevin Wolf 62a9428812 iotests: Move qmp_to_opts() to VM
qmp_to_opts() used to be a method of QMPTestCase, but recently we
started to add more Python test cases that don't make use of
QMPTestCase. In order to make the method usable there, move it to VM.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 9f6bb4c004 blockjob: Remove BlockJob.driver
BlockJob.driver is redundant with Job.driver and only used in very few
places any more. Remove it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 456273b024 job: Add query-jobs QMP command
This adds a minimal query-jobs implementation that shouldn't pose many
design questions. It can later be extended to expose more information,
and especially job-specific information.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 1a90bc8128 job: Add lifecycle QMP commands
This adds QMP commands that control the transition between states of the
job lifecycle.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 1dac83f1a1 job: Add JOB_STATUS_CHANGE QMP event
This adds a QMP event that is emitted whenever a job transitions from
one status to another.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf bf42508f24 job: Introduce qapi/job.json
This adds a separate schema file for all job-related definitions that
aren't tied to the block layer.

For a start, move the enums JobType, JobStatus and JobVerb.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 30a5c887bf job: Move progress fields to Job
BlockJob has fields .offset and .len, which are actually misnomers today
because they are no longer tied to block device sizes, but just progress
counters. As such they make a lot of sense in generic Jobs.

This patch moves the fields to Job and renames them to .progress_current
and .progress_total to describe their function better.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 2e1795b581 job: Add job_transition_to_ready()
The transition to the READY state was still performed in the BlockJob
layer, in the same function that sent the BLOCK_JOB_READY QMP event.

This patch brings the state transition to the Job layer and implements
the QMP event using a notifier called from the Job layer, like we
already do for other events related to state transitions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf df956ae201 job: Add job_is_ready()
Instead of having a 'bool ready' in BlockJob, add a function that
derives its value from the job status.

At the same time, this fixes the behaviour to match what the QAPI
documentation promises for query-block-job: 'true if the job may be
completed'. When the ready flag was introduced in commit ef6dbf1e46,
the flag never had to be reset to match the description because after
being ready, the jobs would immediately complete and disappear.

Job transactions and manual job finalisation were introduced only later.
With these changes, jobs may stay around even after having completed
(and they are not ready to be completed a second time), however their
patches forgot to reset the ready flag.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 5f9a6a08e8 job: Add job_dismiss()
This moves block_job_dismiss() to the Job layer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 198c49cc8d job: Add job_yield()
This moves block_job_yield() to the Job layer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf b3b5299d58 block: Cancel job in bdrv_close_all() callers
Now that we cancel all jobs and not only block jobs on shutdown, doing
that in bdrv_close_all() isn't really appropriate any more. Move the
job_cancel_sync_all() call to the callers, and only assert that there
are no job running in bdrv_close_all().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 3d70ff53b6 job: Move completion and cancellation to Job
This moves the top-level job completion and cancellation functions from
BlockJob to Job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 7eaa8fb57d job: Move transactions to Job
This moves the logic that implements job transactions from BlockJob to
Job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:51 +02:00
Kevin Wolf 62c9e4162a job: Switch transactions to JobTxn
This doesn't actually move any transaction code to Job yet, but it
renames the type for transactions from BlockJobTxn to JobTxn and makes
them contain Jobs rather than BlockJobs

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00
Kevin Wolf 6a74c075ac job: Move job_finish_sync() to Job
block_job_finish_sync() doesn't contain anything block job specific any
more, so it can be moved to Job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00
Kevin Wolf 3453d97243 job: Move .complete callback to Job
This moves the .complete callback that tells a READY job to complete
from BlockJobDriver to JobDriver. The wrapper function job_complete()
doesn't require anything block job specific any more and can be moved
to Job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00
Kevin Wolf b69f777dd9 job: Add job_drain()
block_job_drain() contains a blk_drain() call which cannot be moved to
Job, so add a new JobDriver callback JobDriver.drain which has a common
implementation for all BlockJobs. In addition to this we keep the
existing BlockJobDriver.drain callback that is called by the common
drain implementation for all block jobs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00
Kevin Wolf 004e95df98 job: Convert block_job_cancel_async() to Job
block_job_cancel_async() did two things that were still block job
specific:

* Setting job->force. This field makes sense on the Job level, so we can
  just move it. While at it, rename it to job->force_cancel to make its
  purpose more obvious.

* Resetting the I/O status. This can't be moved because generic Jobs
  don't have an I/O status. What the function really implements is a
  user resume, except without entering the coroutine. Consequently, it
  makes sense to call the .user_resume driver callback here which
  already resets the I/O status.

  The old block_job_cancel_async() has two separate if statements that
  check job->iostatus != BLOCK_DEVICE_IO_STATUS_OK and job->user_paused.
  However, the former condition always implies the latter (as is
  asserted in block_job_iostatus_reset()), so changing the explicit call
  of block_job_iostatus_reset() on the former condition with the
  .user_resume callback on the latter condition is equivalent and
  doesn't need to access any BlockJob specific state.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00
Kevin Wolf 4ad351819b job: Move single job finalisation to Job
This moves the finalisation of a single job from BlockJob to Job.

Some part of this code depends on job transactions, and job transactions
call this code, we introduce some temporary calls from Job functions to
BlockJob ones. This will be fixed once transactions move to Job, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-05-23 14:30:50 +02:00