VFIO updates for v4.10-rc1

- VFIO updates for v4.10 primarily include a new Mediated Device
    interface, which essentially allows software defined devices to be
    exposed to users through VFIO.  The host vendor driver providing
    this virtual device polices, or mediates user access to the device.
    These devices often incorporate portions of real devices, for
    instance the primary initial users of this interface expose vGPUs
    which allow the user to map mediated devices, or mdevs, to a
    portion of a physical GPU.  QEMU composes these mdevs into PCI
    representations using the existing VFIO user API.  This enables
    both Intel KVM-GT support, which is also expected to arrive into
    Linux mainline during the v4.10 merge window, as well as NVIDIA
    vGPU, and also Channel I/O devices (aka CCW devices) for s390
    virtualization support. (Kirti Wankhede, Neo Jia)
 
  - Drop unnecessary uses of pcibios_err_to_errno() (Cao Jin)
 
  - Fixes to VFIO capability chain handling (Eric Auger)
 
  - Error handling fixes for fallout from mdev (Christophe JAILLET)
 
  - Notifiers to expose struct kvm to mdev vendor drivers (Jike Song)
 
  - type1 IOMMU model search fixes (Kirti Wankhede, Neo Jia)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYSyCtAAoJECObm247sIsi+rIP/3Q/GE3zaDdz1iKQK/c/qhs6
 0Pl45opAqw4wCJDCZIhRmoHmsCaT4KkeJKU1fiYc0mKJhW11HfA4DTFwzBqrHBj7
 7wPjHTaWwlFRHCYVCWYEp5g9UASyD8ubWGyZKzqIXELFoAvwuBL3SULNj4neJKKR
 rPcHTVxJ7laYIjHFzuNUi/MWEdjxPT9oJn8Bm9mhISwPglIMU9nkIR20ChaSeFJb
 MiFqFW7BcvkVyqupjpksM9DodpNZu+3uSMVtgASNVNbilf0FXJr0d8RCbeSxTIfm
 rEsZ5+0PrklhCtmRRl5EB+tNawgaism8wAF74KIO//76vE02Usrxb0b5mTIZ8TiN
 6/Z+WID5D+ZRt8hp9hJIJmGE/sM/odH4r174dPaiEkMvOB9ksDIPkzgbtDbVY40c
 DACb7/n3ZZA0an2Eq2HEx/BqTOvt9sgu367KVvhuoIArQcb5SM94GT03Dv+pKnax
 Cxmro2oaWmAV3IS0vNzbCIddsFqlPjkFIYxjtzBy+bVLg2RN3STyaSL6cwJsydSU
 KLcCPiYtovczKFj7RJlgVlqh5/8uZ7SEffTkIggehdnVPAfDlK9p9BYqLCgAoWpN
 vwWidM3qOIjooRXQgxUwJgJsl4MLRMoA/gFP4iHbqOgIAGtUDRHuQ4muvkf+LLxg
 wpgfXsBQNRuVcZHBUEVe
 =gc6j
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - VFIO updates for v4.10 primarily include a new Mediated Device
   interface, which essentially allows software defined devices to be
   exposed to users through VFIO. The host vendor driver providing this
   virtual device polices, or mediates user access to the device.

   These devices often incorporate portions of real devices, for
   instance the primary initial users of this interface expose vGPUs
   which allow the user to map mediated devices, or mdevs, to a portion
   of a physical GPU. QEMU composes these mdevs into PCI representations
   using the existing VFIO user API. This enables both Intel KVM-GT
   support, which is also expected to arrive into Linux mainline during
   the v4.10 merge window, as well as NVIDIA vGPU, and also Channel I/O
   devices (aka CCW devices) for s390 virtualization support. (Kirti
   Wankhede, Neo Jia)

 - Drop unnecessary uses of pcibios_err_to_errno() (Cao Jin)

 - Fixes to VFIO capability chain handling (Eric Auger)

 - Error handling fixes for fallout from mdev (Christophe JAILLET)

 - Notifiers to expose struct kvm to mdev vendor drivers (Jike Song)

 - type1 IOMMU model search fixes (Kirti Wankhede, Neo Jia)

* tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
  vfio iommu type1: Fix size argument to vfio_find_dma() in pin_pages/unpin_pages
  vfio iommu type1: Fix size argument to vfio_find_dma() during DMA UNMAP.
  vfio iommu type1: WARN_ON if notifier block is not unregistered
  kvm: set/clear kvm to/from vfio_group when group add/delete
  vfio: support notifier chain in vfio_group
  vfio: vfio_register_notifier: classify iommu notifier
  vfio: Fix handling of error returned by 'vfio_group_get_from_dev()'
  vfio: fix vfio_info_cap_add/shift
  vfio/pci: Drop unnecessary pcibios_err_to_errno()
  MAINTAINERS: Add entry VFIO based Mediated device drivers
  docs: Sample driver to demonstrate how to use Mediated device framework.
  docs: Sysfs ABI for mediated device framework
  docs: Add Documentation for Mediated devices
  vfio: Define device_api strings
  vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio: Introduce vfio_set_irqs_validate_and_prepare()
  vfio_pci: Update vfio_pci to use vfio_info_add_capability()
  vfio: Introduce common function to add capabilities
  vfio iommu: Add blocking notifier to notify DMA_UNMAP
  ...
This commit is contained in:
Linus Torvalds 2016-12-13 09:23:56 -08:00
commit edc5f445a6
23 changed files with 4490 additions and 269 deletions

View File

@ -0,0 +1,111 @@
What: /sys/.../<device>/mdev_supported_types/
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
This directory contains list of directories of currently
supported mediated device types and their details for
<device>. Supported type attributes are defined by the
vendor driver who registers with Mediated device framework.
Each supported type is a directory whose name is created
by adding the device driver string as a prefix to the
string provided by the vendor driver.
What: /sys/.../<device>/mdev_supported_types/<type-id>/
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
This directory gives details of supported type, like name,
description, available_instances, device_api etc.
'device_api' and 'available_instances' are mandatory
attributes to be provided by vendor driver. 'name',
'description' and other vendor driver specific attributes
are optional.
What: /sys/.../mdev_supported_types/<type-id>/create
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Writing UUID to this file will create mediated device of
type <type-id> for parent device <device>. This is a
write-only file.
For example:
# echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
/sys/devices/foo/mdev_supported_types/foo-1/create
What: /sys/.../mdev_supported_types/<type-id>/devices/
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
This directory contains symbolic links pointing to mdev
devices sysfs entries which are created of this <type-id>.
What: /sys/.../mdev_supported_types/<type-id>/available_instances
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Reading this attribute will show the number of mediated
devices of type <type-id> that can be created. This is a
readonly file.
Users:
Userspace applications interested in creating mediated
device of that type. Userspace application should check
the number of available instances could be created before
creating mediated device of this type.
What: /sys/.../mdev_supported_types/<type-id>/device_api
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Reading this attribute will show VFIO device API supported
by this type. For example, "vfio-pci" for a PCI device,
"vfio-platform" for platform device.
What: /sys/.../mdev_supported_types/<type-id>/name
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Reading this attribute will show human readable name of the
mediated device that will get created of type <type-id>.
This is optional attribute. For example: "Grid M60-0Q"
Users:
Userspace applications interested in knowing the name of
a particular <type-id> that can help in understanding the
type of mediated device.
What: /sys/.../mdev_supported_types/<type-id>/description
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Reading this attribute will show description of the type of
mediated device that will get created of type <type-id>.
This is optional attribute. For example:
"2 heads, 512M FB, 2560x1600 maximum resolution"
Users:
Userspace applications interested in knowing the details of
a particular <type-id> that can help in understanding the
features provided by that type of mediated device.
What: /sys/.../<device>/<UUID>/
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
This directory represents device directory of mediated
device. It contains all the attributes related to mediated
device.
What: /sys/.../<device>/<UUID>/mdev_type
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
This is symbolic link pointing to supported type, <type-id>
directory of which this mediated device is created.
What: /sys/.../<device>/<UUID>/remove
Date: October 2016
Contact: Kirti Wankhede <kwankhede@nvidia.com>
Description:
Writing '1' to this file destroys the mediated device. The
vendor driver can fail the remove() callback if that device
is active and the vendor driver doesn't support hot unplug.
Example:
# echo 1 > /sys/bus/mdev/devices/<UUID>/remove

View File

@ -0,0 +1,398 @@
/*
* VFIO Mediated devices
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
Virtual Function I/O (VFIO) Mediated devices[1]
===============================================
The number of use cases for virtualizing DMA devices that do not have built-in
SR_IOV capability is increasing. Previously, to virtualize such devices,
developers had to create their own management interfaces and APIs, and then
integrate them with user space software. To simplify integration with user space
software, we have identified common requirements and a unified management
interface for such devices.
The VFIO driver framework provides unified APIs for direct device access. It is
an IOMMU/device-agnostic framework for exposing direct device access to user
space in a secure, IOMMU-protected environment. This framework is used for
multiple devices, such as GPUs, network adapters, and compute accelerators. With
direct device access, virtual machines or user space applications have direct
access to the physical device. This framework is reused for mediated devices.
The mediated core driver provides a common interface for mediated device
management that can be used by drivers of different devices. This module
provides a generic interface to perform these operations:
* Create and destroy a mediated device
* Add a mediated device to and remove it from a mediated bus driver
* Add a mediated device to and remove it from an IOMMU group
The mediated core driver also provides an interface to register a bus driver.
For example, the mediated VFIO mdev driver is designed for mediated devices and
supports VFIO APIs. The mediated bus driver adds a mediated device to and
removes it from a VFIO group.
The following high-level block diagram shows the main components and interfaces
in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
devices as examples, as these devices are the first devices to use this module.
+---------------+
| |
| +-----------+ | mdev_register_driver() +--------------+
| | | +<------------------------+ |
| | mdev | | | |
| | bus | +------------------------>+ vfio_mdev.ko |<-> VFIO user
| | driver | | probe()/remove() | | APIs
| | | | +--------------+
| +-----------+ |
| |
| MDEV CORE |
| MODULE |
| mdev.ko |
| +-----------+ | mdev_register_device() +--------------+
| | | +<------------------------+ |
| | | | | nvidia.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | Physical | |
| | device | | mdev_register_device() +--------------+
| | interface | |<------------------------+ |
| | | | | i915.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | | |
| | | | mdev_register_device() +--------------+
| | | +<------------------------+ |
| | | | | ccw_device.ko|<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| +-----------+ |
+---------------+
Registration Interfaces
=======================
The mediated core driver provides the following types of registration
interfaces:
* Registration interface for a mediated bus driver
* Physical device driver interface
Registration Interface for a Mediated Bus Driver
------------------------------------------------
The registration interface for a mediated bus driver provides the following
structure to represent a mediated device's driver:
/*
* struct mdev_driver [2] - Mediated device's driver
* @name: driver name
* @probe: called when new device created
* @remove: called when device removed
* @driver: device driver structure
*/
struct mdev_driver {
const char *name;
int (*probe) (struct device *dev);
void (*remove) (struct device *dev);
struct device_driver driver;
};
A mediated bus driver for mdev should use this structure in the function calls
to register and unregister itself with the core driver:
* Register:
extern int mdev_register_driver(struct mdev_driver *drv,
struct module *owner);
* Unregister:
extern void mdev_unregister_driver(struct mdev_driver *drv);
The mediated bus driver is responsible for adding mediated devices to the VFIO
group when devices are bound to the driver and removing mediated devices from
the VFIO when devices are unbound from the driver.
Physical Device Driver Interface
--------------------------------
The physical device driver interface provides the parent_ops[3] structure to
define the APIs to manage work in the mediated core driver that is related to
the physical device.
The structures in the parent_ops structure are as follows:
* dev_attr_groups: attributes of the parent device
* mdev_attr_groups: attributes of the mediated device
* supported_config: attributes to define supported configurations
The functions in the parent_ops structure are as follows:
* create: allocate basic resources in a driver for a mediated device
* remove: free resources in a driver when a mediated device is destroyed
The callbacks in the parent_ops structure are as follows:
* open: open callback of mediated device
* close: close callback of mediated device
* ioctl: ioctl callback of mediated device
* read : read emulation callback
* write: write emulation callback
* mmap: mmap emulation callback
A driver should use the parent_ops structure in the function call to register
itself with the mdev core driver:
extern int mdev_register_device(struct device *dev,
const struct parent_ops *ops);
However, the parent_ops structure is not required in the function call that a
driver should use to unregister itself with the mdev core driver:
extern void mdev_unregister_device(struct device *dev);
Mediated Device Management Interface Through sysfs
==================================================
The management interface through sysfs enables user space software, such as
libvirt, to query and configure mediated devices in a hardware-agnostic fashion.
This management interface provides flexibility to the underlying physical
device's driver to support features such as:
* Mediated device hot plug
* Multiple mediated devices in a single virtual machine
* Multiple mediated devices from different physical devices
Links in the mdev_bus Class Directory
-------------------------------------
The /sys/class/mdev_bus/ directory contains links to devices that are registered
with the mdev core driver.
Directories and files under the sysfs for Each Physical Device
--------------------------------------------------------------
|- [parent physical device]
|--- Vendor-specific-attributes [optional]
|--- [mdev_supported_types]
| |--- [<type-id>]
| | |--- create
| | |--- name
| | |--- available_instances
| | |--- device_api
| | |--- description
| | |--- [devices]
| |--- [<type-id>]
| | |--- create
| | |--- name
| | |--- available_instances
| | |--- device_api
| | |--- description
| | |--- [devices]
| |--- [<type-id>]
| |--- create
| |--- name
| |--- available_instances
| |--- device_api
| |--- description
| |--- [devices]
* [mdev_supported_types]
The list of currently supported mediated device types and their details.
[<type-id>], device_api, and available_instances are mandatory attributes
that should be provided by vendor driver.
* [<type-id>]
The [<type-id>] name is created by adding the the device driver string as a
prefix to the string provided by the vendor driver. This format of this name
is as follows:
sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);
* device_api
This attribute should show which device API is being created, for example,
"vfio-pci" for a PCI device.
* available_instances
This attribute should show the number of devices of type <type-id> that can be
created.
* [device]
This directory contains links to the devices of type <type-id> that have been
created.
* name
This attribute should show human readable name. This is optional attribute.
* description
This attribute should show brief features/description of the type. This is
optional attribute.
Directories and Files Under the sysfs for Each mdev Device
----------------------------------------------------------
|- [parent phy device]
|--- [$MDEV_UUID]
|--- remove
|--- mdev_type {link to its type}
|--- vendor-specific-attributes [optional]
* remove (write only)
Writing '1' to the 'remove' file destroys the mdev device. The vendor driver can
fail the remove() callback if that device is active and the vendor driver
doesn't support hot unplug.
Example:
# echo 1 > /sys/bus/mdev/devices/$mdev_UUID/remove
Mediated device Hot plug:
------------------------
Mediated devices can be created and assigned at runtime. The procedure to hot
plug a mediated device is the same as the procedure to hot plug a PCI device.
Translation APIs for Mediated Devices
=====================================
The following APIs are provided for translating user pfn to host pfn in a VFIO
driver:
extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
int npage, int prot, unsigned long *phys_pfn);
extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
int npage);
These functions call back into the back-end IOMMU module by using the pin_pages
and unpin_pages callbacks of the struct vfio_iommu_driver_ops[4]. Currently
these callbacks are supported in the TYPE1 IOMMU module. To enable them for
other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
these two callback functions.
Using the Sample Code
=====================
mtty.c in samples/vfio-mdev/ directory is a sample driver program to
demonstrate how to use the mediated device framework.
The sample driver creates an mdev device that simulates a serial port over a PCI
card.
1. Build and load the mtty.ko module.
This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
Files in this device directory in sysfs are similar to the following:
# tree /sys/devices/virtual/mtty/mtty/
/sys/devices/virtual/mtty/mtty/
|-- mdev_supported_types
| |-- mtty-1
| | |-- available_instances
| | |-- create
| | |-- device_api
| | |-- devices
| | `-- name
| `-- mtty-2
| |-- available_instances
| |-- create
| |-- device_api
| |-- devices
| `-- name
|-- mtty_dev
| `-- sample_mtty_dev
|-- power
| |-- autosuspend_delay_ms
| |-- control
| |-- runtime_active_time
| |-- runtime_status
| `-- runtime_suspended_time
|-- subsystem -> ../../../../class/mtty
`-- uevent
2. Create a mediated device by using the dummy device that you created in the
previous step.
# echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
/sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
3. Add parameters to qemu-kvm.
-device vfio-pci,\
sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
4. Boot the VM.
In the Linux guest VM, with no hardware on the host, the device appears
as follows:
# lspci -s 00:05.0 -xxvv
00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
Subsystem: Device 4348:3253
Physical Slot: 5
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c150 [size=8]
Region 1: I/O ports at c158 [size=8]
Kernel driver in use: serial
00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
In the Linux guest VM, dmesg output for the device is as follows:
serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ
10
0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
5. In the Linux guest VM, check the serial ports.
# setserial -g /dev/ttyS*
/dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
/dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
/dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
6. Using a minicom or any terminal enulation program, open port /dev/ttyS1 or
/dev/ttyS2 with hardware flow control disabled.
7. Type data on the minicom terminal or send data to the terminal emulation
program and read the data.
Data is loop backed from hosts mtty driver.
8. Destroy the mediated device that you created.
# echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
References
==========
[1] See Documentation/vfio.txt for more information on VFIO.
[2] struct mdev_driver in include/linux/mdev.h
[3] struct parent_ops in include/linux/mdev.h
[4] struct vfio_iommu_driver_ops in include/linux/vfio.h

View File

@ -12791,6 +12791,15 @@ F: drivers/vfio/
F: include/linux/vfio.h
F: include/uapi/linux/vfio.h
VFIO MEDIATED DEVICE DRIVERS
M: Kirti Wankhede <kwankhede@nvidia.com>
L: kvm@vger.kernel.org
S: Maintained
F: Documentation/vfio-mediated-device.txt
F: drivers/vfio/mdev/
F: include/linux/mdev.h
F: samples/vfio-mdev/
VFIO PLATFORM DRIVER
M: Baptiste Reynal <b.reynal@virtualopensystems.com>
L: kvm@vger.kernel.org

View File

@ -48,4 +48,5 @@ menuconfig VFIO_NOIOMMU
source "drivers/vfio/pci/Kconfig"
source "drivers/vfio/platform/Kconfig"
source "drivers/vfio/mdev/Kconfig"
source "virt/lib/Kconfig"

View File

@ -7,3 +7,4 @@ obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o
obj-$(CONFIG_VFIO_PCI) += pci/
obj-$(CONFIG_VFIO_PLATFORM) += platform/
obj-$(CONFIG_VFIO_MDEV) += mdev/

17
drivers/vfio/mdev/Kconfig Normal file
View File

@ -0,0 +1,17 @@
config VFIO_MDEV
tristate "Mediated device driver framework"
depends on VFIO
default n
help
Provides a framework to virtualize devices.
See Documentation/vfio-mediated-device.txt for more details.
If you don't know what do here, say N.
config VFIO_MDEV_DEVICE
tristate "VFIO driver for Mediated devices"
depends on VFIO && VFIO_MDEV
default n
help
VFIO based driver for Mediated devices.

View File

@ -0,0 +1,5 @@
mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
obj-$(CONFIG_VFIO_MDEV) += mdev.o
obj-$(CONFIG_VFIO_MDEV_DEVICE) += vfio_mdev.o

View File

@ -0,0 +1,385 @@
/*
* Mediated device Core Driver
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/module.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/uuid.h>
#include <linux/sysfs.h>
#include <linux/mdev.h>
#include "mdev_private.h"
#define DRIVER_VERSION "0.1"
#define DRIVER_AUTHOR "NVIDIA Corporation"
#define DRIVER_DESC "Mediated device Core Driver"
static LIST_HEAD(parent_list);
static DEFINE_MUTEX(parent_list_lock);
static struct class_compat *mdev_bus_compat_class;
static int _find_mdev_device(struct device *dev, void *data)
{
struct mdev_device *mdev;
if (!dev_is_mdev(dev))
return 0;
mdev = to_mdev_device(dev);
if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0)
return 1;
return 0;
}
static bool mdev_device_exist(struct parent_device *parent, uuid_le uuid)
{
struct device *dev;
dev = device_find_child(parent->dev, &uuid, _find_mdev_device);
if (dev) {
put_device(dev);
return true;
}
return false;
}
/* Should be called holding parent_list_lock */
static struct parent_device *__find_parent_device(struct device *dev)
{
struct parent_device *parent;
list_for_each_entry(parent, &parent_list, next) {
if (parent->dev == dev)
return parent;
}
return NULL;
}
static void mdev_release_parent(struct kref *kref)
{
struct parent_device *parent = container_of(kref, struct parent_device,
ref);
struct device *dev = parent->dev;
kfree(parent);
put_device(dev);
}
static
inline struct parent_device *mdev_get_parent(struct parent_device *parent)
{
if (parent)
kref_get(&parent->ref);
return parent;
}
static inline void mdev_put_parent(struct parent_device *parent)
{
if (parent)
kref_put(&parent->ref, mdev_release_parent);
}
static int mdev_device_create_ops(struct kobject *kobj,
struct mdev_device *mdev)
{
struct parent_device *parent = mdev->parent;
int ret;
ret = parent->ops->create(kobj, mdev);
if (ret)
return ret;
ret = sysfs_create_groups(&mdev->dev.kobj,
parent->ops->mdev_attr_groups);
if (ret)
parent->ops->remove(mdev);
return ret;
}
/*
* mdev_device_remove_ops gets called from sysfs's 'remove' and when parent
* device is being unregistered from mdev device framework.
* - 'force_remove' is set to 'false' when called from sysfs's 'remove' which
* indicates that if the mdev device is active, used by VMM or userspace
* application, vendor driver could return error then don't remove the device.
* - 'force_remove' is set to 'true' when called from mdev_unregister_device()
* which indicate that parent device is being removed from mdev device
* framework so remove mdev device forcefully.
*/
static int mdev_device_remove_ops(struct mdev_device *mdev, bool force_remove)
{
struct parent_device *parent = mdev->parent;
int ret;
/*
* Vendor driver can return error if VMM or userspace application is
* using this mdev device.
*/
ret = parent->ops->remove(mdev);
if (ret && !force_remove)
return -EBUSY;
sysfs_remove_groups(&mdev->dev.kobj, parent->ops->mdev_attr_groups);
return 0;
}
static int mdev_device_remove_cb(struct device *dev, void *data)
{
if (!dev_is_mdev(dev))
return 0;
return mdev_device_remove(dev, data ? *(bool *)data : true);
}
/*
* mdev_register_device : Register a device
* @dev: device structure representing parent device.
* @ops: Parent device operation structure to be registered.
*
* Add device to list of registered parent devices.
* Returns a negative value on error, otherwise 0.
*/
int mdev_register_device(struct device *dev, const struct parent_ops *ops)
{
int ret;
struct parent_device *parent;
/* check for mandatory ops */
if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
return -EINVAL;
dev = get_device(dev);
if (!dev)
return -EINVAL;
mutex_lock(&parent_list_lock);
/* Check for duplicate */
parent = __find_parent_device(dev);
if (parent) {
ret = -EEXIST;
goto add_dev_err;
}
parent = kzalloc(sizeof(*parent), GFP_KERNEL);
if (!parent) {
ret = -ENOMEM;
goto add_dev_err;
}
kref_init(&parent->ref);
mutex_init(&parent->lock);
parent->dev = dev;
parent->ops = ops;
if (!mdev_bus_compat_class) {
mdev_bus_compat_class = class_compat_register("mdev_bus");
if (!mdev_bus_compat_class) {
ret = -ENOMEM;
goto add_dev_err;
}
}
ret = parent_create_sysfs_files(parent);
if (ret)
goto add_dev_err;
ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
if (ret)
dev_warn(dev, "Failed to create compatibility class link\n");
list_add(&parent->next, &parent_list);
mutex_unlock(&parent_list_lock);
dev_info(dev, "MDEV: Registered\n");
return 0;
add_dev_err:
mutex_unlock(&parent_list_lock);
if (parent)
mdev_put_parent(parent);
else
put_device(dev);
return ret;
}
EXPORT_SYMBOL(mdev_register_device);
/*
* mdev_unregister_device : Unregister a parent device
* @dev: device structure representing parent device.
*
* Remove device from list of registered parent devices. Give a chance to free
* existing mediated devices for given device.
*/
void mdev_unregister_device(struct device *dev)
{
struct parent_device *parent;
bool force_remove = true;
mutex_lock(&parent_list_lock);
parent = __find_parent_device(dev);
if (!parent) {
mutex_unlock(&parent_list_lock);
return;
}
dev_info(dev, "MDEV: Unregistering\n");
list_del(&parent->next);
class_compat_remove_link(mdev_bus_compat_class, dev, NULL);
device_for_each_child(dev, (void *)&force_remove,
mdev_device_remove_cb);
parent_remove_sysfs_files(parent);
mutex_unlock(&parent_list_lock);
mdev_put_parent(parent);
}
EXPORT_SYMBOL(mdev_unregister_device);
static void mdev_device_release(struct device *dev)
{
struct mdev_device *mdev = to_mdev_device(dev);
dev_dbg(&mdev->dev, "MDEV: destroying\n");
kfree(mdev);
}
int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
{
int ret;
struct mdev_device *mdev;
struct parent_device *parent;
struct mdev_type *type = to_mdev_type(kobj);
parent = mdev_get_parent(type->parent);
if (!parent)
return -EINVAL;
mutex_lock(&parent->lock);
/* Check for duplicate */
if (mdev_device_exist(parent, uuid)) {
ret = -EEXIST;
goto create_err;
}
mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
if (!mdev) {
ret = -ENOMEM;
goto create_err;
}
memcpy(&mdev->uuid, &uuid, sizeof(uuid_le));
mdev->parent = parent;
kref_init(&mdev->ref);
mdev->dev.parent = dev;
mdev->dev.bus = &mdev_bus_type;
mdev->dev.release = mdev_device_release;
dev_set_name(&mdev->dev, "%pUl", uuid.b);
ret = device_register(&mdev->dev);
if (ret) {
put_device(&mdev->dev);
goto create_err;
}
ret = mdev_device_create_ops(kobj, mdev);
if (ret)
goto create_failed;
ret = mdev_create_sysfs_files(&mdev->dev, type);
if (ret) {
mdev_device_remove_ops(mdev, true);
goto create_failed;
}
mdev->type_kobj = kobj;
dev_dbg(&mdev->dev, "MDEV: created\n");
mutex_unlock(&parent->lock);
return ret;
create_failed:
device_unregister(&mdev->dev);
create_err:
mutex_unlock(&parent->lock);
mdev_put_parent(parent);
return ret;
}
int mdev_device_remove(struct device *dev, bool force_remove)
{
struct mdev_device *mdev;
struct parent_device *parent;
struct mdev_type *type;
int ret;
mdev = to_mdev_device(dev);
type = to_mdev_type(mdev->type_kobj);
parent = mdev->parent;
mutex_lock(&parent->lock);
ret = mdev_device_remove_ops(mdev, force_remove);
if (ret) {
mutex_unlock(&parent->lock);
return ret;
}
mdev_remove_sysfs_files(dev, type);
device_unregister(dev);
mutex_unlock(&parent->lock);
mdev_put_parent(parent);
return ret;
}
static int __init mdev_init(void)
{
int ret;
ret = mdev_bus_register();
/*
* Attempt to load known vfio_mdev. This gives us a working environment
* without the user needing to explicitly load vfio_mdev driver.
*/
if (!ret)
request_module_nowait("vfio_mdev");
return ret;
}
static void __exit mdev_exit(void)
{
if (mdev_bus_compat_class)
class_compat_unregister(mdev_bus_compat_class);
mdev_bus_unregister();
}
module_init(mdev_init)
module_exit(mdev_exit)
MODULE_VERSION(DRIVER_VERSION);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);

View File

@ -0,0 +1,119 @@
/*
* MDEV driver
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/device.h>
#include <linux/iommu.h>
#include <linux/mdev.h>
#include "mdev_private.h"
static int mdev_attach_iommu(struct mdev_device *mdev)
{
int ret;
struct iommu_group *group;
group = iommu_group_alloc();
if (IS_ERR(group))
return PTR_ERR(group);
ret = iommu_group_add_device(group, &mdev->dev);
if (!ret)
dev_info(&mdev->dev, "MDEV: group_id = %d\n",
iommu_group_id(group));
iommu_group_put(group);
return ret;
}
static void mdev_detach_iommu(struct mdev_device *mdev)
{
iommu_group_remove_device(&mdev->dev);
dev_info(&mdev->dev, "MDEV: detaching iommu\n");
}
static int mdev_probe(struct device *dev)
{
struct mdev_driver *drv = to_mdev_driver(dev->driver);
struct mdev_device *mdev = to_mdev_device(dev);
int ret;
ret = mdev_attach_iommu(mdev);
if (ret)
return ret;
if (drv && drv->probe) {
ret = drv->probe(dev);
if (ret)
mdev_detach_iommu(mdev);
}
return ret;
}
static int mdev_remove(struct device *dev)
{
struct mdev_driver *drv = to_mdev_driver(dev->driver);
struct mdev_device *mdev = to_mdev_device(dev);
if (drv && drv->remove)
drv->remove(dev);
mdev_detach_iommu(mdev);
return 0;
}
struct bus_type mdev_bus_type = {
.name = "mdev",
.probe = mdev_probe,
.remove = mdev_remove,
};
EXPORT_SYMBOL_GPL(mdev_bus_type);
/**
* mdev_register_driver - register a new MDEV driver
* @drv: the driver to register
* @owner: module owner of driver to be registered
*
* Returns a negative value on error, otherwise 0.
**/
int mdev_register_driver(struct mdev_driver *drv, struct module *owner)
{
/* initialize common driver fields */
drv->driver.name = drv->name;
drv->driver.bus = &mdev_bus_type;
drv->driver.owner = owner;
/* register with core */
return driver_register(&drv->driver);
}
EXPORT_SYMBOL(mdev_register_driver);
/*
* mdev_unregister_driver - unregister MDEV driver
* @drv: the driver to unregister
*/
void mdev_unregister_driver(struct mdev_driver *drv)
{
driver_unregister(&drv->driver);
}
EXPORT_SYMBOL(mdev_unregister_driver);
int mdev_bus_register(void)
{
return bus_register(&mdev_bus_type);
}
void mdev_bus_unregister(void)
{
bus_unregister(&mdev_bus_type);
}

View File

@ -0,0 +1,41 @@
/*
* Mediated device interal definitions
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef MDEV_PRIVATE_H
#define MDEV_PRIVATE_H
int mdev_bus_register(void);
void mdev_bus_unregister(void);
struct mdev_type {
struct kobject kobj;
struct kobject *devices_kobj;
struct parent_device *parent;
struct list_head next;
struct attribute_group *group;
};
#define to_mdev_type_attr(_attr) \
container_of(_attr, struct mdev_type_attribute, attr)
#define to_mdev_type(_kobj) \
container_of(_kobj, struct mdev_type, kobj)
int parent_create_sysfs_files(struct parent_device *parent);
void parent_remove_sysfs_files(struct parent_device *parent);
int mdev_create_sysfs_files(struct device *dev, struct mdev_type *type);
void mdev_remove_sysfs_files(struct device *dev, struct mdev_type *type);
int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid);
int mdev_device_remove(struct device *dev, bool force_remove);
#endif /* MDEV_PRIVATE_H */

View File

@ -0,0 +1,286 @@
/*
* File attributes for Mediated devices
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/sysfs.h>
#include <linux/ctype.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/uuid.h>
#include <linux/mdev.h>
#include "mdev_private.h"
/* Static functions */
static ssize_t mdev_type_attr_show(struct kobject *kobj,
struct attribute *__attr, char *buf)
{
struct mdev_type_attribute *attr = to_mdev_type_attr(__attr);
struct mdev_type *type = to_mdev_type(kobj);
ssize_t ret = -EIO;
if (attr->show)
ret = attr->show(kobj, type->parent->dev, buf);
return ret;
}
static ssize_t mdev_type_attr_store(struct kobject *kobj,
struct attribute *__attr,
const char *buf, size_t count)
{
struct mdev_type_attribute *attr = to_mdev_type_attr(__attr);
struct mdev_type *type = to_mdev_type(kobj);
ssize_t ret = -EIO;
if (attr->store)
ret = attr->store(&type->kobj, type->parent->dev, buf, count);
return ret;
}
static const struct sysfs_ops mdev_type_sysfs_ops = {
.show = mdev_type_attr_show,
.store = mdev_type_attr_store,
};
static ssize_t create_store(struct kobject *kobj, struct device *dev,
const char *buf, size_t count)
{
char *str;
uuid_le uuid;
int ret;
if ((count < UUID_STRING_LEN) || (count > UUID_STRING_LEN + 1))
return -EINVAL;
str = kstrndup(buf, count, GFP_KERNEL);
if (!str)
return -ENOMEM;
ret = uuid_le_to_bin(str, &uuid);
kfree(str);
if (ret)
return ret;
ret = mdev_device_create(kobj, dev, uuid);
if (ret)
return ret;
return count;
}
MDEV_TYPE_ATTR_WO(create);
static void mdev_type_release(struct kobject *kobj)
{
struct mdev_type *type = to_mdev_type(kobj);
pr_debug("Releasing group %s\n", kobj->name);
kfree(type);
}
static struct kobj_type mdev_type_ktype = {
.sysfs_ops = &mdev_type_sysfs_ops,
.release = mdev_type_release,
};
struct mdev_type *add_mdev_supported_type(struct parent_device *parent,
struct attribute_group *group)
{
struct mdev_type *type;
int ret;
if (!group->name) {
pr_err("%s: Type name empty!\n", __func__);
return ERR_PTR(-EINVAL);
}
type = kzalloc(sizeof(*type), GFP_KERNEL);
if (!type)
return ERR_PTR(-ENOMEM);
type->kobj.kset = parent->mdev_types_kset;
ret = kobject_init_and_add(&type->kobj, &mdev_type_ktype, NULL,
"%s-%s", dev_driver_string(parent->dev),
group->name);
if (ret) {
kfree(type);
return ERR_PTR(ret);
}
ret = sysfs_create_file(&type->kobj, &mdev_type_attr_create.attr);
if (ret)
goto attr_create_failed;
type->devices_kobj = kobject_create_and_add("devices", &type->kobj);
if (!type->devices_kobj) {
ret = -ENOMEM;
goto attr_devices_failed;
}
ret = sysfs_create_files(&type->kobj,
(const struct attribute **)group->attrs);
if (ret) {
ret = -ENOMEM;
goto attrs_failed;
}
type->group = group;
type->parent = parent;
return type;
attrs_failed:
kobject_put(type->devices_kobj);
attr_devices_failed:
sysfs_remove_file(&type->kobj, &mdev_type_attr_create.attr);
attr_create_failed:
kobject_del(&type->kobj);
kobject_put(&type->kobj);
return ERR_PTR(ret);
}
static void remove_mdev_supported_type(struct mdev_type *type)
{
sysfs_remove_files(&type->kobj,
(const struct attribute **)type->group->attrs);
kobject_put(type->devices_kobj);
sysfs_remove_file(&type->kobj, &mdev_type_attr_create.attr);
kobject_del(&type->kobj);
kobject_put(&type->kobj);
}
static int add_mdev_supported_type_groups(struct parent_device *parent)
{
int i;
for (i = 0; parent->ops->supported_type_groups[i]; i++) {
struct mdev_type *type;
type = add_mdev_supported_type(parent,
parent->ops->supported_type_groups[i]);
if (IS_ERR(type)) {
struct mdev_type *ltype, *tmp;
list_for_each_entry_safe(ltype, tmp, &parent->type_list,
next) {
list_del(&ltype->next);
remove_mdev_supported_type(ltype);
}
return PTR_ERR(type);
}
list_add(&type->next, &parent->type_list);
}
return 0;
}
/* mdev sysfs functions */
void parent_remove_sysfs_files(struct parent_device *parent)
{
struct mdev_type *type, *tmp;
list_for_each_entry_safe(type, tmp, &parent->type_list, next) {
list_del(&type->next);
remove_mdev_supported_type(type);
}
sysfs_remove_groups(&parent->dev->kobj, parent->ops->dev_attr_groups);
kset_unregister(parent->mdev_types_kset);
}
int parent_create_sysfs_files(struct parent_device *parent)
{
int ret;
parent->mdev_types_kset = kset_create_and_add("mdev_supported_types",
NULL, &parent->dev->kobj);
if (!parent->mdev_types_kset)
return -ENOMEM;
INIT_LIST_HEAD(&parent->type_list);
ret = sysfs_create_groups(&parent->dev->kobj,
parent->ops->dev_attr_groups);
if (ret)
goto create_err;
ret = add_mdev_supported_type_groups(parent);
if (ret)
sysfs_remove_groups(&parent->dev->kobj,
parent->ops->dev_attr_groups);
else
return ret;
create_err:
kset_unregister(parent->mdev_types_kset);
return ret;
}
static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
unsigned long val;
if (kstrtoul(buf, 0, &val) < 0)
return -EINVAL;
if (val && device_remove_file_self(dev, attr)) {
int ret;
ret = mdev_device_remove(dev, false);
if (ret) {
device_create_file(dev, attr);
return ret;
}
}
return count;
}
static DEVICE_ATTR_WO(remove);
static const struct attribute *mdev_device_attrs[] = {
&dev_attr_remove.attr,
NULL,
};
int mdev_create_sysfs_files(struct device *dev, struct mdev_type *type)
{
int ret;
ret = sysfs_create_files(&dev->kobj, mdev_device_attrs);
if (ret)
return ret;
ret = sysfs_create_link(type->devices_kobj, &dev->kobj, dev_name(dev));
if (ret)
goto device_link_failed;
ret = sysfs_create_link(&dev->kobj, &type->kobj, "mdev_type");
if (ret)
goto type_link_failed;
return ret;
type_link_failed:
sysfs_remove_link(type->devices_kobj, dev_name(dev));
device_link_failed:
sysfs_remove_files(&dev->kobj, mdev_device_attrs);
return ret;
}
void mdev_remove_sysfs_files(struct device *dev, struct mdev_type *type)
{
sysfs_remove_link(&dev->kobj, "mdev_type");
sysfs_remove_link(type->devices_kobj, dev_name(dev));
sysfs_remove_files(&dev->kobj, mdev_device_attrs);
}

View File

@ -0,0 +1,148 @@
/*
* VFIO based driver for Mediated device
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/init.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/vfio.h>
#include <linux/mdev.h>
#include "mdev_private.h"
#define DRIVER_VERSION "0.1"
#define DRIVER_AUTHOR "NVIDIA Corporation"
#define DRIVER_DESC "VFIO based driver for Mediated device"
static int vfio_mdev_open(void *device_data)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
int ret;
if (unlikely(!parent->ops->open))
return -EINVAL;
if (!try_module_get(THIS_MODULE))
return -ENODEV;
ret = parent->ops->open(mdev);
if (ret)
module_put(THIS_MODULE);
return ret;
}
static void vfio_mdev_release(void *device_data)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
if (likely(parent->ops->release))
parent->ops->release(mdev);
module_put(THIS_MODULE);
}
static long vfio_mdev_unlocked_ioctl(void *device_data,
unsigned int cmd, unsigned long arg)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
if (unlikely(!parent->ops->ioctl))
return -EINVAL;
return parent->ops->ioctl(mdev, cmd, arg);
}
static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
size_t count, loff_t *ppos)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
if (unlikely(!parent->ops->read))
return -EINVAL;
return parent->ops->read(mdev, buf, count, ppos);
}
static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
size_t count, loff_t *ppos)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
if (unlikely(!parent->ops->write))
return -EINVAL;
return parent->ops->write(mdev, buf, count, ppos);
}
static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
{
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
if (unlikely(!parent->ops->mmap))
return -EINVAL;
return parent->ops->mmap(mdev, vma);
}
static const struct vfio_device_ops vfio_mdev_dev_ops = {
.name = "vfio-mdev",
.open = vfio_mdev_open,
.release = vfio_mdev_release,
.ioctl = vfio_mdev_unlocked_ioctl,
.read = vfio_mdev_read,
.write = vfio_mdev_write,
.mmap = vfio_mdev_mmap,
};
int vfio_mdev_probe(struct device *dev)
{
struct mdev_device *mdev = to_mdev_device(dev);
return vfio_add_group_dev(dev, &vfio_mdev_dev_ops, mdev);
}
void vfio_mdev_remove(struct device *dev)
{
vfio_del_group_dev(dev);
}
struct mdev_driver vfio_mdev_driver = {
.name = "vfio_mdev",
.probe = vfio_mdev_probe,
.remove = vfio_mdev_remove,
};
static int __init vfio_mdev_init(void)
{
return mdev_register_driver(&vfio_mdev_driver, THIS_MODULE);
}
static void __exit vfio_mdev_exit(void)
{
mdev_unregister_driver(&vfio_mdev_driver);
}
module_init(vfio_mdev_init)
module_exit(vfio_mdev_exit)
MODULE_VERSION(DRIVER_VERSION);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);

View File

@ -558,10 +558,9 @@ static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev,
static int msix_sparse_mmap_cap(struct vfio_pci_device *vdev,
struct vfio_info_cap *caps)
{
struct vfio_info_cap_header *header;
struct vfio_region_info_cap_sparse_mmap *sparse;
size_t end, size;
int nr_areas = 2, i = 0;
int nr_areas = 2, i = 0, ret;
end = pci_resource_len(vdev->pdev, vdev->msix_bar);
@ -572,13 +571,10 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device *vdev,
size = sizeof(*sparse) + (nr_areas * sizeof(*sparse->areas));
header = vfio_info_cap_add(caps, size,
VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
if (IS_ERR(header))
return PTR_ERR(header);
sparse = kzalloc(size, GFP_KERNEL);
if (!sparse)
return -ENOMEM;
sparse = container_of(header,
struct vfio_region_info_cap_sparse_mmap, header);
sparse->nr_areas = nr_areas;
if (vdev->msix_offset & PAGE_MASK) {
@ -594,26 +590,11 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device *vdev,
i++;
}
return 0;
}
ret = vfio_info_add_capability(caps, VFIO_REGION_INFO_CAP_SPARSE_MMAP,
sparse);
kfree(sparse);
static int region_type_cap(struct vfio_pci_device *vdev,
struct vfio_info_cap *caps,
unsigned int type, unsigned int subtype)
{
struct vfio_info_cap_header *header;
struct vfio_region_info_cap_type *cap;
header = vfio_info_cap_add(caps, sizeof(*cap),
VFIO_REGION_INFO_CAP_TYPE, 1);
if (IS_ERR(header))
return PTR_ERR(header);
cap = container_of(header, struct vfio_region_info_cap_type, header);
cap->type = type;
cap->subtype = subtype;
return 0;
return ret;
}
int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
@ -752,6 +733,9 @@ static long vfio_pci_ioctl(void *device_data,
break;
default:
{
struct vfio_region_info_cap_type cap_type;
if (info.index >=
VFIO_PCI_NUM_REGIONS + vdev->num_regions)
return -EINVAL;
@ -762,11 +746,16 @@ static long vfio_pci_ioctl(void *device_data,
info.size = vdev->region[i].size;
info.flags = vdev->region[i].flags;
ret = region_type_cap(vdev, &caps,
vdev->region[i].type,
vdev->region[i].subtype);
cap_type.type = vdev->region[i].type;
cap_type.subtype = vdev->region[i].subtype;
ret = vfio_info_add_capability(&caps,
VFIO_REGION_INFO_CAP_TYPE,
&cap_type);
if (ret)
return ret;
}
}
if (caps.size) {
@ -829,45 +818,25 @@ static long vfio_pci_ioctl(void *device_data,
} else if (cmd == VFIO_DEVICE_SET_IRQS) {
struct vfio_irq_set hdr;
size_t size;
u8 *data = NULL;
int max, ret = 0;
size_t data_size = 0;
minsz = offsetofend(struct vfio_irq_set, count);
if (copy_from_user(&hdr, (void __user *)arg, minsz))
return -EFAULT;
if (hdr.argsz < minsz || hdr.index >= VFIO_PCI_NUM_IRQS ||
hdr.count >= (U32_MAX - hdr.start) ||
hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
VFIO_IRQ_SET_ACTION_TYPE_MASK))
return -EINVAL;
max = vfio_pci_get_irq_count(vdev, hdr.index);
if (hdr.start >= max || hdr.start + hdr.count > max)
return -EINVAL;
switch (hdr.flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
case VFIO_IRQ_SET_DATA_NONE:
size = 0;
break;
case VFIO_IRQ_SET_DATA_BOOL:
size = sizeof(uint8_t);
break;
case VFIO_IRQ_SET_DATA_EVENTFD:
size = sizeof(int32_t);
break;
default:
return -EINVAL;
}
if (size) {
if (hdr.argsz - minsz < hdr.count * size)
return -EINVAL;
ret = vfio_set_irqs_validate_and_prepare(&hdr, max,
VFIO_PCI_NUM_IRQS, &data_size);
if (ret)
return ret;
if (data_size) {
data = memdup_user((void __user *)(arg + minsz),
hdr.count * size);
data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}

View File

@ -152,7 +152,7 @@ static int vfio_user_config_read(struct pci_dev *pdev, int offset,
*val = cpu_to_le32(tmp_val);
return pcibios_err_to_errno(ret);
return ret;
}
static int vfio_user_config_write(struct pci_dev *pdev, int offset,
@ -173,7 +173,7 @@ static int vfio_user_config_write(struct pci_dev *pdev, int offset,
break;
}
return pcibios_err_to_errno(ret);
return ret;
}
static int vfio_default_config_read(struct vfio_pci_device *vdev, int pos,
@ -257,7 +257,7 @@ static int vfio_direct_config_read(struct vfio_pci_device *vdev, int pos,
ret = vfio_user_config_read(vdev->pdev, pos, val, count);
if (ret)
return pcibios_err_to_errno(ret);
return ret;
if (pos >= PCI_CFG_SPACE_SIZE) { /* Extended cap header mangling */
if (offset < 4)
@ -295,7 +295,7 @@ static int vfio_raw_config_read(struct vfio_pci_device *vdev, int pos,
ret = vfio_user_config_read(vdev->pdev, pos, val, count);
if (ret)
return pcibios_err_to_errno(ret);
return ret;
return count;
}
@ -1089,7 +1089,7 @@ static int vfio_msi_config_write(struct vfio_pci_device *vdev, int pos,
start + PCI_MSI_FLAGS,
flags);
if (ret)
return pcibios_err_to_errno(ret);
return ret;
}
return count;

View File

@ -364,36 +364,21 @@ static long vfio_platform_ioctl(void *device_data,
struct vfio_irq_set hdr;
u8 *data = NULL;
int ret = 0;
size_t data_size = 0;
minsz = offsetofend(struct vfio_irq_set, count);
if (copy_from_user(&hdr, (void __user *)arg, minsz))
return -EFAULT;
if (hdr.argsz < minsz)
return -EINVAL;
ret = vfio_set_irqs_validate_and_prepare(&hdr, vdev->num_irqs,
vdev->num_irqs, &data_size);
if (ret)
return ret;
if (hdr.index >= vdev->num_irqs)
return -EINVAL;
if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
VFIO_IRQ_SET_ACTION_TYPE_MASK))
return -EINVAL;
if (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE)) {
size_t size;
if (hdr.flags & VFIO_IRQ_SET_DATA_BOOL)
size = sizeof(uint8_t);
else if (hdr.flags & VFIO_IRQ_SET_DATA_EVENTFD)
size = sizeof(int32_t);
else
return -EINVAL;
if (hdr.argsz - minsz < size)
return -EINVAL;
data = memdup_user((void __user *)(arg + minsz), size);
if (data_size) {
data = memdup_user((void __user *)(arg + minsz),
data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}

View File

@ -86,6 +86,8 @@ struct vfio_group {
struct mutex unbound_lock;
atomic_t opened;
bool noiommu;
struct kvm *kvm;
struct blocking_notifier_head notifier;
};
struct vfio_device {
@ -339,6 +341,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group)
#ifdef CONFIG_VFIO_NOIOMMU
group->noiommu = (iommu_group_get_iommudata(iommu_group) == &noiommu);
#endif
BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
group->nb.notifier_call = vfio_iommu_group_notifier;
@ -480,6 +483,21 @@ static struct vfio_group *vfio_group_get_from_minor(int minor)
return group;
}
static struct vfio_group *vfio_group_get_from_dev(struct device *dev)
{
struct iommu_group *iommu_group;
struct vfio_group *group;
iommu_group = iommu_group_get(dev);
if (!iommu_group)
return NULL;
group = vfio_group_get_from_iommu(iommu_group);
iommu_group_put(iommu_group);
return group;
}
/**
* Device objects - create, release, get, put, search
*/
@ -811,16 +829,10 @@ EXPORT_SYMBOL_GPL(vfio_add_group_dev);
*/
struct vfio_device *vfio_device_get_from_dev(struct device *dev)
{
struct iommu_group *iommu_group;
struct vfio_group *group;
struct vfio_device *device;
iommu_group = iommu_group_get(dev);
if (!iommu_group)
return NULL;
group = vfio_group_get_from_iommu(iommu_group);
iommu_group_put(iommu_group);
group = vfio_group_get_from_dev(dev);
if (!group)
return NULL;
@ -1376,6 +1388,23 @@ static bool vfio_group_viable(struct vfio_group *group)
group, vfio_dev_viable) == 0);
}
static int vfio_group_add_container_user(struct vfio_group *group)
{
if (!atomic_inc_not_zero(&group->container_users))
return -EINVAL;
if (group->noiommu) {
atomic_dec(&group->container_users);
return -EPERM;
}
if (!group->container->iommu_driver || !vfio_group_viable(group)) {
atomic_dec(&group->container_users);
return -EINVAL;
}
return 0;
}
static const struct file_operations vfio_device_fops;
static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
@ -1555,6 +1584,9 @@ static int vfio_group_fops_release(struct inode *inode, struct file *filep)
filep->private_data = NULL;
/* Any user didn't unregister? */
WARN_ON(group->notifier.head);
vfio_group_try_dissolve_container(group);
atomic_dec(&group->opened);
@ -1685,23 +1717,14 @@ static const struct file_operations vfio_device_fops = {
struct vfio_group *vfio_group_get_external_user(struct file *filep)
{
struct vfio_group *group = filep->private_data;
int ret;
if (filep->f_op != &vfio_group_fops)
return ERR_PTR(-EINVAL);
if (!atomic_inc_not_zero(&group->container_users))
return ERR_PTR(-EINVAL);
if (group->noiommu) {
atomic_dec(&group->container_users);
return ERR_PTR(-EPERM);
}
if (!group->container->iommu_driver ||
!vfio_group_viable(group)) {
atomic_dec(&group->container_users);
return ERR_PTR(-EINVAL);
}
ret = vfio_group_add_container_user(group);
if (ret)
return ERR_PTR(ret);
vfio_group_get(group);
@ -1763,7 +1786,7 @@ struct vfio_info_cap_header *vfio_info_cap_add(struct vfio_info_cap *caps,
header->version = version;
/* Add to the end of the capability chain */
for (tmp = caps->buf; tmp->next; tmp = (void *)tmp + tmp->next)
for (tmp = buf; tmp->next; tmp = buf + tmp->next)
; /* nothing */
tmp->next = caps->size;
@ -1776,11 +1799,403 @@ EXPORT_SYMBOL_GPL(vfio_info_cap_add);
void vfio_info_cap_shift(struct vfio_info_cap *caps, size_t offset)
{
struct vfio_info_cap_header *tmp;
void *buf = (void *)caps->buf;
for (tmp = caps->buf; tmp->next; tmp = (void *)tmp + tmp->next - offset)
for (tmp = buf; tmp->next; tmp = buf + tmp->next - offset)
tmp->next += offset;
}
EXPORT_SYMBOL_GPL(vfio_info_cap_shift);
EXPORT_SYMBOL(vfio_info_cap_shift);
static int sparse_mmap_cap(struct vfio_info_cap *caps, void *cap_type)
{
struct vfio_info_cap_header *header;
struct vfio_region_info_cap_sparse_mmap *sparse_cap, *sparse = cap_type;
size_t size;
size = sizeof(*sparse) + sparse->nr_areas * sizeof(*sparse->areas);
header = vfio_info_cap_add(caps, size,
VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
if (IS_ERR(header))
return PTR_ERR(header);
sparse_cap = container_of(header,
struct vfio_region_info_cap_sparse_mmap, header);
sparse_cap->nr_areas = sparse->nr_areas;
memcpy(sparse_cap->areas, sparse->areas,
sparse->nr_areas * sizeof(*sparse->areas));
return 0;
}
static int region_type_cap(struct vfio_info_cap *caps, void *cap_type)
{
struct vfio_info_cap_header *header;
struct vfio_region_info_cap_type *type_cap, *cap = cap_type;
header = vfio_info_cap_add(caps, sizeof(*cap),
VFIO_REGION_INFO_CAP_TYPE, 1);
if (IS_ERR(header))
return PTR_ERR(header);
type_cap = container_of(header, struct vfio_region_info_cap_type,
header);
type_cap->type = cap->type;
type_cap->subtype = cap->subtype;
return 0;
}
int vfio_info_add_capability(struct vfio_info_cap *caps, int cap_type_id,
void *cap_type)
{
int ret = -EINVAL;
if (!cap_type)
return 0;
switch (cap_type_id) {
case VFIO_REGION_INFO_CAP_SPARSE_MMAP:
ret = sparse_mmap_cap(caps, cap_type);
break;
case VFIO_REGION_INFO_CAP_TYPE:
ret = region_type_cap(caps, cap_type);
break;
}
return ret;
}
EXPORT_SYMBOL(vfio_info_add_capability);
int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
int max_irq_type, size_t *data_size)
{
unsigned long minsz;
size_t size;
minsz = offsetofend(struct vfio_irq_set, count);
if ((hdr->argsz < minsz) || (hdr->index >= max_irq_type) ||
(hdr->count >= (U32_MAX - hdr->start)) ||
(hdr->flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
VFIO_IRQ_SET_ACTION_TYPE_MASK)))
return -EINVAL;
if (data_size)
*data_size = 0;
if (hdr->start >= num_irqs || hdr->start + hdr->count > num_irqs)
return -EINVAL;
switch (hdr->flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
case VFIO_IRQ_SET_DATA_NONE:
size = 0;
break;
case VFIO_IRQ_SET_DATA_BOOL:
size = sizeof(uint8_t);
break;
case VFIO_IRQ_SET_DATA_EVENTFD:
size = sizeof(int32_t);
break;
default:
return -EINVAL;
}
if (size) {
if (hdr->argsz - minsz < hdr->count * size)
return -EINVAL;
if (!data_size)
return -EINVAL;
*data_size = hdr->count * size;
}
return 0;
}
EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
/*
* Pin a set of guest PFNs and return their associated host PFNs for local
* domain only.
* @dev [in] : device
* @user_pfn [in]: array of user/guest PFNs to be unpinned.
* @npage [in] : count of elements in user_pfn array. This count should not
* be greater VFIO_PIN_PAGES_MAX_ENTRIES.
* @prot [in] : protection flags
* @phys_pfn[out]: array of host PFNs
* Return error or number of pages pinned.
*/
int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage,
int prot, unsigned long *phys_pfn)
{
struct vfio_container *container;
struct vfio_group *group;
struct vfio_iommu_driver *driver;
int ret;
if (!dev || !user_pfn || !phys_pfn || !npage)
return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
return -E2BIG;
group = vfio_group_get_from_dev(dev);
if (!group)
return -ENODEV;
ret = vfio_group_add_container_user(group);
if (ret)
goto err_pin_pages;
container = group->container;
down_read(&container->group_lock);
driver = container->iommu_driver;
if (likely(driver && driver->ops->pin_pages))
ret = driver->ops->pin_pages(container->iommu_data, user_pfn,
npage, prot, phys_pfn);
else
ret = -ENOTTY;
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
err_pin_pages:
vfio_group_put(group);
return ret;
}
EXPORT_SYMBOL(vfio_pin_pages);
/*
* Unpin set of host PFNs for local domain only.
* @dev [in] : device
* @user_pfn [in]: array of user/guest PFNs to be unpinned. Number of user/guest
* PFNs should not be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
* @npage [in] : count of elements in user_pfn array. This count should not
* be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
* Return error or number of pages unpinned.
*/
int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage)
{
struct vfio_container *container;
struct vfio_group *group;
struct vfio_iommu_driver *driver;
int ret;
if (!dev || !user_pfn || !npage)
return -EINVAL;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
return -E2BIG;
group = vfio_group_get_from_dev(dev);
if (!group)
return -ENODEV;
ret = vfio_group_add_container_user(group);
if (ret)
goto err_unpin_pages;
container = group->container;
down_read(&container->group_lock);
driver = container->iommu_driver;
if (likely(driver && driver->ops->unpin_pages))
ret = driver->ops->unpin_pages(container->iommu_data, user_pfn,
npage);
else
ret = -ENOTTY;
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
err_unpin_pages:
vfio_group_put(group);
return ret;
}
EXPORT_SYMBOL(vfio_unpin_pages);
static int vfio_register_iommu_notifier(struct vfio_group *group,
unsigned long *events,
struct notifier_block *nb)
{
struct vfio_container *container;
struct vfio_iommu_driver *driver;
int ret;
ret = vfio_group_add_container_user(group);
if (ret)
return -EINVAL;
container = group->container;
down_read(&container->group_lock);
driver = container->iommu_driver;
if (likely(driver && driver->ops->register_notifier))
ret = driver->ops->register_notifier(container->iommu_data,
events, nb);
else
ret = -ENOTTY;
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
return ret;
}
static int vfio_unregister_iommu_notifier(struct vfio_group *group,
struct notifier_block *nb)
{
struct vfio_container *container;
struct vfio_iommu_driver *driver;
int ret;
ret = vfio_group_add_container_user(group);
if (ret)
return -EINVAL;
container = group->container;
down_read(&container->group_lock);
driver = container->iommu_driver;
if (likely(driver && driver->ops->unregister_notifier))
ret = driver->ops->unregister_notifier(container->iommu_data,
nb);
else
ret = -ENOTTY;
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
return ret;
}
void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
{
group->kvm = kvm;
blocking_notifier_call_chain(&group->notifier,
VFIO_GROUP_NOTIFY_SET_KVM, kvm);
}
EXPORT_SYMBOL_GPL(vfio_group_set_kvm);
static int vfio_register_group_notifier(struct vfio_group *group,
unsigned long *events,
struct notifier_block *nb)
{
struct vfio_container *container;
int ret;
bool set_kvm = false;
if (*events & VFIO_GROUP_NOTIFY_SET_KVM)
set_kvm = true;
/* clear known events */
*events &= ~VFIO_GROUP_NOTIFY_SET_KVM;
/* refuse to continue if still events remaining */
if (*events)
return -EINVAL;
ret = vfio_group_add_container_user(group);
if (ret)
return -EINVAL;
container = group->container;
down_read(&container->group_lock);
ret = blocking_notifier_chain_register(&group->notifier, nb);
/*
* The attaching of kvm and vfio_group might already happen, so
* here we replay once upon registration.
*/
if (!ret && set_kvm && group->kvm)
blocking_notifier_call_chain(&group->notifier,
VFIO_GROUP_NOTIFY_SET_KVM, group->kvm);
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
return ret;
}
static int vfio_unregister_group_notifier(struct vfio_group *group,
struct notifier_block *nb)
{
struct vfio_container *container;
int ret;
ret = vfio_group_add_container_user(group);
if (ret)
return -EINVAL;
container = group->container;
down_read(&container->group_lock);
ret = blocking_notifier_chain_unregister(&group->notifier, nb);
up_read(&container->group_lock);
vfio_group_try_dissolve_container(group);
return ret;
}
int vfio_register_notifier(struct device *dev, enum vfio_notify_type type,
unsigned long *events, struct notifier_block *nb)
{
struct vfio_group *group;
int ret;
if (!dev || !nb || !events || (*events == 0))
return -EINVAL;
group = vfio_group_get_from_dev(dev);
if (!group)
return -ENODEV;
switch (type) {
case VFIO_IOMMU_NOTIFY:
ret = vfio_register_iommu_notifier(group, events, nb);
break;
case VFIO_GROUP_NOTIFY:
ret = vfio_register_group_notifier(group, events, nb);
break;
default:
ret = -EINVAL;
}
vfio_group_put(group);
return ret;
}
EXPORT_SYMBOL(vfio_register_notifier);
int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type,
struct notifier_block *nb)
{
struct vfio_group *group;
int ret;
if (!dev || !nb)
return -EINVAL;
group = vfio_group_get_from_dev(dev);
if (!group)
return -ENODEV;
switch (type) {
case VFIO_IOMMU_NOTIFY:
ret = vfio_unregister_iommu_notifier(group, nb);
break;
case VFIO_GROUP_NOTIFY:
ret = vfio_unregister_group_notifier(group, nb);
break;
default:
ret = -EINVAL;
}
vfio_group_put(group);
return ret;
}
EXPORT_SYMBOL(vfio_unregister_notifier);
/**
* Module/class support

File diff suppressed because it is too large Load Diff

168
include/linux/mdev.h Normal file
View File

@ -0,0 +1,168 @@
/*
* Mediated device definition
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef MDEV_H
#define MDEV_H
/* Parent device */
struct parent_device {
struct device *dev;
const struct parent_ops *ops;
/* internal */
struct kref ref;
struct mutex lock;
struct list_head next;
struct kset *mdev_types_kset;
struct list_head type_list;
};
/* Mediated device */
struct mdev_device {
struct device dev;
struct parent_device *parent;
uuid_le uuid;
void *driver_data;
/* internal */
struct kref ref;
struct list_head next;
struct kobject *type_kobj;
};
/**
* struct parent_ops - Structure to be registered for each parent device to
* register the device to mdev module.
*
* @owner: The module owner.
* @dev_attr_groups: Attributes of the parent device.
* @mdev_attr_groups: Attributes of the mediated device.
* @supported_type_groups: Attributes to define supported types. It is mandatory
* to provide supported types.
* @create: Called to allocate basic resources in parent device's
* driver for a particular mediated device. It is
* mandatory to provide create ops.
* @kobj: kobject of type for which 'create' is called.
* @mdev: mdev_device structure on of mediated device
* that is being created
* Returns integer: success (0) or error (< 0)
* @remove: Called to free resources in parent device's driver for a
* a mediated device. It is mandatory to provide 'remove'
* ops.
* @mdev: mdev_device device structure which is being
* destroyed
* Returns integer: success (0) or error (< 0)
* @open: Open mediated device.
* @mdev: mediated device.
* Returns integer: success (0) or error (< 0)
* @release: release mediated device
* @mdev: mediated device.
* @read: Read emulation callback
* @mdev: mediated device structure
* @buf: read buffer
* @count: number of bytes to read
* @ppos: address.
* Retuns number on bytes read on success or error.
* @write: Write emulation callback
* @mdev: mediated device structure
* @buf: write buffer
* @count: number of bytes to be written
* @ppos: address.
* Retuns number on bytes written on success or error.
* @ioctl: IOCTL callback
* @mdev: mediated device structure
* @cmd: ioctl command
* @arg: arguments to ioctl
* @mmap: mmap callback
* @mdev: mediated device structure
* @vma: vma structure
* Parent device that support mediated device should be registered with mdev
* module with parent_ops structure.
**/
struct parent_ops {
struct module *owner;
const struct attribute_group **dev_attr_groups;
const struct attribute_group **mdev_attr_groups;
struct attribute_group **supported_type_groups;
int (*create)(struct kobject *kobj, struct mdev_device *mdev);
int (*remove)(struct mdev_device *mdev);
int (*open)(struct mdev_device *mdev);
void (*release)(struct mdev_device *mdev);
ssize_t (*read)(struct mdev_device *mdev, char __user *buf,
size_t count, loff_t *ppos);
ssize_t (*write)(struct mdev_device *mdev, const char __user *buf,
size_t count, loff_t *ppos);
ssize_t (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
unsigned long arg);
int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
};
/* interface for exporting mdev supported type attributes */
struct mdev_type_attribute {
struct attribute attr;
ssize_t (*show)(struct kobject *kobj, struct device *dev, char *buf);
ssize_t (*store)(struct kobject *kobj, struct device *dev,
const char *buf, size_t count);
};
#define MDEV_TYPE_ATTR(_name, _mode, _show, _store) \
struct mdev_type_attribute mdev_type_attr_##_name = \
__ATTR(_name, _mode, _show, _store)
#define MDEV_TYPE_ATTR_RW(_name) \
struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_RW(_name)
#define MDEV_TYPE_ATTR_RO(_name) \
struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_RO(_name)
#define MDEV_TYPE_ATTR_WO(_name) \
struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_WO(_name)
/**
* struct mdev_driver - Mediated device driver
* @name: driver name
* @probe: called when new device created
* @remove: called when device removed
* @driver: device driver structure
*
**/
struct mdev_driver {
const char *name;
int (*probe)(struct device *dev);
void (*remove)(struct device *dev);
struct device_driver driver;
};
#define to_mdev_driver(drv) container_of(drv, struct mdev_driver, driver)
#define to_mdev_device(dev) container_of(dev, struct mdev_device, dev)
static inline void *mdev_get_drvdata(struct mdev_device *mdev)
{
return mdev->driver_data;
}
static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
{
mdev->driver_data = data;
}
extern struct bus_type mdev_bus_type;
#define dev_is_mdev(d) ((d)->bus == &mdev_bus_type)
extern int mdev_register_device(struct device *dev,
const struct parent_ops *ops);
extern void mdev_unregister_device(struct device *dev);
extern int mdev_register_driver(struct mdev_driver *drv, struct module *owner);
extern void mdev_unregister_driver(struct mdev_driver *drv);
#endif /* MDEV_H */

View File

@ -75,7 +75,16 @@ struct vfio_iommu_driver_ops {
struct iommu_group *group);
void (*detach_group)(void *iommu_data,
struct iommu_group *group);
int (*pin_pages)(void *iommu_data, unsigned long *user_pfn,
int npage, int prot,
unsigned long *phys_pfn);
int (*unpin_pages)(void *iommu_data,
unsigned long *user_pfn, int npage);
int (*register_notifier)(void *iommu_data,
unsigned long *events,
struct notifier_block *nb);
int (*unregister_notifier)(void *iommu_data,
struct notifier_block *nb);
};
extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
@ -92,6 +101,36 @@ extern int vfio_external_user_iommu_id(struct vfio_group *group);
extern long vfio_external_check_extension(struct vfio_group *group,
unsigned long arg);
#define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long))
extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
int npage, int prot, unsigned long *phys_pfn);
extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
int npage);
/* each type has independent events */
enum vfio_notify_type {
VFIO_IOMMU_NOTIFY = 0,
VFIO_GROUP_NOTIFY = 1,
};
/* events for VFIO_IOMMU_NOTIFY */
#define VFIO_IOMMU_NOTIFY_DMA_UNMAP BIT(0)
/* events for VFIO_GROUP_NOTIFY */
#define VFIO_GROUP_NOTIFY_SET_KVM BIT(0)
extern int vfio_register_notifier(struct device *dev,
enum vfio_notify_type type,
unsigned long *required_events,
struct notifier_block *nb);
extern int vfio_unregister_notifier(struct device *dev,
enum vfio_notify_type type,
struct notifier_block *nb);
struct kvm;
extern void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm);
/*
* Sub-module helpers
*/
@ -103,6 +142,13 @@ extern struct vfio_info_cap_header *vfio_info_cap_add(
struct vfio_info_cap *caps, size_t size, u16 id, u16 version);
extern void vfio_info_cap_shift(struct vfio_info_cap *caps, size_t offset);
extern int vfio_info_add_capability(struct vfio_info_cap *caps,
int cap_type_id, void *cap_type);
extern int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr,
int num_irqs, int max_irq_type,
size_t *data_size);
struct pci_dev;
#ifdef CONFIG_EEH
extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);

View File

@ -203,6 +203,16 @@ struct vfio_device_info {
};
#define VFIO_DEVICE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 7)
/*
* Vendor driver using Mediated device framework should provide device_api
* attribute in supported type attribute groups. Device API string should be one
* of the following corresponding to device flags in vfio_device_info structure.
*/
#define VFIO_DEVICE_API_PCI_STRING "vfio-pci"
#define VFIO_DEVICE_API_PLATFORM_STRING "vfio-platform"
#define VFIO_DEVICE_API_AMBA_STRING "vfio-amba"
/**
* VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
* struct vfio_region_info)

View File

@ -0,0 +1,13 @@
#
# Makefile for mtty.c file
#
KERNEL_DIR:=/lib/modules/$(shell uname -r)/build
obj-m:=mtty.o
modules clean modules_install:
$(MAKE) -C $(KERNEL_DIR) SUBDIRS=$(PWD) $@
default: modules
module: modules

1503
samples/vfio-mdev/mtty.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -60,6 +60,19 @@ static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group)
symbol_put(vfio_group_put_external_user);
}
static void kvm_vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
{
void (*fn)(struct vfio_group *, struct kvm *);
fn = symbol_get(vfio_group_set_kvm);
if (!fn)
return;
fn(group, kvm);
symbol_put(vfio_group_set_kvm);
}
static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group)
{
long (*fn)(struct vfio_group *, unsigned long);
@ -159,6 +172,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
mutex_unlock(&kv->lock);
kvm_vfio_group_set_kvm(vfio_group, dev->kvm);
kvm_vfio_update_coherency(dev);
return 0;
@ -196,6 +211,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
mutex_unlock(&kv->lock);
kvm_vfio_group_set_kvm(vfio_group, NULL);
kvm_vfio_group_put_external_user(vfio_group);
kvm_vfio_update_coherency(dev);
@ -240,6 +257,7 @@ static void kvm_vfio_destroy(struct kvm_device *dev)
struct kvm_vfio_group *kvg, *tmp;
list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) {
kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
kvm_vfio_group_put_external_user(kvg->vfio_group);
list_del(&kvg->node);
kfree(kvg);