openkylin/qemu - qemu - 红山开源项目托管

Commit Graph

Author	SHA1	Message	Date
David Gibson	185181f883	spapr_pci: Allow VFIO devices to work on the normal PCI host bridge The core VFIO infrastructure more or less allows VFIO devices to work on any normal guest PCI host bridge (PHB) without extra logic. However, the "spapr-pci-host-bridge" device (as opposed to the special "spapr-pci-vfio-host-bridge" device) breaks this by using a partially KVM accelerated implementation of the guest kernel IOMMU which won't work with VFIO devices, without additional kernel support. This patch allows VFIO devices to work on the spapr-pci-host-bridge, by having it switch off KVM TCE acceleration when a VFIO device is added to the PHB (either on startup, or by hotplug). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2015-10-23 10:38:10 +11:00
David Gibson	f93caaac36	spapr_pci: Allow PCI host bridge DMA window to be configured At present the PCI host bridge (PHB) for the pseries machine type has a fixed DMA window from 0..1GB (in PCI address space) which is mapped to real memory via the PAPR paravirtualized IOMMU. For better support of VFIO devices, we're going to want to allow for different configurations of the DMA window. Eventually we'll want to allow the guest itself to reconfigure the window via the PAPR dynamic DMA window interface, but as a preliminary this patch allows the user to reconfigure the window with new properties on the PHB device. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2015-10-23 10:38:10 +11:00
Gavin Shan	47445c80fb	sPAPR: Revert don't enable EEH on emulated PCI devices This reverts commit `7cb18007` ("sPAPR: Don't enable EEH on emulated PCI devices") as rtas_ibm_set_eeh_option() isn't the right place to check if there has the corresponding PCI device for the input address, which can be PE address, not PCI device address. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2015-09-23 10:51:11 +10:00
Bharata B Rao	7a36ae7a9f	spapr: Support hotplug by specifying DRC count Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows hotplugging of DRCs by specifying the DRC count. While we are here, rename spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index() spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index() so that they match with spapr_hotplug_req_add_by_count(). Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2015-09-23 10:51:11 +10:00
Michael Roth	a8ad731a00	spapr_pci: fix device tree props for MSI/MSI-X PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the device node to define the number of msi/msi-x interrupts the device supports, respectively. Currently we have ibm,req#msi-x hardcoded to a non-sensical constant that happens to be 2, and are missing ibm,req#msi entirely. The result of that is that msi-x capable devices get limited to 2 msi-x interrupts (which can impact performance), and msi-only devices likely wouldn't work at all. Additionally, if devices expect a minimum that exceeds 2, the guest driver may fail to load entirely. SLOF still owns the generation of these properties at boot-time (although other device properties have since been offloaded to QEMU), but for hotplugged devices we rely on the values generated by QEMU and thus hit the limitations above. Fix this by generating these properties in QEMU as expected by guests. In the future it may make sense to modify SLOF to pass through these values directly as we do with other props since we're duplicating SLOF code. Cc: qemu-ppc@nongnu.org Cc: qemu-stable@nongnu.org Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2015-09-23 10:51:10 +10:00
Gavin Shan	a14aa92b20	sPAPR: Introduce rtas_ldq() This introduces rtas_ldq() to load 64-bits parameter from continuous two 4-bytes memory chunk of RTAS parameter buffer, to simplify the code. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2015-09-23 10:51:09 +10:00
Sam Bobroff	b359bd6a42	spapr: Make ibm, change-msi respect 3 return values Currently, rtas_ibm_change_msi() always returns four values even if less are specified. Correct this by only returning the fourth parameter if it was requested. This is specified by PAPR. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2015-09-23 10:51:09 +10:00
Markus Armbruster	012aef0734	maint: avoid useless "if (foo) free(foo)" pattern My Coccinelle semantic patch finds a few more, because it also fixes up the equally pointless conditional if (foo) { free(foo); foo = NULL; } Result (feel free to squash it into your patch): Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-09-11 10:21:38 +03:00
Gavin Shan	7cb180079e	sPAPR: Don't enable EEH on emulated PCI devices There might have emulated PCI devices, together with VFIO PCI devices under one PHB. The EEH capability shouldn't enabled on emulated PCI devices. The patch returns error when enabling EEH capability on emulated PCI devices by RTAS call "ibm,set-eeh-option". Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:53 +02:00
Nikunj A Dadhania	e634b89c6e	spapr_pci: drop redundant args in spapr_[populate, create]_pci_child_dt * phb_index is not being used and if required can be obtained from sphb * use helper to get drc_index in spapr_populate_pci_child_dt() * Check if drc_index is zero Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:53 +02:00
Nikunj A Dadhania	16b0ea1d85	spapr_pci: populate ibm,loc-code Each hardware instance has a platform unique location code. The OF device tree that describes a part of a hardware entity must include the “ibm,loc-code” property with a value that represents the location code for that hardware entity. Populate ibm,loc-code. 1) PCI passthru devices need to identify with its own ibm,loc-code available on the host. In failure cases use: vfio_<name>:<phb-index>:<bus>:<slot>.<fn> 2) Emulated devices encode as following: qemu_<name>:<phb-index>:<bus>:<slot>.<fn> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:53 +02:00
Nikunj A Dadhania	1d2d974244	spapr_pci: enumerate and add PCI device tree All the PCI enumeration and device node creation was off-loaded to SLOF. With PCI hotplug support, code needed to be added to add device node. This creates multiple copy of the code one in SLOF and other in hotplug code. To unify this, the patch adds the pci device node creation in Qemu. For backward compatibility, a flag "qemu,phb-enumerated" is added to the phb, suggesting to SLOF to not do device node creation. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [ Squashed Michael's drc_index changes ] Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:52 +02:00
Markus Armbruster	708414f03c	Revert "hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)" Since we now require GLib 2.22+ (commit `f40685c`), we don't have to work around lack of g_hash_table_iter_init() & friends anymore. This reverts commit `f8833a37c0`. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:51 +02:00
Nikunj A Dadhania	9b7d9284c3	spapr_pci: set device node unit address as hex Device node names should encode the unit address as hex, while the code was encodind it as integers. Also, use FDT_NAME_MAX macro for allocating and composing the name. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:51 +02:00
Nikunj A Dadhania	4a7c347415	spapr_pci: encode class code including Prog IF register Current code missed the Prog IF register. All Class Code, Subclass, and Prog IF registers are needed to identify the accurate device type. For example: USB controllers use the PROG IF for denoting: USB FullSpeed, HighSpeed or SuperSpeed. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:50 +02:00
Nikunj A Dadhania	72187935b4	spapr_pci: encode missing 64-bit memory address space The properties reg/assigned-resources need to encode 64-bit memory address space as part of phys.hi dword. 00 if configuration space 01 if IO region, 10 if 32-bit MEM region 11 if 64-bit MEM region Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:50 +02:00
David Gibson	28e0204254	spapr: Merge sPAPREnvironment into sPAPRMachineState The code for -machine pseries maintains a global sPAPREnvironment structure which keeps track of general state information about the guest platform. This predates the existence of the MachineState structure, but performs basically the same function. Now that we have the generic MachineState, fold sPAPREnvironment into sPAPRMachineState, the pseries specific subclass of MachineState. This is mostly a matter of search and replace, although a few places which relied on the global spapr variable are changed to find the structure via qdev_get_machine(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:44:50 +02:00
Markus Armbruster	c6bd8c706a	qerror: Clean up QERR_ macros to expand into a single string These macros expand into error class enumeration constant, comma, string. Unclean. Has been that way since commit `13f59ae`. The error class is always ERROR_CLASS_GENERIC_ERROR since the previous commit. Clean up as follows: * Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and delete it from the QERR_ macro. No change after preprocessing. * Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into error_setg(...). Again, no change after preprocessing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Tyrel Datwyler	c5bc152bc3	spapr_pci: emit hotplug add/remove events during hotplug This uses extension of existing EPOW interrupt/event mechanism to notify userspace tools like librtas/drmgr to handle in-guest configuration/cleanup operations in response to device_add/device_del. Userspace tools that don't implement this extension will need to be run manually in response/advance of device_add/device_del, respectively. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:55 +02:00
Michael Roth	7454c7af91	spapr_pci: enable basic hotplug operations This enables hotplug of PCI devices to a PHB. Upon hotplug we generate the OF-nodes required by PAPR specification and IEEE 1275-1994 "PCI Bus Binding to Open Firmware" for the device. We associate the corresponding FDT for these nodes with the DRC corresponding to the slot, which will be fetched via ibm,configure-connector RTAS calls by the guest as described by PAPR specification. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:54 +02:00
Michael Roth	62083979b0	spapr_pci: create DRConnectors for each PCI slot during PHB realize These will be used to support hotplug/unplug of PCI devices to the PCI bus associated with a particular PHB. We also set up device-tree properties in each PHBs initial FDT to describe the DRCs associated with them. This advertises to guests that each PHB is DR-capable device with physical hotpluggable slots, each managed by the corresponding DRC. This is necessary for allowing hotplugging of devices to it later via bus rescan or guest rpaphp hotplug module. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:54 +02:00
Michael Roth	7619c7b00c	spapr_pci: add dynamic-reconfiguration option for spapr-pci-host-bridge This option enables/disables PCI hotplug for a particular PHB. Also add machine compatibility code to disable it by default for machine types prior to pseries-2.4. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: move commas for compat fields] Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:54 +02:00
Alexey Kardashevskiy	ccf9ff8527	spapr_pci: Rework device-tree rendering This replaces object_child_foreach() and callback with existing SPAPR_PCI_LIOBN() and spapr_tce_find_by_liobn() to make the code easier to read. This is a mechanical patch so no behaviour change is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:51 +02:00
Alexey Kardashevskiy	46c5874e9c	spapr_pci: Make find_phb()/find_dev() public This makes find_phb()/find_dev() public and changed its names to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to be used from other parts of QEMU such as VFIO DDW (dynamic DMA window) or VFIO PCI error injection or VFIO EEH handling - in all these cases there are RTAS calls which are addressed to BUID+config_addr in IEEE1275 format. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:51 +02:00
Alexey Kardashevskiy	3e1a01cb55	spapr_pci: Define default DMA window size as a macro This gets rid of a magic constant describing the default DMA window size for an emulated PHB. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:50 +02:00
Alexey Kardashevskiy	c8545818b3	spapr_pci: Introduce a liobn number generating macros We are going to have multiple DMA windows per PHB and we want them to migrate so we need a predictable way of assigning LIOBNs. This introduces a macro which makes up a LIOBN from fixed prefix, PHB index (unique PHB id) and window number. This introduces a SPAPR_PCI_DMA_WINDOW_NUM() to know the window number from LIOBN. It is used to distinguish the default 32bit windows from dynamic windows and avoid picking default DMA window properties from a wrong TCE table. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:50 +02:00
David Gibson	421b1b27f6	spapr_pci: Fix unsafe signed/unsigned comparisons spapr_pci.c contains a number of expressions of the form (uval == -1) or (uval != -1), where 'uval' is an unsigned value. This mostly works in practice, because as long as the width of uval is greater or equal than that of (int), the -1 will be promoted to the unsigned type, which is the expected outcome. However, at least for the cases where uval is uint32_t, this would break on platforms where sizeof(int) > 4 (and a few such do exist), because then the uint32_t value would be promoted to the larger int type, and never be equal to -1. This patch fixes these errors. The fixes for the (uint32_t) cases are necessary as described above. I've made similar fixes to (uint64_t) and (hwaddr) cases. Those are strictly theoretical, since I don't know of any platforms where sizeof(int) > 8, but hey, it's not that hard so we might as well be strictly C standard compliant. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:50 +02:00
Gavin Shan	ee954280da	sPAPR: Implement EEH RTAS calls The emulation for EEH RTAS requests from guest isn't covered by QEMU yet and the patch implements them. The patch defines constants used by EEH RTAS calls and adds callbacks sPAPRPHBClass::{eeh_set_option, eeh_get_state, eeh_reset, eeh_configure}, which are going to be used as follows: * RTAS calls are received in spapr_pci.c, sanity check is done there. * RTAS handlers handle what they can. If there is something it cannot handle and the corresponding sPAPRPHBClass callback is defined, it is called. * Those callbacks are only implemented for VFIO now. They do ioctl() to the IOMMU container fd to complete the calls. Error codes from that ioctl() are transferred back to the guest. [aik: defined RTAS tokens for EEH RTAS calls] Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-03-09 15:00:08 +01:00
David Gibson	eefaccc02b	pseries: Switch VGA endian on H_SET_MODE When the guest switches the interrupt endian mode, which essentially means a global machine endian switch, we want to change the VGA framebuffer endian mode as well in order to be backward compatible with existing guests who don't know about the new endian control register. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-03-09 15:00:03 +01:00
Alexey Kardashevskiy	b194df478a	spapr-pci: Enable huge BARs At the moment sPAPR only supports 512MB window for MMIO BARs. However modern devices might want bigger 64bit BARs. This extends MMIO window from 512MB to 62GB (aligned to SPAPR_PCI_WINDOW_SPACING) and advertises it in 2 records in the PHB "ranges" property. 32bit gets the space from SPAPR_PCI_MEM_WIN_BUS_OFFSET till the end of 4GB, 64bit gets the rest of the space. If no space is left, 64bit range is not advertised. The MMIO space size is set to old value of 0x20000000 by default for pseries machines older than 2.3. The approach changes the device tree which is a guest visible change, however it won't break migration as: 1. we do not support migration to older QEMU versions 2. migration to newer QEMU will migrate the device tree as well and since the new layout only extends the old one and does not change address mappigns, no breakage is expected here too. SLOF change is required to utilize this extension. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-03-09 14:59:54 +01:00
David Gibson	3e4ac96871	pseries: Limit PCI host bridge "index" value pseries guests can have large numbers of PCI host bridges. To avoid the user having to specify a number of different configuration values for every one, the device supports an "index" property which is a shorthand setting the various window and configuration addresses from a predefined sensible set. There are some problems with the details at present: * The "index" propery is signed, but negative values will create PCI windows below where we expect, potentially colliding with other devices * No limit is imposed on the "index" property and large values can translate to extremely large window addresses. With PCI passthrough in particular this can mean we exceed various mapping and physical address limits causing the guest host bridge to not work in strange ways. This patch addresses this, by making "index" unsigned, and imposing a limit. Currently the limit allows indices from 0..255 which is probably enough host bridges for the time being. It's fairly easy to extend if we discover we need more. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-03-09 14:59:54 +01:00
Peter Maydell	f8833a37c0	hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_) The g_hash_table_iter_ functions for iterating through a hash table are not present in glib 2.12, which is our current minimum requirement. Rewrite the code to use g_hash_table_foreach() instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-11-04 23:26:13 +01:00
Greg Kurz	8c46f7ec85	spapr_pci: map the MSI window in each PHB On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit `cc943c36fa` has modified MSI-X so that writes are made using the bus master address space and follow the IOMMU path. Unfortunately, the IOMMU address space address space does not have an MSI window: the notification is silently dropped in unassigned_mem_write instead of reaching the guest... The most visible effect is that all virtio devices are non-functional on sPAPR since then. :( This patch does the following: 1) map the MSI window into the IOMMU address space for each PHB - since each PHB instantiates its own IOMMU address space, we can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW) - no real need to keep the MSI window setup in a separate function, the spapr_pci_msi_init() code moves to spapr_phb_realize(). 2) kill the global MSI window as it is not needed in the end Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:53 +02:00
Alexey Kardashevskiy	3242052248	spapr_pci: Fix config space corruption When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made if MSI or MSIX is actually supported and the MSI message was reset unconditionally. If this happened on a device which does not support MSI (but does support MSIX, otherwise "ibm,change-msi" would not be called), this device would have PCIDevice::msi_cap field (MSI capability offset) set to zero and writing a vector would actually clear PCI status. This clears MSI message only if MSI or MSIX is present on a device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:52 +02:00
Alexey Kardashevskiy	9a321e9234	spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB Currently SPAPR PHB keeps track of all allocated MSI (here and below MSI stands for both MSI and MSIX) interrupt because XICS used to be unable to reuse interrupts. This is a problem for dynamic MSI reconfiguration which happens when guest reloads a driver or performs PCI hotplug. Another problem is that the existing implementation can enable MSI on 32 devices maximum (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This reorganizes MSI information storage in sPAPRPHBState. Instead of static array of 32 descriptors (one per a PCI function), this patch adds a GHashTable when @config_addr is a key and (first_irq, num) pair is a value. GHashTable can dynamically grow and shrink so the initial limit of 32 devices is gone. This changes migration stream as @msi_table was a static array while new @msi_devs is a dynamic hash table. This adds temporary array which is used for migration, it is populated in "spapr_pci"::pre_save() callback and expanded into the hash table in post_load() callback. Since the destination side does not know the number of MSI-enabled devices in advance and cannot pre-allocate the temporary array to receive migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro which allocates the array automatically. This resets the MSI configuration space when interrupts are released by the ibm,change-msi RTAS call. This fixed traces to be more informative. This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which was incorrect by accident. As the internal representation changed, thus bumps migration version number. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: drop g_malloc_n usage] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy	bee763dbfb	spapr: Move interrupt allocator to xics The current allocator returns IRQ numbers from a pool and does not support IRQs reuse in any form as it did not keep track of what it previously returned, it only keeps the last returned IRQ. Some use cases such as PCI hot(un)plug may require IRQ release and reallocation. This moves an allocator from SPAPR to XICS. This switches IRQ users to use new API. This uses LSI/MSI flags to know if interrupt is allocated. The interrupt release function will be posted as a separate patch. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy	9bb62a0702	spapr_iommu: Make in-kernel TCE table optional POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating TCE tables in the host kernel memory and handle H_PUT_TCE requests targeted to specific LIOBN (logical bus number) right in the host without switching to QEMU. At the moment this is used for emulated devices only and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE handler finds a LIOBN and corresponding table, it will put a TCE to the table and complete hypercall execution. The user space will not be notified. Upcoming VFIO support is going to use the same sPAPRTCETable device class so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE tables for VFIO are going to be allocated in the host as well. However VFIO operates with real IOMMU tables and simple copying of a TCE to the real hardware TCE table will not work as guest physical to host physical address translation is requited. So until the host kernel gets VFIO support for H_PUT_TCE, we better not to register VFIO's TCE in the host. This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not in upstream yet and being discussed so now it is always false which means that in-kernel VFIO acceleration is not supported. This adds a bool @vfio_accel flag to the sPAPRTCETable device telling that sPAPRTCETable should not try allocating TCE table in the host kernel for VFIO. The flag is false now as at the moment there is no VFIO. This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic is the same. Since there is only emulated PCI and VIO now, the flag is set to false. Upcoming VFIO support will set it to true. This is a preparation patch so no change in behaviour is expected Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy	3a3b8502e6	spapr: Fix RTAS token numbers At the moment spapr_rtas_register() allocates a new token number for every new RTAS callback so numbers are not fixed and depend on the number of supported RTAS handlers and the exact order of spapr_rtas_register() calls. These tokens are copied into the device tree and remain the same during the guest lifetime. When we start another guest to receive a migration, it calls spapr_rtas_register() as well. If the number of RTAS handlers or their order is different in QEMU on source and destination sides, the "/rtas" node in the device tree will differ. Since migration overwrites the device tree (as it overwrites the entire RAM), the actual RTAS config on the destination side gets broken. This defines global contant values for every RTAS token which QEMU is using today. This changes spapr_rtas_register() to accept a token number instead of allocating one. This changes all users of spapr_rtas_register(). This changes XICS-KVM not to cache tokens registered with KVM as they constant now. This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as a base. TOKEN_MAX is moved and renamed too and its value is changed to the last token + 1. Boundary checks for token values are adjusted. This reserves token numbers for "os-term" handlers and PCI hotplug which we are working on. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:22 +02:00
Badari Pulavarty	9dbae97723	spapr_pci: Advertise MSI quota Hotplug of multiple disks fails due to MSI vector quota check. Number of MSI vectors default to 8 allowing only 4 devices. This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback to INTX. One way to workaround the issue is to increase total MSIs, so that MSI quota check allows us to hotplug multiple disks. This sets the quota to the maximum number of interupts XICS has which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c to xics.h for wider visibility. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> [aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:46 +02:00
Alexey Kardashevskiy	1b8eceee28	spapr_iommu: Introduce bus_offset in sPAPRTCETable This adds @bus_offset into sPAPRTCETable to tell where TCE table starts from. It is set to 0 for emulated devices. Dynamic DMA windows will use other offset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	650f33adbd	spapr_iommu: Introduce page_shift in sPAPRTCETable At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it where it is possible. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	523e7b8ab8	spapr_iommu: Get rid of window_size in sPAPRTCETable This removes window_size as it is basically a copy of nb_table shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are going to support windows as big as the entire RAM and this number will be bigger that 32 capacity, we will have to do something about @window_size anyway and removal seems to be the right way to go. This removes dma_window_start/dma_window_size from sPAPRPHBState as they are no longer used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	e28c16f61f	spapr_pci: Allow multiple TCE tables per PHB At the moment sPAPRPHBState contains a @tcet pointer to the only TCE table. However sPAPR spec allows having more than one DMA window. Since the TCE object is already a child of SPAPR PHB object, there is no need to keep an additional pointer to it in sPAPRPHBState so remove it. This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable objects. This changes the default DMA window properties calculation. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	cca7fad576	spapr_pci: spapr_iommu: Make DMA window a subregion Currently the default DMA window is represented by a single MemoryRegion. However there can be more than just one window so we need a "root" memory region to be separated from the actual DMA window(s). This introduces a "root" IOMMU memory region and adds a subregion for the default DMA 32bit window. Following patches will add other subregion(s). This initializes a default DMA window subregion size to the guest RAM size as this window can be switched into "bypass" mode which implements direct DMA mapping. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	da6ccee418	spapr_pci: Introduce a finish_realize() callback The spapr-pci PHB initializes IOMMU for emulated devices only. The upcoming VFIO support will do it different. However both emulated and VFIO PHB types share most of the initialization code. For the type specific things a new finish_realize() callback is introduced. This introduces sPAPRPHBClass derived from PCIHostBridgeClass and adds the callback pointer. This implements finish_realize() for emulated devices. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: Fix compilation] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	28668b5f31	spapr_pci: fix MSI limit At the moment XICS does not support interrupts reuse so sPAPR PHB implements this. sPAPRPHBState holds array of 32 spapr_pci_msi to describe PCI config address, first MSI and number of MSIs. Once allocated for a device, QEMU tries reusing this config until the number of MSIs changes. Existing SPAPR guests call ibm,change-msi in a loop until the handler returns the requested number of vectors. Recently introduced check for the maximum number of MSI/MSIX vectors supported by a device only works for a device which is new for PHB's MSI cache. If it is already there, the check is not performed which leads to new IRQ block allocation. This happens during PCI hotplug even when the user hot plug the same device which he just hot unplugged. This moves the check earlier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:32 +02:00
Alexey Kardashevskiy	b26696b519	spapr_pci: Fix number of returned vectors in ibm, change-msi Current guest kernels try allocating as many vectors as the quota is. For example, in the case of virtio-net (which has just 3 vectors) the guest requests 4 vectors (that is the quota in the test) and the existing ibm,change-msi handler returns 4. But before it returns, it calls msix_set_message() in a loop and corrupts memory behind the end of msix_table. This limits the number of vectors returned by ibm,change-msi to the maximum supported by the actual device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: qemu-stable@nongnu.org [agraf: squash in bugfix from aik] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:27 +02:00
Greg Kurz	fabe9ee113	spapr-pci: remove io ports workaround In the past, IO space could not be mapped into the memory address space so we introduced a workaround for that. Nowadays it does not look necessary so we can remove the workaround and make sPAPR PCI configuration simplier. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:27 +02:00
Juan Quintela	3aff6c2fea	savevm: Remove all the unneeded version_minimum_id_old (ppc) After previous Peter patch, they are redundant. This way we don't assign them except when needed. Once there, there were lots of case where the ".fields" indentation was wrong: .fields = (VMStateField []) { and .fields = (VMStateField []) { Change all the combinations to: .fields = (VMStateField[]){ The biggest problem (appart from aesthetics) was that checkpatch complained when we copy&pasted the code from one place to another. Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2014-06-16 04:55:26 +02:00
Peter Maydell	be86c53c05	PowerPC queue for 2.0-rc0 * QEMUMachine include cleanup * SLOF update * XICS reset fix * sPAPR PCI host bridge refactorings -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTIR0LAAoJEPou0S0+fgE/4rMP/ReyYRPSXFFu2PXt2werCYcA P5Agynib/O5LXTBgZCOSkTrRYBE2+5bq3Zp3iQe7gkLrpbRjHJLulwBpFW3rlVQO 5IWxMoFC1eW+PE9k+V0hRdTFXMfVxvvHam3g77XVGb4lm/thwHkQ2RYa7z1V4zCP hbj/gVdZA7pzXQZfLnOU6illaVCBjYeajEle5sagLlN+DANFSm3G/BdYWPTtxKxq MLiMEdaZJkBG/NG3zocyuCsWeF+oLyh8fB4IQUjORq7x8v749DgprQjFUy3YMrB5 Ph5CQXyVyyiZSXmfbdINMpQPxdPaVm2rtBD9ol7p+mXtc6UOS3uzbFQ75XXiG401 crqDpc74iXYpxmwtSs9FwD8TFqh5n8NFGEN8U3MOkhcNQ3zdjnlbVRMG1aqFIUne SbQeWC5Qg3MI7UruStq9n3l/3SL3KJSRRrT11ndO/4UoHT6PZfBKQ4eCD8XiHmPu /p0ahfnl8/UvMyTWVJhBvLt5G6+v8aumKJR/47jhrPnqLFCk4yUDbCdT5a5KSDjt ZauyfGlIQcORXKRbVw+DRgSQjeGLoQJMtOVqDXnKwS8j8cuhF2JebBCVTPHHjC72 UTRYk5/tLjcZCLD7Vscpd4uRiuz69xJu/MCvwqUwSZ3TssuoW3Vl0/giLyajMxTU RLn0uLaFguDX+DoSNIAy =icAj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/afaerber/tags/ppc-for-2.0' into staging PowerPC queue for 2.0-rc0 * QEMUMachine include cleanup * SLOF update * XICS reset fix * sPAPR PCI host bridge refactorings # gpg: Signature made Thu 13 Mar 2014 02:50:51 GMT using RSA key ID 3E7E013F # gpg: Good signature from "Andreas Färber <afaerber@suse.de>" # gpg: aka "Andreas Färber <afaerber@suse.com>" * remotes/afaerber/tags/ppc-for-2.0: spapr-pci: Convert fprintf() to error_report() spapr-pci: Convert to QOM realize xics-kvm: Fix reset function pseries: Update SLOF firmware image to qemu-slof-20140304 Move QEMUMachine typedef to qemu/typedefs.h Revert "KVM: Split QEMUMachine typedef into separate header" Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-03-13 13:19:46 +00:00

1 2

76 Commits