openkylin/qemu - qemu - 红山开源项目托管

Commit Graph

Author	SHA1	Message	Date
Alistair Francis	f936eac808	linux-user: Add host dependency for RISC-V 32-bit Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <76f8f9383a766dbcade883e897dec8cfef669799.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	6d06fdd10e	elf.h: Add the RISCV ELF magic numbers Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <02fc0b3a733f5f08eb396bee5afd3d327941f0c9.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Peter Maydell	9b2e891ec5	RDMA queue * Add support for RDMA MAD * Various fixes for the pvrdma backend -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcHgWkAAoJEDbUwPDPL+RtuQcIAJk6BYbi/d5EG9rE3WmM3kOp 1oh49tA3ahHPApSwc7J69M+j1MMELCUUzU/HUsd1DTn+uR219s5KO7O11f5pgRko KX+4kdWdRTumu2s51bR3yz3Alq1KjhtX8lSGchSCB/aV6o16Tt03HJcZegyeWtw1 BgKkuuFz7lKzXw6tW3Q/F1GzYNRjHAizx5q6c2PI2lxpQ39jiFD0WQa5TnPGaW5E dQ0og+aIUKNDQxcX48PeW0Rv1aRRS8GNmkO8L7dYh1x4gZkFLd9SNQu55T+WtULW drwb887ROKv5OTnJ8l9c4BQ/eAYBegFcbGcXJl+Zr2ueGk+Rup5IVWOq/hhjOuw= =Wcc9 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging RDMA queue * Add support for RDMA MAD * Various fixes for the pvrdma backend # gpg: Signature made Sat 22 Dec 2018 09:36:36 GMT # gpg: using RSA key 36D4C0F0CF2FE46D # gpg: Good signature from "Marcel Apfelbaum <marcel.apfelbaum@zoho.com>" # gpg: aka "Marcel Apfelbaum <marcel@redhat.com>" # gpg: aka "Marcel Apfelbaum <marcel.apfelbaum@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: B1C6 3A57 F92E 08F2 640F 31F5 36D4 C0F0 CF2F E46D * remotes/marcel/tags/rdma-pull-request: (31 commits) pvrdma: check return value from pvrdma_idx_ring_has_ routines rdma: remove unused VENDOR_ERR_NO_SGE macro pvrdma: release ring object in case of an error pvrdma: check number of pages when creating rings pvrdma: add uar_read routine rdma: check num_sge does not exceed MAX_SGE pvrdma: release device resources in case of an error docs: Update pvrdma device documentation hw/rdma: Do not call rdma_backend_del_gid on an empty gid hw/rdma: Do not use bitmap_zero_extend to free bitmap hw/pvrdma: Clean device's resource when system is shutdown vl: Introduce shutdown_notifiers hw/rdma: Remove unneeded code that handles more that one port hw/pvrdma: Fill error code in command's response hw/pvrdma: Fill all CQE fields hw/pvrdma: Make device state depend on Ethernet function state hw/rdma: Initialize node_guid from vmxnet3 mac address hw/pvrdma: Make sure PCI function 0 is vmxnet3 vmxnet3: Move some definitions to header file hw/pvrdma: Add support to allow guest to configure GID table ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-22 11:25:31 +00:00
Prasad J Pandit	f1e2e38ee0	pvrdma: check return value from pvrdma_idx_ring_has_ routines pvrdma_idx_ring_has_[data/space] routines also return invalid index PVRDMA_INVALID_IDX[=-1], if ring has no data/space. Check return value from these routines to avoid plausible infinite loops. Reported-by: Li Qiang <liq3ea@163.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	7be3a21325	rdma: remove unused VENDOR_ERR_NO_SGE macro With commit 4481985c (rdma: check num_sge does not exceed MAX_SGE) macro VENDOR_ERR_NO_SGE is no longer in use - delete it. Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	509f57c98e	pvrdma: release ring object in case of an error create_cq and create_qp routines allocate ring object, but it's not released in case of an error, leading to memory leakage. Reported-by: Li Qiang <liq3ea@163.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	2c858ce5da	pvrdma: check number of pages when creating rings When creating CQ/QP rings, an object can have up to PVRDMA_MAX_FAST_REG_PAGES 8 pages. Check 'npages' parameter to avoid excessive memory allocation or a null dereference. Reported-by: Li Qiang <liq3ea@163.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	2aa86456fb	pvrdma: add uar_read routine Define skeleton 'uar_read' routine. Avoid NULL dereference. Reported-by: Li Qiang <liq3ea@163.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	0e68373cc2	rdma: check num_sge does not exceed MAX_SGE rdma back-end has scatter/gather array ibv_sge[MAX_SGE=4] set to have 4 elements. A guest could send a 'PvrdmaSqWqe' ring element with 'num_sge' set to > MAX_SGE, which may lead to OOB access issue. Add check to avoid it. Reported-by: Saar Amar <saaramar5@gmail.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Prasad J Pandit	cce648613b	pvrdma: release device resources in case of an error If during pvrdma device initialisation an error occurs, pvrdma_realize() does not release memory resources, leading to memory leakage. Reported-by: Li Qiang <liq3ea@163.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <20181212175817.815-1-ppandit@redhat.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Yuval Shaia	46b69a8824	docs: Update pvrdma device documentation Interface with the device is changed with the addition of support for MAD packets. Adjust documentation accordingly. While there fix a minor mistake which may lead to think that there is a relation between using RXE on host and the compatibility with bare-metal peers. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Yuval Shaia	305fd2ba06	hw/rdma: Do not call rdma_backend_del_gid on an empty gid When device goes down the function fini_ports loops over all entries in gid table regardless of the fact whether entry is valid or not. In case that entry is not valid we'd like to skip from any further processing in backend device. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Yuval Shaia	9a3053d2e8	hw/rdma: Do not use bitmap_zero_extend to free bitmap bitmap_zero_extend is designed to work for extending, not for shrinking. Using g_free instead. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Yuval Shaia	ffa65d97fc	hw/pvrdma: Clean device's resource when system is shutdown In order to clean some external resources such as GIDs, QPs etc, register to receive notification when VM is shutdown. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:57 +02:00
Yuval Shaia	2dadd75385	vl: Introduce shutdown_notifiers Notifier will be used for signaling shutdown event to inform system is shutdown. This will allow devices and other component to run some cleanup code needed before VM is shutdown. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	14c74f7207	hw/rdma: Remove unneeded code that handles more that one port Device supports only one port, let's remove a dead code that handles more than one port. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	091782171f	hw/pvrdma: Fill error code in command's response Driver checks error code let's set it. In addition, for code simplification purposes, set response's fields ack, response and err outside of the scope of command handlers. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	eaac01005d	hw/pvrdma: Fill all CQE fields Add ability to pass specific WC attributes to CQE such as GRH_BIT flag. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	e976ebc87c	hw/pvrdma: Make device state depend on Ethernet function state User should be able to control the device by changing Ethernet function state so if user runs 'ifconfig ens3 down' the PVRDMA function should be down as well. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	028c3f93d6	hw/rdma: Initialize node_guid from vmxnet3 mac address node_guid should be set once device is load. Make node_guid be GID format (32 bit) of PCI function 0 vmxnet3 device's MAC. A new function was added to do the conversion. So for example the MAC 56:b6:44:e9:62:dc will be converted to GID 54b6:44ff:fee9:62dc. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	d961ead16e	hw/pvrdma: Make sure PCI function 0 is vmxnet3 Guest driver enforces it, we should also. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	317639aafd	vmxnet3: Move some definitions to header file pvrdma setup requires vmxnet3 device on PCI function 0 and PVRDMA device on PCI function 1. pvrdma device needs to access vmxnet3 device object for several reasons: 1. Make sure PCI function 0 is vmxnet3. 2. To monitor vmxnet3 device state. 3. To configure node_guid accoring to vmxnet3 device's MAC address. To be able to access vmxnet3 device the definition of VMXNET3State is moved to a new header file. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	2b05705dc8	hw/pvrdma: Add support to allow guest to configure GID table The control over the RDMA device's GID table is done by updating the device's Ethernet function addresses. Usually the first GID entry is determined by the MAC address, the second by the first IPv6 address and the third by the IPv4 address. Other entries can be added by adding more IP addresses. The opposite is the same, i.e. whenever an address is removed, the corresponding GID entry is removed. The process is done by the network and RDMA stacks. Whenever an address is added the ib_core driver is notified and calls the device driver add_gid function which in turn update the device. To support this in pvrdma device we need to hook into the create_bind and destroy_bind HW commands triggered by pvrdma driver in guest. Whenever a change is made to the pvrdma port's GID table a special QMP message is sent to be processed by libvirt to update the address of the backend Ethernet device. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	4a5c9903f3	qapi: Define new QMP message for pvrdma pvrdma requires that the same GID attached to it will be attached to the backend device in the host. A new QMP messages is defined so pvrdma device can broadcast any change made to its GID table. This event is captured by libvirt which in turn will update the GID table in the backend device. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	1625bb13da	hw/pvrdma: Set the correct opcode for send completion opcode for WC should be set by the device and not taken from work element. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	2bff59e699	hw/pvrdma: Set the correct opcode for recv completion The function pvrdma_post_cqe populates CQE entry with opcode from the given completion element. For receive operation value was not set. Fix it by setting it to IBV_WC_RECV. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	536692ca1d	hw/pvrdma: Make default pkey 0xFFFF Commit `6e7dba23af` ("hw/pvrdma: Make default pkey 0xFFFF") exports default pkey as external definition but omit the change from 0x7FFF to 0xFFFF. Fixes: `6e7dba23af` ("hw/pvrdma: Make default pkey 0xFFFF") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	f00c48caab	hw/pvrdma: Make function reset_device return void This function cannot fail - fix it to return void Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	605ec1663b	hw/rdma: Add support for MAD packets MAD (Management Datagram) packets are widely used by various modules both in kernel and in user space for example the rdma_* API which is used to create and maintain "connection" layer on top of RDMA uses several types of MAD packets. For more information please refer to chapter 13.4 in Volume 1 Architecture Specification, Release 1.1 available here: https://www.infinibandta.org/ibta-specifications-download/ To support MAD packets the device uses an external utility (contrib/rdmacm-mux) to relay packets from and to the guest driver. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	305bdd7a57	hw/rdma: Abort send-op if fail to create addr handler Function create_ah might return NULL, let's exit with an error. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	46462cb161	hw/rdma: Return qpn 1 if ibqp is NULL Device is not supporting QP0, only QP1. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	4082e533f6	hw/rdma: Add ability to force notification without re-arm Upon completion of incoming packet the device pushes CQE to driver's RX ring and notify the driver (msix). While for data-path incoming packets the driver needs the ability to control whether it wished to receive interrupts or not, for control-path packets such as incoming MAD the driver needs to be notified anyway, it even do not need to re-arm the notification bit. Enhance the notification field to support this. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	a5d2f6f877	contrib/rdmacm-mux: Add implementation of RDMA User MAD multiplexer RDMA MAD kernel module (ibcm) disallow more than one MAD-agent for a given MAD class. This does not go hand-by-hand with qemu pvrdma device's requirements where each VM is MAD agent. Fix it by adding implementation of RDMA MAD multiplexer service which on one hand register as a sole MAD agent with the kernel module and on the other hand gives service to more than one VM. Design Overview: Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> ---------------- A server process is registered to UMAD framework (for this to work the rdma_cm kernel module needs to be unloaded) and creates a unix socket to listen to incoming request from clients. A client process (such as QEMU) connects to this unix socket and registers with its own GID. TX: ---- When client needs to send rdma_cm MAD message it construct it the same way as without this multiplexer, i.e. creates a umad packet but this time it writes its content to the socket instead of calling umad_send(). The server, upon receiving such a message fetch local_comm_id from it so a context for this session can be maintain and relay the message to UMAD layer by calling umad_send(). RX: ---- The server creates a worker thread to process incoming rdma_cm MAD messages. When an incoming message arrived (umad_recv()) the server, depending on the message type (attr_id) looks for target client by either searching in gid->fd table or in local_comm_id->fd table. With the extracted fd the server relays to incoming message to the client. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Yuval Shaia	dee2e53c86	hw/pvrdma: Check the correct return value Return value of 0 means ok, we want to free the memory only in case of error. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20181025061700.17050-1-yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2018-12-22 11:09:56 +02:00
Palmer Dabbelt	7b91ae7d79	MAINTAINERS: Mark RISC-V as Supported There's at least two of us that are paid to work on this. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2018-12-21 07:57:15 -08:00
Peter Maydell	891ff9f4a3	ppc patch queue 2018-12-21 This pull request supersedes the one from 2018-12-13. This is a revised first ppc pull request for qemu-4.0. Highlights are: * Most of the code for the POWER9 "XIVE" interrupt controller (not complete yet, but we're getting there) * A number of g_new vs. g_malloc cleanups * Some IRQ wiring cleanups * A fix for how we advertise NUMA nodes to the guest for pseries -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlwce1QACgkQbDjKyiDZ s5JqlhAAhES3+UoNHa/eTf/o2OOgIgZ3A2FVP1kGQbb3Q415hxx1blpYkBDk6sUg POEgwbj8QMvJ8npOICb2NnHNsAhKRfgGxx/lqVgxLPTDdwGBq7Jr1lfhyX4D99WD C2oLtJWvQrA7yIsDzurMjJpFvw8SYSogppom4lqE5667pm6U0j7JggFJkwIo+VAj jzl706vvB6/EL3PHZ8eCzsxT2oRpxxMStE3lJ1JPpKc60mFb5gkXMk3hura1L8Ez t4NEN9I4ePivXlh6YYDp7Pv5l9JSzKV7Uu8xrYeMdz33e0jUWyoERrdhM51mI4s1 WGoQm6eL6p0jngAUPYtAdIGC6ZGaCMT5rkoDZ4K+us94kVvdqzWjQyRNp84GpQq0 Z/sxJaTSK2DZMnQL3LE19upk7XkB5uBgnjs5T5FcFia7bIDG3p8MY4VwIM4dRum9 WuirEUJRKg28eTTnuK9NQX2+MEnrRWc/FSNaBLjxrijD4C4jHogXTpssNprYnkV7 HgkQ2MaidcnNLftfOUeBr0aTx+rGqtUB56Xas1UK+WqykKVfRdZ6hnbRg30FJvQ/ X4SIc/QZcLcA78C/SvCuXa1uqWqlrZMhN2e5r+eiEXaFYyriWgMYe9w+mXKbrTsb ZPkbax6xz1esqOZ15ytCTneQONyhMXy5iFDfb0khO4DO0uXq4dA= =IN1s -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20181221' into staging ppc patch queue 2018-12-21 This pull request supersedes the one from 2018-12-13. This is a revised first ppc pull request for qemu-4.0. Highlights are: * Most of the code for the POWER9 "XIVE" interrupt controller (not complete yet, but we're getting there) * A number of g_new vs. g_malloc cleanups * Some IRQ wiring cleanups * A fix for how we advertise NUMA nodes to the guest for pseries # gpg: Signature made Fri 21 Dec 2018 05:34:12 GMT # gpg: using RSA key 6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-4.0-20181221: (40 commits) MAINTAINERS: PPC: add a XIVE section spapr: change default CPU type to POWER9 spapr: introduce an 'ic-mode' machine option spapr: add an extra OV5 field to the sPAPR IRQ backend spapr: add a 'reset' method to the sPAPR IRQ backend spapr: extend the sPAPR IRQ backend for XICS migration spapr: allocate the interrupt thread context under the CPU core spapr: add device tree support for the XIVE exploitation mode spapr: add hcalls support for the XIVE exploitation interrupt mode spapr: introduce a new machine IRQ backend for XIVE spapr-iommu: Always advertise the maximum possible DMA window size spapr/xive: use the VCPU id as a NVT identifier spapr/xive: introduce a XIVE interrupt controller ppc/xive: notify the CPU when the interrupt priority is more privileged ppc/xive: introduce a simplified XIVE presenter ppc/xive: introduce the XIVE interrupt thread context ppc/xive: add support for the END Event State Buffers Changes requirement for "vsubsbs" instruction spapr: export and rename the xics_max_server_number() routine spapr: introduce a spapr_irq_init() routine ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-21 15:49:59 +00:00
Peter Maydell	15763776bf	pci, pc, virtio: fixes, features VTD fixes IR and split irqchip are now the default for Q35 ACPI refactoring hotplug refactoring new names for virtio devices multiple pcie link width/speeds PCI fixes Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcG967AAoJECgfDbjSjVRpUlMH/1wy8WW7Br/4JxlWUPsfZTqZ 0Lg2n59wuFzRVS+gLotp6bUaJGR9xn9fKjI1wfD59oVrDTKyauuw82v0OityEs33 ZquFecuJvP6k5N40DkqA9YJjKSw7nUmLrsyrD0t2H43npikP2aD9f6yootrr3oVV nPwBvyT9ykIBQc7IzsHDiLw3EPdIf+9IR7+l+PLptzV0zK43Jfwi/nzHIQq00UMz eLM/ejQLIx4BZJnYGS0Cy6v3K7cS3k45LHDY0cGc0id2jHFN2vdQyHCF9I1ps72q pSlhMaLEwvQSYyre6iFTG5uuvyIPWv3LOkaBEwMSA5B/HXuEb2RKHThYzS9dc68= =OwW7 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging pci, pc, virtio: fixes, features VTD fixes IR and split irqchip are now the default for Q35 ACPI refactoring hotplug refactoring new names for virtio devices multiple pcie link width/speeds PCI fixes Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 20 Dec 2018 18:26:03 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (44 commits) x86-iommu: turn on IR by default if proper x86-iommu: switch intr_supported to OnOffAuto type q35: set split kernel irqchip as default pci: Adjust PCI config limit based on bus topology spapr_pci: perform unplug via the hotplug handler pci/shpc: perform unplug via the hotplug handler pci: Reuse pci-bridge hotplug handler handlers for pcie-pci-bridge pci/pcie: perform unplug via the hotplug handler pci/pcihp: perform unplug via the hotplug handler pci/pcihp: overwrite hotplug handler recursively from the start pci/pcihp: perform check for bus capability in pre_plug handler s390x/pci: rename hotplug handler callbacks pci/shpc: rename hotplug handler callbacks pci/pcie: rename hotplug handler callbacks hw/i386: Remove deprecated machines pc-0.10 and pc-0.11 hw: acpi: Remove AcpiRsdpDescriptor and fix tests hw: acpi: Export and share the ARM RSDP build hw: arm: Support both legacy and current RSDP build hw: arm: Convert the RSDP build to the buid_append_foo() API hw: arm: Carry RSDP specific data through AcpiRsdpData ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-21 14:06:01 +00:00
Cédric Le Goater	b62c6e1237	MAINTAINERS: PPC: add a XIVE section Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:40:43 +11:00
Cédric Le Goater	34a6b015a9	spapr: change default CPU type to POWER9 Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:40:43 +11:00
Cédric Le Goater	3ba3d0bc33	spapr: introduce an 'ic-mode' machine option This option is used to select the interrupt controller mode (XICS or XIVE) with which the machine will operate. XICS being the default mode for now. When running a machine with the XIVE interrupt mode backend, the guest OS is required to have support for the XIVE exploitation mode. In the case of legacy OS, the mode selected by CAS should be XICS and the OS should fail to boot. However, QEMU could possibly detect it, terminate the boot process and reset to stop in the SLOF firmware. This is not yet handled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:40:43 +11:00
Cédric Le Goater	db592b5b16	spapr: add an extra OV5 field to the sPAPR IRQ backend The interrupt modes supported by the hypervisor are advertised to the guest with new bits definitions of the option vector 5 of property "ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are defined as follow : 0b00 PAPR 2.7 and earlier (Legacy systems) 0b01 XIVE Exploitation mode only 0b10 Either available If the client/guest selects the XIVE interrupt mode, it informs the hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00 value indicates the use of the XICS interrupt mode (Legacy systems). The sPAPR IRQ backend is extended with these definitions and the values are directly used to populate the "ibm,arch-vec-5-platform-support" property. The interrupt mode is advertised under TCG and under KVM. Although a KVM XIVE device is not yet available, the machine can still operate with kernel_irqchip=off. However, we apply a restriction on the CPU which is required to be a POWER9 when a XIVE interrupt controller is in use. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:40:43 +11:00
Cédric Le Goater	b2e2247716	spapr: add a 'reset' method to the sPAPR IRQ backend For the time being, the XIVE reset handler updates the OS CAM line of the vCPU as it is done under a real hypervisor when a vCPU is scheduled to run on a HW thread. This will let the XIVE presenter engine find a match among the NVTs dispatched on the HW threads. This handler will become even more useful when we introduce the machine supporting both interrupt modes, XIVE and XICS. In this machine, the interrupt mode is chosen by the CAS negotiation process and activated after a reset. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fix style nits] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:40:35 +11:00
Cédric Le Goater	1c53b06c03	spapr: extend the sPAPR IRQ backend for XICS migration Introduce a new sPAPR IRQ handler to handle resend after migration when the machine is using a KVM XICS interrupt controller model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:39:13 +11:00
Cédric Le Goater	1a937ad7e7	spapr: allocate the interrupt thread context under the CPU core Each interrupt mode has its own specific interrupt presenter object, that we store under the CPU object, one for XICS and one for XIVE. Extend the sPAPR IRQ backend with a new handler to support them both. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:39:13 +11:00
Cédric Le Goater	6e21de4a50	spapr: add device tree support for the XIVE exploitation mode The XIVE interface for the guest is described in the device tree under the "interrupt-controller" node. A couple of new properties are specific to XIVE : - "reg" contains the base address and size of the thread interrupt managnement areas (TIMA), for the User level and for the Guest OS level. Only the Guest OS level is taken into account today. - "ibm,xive-eq-sizes" the size of the event queues. One cell per size supported, contains log2 of size, in ascending order. - "ibm,xive-lisn-ranges" the IRQ interrupt number ranges assigned to the guest for the IPIs. and also under the root node : - "ibm,plat-res-int-priorities" contains a list of priorities that the hypervisor has reserved for its own use. OPAL uses the priority 7 queue to automatically escalate interrupts for all other queues (DD2.X POWER9). So only priorities [0..6] are allowed for the guest. Extend the sPAPR IRQ backend with a new handler to populate the DT with the appropriate "interrupt-controller" node. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fix style nits] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:39:07 +11:00
Cédric Le Goater	23bcd5eb9a	spapr: add hcalls support for the XIVE exploitation interrupt mode The different XIVE virtualization structures (sources and event queues) are configured with a set of Hypervisor calls : - H_INT_GET_SOURCE_INFO used to obtain the address of the MMIO page of the Event State Buffer (ESB) entry associated with the source. - H_INT_SET_SOURCE_CONFIG assigns a source to a "target". - H_INT_GET_SOURCE_CONFIG determines which "target" and "priority" is assigned to a source - H_INT_GET_QUEUE_INFO returns the address of the notification management page associated with the specified "target" and "priority". - H_INT_SET_QUEUE_CONFIG sets or resets the event queue for a given "target" and "priority". It is also used to set the notification configuration associated with the queue, only unconditional notification is supported for the moment. Reset is performed with a queue size of 0 and queueing is disabled in that case. - H_INT_GET_QUEUE_CONFIG returns the queue settings for a given "target" and "priority". - H_INT_RESET resets all of the guest's internal interrupt structures to their initial state, losing all configuration set via the hcalls H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG. - H_INT_SYNC issue a synchronisation on a source to make sure all notifications have reached their queue. Calls that still need to be addressed : H_INT_SET_OS_REPORTING_LINE H_INT_GET_OS_REPORTING_LINE See the code for more documentation on each hcall. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Folded in fix for field accessors] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:37:38 +11:00
Cédric Le Goater	dcc345b61e	spapr: introduce a new machine IRQ backend for XIVE The XIVE IRQ backend uses the same layout as the new XICS backend but covers the full range of the IRQ number space. The IRQ numbers for the CPU IPIs are allocated at the bottom of this space, below 4K, to preserve compatibility with XICS which does not use that range. This should be enough given that the maximum number of CPUs is 1024 for the sPAPR machine under QEMU. For the record, the biggest POWER8 or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192 cores, SMT8). Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:37:38 +11:00
Alexey Kardashevskiy	8994e91e96	spapr-iommu: Always advertise the maximum possible DMA window size When deciding about the huge DMA window, the typical Linux pseries guest uses the maximum allowed RAM size as the upper limit. We did the same on QEMU side to match that logic. Now we are going to support a GPU RAM pass through which is not available at the guest boot time as it requires the guest driver interaction. As the result, the guest requests a smaller window than it should. Therefore the guest needs to be patched to understand this new memory and so does QEMU. Instead of reimplementing here whatever solution we choose for the guest, this advertises the biggest possible window size limited by 32 bit (as defined by LoPAPR). Since the window size has to be power-of-two (the create rtas call receives a window shift, not a size), this uses 0x8000.0000 as the maximum number of TCEs possible (rather than 32bit maximum of 0xffff.ffff). This is safe as: 1. The guest visible emulated table is allocated in KVM (actual pages are allocated in page fault handler) and QEMU (actual pages are allocated when updated); 2. The hardware table (and corresponding userspace address table) supports sparse allocation and also checks for locked_vm limit so it is unable to cause the host any damage. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:37:38 +11:00
Cédric Le Goater	0cddee8d48	spapr/xive: use the VCPU id as a NVT identifier The IVPE scans the O/S CAM line of the XIVE thread interrupt contexts to find a matching Notification Virtual Target (NVT) among the NVTs dispatched on the HW processor threads. On a real system, the thread interrupt contexts are updated by the hypervisor when a Virtual Processor is scheduled to run on a HW thread. Under QEMU, the model will emulate the same behavior by hardwiring the NVT identifier in the thread context registers at reset. The NVT identifier used by the sPAPRXive model is the VCPU id. The END identifier is also derived from the VCPU id. A set of helpers doing the conversion between identifiers are provided for the hcalls configuring the sources and the ENDs. The model does not need a NVT table but the XiveRouter NVT operations are provided to perform some extra checks in the routing algorithm. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:37:38 +11:00
Cédric Le Goater	3aa597f650	spapr/xive: introduce a XIVE interrupt controller sPAPRXive models the XIVE interrupt controller of the sPAPR machine. It inherits from the XiveRouter and provisions storage for the routing tables : - Event Assignment Structure (EAS) - Event Notification Descriptor (END) The sPAPRXive model incorporates an internal XiveSource for the IPIs and for the interrupts of the virtual devices of the guest. This model is consistent with XIVE architecture which also incorporates an internal IVSE for IPIs and accelerator interrupts in the IVRE sub-engine. The sPAPRXive model exports two memory regions, one for the ESB trigger and management pages used to control the sources and one for the TIMA pages. They are mapped by default at the addresses found on chip 0 of a baremetal system. This is also consistent with the XIVE architecture which defines a Virtualization Controller BAR for the internal IVSE ESB pages and a Thread Managment BAR for the TIMA. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Fold in field accessor fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-12-21 09:37:38 +11:00

... 3 4 5 6 7 ...

65881 Commits All Branches Search

65881 Commits

All Branches