Commit Graph

117 Commits

Author SHA1 Message Date
Cédric Le Goater 0bbf14a095 ppc/spapr: add a POWER10 CPU model
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200507073855.2485680-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-05-27 15:27:29 +10:00
Markus Armbruster b69c3c21a5 qdev: Unrealize must not fail
Devices may have component devices and buses.

Device realization may fail.  Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).

When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far.  If any of these unrealizes
failed, the device would be left in an inconsistent state.  Must not
happen.

device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.

Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back?  We'd have to
re-realize, which can fail.  This design is fundamentally broken.

device_set_realized() does not roll back at all.  Instead, it keeps
unrealizing, ignoring further errors.

It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.

bus_set_realized() does not roll back either.  Instead, it stops
unrealizing.

Fortunately, no unrealize method can fail, as we'll see below.

To fix the design error, drop parameter @errp from all the unrealize
methods.

Any unrealize method that uses @errp now needs an update.  This leads
us to unrealize() methods that can fail.  Merely passing it to another
unrealize method cannot cause failure, though.  Here are the ones that
do other things with @errp:

* virtio_serial_device_unrealize()

  Fails when qbus_set_hotplug_handler() fails, but still does all the
  other work.  On failure, the device would stay realized with its
  resources completely gone.  Oops.  Can't happen, because
  qbus_set_hotplug_handler() can't actually fail here.  Pass
  &error_abort to qbus_set_hotplug_handler() instead.

* hw/ppc/spapr_drc.c's unrealize()

  Fails when object_property_del() fails, but all the other work is
  already done.  On failure, the device would stay realized with its
  vmstate registration gone.  Oops.  Can't happen, because
  object_property_del() can't actually fail here.  Pass &error_abort
  to object_property_del() instead.

* spapr_phb_unrealize()

  Fails and bails out when remove_drcs() fails, but other work is
  already done.  On failure, the device would stay realized with some
  of its resources gone.  Oops.  remove_drcs() fails only when
  chassis_from_bus()'s object_property_get_uint() fails, and it can't
  here.  Pass &error_abort to remove_drcs() instead.

Therefore, no unrealize method can fail before this patch.

device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool().  Can't drop @errp there, so pass
&error_abort.

We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors.  Pass &error_abort instead.

Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.

One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().

Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-15 07:08:14 +02:00
Markus Armbruster d2623129a7 qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.

Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.

We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.

The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.

Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.

When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.

Drop parameter @errp and assert the preconditions instead.

There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
2020-05-15 07:07:58 +02:00
Alexey Kardashevskiy 395a20d3cc ppc/spapr: Move GPRs setup to one place
At the moment "pseries" starts in SLOF which only expects the FDT blob
pointer in r3. As we are going to introduce a OpenFirmware support in
QEMU, we will be booting OF clients directly and these expect a stack
pointer in r1, Linux looks at r3/r4 for the initramdisk location
(although vmlinux can find this from the device tree but zImage from
distro kernels cannot).

This extends spapr_cpu_set_entry_state() to take more registers. This
should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20200310050733.29805-2-aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-03-17 15:08:50 +11:00
David Gibson e8b1144e73 spapr, ppc: Remove VPM0/RMLS hacks for POWER9
For the "pseries" machine, we use "virtual hypervisor" mode where we
only model the CPU in non-hypervisor privileged mode.  This means that
we need guest physical addresses within the modelled cpu to be treated
as absolute physical addresses.

We used to do that by clearing LPCR[VPM0] and setting LPCR[RMLS] to a high
limit so that the old offset based translation for guest mode applied,
which does what we need.  However, POWER9 has removed support for that
translation mode, which meant we had some ugly hacks to keep it working.

We now explicitly handle this sort of translation for virtual hypervisor
mode, so the hacks aren't necessary.  We don't need to set VPM0 and RMLS
from the machine type code - they're now ignored in vhyp mode.  On the cpu
side we don't need to allow LPCR[RMLS] to be set on POWER9 in vhyp mode -
that was only there to allow the hack on the machine side.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2020-03-17 09:41:15 +11:00
Marc-André Lureau 4f67d30b5e qdev: set properties with device_class_set_props()
The following patch will need to handle properties registration during
class_init time. Let's use a device_class_set_props() setter.

spatch --macro-file scripts/cocci-macro-file.h  --sp-file
./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place
--dir .

@@
typedef DeviceClass;
DeviceClass *d;
expression val;
@@
- d->props = val
+ device_class_set_props(d, val)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:15 +01:00
Greg Kurz 0990ce6a2e ppc: Add intc_destroy() handlers to SpaprInterruptController/PnvChip
SpaprInterruptControllerClass and PnvChipClass have an intc_create() method
that calls the appropriate routine, ie. icp_create() or xive_tctx_create(),
to establish the link between the VCPU and the presenter component of the
interrupt controller during realize.

There aren't any symmetrical call to be called when the VCPU gets unrealized
though. It is assumed that object_unparent() is the only thing to do.

This is questionable because the parenting logic around the CPU and
presenter objects is really an implementation detail of the interrupt
controller. It shouldn't be open-coded in the machine code.

Fix this by adding an intc_destroy() method that undoes what was done in
intc_create(). Also NULLify the presenter pointers to avoid having
stale pointers around. This will allow to reliably check if a vCPU has
a valid presenter.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2019-11-18 11:49:11 +01:00
Cédric Le Goater d49e8a9b46 ppc: Reset the interrupt presenter from the CPU reset handler
On the sPAPR machine and PowerNV machine, the interrupt presenters are
created by a machine handler at the core level and are reset
independently. This is not consistent and it raises issues when it
comes to handle hot-plugged CPUs. In that case, the presenters are not
reset. This is less of an issue in XICS, although a zero MFFR could
be a concern, but in XIVE, the OS CAM line is not set and this breaks
the presenting algorithm. The current code has workarounds which need
a global cleanup.

Extend the sPAPR IRQ backend and the PowerNV Chip class with a new
cpu_intc_reset() handler called by the CPU reset handler and remove
the XiveTCTX reset handler which is now redundant.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-6-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:33:45 +11:00
Greg Kurz d1f2b4691a spapr_cpu_core: Implement DeviceClass::reset
Since vCPUs aren't plugged into a bus, we manually register a reset
handler for each vCPU. We also call this handler at realize time
to ensure hot plugged vCPUs are reset before being exposed to the
guest. This results in vCPUs being reset twice at machine reset.
It doesn't break anything but it is slightly suboptimal and above
all confusing.

The hotplug path in device_set_realized() already knows how to reset
a hotplugged device if the device reset handler is present. Implement
one for sPAPR CPU cores that resets all vCPUs under a core.

While here rename spapr_cpu_reset() to spapr_reset_vcpu() for
consistency with spapr_realize_vcpu() and spapr_unrealize_vcpu().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[clg: add documentation on the reset helper usage ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:32:33 +11:00
Cédric Le Goater 90f8db52bb spapr: move CPU reset after presenter creation
This change prepares ground for future changes which will reset the
interrupt presenter in the reset handler of the sPAPR and PowerNV
cores.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:32:33 +11:00
David Gibson ebd6be089b spapr, xics, xive: Move cpu_intc_create from SpaprIrq to SpaprInterruptController
This method essentially represents code which belongs to the interrupt
controller, but needs to be called on all possible intcs, rather than
just the currently active one.  The "dual" version therefore calls
into the xics and xive versions confusingly.

Handle this more directly, by making it instead a method on the intc
backend, and always calling it on every backend that exists.

While we're there, streamline the error reporting a bit.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
Greg Kurz b1e8156743 spapr: Set compat mode in spapr_core_plug()
A recent change in spapr_machine_reset() showed that resetting the compat
mode in spapr_machine_reset() for the boot vCPU and in spapr_cpu_reset()
for all other vCPUs was fragile. The fix was thus to reset the compat mode
for all vCPUs in spapr_machine_reset(), but we still have to propagate
it to hot-plugged CPUs. This is still performed from spapr_cpu_reset(),
hence resulting in ppc_set_compat() being called twice for every vCPU at
machine reset. Apart from wasting cycles, which isn't really an issue
during machine reset, this seems to indicate that spapr_cpu_reset() isn't
the best place to set the compat mode.

A natural candidate for CPU-hotplug specific code is spapr_core_plug().
Also, it sits in the same file as spapr_machine_reset() : this makes
it easier for someone who wants to know when the compat PVR is set.

Call ppc_set_compat() from there. This doesn't need to be done for
initial vCPUs since the compat PVR is 0 and spapr_machine_reset() sets
the appropriate value later. No need to do this on manually added vCPUS
on the destination QEMU during migration since the compat PVR is
part of the migrated vCPU state. Both conditions can be checked with
spapr_drc_hotplugged().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156701285312.499757.7807417667750711711.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-29 09:46:07 +10:00
Laurent Vivier ce03a193e1 pseries: Fix compat_pvr on reset
If we a migrate P8 machine to a P9 machine, the migration fails on
destination with:

  error while loading state for instance 0x1 of device 'cpu'
  load of migration failed: Operation not permitted

This is caused because the compat_pvr field is only present for the first
CPU.
Originally, spapr_machine_reset() calls ppc_set_compat() to set the value
max_compat_pvr for the first cpu and this was propagated to all CPUs by
spapr_cpu_reset().  Now, as spapr_cpu_reset() is called before that, the
value is not propagated to all CPUs and the migration fails.

To fix that, propagate the new value to all CPUs in spapr_machine_reset().

Fixes: 25c9780d38 ("spapr: Reset CAS & IRQ subsystem after devices")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20190826090812.19080-1-lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-29 09:46:07 +10:00
Markus Armbruster 12e9493df9 Include hw/boards.h a bit less
hw/boards.h pulls in almost 60 headers.  The less we include it into
headers, the better.  As a first step, drop superfluous inclusions,
and downgrade some more to what's actually needed.  Gets rid of just
one inclusion into a header.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-23-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 71e8a91585 Include sysemu/reset.h a lot less
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Suraj Jitindar Singh 70de096748 target/ppc: Set PSSCR_EC on cpu halt to prevent spurious wakeup
The processor stop status and control register (PSSCR) is used to
control the power saving facilities of the thread. The exit criterion
bit (EC) is used to specify whether the thread should be woken by any
interrupt (EC == 0) or only an interrupt enabled in the LPCR to wake the
thread (EC == 1).

The rtas facilities start-cpu and self-stop are used to transition a
vcpu between the stopped and running states. When a vcpu is stopped it
may only be started again by the start-cpu rtas call.

Currently a vcpu in the stopped state will start again whenever an
interrupt comes along due to PSSCR_EC being cleared, and while this is
architecturally correct for a hardware thread, a vcpu is expected to
only be woken by calling start-cpu. This means when performing a reboot
on a tcg machine that the secondary threads will restart while the
primary is still in slof, this is unsupported and causes call traces
like:

SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 14 2019 18:00:39
 FW Version = git-a5b428e1c1eae703
 Press "s" to enter Open Firmware.

qemu: fatal: Trying to deliver HV exception (MSR) 70 with no HV support

NIP 6d61676963313230   LR 000000003dbe0308 CTR 6d61676963313233 XER 0000000000000000 CPU#1
MSR 0000000000000000 HID0 0000000000000000  HF 0000000000000000 iidx 3 didx 3
TB 00000026 115746031956 DECR 18446744073326238463
GPR00 000000003dbe0308 000000003e669fe0 000000003dc10700 0000000000000003
GPR04 000000003dc62198 000000003dc62178 000000003dc0ea48 0000000000000030
GPR08 000000003dc621a8 0000000000000018 000000003e466008 000000003dc50700
GPR12 c00000000093a4e0 c00000003ffff300 c00000003e533f90 0000000000000000
GPR16 0000000000000000 0000000000000000 000000003e466010 000000003dc0b040
GPR20 0000000000008000 000000000000f003 0000000000000006 000000003e66a050
GPR24 000000003dc06400 000000003dc0ae70 0000000000000003 000000000000f001
GPR28 000000003e66a060 ffffffffffffffff 6d61676963313233 0000000000000028
CR 28000222  [ E  L  -  -  -  E  E  E  ]             RES ffffffffffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 00000000311825e0
FPR12 00000000311825e0 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
 SRR0 000000003dbe06b0  SRR1 0000000000080000    PVR 00000000004e1200 VRSAVE 0000000000000000
SPRG0 000000003dbe0308 SPRG1 000000003e669fe0  SPRG2 00000000000000d8  SPRG3 000000003dbe0308
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
HSRR0 6d61676963313230 HSRR1 0000000000000000
 CFAR 000000003dbe3e64
 LPCR 0000000004020008
 PTCR 0000000000000000   DAR 0000000000000000  DSISR 0000000000000000
Aborted (core dumped)

To fix this, set the PSSCR_EC bit when a vcpu is stopped to disable it
from coming back online until the start-cpu rtas call is made.

Fixes: 21c0d66a9c ("target/ppc: Fix support for "STOP light" states on POWER9")

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190516005744.24366-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
David Gibson ce2918cbc3 spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of.  There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".

That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.

In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words".  So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.

In addition to case changes, we also make some other identifier renames:
  VIOsPAPR* -> SpaprVio*
    The reverse word ordering was only ever used to mitigate the capital
    cluster, so revert to the natural ordering.
  VIOsPAPRVTYDevice -> SpaprVioVty
  VIOsPAPRVLANDevice -> SpaprVioVlan
    Brevity, since the "Device" didn't add useful information
  sPAPRDRConnector -> SpaprDrc
  sPAPRDRConnectorClass -> SpaprDrcClass
    Brevity, and makes it clearer this is the same thing as a "DRC"
    mentioned in many other places in the code

This is 100% a mechanical search-and-replace patch.  It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12 14:33:05 +11:00
Cédric Le Goater a28b9a5a8d spapr: move the interrupt presenters under machine_data
Next step is to remove them from under the PowerPCCPU

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-04 18:44:18 +11:00
Cédric Le Goater 3ff73aa241 ppc: replace the 'Object *intc' by a 'ICPState *icp' pointer under the CPU
Now that the 'intc' pointer is only used by the XICS interrupt mode,
let's make things clear and use a XICS type and name.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09 09:28:14 +11:00
Cédric Le Goater 129dbe6926 ppc/xive: introduce a XiveTCTX pointer under PowerPCCPU
which will be used by the machine only when the XIVE interrupt mode is
in use.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09 09:28:14 +11:00
Cédric Le Goater 8fa1f4ef38 spapr: modify the prototype of the cpu_intc_create() method
Today, the interrupt presenter is linked to a CPU using the
cpu_intc_create() method of the sPAPR IRQ backend. The resulting
object is assigned to the PowerPCCPU 'intc' pointer whatever the
interrupt mode, XICS or XIVE.

To support the 'dual' interrupt mode, we will need to distinguish
between the two presenter objects and for that, we plan to introduce a
second interrupt presenter object pointer under the PowerPCCPU. The
modifications below move the assignment of the presenter object under
the cpu_intc_create() method to prepare ground for the future changes.

Both sPAPR and PowerNV machines are impacted.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09 09:28:14 +11:00
Cédric Le Goater 3ba3d0bc33 spapr: introduce an 'ic-mode' machine option
This option is used to select the interrupt controller mode (XICS or
XIVE) with which the machine will operate. XICS being the default
mode for now.

When running a machine with the XIVE interrupt mode backend, the guest
OS is required to have support for the XIVE exploitation mode. In the
case of legacy OS, the mode selected by CAS should be XICS and the OS
should fail to boot. However, QEMU could possibly detect it, terminate
the boot process and reset to stop in the SLOF firmware. This is not
yet handled.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:40:43 +11:00
Cédric Le Goater 1a937ad7e7 spapr: allocate the interrupt thread context under the CPU core
Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.

Extend the sPAPR IRQ backend with a new handler to support them both.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21 09:39:13 +11:00
Nikunj A Dadhania a84f71793a target/ppc/kvm: set vcpu as online/offline
Set the newly added register(KVM_REG_PPC_ONLINE) to indicate if the vcpu is
online(1) or offline(0)

KVM will use this information to set the RWMR register, which controls the PURR
and SPURR accumulation.

CC: paulus@samba.org
Signed-off-by: Nikunj A Dadhania <nikunj@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-09-05 16:06:19 +10:00
Jose Ricardo Ziviani b12a4efb76 Fix a deadlock case in the CPU hotplug flow
We need to set cs->halted to 1 before calling ppc_set_compat. The reason
is that ppc_set_compat kicks up the new thread created to manage the
hotplugged KVM virtual CPU and the code drives directly to KVM_RUN
ioctl. When cs->halted is 1, the code:

int kvm_cpu_exec(CPUState *cpu)
...
     if (kvm_arch_process_async_events(cpu)) {
         atomic_set(&cpu->exit_request, 0);
         return EXCP_HLT;
     }
...

returns before it reaches KVM_RUN, giving time to the main thread to
finish its job. Otherwise we can fall in a deadlock because the KVM
thread will issue the KVM_RUN ioctl while the main thread is setting up
KVM registers. Depending on how these jobs are scheduled we'll end up
freezing QEMU.

The following output shows kvm_vcpu_ioctl sleeping because it cannot get
the mutex and never will.
PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr.

STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL

PID: 61564  TASK: c000003e981e0780  CPU: 48  COMMAND: "qemu-system-ppc"
 #0 [c000003e982679a0] __schedule at c000000000b10a44
 #1 [c000003e98267a60] schedule at c000000000b113a8
 #2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910
 #3 [c000003e98267ab0] __mutex_lock at c000000000b132ec
 #4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm]
 #5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30
 #6 [c000003e98267dc0] ksys_ioctl at c000000000408674
 #7 [c000003e98267e10] sys_ioctl at c0000000004086f8
 #8 [c000003e98267e30] system_call at c00000000000b488

crash> struct -x kvm.vcpus 0xc000003da0000000
vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...}

crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000
  mutex.owner = {
    counter = 0xc000003a23a5c881 <- flag 1: waiters
  },

crash> bt 0xc000003a23a5c880
PID: 61579  TASK: c000003a23a5c880  CPU: 9   COMMAND: "CPU 4/KVM"
(active)

crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000
  mutex.wait_list = {
    next = 0xc000003e98267b10,
    prev = 0xc000003e98267b10
  },

crash> struct -x mutex_waiter.task 0xc000003e98267b10
  task = 0xc000003e981e0780

The following command-line was used to reproduce the problem (note: gdb
and trace can change the results).

 $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \
     -enable-kvm -m 4096 \
     -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \
     -display none -nographic \
     -drive file=disk1.qcow2,format=qcow2
 ...
 (qemu) device_add host-spapr-cpu-core,core-id=4
[no interaction is possible after it, only SIGKILL to take the terminal
back]

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-09-03 11:46:43 +10:00
Cédric Le Goater ef01ed9d19 spapr: introduce a IRQ controller backend to the machine
This proposal moves all the related IRQ routines of the sPAPR machine
behind a sPAPR IRQ backend interface 'spapr_irq' to prepare for future
changes. First of which will be to increase the size of the IRQ number
space, then, will follow a new backend for the POWER9 XIVE IRQ controller.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Bharata B Rao cc71c7760e spapr_cpu_core: vmstate_[un]register per-CPU data from (un)realizefn
VMStateDescription vmstate_spapr_cpu_state was added by commit
b94020268e (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU
data with the required vmstate registration and unregistration calls.
However the unregistration is being done only from vcpu creation error path
and not from CPU delete path.

This causes migration to fail with the following error if migration is
attempted after a CPU unplug like this:
Unknown savevm section or instance 'spapr_cpu' 16
Additionally this leaves the source VM unresponsive after migration failure.

Fix this by ensuring the vmstate_unregister happens during CPU removal.
Fixing this becomes easier when vmstate (un)registration calls are moved to
vcpu (un)realize functions which is what this patch does.

Fixes: https://bugs.launchpad.net/qemu/+bug/1785972
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 11:09:34 +10:00
David Gibson e5ca28ecab spapr: Don't rewrite mmu capabilities in KVM mode
Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
rewrites a bunch of information in the cpu state to reflect the
capabilities of the host MMU and KVM.  This overwrites the information
that's already there reflecting how the TCG implementation of the MMU will
operate.

This means that we can get guest-visibly different behaviour between KVM
and TCG (and between different KVM implementations).  That's bad.  It also
prevents migration between KVM and TCG.

The pseries machine type now has filtering of the pagesizes it allows the
guest to use which means it can present a consistent model of the MMU
across all accelerators.

So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
merely verifies that the expected cpu model can be faithfully handled by
KVM, rather than updating the cpu model to match KVM.

We call kvm_check_mmu() from the spapr cpu reset code.  This is a hack:
conceptually it makes more sense where fixup_page_sizes() was - in the KVM
cpu init path.  However, doing that would require moving the platform's
pagesize filtering much earlier, which would require a lot of work making
further adjustments.  There wouldn't be a lot of concrete point to doing
that, since the only KVM implementation which has the awkward MMU
restrictions is KVM HV, which can only work with an spapr guest anyway.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-22 14:19:07 +10:00
David Gibson e2e4f64118 spapr: Add cpu_apply hook to capabilities
spapr capabilities have an apply hook to actually activate (or deactivate)
the feature in the system at reset time.  However, a number of capabilities
affect the setup of cpus, and need to be applied to each of them -
including hotplugged cpus for extra complication.  To make this simpler,
add an optional cpu_apply hook that is called from spapr_cpu_reset().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-21 21:22:53 +10:00
Greg Kurz 7f9fe3f02d spapr_cpu_core: migrate VPA related state
QEMU implements the "Shared Processor LPAR" (SPLPAR) option, which allows
the hypervisor to time-slice a physical processor into multiple virtual
processor. The intent is to allow more guests to run, and to optimize
processor utilization.

The guest OS can cede idle VCPUs, so that their processing capacity may
be used by other VCPUs, with the H_CEDE hcall. The guest OS can also
optimize spinlocks, by confering the time-slice of a spinning VCPU to the
spinlock holder if it's currently notrunning, with the H_CONFER hcall.

Both hcalls depend on a "Virtual Processor Area" (VPA) to be registered
by the guest OS, generally during early boot. Other per-VCPU areas can
be registered: the "SLB Shadow Buffer" which allows a more efficient
dispatching of VCPUs, and the "Dispatch Trace Log Buffer" (DTL) which
is used to compute time stolen by the hypervisor. Both DTL and SLB Shadow
areas depend on the VPA to be registered.

The VPA/SLB Shadow/DTL are state that QEMU should migrate, but this doesn't
happen, for no apparent reason other than it was just never coded. This
causes the features listed above to stop working after migration, and it
breaks the logic of the H_REGISTER_VPA hcall in the destination.

The VPA is set at the guest request, ie, we don't have to migrate
it before the guest has actually set it. This patch hence adds an
"spapr_cpu/vpa" subsection to the recently introduced per-CPU machine
data migration stream.

Since DTL and SLB Shadow are optional and both depend on VPA, they get
their own subsections "spapr_cpu/vpa/slb_shadow" and "spapr_cpu/vpa/dtl"
hanging from the "spapr_cpu/vpa" subsection.

Note that this won't break migration to older QEMUs. Is is already handled
by only registering the vmstate handler for per-CPU data with newer machine
types.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
Greg Kurz b94020268e spapr_cpu_core: migrate per-CPU data
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.

This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
David Gibson 7388efafc2 target/ppc, spapr: Move VPA information to machine_data
CPUPPCState currently contains a number of fields containing the state of
the VPA.  The VPA is a PAPR specific concept covering several guest/host
shared memory areas used to communicate some information with the
hypervisor.

As a PAPR concept this is really machine specific information, although it
is per-cpu, so it doesn't really belong in the core CPU state structure.

There's also other information that's per-cpu, but platform/machine
specific.  So create a (void *)machine_data in PowerPCCPU which can be
used by the machine to locate per-cpu data.  Intialization, lifetime and
cleanup of machine_data is entirely up to the machine type.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
2018-06-16 16:32:50 +10:00
Greg Kurz d9f0e34cb7 spapr_cpu_core: introduce spapr_create_vcpu()
This moves some code out from spapr_cpu_core_realize() for clarity. No
functional change.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-16 16:32:33 +10:00
Greg Kurz 9986ddec4c spapr_cpu_core: add missing rollback on realization path
The spapr_realize_vcpu() function doesn't rollback in case of error.
This isn't a problem with coldplugged CPUs because the machine won't
start and QEMU will exit. Hotplug is a different story though: the
CPU thread is started under object_property_set_bool() and it assumes
it can access the CPU object.

If icp_create() fails, we return an error without unregistering the
reset handler for this CPU, and we let the underlying QEMU thread for
this CPU alive. Since spapr_cpu_core_realize() doesn't care to unrealize
already realized CPUs either, but happily frees all of them anyway, the
CPU thread crashes instantly:

(qemu) device_add host-spapr-cpu-core,core-id=1,id=gku
GKU: failing icp_create (cpu 0x11497fd0)
                             ^^^^^^^^^^
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffee3feaa0 (LWP 24725)]
0x00000000104c8374 in object_dynamic_cast_assert (obj=0x11497fd0,
                                                  ^^^^^^^^^^^^^^
                                             pointer to the CPU object
623         trace_object_dynamic_cast_assert(obj ? obj->class->type->name
(gdb) p obj->class->type
$1 = (Type) 0x0
(gdb) p * obj
$2 = {class = 0x10ea9c10, free = 0x11244620,
                                 ^^^^^^^^^^
                              should be g_free
(gdb) p g_free
$3 = {<text variable, no debug info>} 0x7ffff282bef0 <g_free>

obj is a dangling pointer to the CPU that was just destroyed in
spapr_cpu_core_realize().

This patch adds proper rollback to both spapr_realize_vcpu() and
spapr_cpu_core_realize().

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fixed a conflict due to a change in my tree]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-16 16:32:33 +10:00
Greg Kurz 27607c1cdc spapr_cpu_core: fix potential leak in spapr_cpu_core_realize()
Commit 94ad93bd97 (QEMU 2.12) switched to instantiate CPUs separately
but it missed to adapt the error path accordingly. If something fails in
the CPU creation loop, then the CPU object that was just created is leaked.

The error paths in this function are a bit obfuscated, and adding
yet another label to free this CPU object makes it worse. We should
move the block of the loop to a separate function, with a proper
rollback path, but this is a bigger cleanup.

For now, let's just fix the bug by adding the missing calls to
object_unref(). This will allow easier backport to older QEMU
versions.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-16 16:32:33 +10:00
Greg Kurz dbb3e8d5da spapr_cpu_core: convert last snprintf() to g_strdup_printf()
Because this is the preferred practice in QEMU.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-16 16:32:33 +10:00
David Gibson b1d40d6e09 spapr: Clean up cpu realize/unrealize paths
spapr_cpu_init() and spapr_cpu_destroy() are only called from the spapr
cpu core realize/unrealize paths, and really can only be called from there.

Those are all short functions, so fold the pairs together for simplicity.
While we're there rename some functions and change some parameter types
for brevity and clarity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-06-16 16:32:33 +10:00
David Gibson 47a9b55154 spapr: Clean up handling of LPCR power-saving exit bits
To prevent spurious wakeups on cpus that are supposed to be disabled, we
need to clear the LPCR bits which control certain wakeup events.
spapr_cpu_reset() has separate cases here for boot and non-boot (initially
inactive) cpus.  rtas_start_cpu() then turns the LPCR bits on when the
non-boot cpus are activated.

But explicit checks against first_cpu are not how we usually do things:
instead spapr_cpu_reset() generally sets things up for non-boot (inactive)
cpus, then spapr_machine_reset() and/or rtas_start_cpu() override as
necessary.

So, do that instead.  Because the LPCR activation is identical for boot
cpus and non-boot cpus just activated with rtas_start_cpu() we can put the
code common in spapr_cpu_set_entry_state().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
2018-05-04 15:00:37 +10:00
David Gibson da20aed12a spapr: Move PAPR mode cpu setup fully to spapr code
cpu_ppc_set_papr() does several things:
    1) it sets up the virtual hypervisor interface
    2) it prevents the cpu from ever entering hypervisor mode
    3) it tells KVM that we're emulating a cpu in PAPR mode
and 4) it configures the LPCR and AMOR (hypervisor privileged registers)
       so that TCG will behave correctly for PAPR guests, without
       attempting to emulate the cpu in hypervisor mode

(1) & (2) make sense for any virtual hypervisor (if another one ever
exists).

(3) belongs more properly in the machine type specific to a PAPR guest, so
move it to spapr_cpu_init().  While we're at it, remove an ugly test on
kvm_enabled() by making kvmppc_set_papr() a safe no-op on non-KVM.

(4) also belongs more properly in the machine type specific code.  (4) is
done by mangling the default values of the SPRs, so that they will be set
correctly at reset time.  Manipulating usually-static parameters of the cpu
model like this is kind of ugly, especially since the values used really
have more to do with the platform than the cpu.

The spapr code already has places for PAPR specific initializations of
register state in spapr_cpu_reset(), so move this handling there.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
2018-05-04 15:00:37 +10:00
David Gibson 84369f639e spapr: Make a helper to set up cpu entry point state
Under PAPR, only the boot CPU is active when the system starts.  Other cpus
must be explicitly activated using an RTAS call.  The entry state for the
boot and secondary cpus isn't identical, but it has some things in common.
We're going to add a bit more common setup later, too, so to simplify
make a helper which sets up the common entry state for both boot and
secondary cpu threads.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-05-04 15:00:37 +10:00
David Gibson 88f42c6773 spapr: Set compatibility mode before the rest of spapr_cpu_reset()
Although the order doesn't really matter at the moment, it's possible
other initializastions could depend on the compatiblity mode, so make sure
we set it first in spapr_cpu_reset().

While we're at it drop the test against first_cpu.  Setting the compat mode
to the value it already has is redundant, but harmless, so we might as well
make a small simplification to the code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-04-27 18:05:23 +10:00
Greg Kurz 648edb6475 spapr: move VCPU calculation to core machine code
The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.

The current formula used in spapr_cpu_core_realize() is:

    vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

where:

    cc->core_id is a multiple of smp_threads
    cpu_index = cc->core_id + i
    0 <= i < smp_threads

So we have:

    cpu_index % smp_threads == i
    cc->core_id / smp_threads == cpu_index / smp_threads

hence:

    vcpu_id =
        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;

This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.

This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-02-16 12:14:26 +11:00
Greg Kurz 9012a53f06 spapr: fix device tree properties when using compatibility mode
Commit 51f84465dd changed the compatility mode setting logic:
- machine reset only sets compatibility mode for the boot CPU
- compatibility mode is set for other CPUs when they are put online
  by the guest with the "start-cpu" RTAS call

This causes a regression for machines started with max-compat-cpu:
the device tree nodes related to secondary CPU cores contain wrong
"cpu-version" and "ibm,pa-features" values, as shown below.

Guest started on a POWER8 host with:
     -smp cores=2 -machine pseries,max-cpu-compat=compat7

                        ibm,pa-features = [18 00 f6 3f c7 c0 80 f0 80 00
 00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00];
                        cpu-version = <0x4d0200>;

                               ^^^
                        second CPU core

                        ibm,pa-features = <0x600f63f 0xc70080c0>;
                        cpu-version = <0xf000003>;

                               ^^^
                          boot CPU core

The second core is advertised in raw POWER8 mode. This happens because
CAS assumes all CPUs to have the same compatibility mode. Since the
boot CPU already has the requested compatibility mode, the CAS code
does not set it for the secondary one, and exposes the bogus device
tree properties in in the CAS response to the guest.

A similar situation is observed when hot-plugging a CPU core. The
related device tree properties are generated and exposed to guest
with the "ibm,configure-connector" RTAS before "start-cpu" is called.
The CPU core is advertised to the guest in raw mode as well.

It both cases, it boils down to the fact that "start-cpu" happens too
late. This can be fixed globally by propagating the compatibility mode
of the boot CPU to the other CPUs during reset.  For this to work, the
compatibility mode of the boot CPU must be set before the machine code
actually resets all CPUs.

It is not needed to set the compatibility mode in "start-cpu" anymore,
so the code is dropped.

Fixes: 51f84465dd
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-20 17:15:05 +11:00
Philippe Mathieu-Daudé e9808d0969 hw: use "qemu/osdep.h" as first #include in source files
applied using ./scripts/clean-includes

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Cédric Le Goater ed0c37eedf ppc/xics: assign of the CPU 'intc' pointer under the core
The 'intc' pointer of the CPU references the interrupt presenter in
the XICS interrupt mode. When the XIVE interrupt mode is available and
activated, the machine will need to reassign this pointer to reflect
the change.

Moving this assignment under the realize routine of the CPU will ease
the process when the interrupt mode is toggled.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15 09:49:24 +11:00
Cédric Le Goater 4f7a47beeb ppc/xics: introduce an icp_create() helper
The sPAPR and the PowerNV core objects create the interrupt presenter
object of the CPUs in a very similar way. Let's provide a common
routine in which we use the presenter 'type' as a child identifier.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15 09:49:24 +11:00
Cédric Le Goater d6322252b3 spapr/rtas: fix reboot of a a SMP TCG guest
Just like for hot unplug CPUs, when a guest is rebooted, the secondary
CPUs can be awaken by the decrementer and start entering SLOF at the
same time the boot CPU is.

To be safe, let's disable on the secondaries all the exceptions which
can cause an exit while the CPU is in power-saving mode.

Based on previous work from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15 09:49:24 +11:00
Greg Kurz 94ad93bd97 spapr_cpu_core: instantiate CPUs separately
The current code assumes that only the CPU core object holds a
reference on each individual CPU object, and happily frees their
allocated memory when the core is unrealized. This is dangerous
as some other code can legitimely keep a pointer to a CPU if it
calls object_ref(), but it would end up with a dangling pointer.

Let's allocate all CPUs with object_new() and let QOM free them
when their reference count reaches zero. This greatly simplify the
code as we don't have to fiddle with the instance size anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15 09:49:23 +11:00