2008-12-16 18:43:58 +08:00
|
|
|
/*
|
|
|
|
* PowerPC implementation of KVM hooks
|
|
|
|
*
|
|
|
|
* Copyright IBM Corp. 2007
|
2011-04-30 06:10:23 +08:00
|
|
|
* Copyright (C) 2011 Freescale Semiconductor, Inc.
|
2008-12-16 18:43:58 +08:00
|
|
|
*
|
|
|
|
* Authors:
|
|
|
|
* Jerone Young <jyoung5@us.ibm.com>
|
|
|
|
* Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
|
|
|
|
* Hollis Blanchard <hollisb@us.ibm.com>
|
|
|
|
*
|
|
|
|
* This work is licensed under the terms of the GNU GPL, version 2 or later.
|
|
|
|
* See the COPYING file in the top-level directory.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
2016-01-27 02:16:58 +08:00
|
|
|
#include "qemu/osdep.h"
|
2011-07-21 08:29:15 +08:00
|
|
|
#include <dirent.h>
|
2008-12-16 18:43:58 +08:00
|
|
|
#include <sys/ioctl.h>
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
#include <sys/vfs.h>
|
2008-12-16 18:43:58 +08:00
|
|
|
|
|
|
|
#include <linux/kvm.h>
|
|
|
|
|
|
|
|
#include "qemu-common.h"
|
2016-02-19 05:01:38 +08:00
|
|
|
#include "qemu/error-report.h"
|
2016-03-15 23:58:45 +08:00
|
|
|
#include "cpu.h"
|
2012-12-18 01:20:00 +08:00
|
|
|
#include "qemu/timer.h"
|
2012-12-18 01:20:04 +08:00
|
|
|
#include "sysemu/sysemu.h"
|
2017-01-10 18:59:55 +08:00
|
|
|
#include "sysemu/hw_accel.h"
|
ppc: Disable huge page support if it is not available for main RAM
On powerpc, we must only signal huge page support to the guest if
all memory areas are capable of supporting huge pages. The commit
2d103aae8765 ("fix hugepage support when using memory-backend-file")
already fixed the case when the user specified the mem-path property
for NUMA memory nodes instead of using the global "-mem-path" option.
However, there is one more case where it currently can go wrong.
When specifying additional memory DIMMs without using NUMA, e.g.
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=2,maxmem=2G \
-device pc-dimm,id=dimm-mem1,memdev=mem1 -object \
memory-backend-file,policy=default,mem-path=/...,size=1G,id=mem1
the code in getrampagesize() currently assumes that huge pages
are possible since they are enabled for the mem1 object. But
since the main RAM is not backed by a huge page filesystem,
the guest Linux kernel then crashes very quickly after being
started. So in case the we've got "normal" memory without NUMA
and without the global "-mem-path" option, we must not announce
huge pages to the guest. Since this is likely a mis-configuration
by the user, also spill out a message in this case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 16:50:05 +08:00
|
|
|
#include "sysemu/numa.h"
|
2008-12-16 18:43:58 +08:00
|
|
|
#include "kvm_ppc.h"
|
2012-12-18 01:20:04 +08:00
|
|
|
#include "sysemu/cpus.h"
|
|
|
|
#include "sysemu/device_tree.h"
|
2013-03-12 08:31:18 +08:00
|
|
|
#include "mmu-hash64.h"
|
2008-12-16 18:43:58 +08:00
|
|
|
|
2011-08-09 23:57:37 +08:00
|
|
|
#include "hw/sysbus.h"
|
2013-02-06 00:06:20 +08:00
|
|
|
#include "hw/ppc/spapr.h"
|
|
|
|
#include "hw/ppc/spapr_vio.h"
|
2016-09-12 15:57:20 +08:00
|
|
|
#include "hw/ppc/spapr_cpu_core.h"
|
2014-05-01 18:37:09 +08:00
|
|
|
#include "hw/ppc/ppc.h"
|
2013-02-25 02:16:21 +08:00
|
|
|
#include "sysemu/watchdog.h"
|
2014-02-04 12:12:34 +08:00
|
|
|
#include "trace.h"
|
2014-07-14 17:15:37 +08:00
|
|
|
#include "exec/gdbstub.h"
|
2015-04-08 19:30:58 +08:00
|
|
|
#include "exec/memattrs.h"
|
2015-07-03 04:46:14 +08:00
|
|
|
#include "sysemu/hostmem.h"
|
2016-03-21 01:16:19 +08:00
|
|
|
#include "qemu/cutils.h"
|
2016-06-10 08:59:01 +08:00
|
|
|
#if defined(TARGET_PPC64)
|
|
|
|
#include "hw/ppc/spapr_cpu_core.h"
|
|
|
|
#endif
|
2011-08-09 23:57:37 +08:00
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
//#define DEBUG_KVM
|
|
|
|
|
|
|
|
#ifdef DEBUG_KVM
|
2013-07-29 20:16:38 +08:00
|
|
|
#define DPRINTF(fmt, ...) \
|
2008-12-16 18:43:58 +08:00
|
|
|
do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
|
|
|
|
#else
|
2013-07-29 20:16:38 +08:00
|
|
|
#define DPRINTF(fmt, ...) \
|
2008-12-16 18:43:58 +08:00
|
|
|
do { } while (0)
|
|
|
|
#endif
|
|
|
|
|
2011-07-21 08:29:15 +08:00
|
|
|
#define PROC_DEVTREE_CPU "/proc/device-tree/cpus/"
|
|
|
|
|
2011-01-22 04:48:17 +08:00
|
|
|
const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
|
|
|
|
KVM_CAP_LAST_INFO
|
|
|
|
};
|
|
|
|
|
2010-08-30 19:49:15 +08:00
|
|
|
static int cap_interrupt_unset = false;
|
|
|
|
static int cap_interrupt_level = false;
|
2011-04-30 06:10:23 +08:00
|
|
|
static int cap_segstate;
|
|
|
|
static int cap_booke_sregs;
|
2011-09-30 05:39:10 +08:00
|
|
|
static int cap_ppc_smt;
|
2011-09-30 05:39:11 +08:00
|
|
|
static int cap_ppc_rma;
|
2011-09-30 05:39:12 +08:00
|
|
|
static int cap_spapr_tce;
|
2014-05-27 13:36:30 +08:00
|
|
|
static int cap_spapr_multitce;
|
2014-06-10 13:39:21 +08:00
|
|
|
static int cap_spapr_vfio;
|
2012-09-13 00:57:09 +08:00
|
|
|
static int cap_hior;
|
2013-02-21 00:41:50 +08:00
|
|
|
static int cap_one_reg;
|
2013-03-30 14:40:49 +08:00
|
|
|
static int cap_epr;
|
2013-02-25 02:16:21 +08:00
|
|
|
static int cap_ppc_watchdog;
|
2013-04-08 03:08:22 +08:00
|
|
|
static int cap_papr;
|
2013-07-19 03:33:03 +08:00
|
|
|
static int cap_htab_fd;
|
2014-06-04 18:14:08 +08:00
|
|
|
static int cap_fixup_hcalls;
|
2016-09-28 19:16:30 +08:00
|
|
|
static int cap_htm; /* Hardware transactional memory support */
|
2010-08-30 19:49:15 +08:00
|
|
|
|
2014-07-14 17:15:35 +08:00
|
|
|
static uint32_t debug_inst_opcode;
|
|
|
|
|
2010-04-19 05:10:17 +08:00
|
|
|
/* XXX We have a race condition where we actually have a level triggered
|
|
|
|
* interrupt, but the infrastructure can't expose that yet, so the guest
|
|
|
|
* takes but ignores it, goes to sleep and never gets notified that there's
|
|
|
|
* still an interrupt pending.
|
2010-02-10 00:37:10 +08:00
|
|
|
*
|
2010-04-19 05:10:17 +08:00
|
|
|
* As a quick workaround, let's just wake up again 20 ms after we injected
|
|
|
|
* an interrupt. That way we can assure that we're always reinjecting
|
|
|
|
* interrupts in case the guest swallowed them.
|
2010-02-10 00:37:10 +08:00
|
|
|
*/
|
|
|
|
static QEMUTimer *idle_timer;
|
|
|
|
|
2012-05-03 10:02:03 +08:00
|
|
|
static void kvm_kick_cpu(void *opaque)
|
2010-02-10 00:37:10 +08:00
|
|
|
{
|
2012-05-03 10:02:03 +08:00
|
|
|
PowerPCCPU *cpu = opaque;
|
|
|
|
|
2012-05-03 10:34:15 +08:00
|
|
|
qemu_cpu_kick(CPU(cpu));
|
2010-02-10 00:37:10 +08:00
|
|
|
}
|
|
|
|
|
2016-09-29 18:48:06 +08:00
|
|
|
/* Check whether we are running with KVM-PR (instead of KVM-HV). This
|
|
|
|
* should only be used for fallback tests - generally we should use
|
|
|
|
* explicit capabilities for the features we want, rather than
|
|
|
|
* assuming what is/isn't available depending on the KVM variant. */
|
|
|
|
static bool kvmppc_is_pr(KVMState *ks)
|
|
|
|
{
|
|
|
|
/* Assume KVM-PR if the GET_PVINFO capability is available */
|
|
|
|
return kvm_check_extension(ks, KVM_CAP_PPC_GET_PVINFO) != 0;
|
|
|
|
}
|
|
|
|
|
2013-02-23 19:22:12 +08:00
|
|
|
static int kvm_ppc_register_host_cpu_type(void);
|
|
|
|
|
2015-02-04 23:43:51 +08:00
|
|
|
int kvm_arch_init(MachineState *ms, KVMState *s)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2010-08-30 19:49:15 +08:00
|
|
|
cap_interrupt_unset = kvm_check_extension(s, KVM_CAP_PPC_UNSET_IRQ);
|
|
|
|
cap_interrupt_level = kvm_check_extension(s, KVM_CAP_PPC_IRQ_LEVEL);
|
2011-04-30 06:10:23 +08:00
|
|
|
cap_segstate = kvm_check_extension(s, KVM_CAP_PPC_SEGSTATE);
|
|
|
|
cap_booke_sregs = kvm_check_extension(s, KVM_CAP_PPC_BOOKE_SREGS);
|
2011-09-30 05:39:10 +08:00
|
|
|
cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT);
|
2011-09-30 05:39:11 +08:00
|
|
|
cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA);
|
2011-09-30 05:39:12 +08:00
|
|
|
cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE);
|
2014-05-27 13:36:30 +08:00
|
|
|
cap_spapr_multitce = kvm_check_extension(s, KVM_CAP_SPAPR_MULTITCE);
|
2014-06-10 13:39:21 +08:00
|
|
|
cap_spapr_vfio = false;
|
2013-02-21 00:41:50 +08:00
|
|
|
cap_one_reg = kvm_check_extension(s, KVM_CAP_ONE_REG);
|
2012-09-13 00:57:09 +08:00
|
|
|
cap_hior = kvm_check_extension(s, KVM_CAP_PPC_HIOR);
|
2013-03-30 14:40:49 +08:00
|
|
|
cap_epr = kvm_check_extension(s, KVM_CAP_PPC_EPR);
|
2013-02-25 02:16:21 +08:00
|
|
|
cap_ppc_watchdog = kvm_check_extension(s, KVM_CAP_PPC_BOOKE_WATCHDOG);
|
2013-04-08 03:08:22 +08:00
|
|
|
/* Note: we don't set cap_papr here, because this capability is
|
|
|
|
* only activated after this by kvmppc_set_papr() */
|
2013-07-19 03:33:03 +08:00
|
|
|
cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
|
2014-06-04 18:14:08 +08:00
|
|
|
cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
|
2016-09-28 19:16:30 +08:00
|
|
|
cap_htm = kvm_vm_check_extension(s, KVM_CAP_PPC_HTM);
|
2010-08-30 19:49:15 +08:00
|
|
|
|
|
|
|
if (!cap_interrupt_level) {
|
|
|
|
fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
|
|
|
|
"VM to stall at times!\n");
|
|
|
|
}
|
|
|
|
|
2013-02-23 19:22:12 +08:00
|
|
|
kvm_ppc_register_host_cpu_type();
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-12-22 23:41:42 +08:00
|
|
|
int kvm_arch_irqchip_create(MachineState *ms, KVMState *s)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
static int kvm_arch_sync_sregs(PowerPCCPU *cpu)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2012-10-31 13:06:49 +08:00
|
|
|
CPUPPCState *cenv = &cpu->env;
|
|
|
|
CPUState *cs = CPU(cpu);
|
2009-07-17 19:51:43 +08:00
|
|
|
struct kvm_sregs sregs;
|
2011-04-12 07:34:34 +08:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (cenv->excp_model == POWERPC_EXCP_BOOKE) {
|
2011-04-16 08:00:36 +08:00
|
|
|
/* What we're really trying to say is "if we're on BookE, we use
|
|
|
|
the native PVR for now". This is the only sane way to check
|
|
|
|
it though, so we potentially confuse users that they can run
|
|
|
|
BookE guests on BookS. Let's hope nobody dares enough :) */
|
2011-04-12 07:34:34 +08:00
|
|
|
return 0;
|
|
|
|
} else {
|
2011-04-30 06:10:23 +08:00
|
|
|
if (!cap_segstate) {
|
2011-04-16 08:00:36 +08:00
|
|
|
fprintf(stderr, "kvm error: missing PVR setting capability\n");
|
|
|
|
return -ENOSYS;
|
2011-04-12 07:34:34 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_SREGS, &sregs);
|
2011-04-12 07:34:34 +08:00
|
|
|
if (ret) {
|
|
|
|
return ret;
|
|
|
|
}
|
2009-07-17 19:51:43 +08:00
|
|
|
|
|
|
|
sregs.pvr = cenv->spr[SPR_PVR];
|
2012-10-31 13:06:49 +08:00
|
|
|
return kvm_vcpu_ioctl(cs, KVM_SET_SREGS, &sregs);
|
2011-04-12 07:34:34 +08:00
|
|
|
}
|
|
|
|
|
2011-08-31 19:26:56 +08:00
|
|
|
/* Set up a shared TLB array with KVM */
|
2012-10-31 13:06:49 +08:00
|
|
|
static int kvm_booke206_tlb_init(PowerPCCPU *cpu)
|
2011-08-31 19:26:56 +08:00
|
|
|
{
|
2012-10-31 13:06:49 +08:00
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
CPUState *cs = CPU(cpu);
|
2011-08-31 19:26:56 +08:00
|
|
|
struct kvm_book3e_206_tlb_params params = {};
|
|
|
|
struct kvm_config_tlb cfg = {};
|
|
|
|
unsigned int entries = 0;
|
|
|
|
int ret, i;
|
|
|
|
|
|
|
|
if (!kvm_enabled() ||
|
2012-12-01 12:35:08 +08:00
|
|
|
!kvm_check_extension(cs->kvm_state, KVM_CAP_SW_TLB)) {
|
2011-08-31 19:26:56 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(ARRAY_SIZE(params.tlb_sizes) == BOOKE206_MAX_TLBN);
|
|
|
|
|
|
|
|
for (i = 0; i < BOOKE206_MAX_TLBN; i++) {
|
|
|
|
params.tlb_sizes[i] = booke206_tlb_size(env, i);
|
|
|
|
params.tlb_ways[i] = booke206_tlb_ways(env, i);
|
|
|
|
entries += params.tlb_sizes[i];
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(entries == env->nb_tlb);
|
|
|
|
assert(sizeof(struct kvm_book3e_206_tlb_entry) == sizeof(ppcmas_tlb_t));
|
|
|
|
|
|
|
|
env->tlb_dirty = true;
|
|
|
|
|
|
|
|
cfg.array = (uintptr_t)env->tlb.tlbm;
|
|
|
|
cfg.array_len = sizeof(ppcmas_tlb_t) * entries;
|
|
|
|
cfg.params = (uintptr_t)¶ms;
|
|
|
|
cfg.mmu_type = KVM_MMU_FSL_BOOKE_NOHV;
|
|
|
|
|
2014-04-09 23:21:57 +08:00
|
|
|
ret = kvm_vcpu_enable_cap(cs, KVM_CAP_SW_TLB, 0, (uintptr_t)&cfg);
|
2011-08-31 19:26:56 +08:00
|
|
|
if (ret < 0) {
|
|
|
|
fprintf(stderr, "%s: couldn't enable KVM_CAP_SW_TLB: %s\n",
|
|
|
|
__func__, strerror(-ret));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
env->kvm_sw_tlb = true;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
|
|
|
|
#if defined(TARGET_PPC64)
|
2012-12-01 12:35:08 +08:00
|
|
|
static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
struct kvm_ppc_smmu_info *info)
|
|
|
|
{
|
2012-12-01 12:35:08 +08:00
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
memset(info, 0, sizeof(*info));
|
|
|
|
|
|
|
|
/* We don't have the new KVM_PPC_GET_SMMU_INFO ioctl, so
|
|
|
|
* need to "guess" what the supported page sizes are.
|
|
|
|
*
|
|
|
|
* For that to work we make a few assumptions:
|
|
|
|
*
|
2016-09-29 18:48:06 +08:00
|
|
|
* - Check whether we are running "PR" KVM which only supports 4K
|
|
|
|
* and 16M pages, but supports them regardless of the backing
|
|
|
|
* store characteritics. We also don't support 1T segments.
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
*
|
|
|
|
* This is safe as if HV KVM ever supports that capability or PR
|
|
|
|
* KVM grows supports for more page/segment sizes, those versions
|
|
|
|
* will have implemented KVM_CAP_PPC_GET_SMMU_INFO and thus we
|
|
|
|
* will not hit this fallback
|
|
|
|
*
|
|
|
|
* - Else we are running HV KVM. This means we only support page
|
|
|
|
* sizes that fit in the backing store. Additionally we only
|
|
|
|
* advertize 64K pages if the processor is ARCH 2.06 and we assume
|
|
|
|
* P7 encodings for the SLB and hash table. Here too, we assume
|
|
|
|
* support for any newer processor will mean a kernel that
|
|
|
|
* implements KVM_CAP_PPC_GET_SMMU_INFO and thus doesn't hit
|
|
|
|
* this fallback.
|
|
|
|
*/
|
2016-09-29 18:48:06 +08:00
|
|
|
if (kvmppc_is_pr(cs->kvm_state)) {
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
/* No flags */
|
|
|
|
info->flags = 0;
|
|
|
|
info->slb_size = 64;
|
|
|
|
|
|
|
|
/* Standard 4k base page size segment */
|
|
|
|
info->sps[0].page_shift = 12;
|
|
|
|
info->sps[0].slb_enc = 0;
|
|
|
|
info->sps[0].enc[0].page_shift = 12;
|
|
|
|
info->sps[0].enc[0].pte_enc = 0;
|
|
|
|
|
|
|
|
/* Standard 16M large page size segment */
|
|
|
|
info->sps[1].page_shift = 24;
|
|
|
|
info->sps[1].slb_enc = SLB_VSID_L;
|
|
|
|
info->sps[1].enc[0].page_shift = 24;
|
|
|
|
info->sps[1].enc[0].pte_enc = 0;
|
|
|
|
} else {
|
|
|
|
int i = 0;
|
|
|
|
|
|
|
|
/* HV KVM has backing store size restrictions */
|
|
|
|
info->flags = KVM_PPC_PAGE_SIZES_REAL;
|
|
|
|
|
|
|
|
if (env->mmu_model & POWERPC_MMU_1TSEG) {
|
|
|
|
info->flags |= KVM_PPC_1T_SEGMENTS;
|
|
|
|
}
|
|
|
|
|
2015-10-22 15:30:58 +08:00
|
|
|
if (env->mmu_model == POWERPC_MMU_2_06 ||
|
|
|
|
env->mmu_model == POWERPC_MMU_2_07) {
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
info->slb_size = 32;
|
|
|
|
} else {
|
|
|
|
info->slb_size = 64;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Standard 4k base page size segment */
|
|
|
|
info->sps[i].page_shift = 12;
|
|
|
|
info->sps[i].slb_enc = 0;
|
|
|
|
info->sps[i].enc[0].page_shift = 12;
|
|
|
|
info->sps[i].enc[0].pte_enc = 0;
|
|
|
|
i++;
|
|
|
|
|
2015-10-22 15:30:58 +08:00
|
|
|
/* 64K on MMU 2.06 and later */
|
|
|
|
if (env->mmu_model == POWERPC_MMU_2_06 ||
|
|
|
|
env->mmu_model == POWERPC_MMU_2_07) {
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
info->sps[i].page_shift = 16;
|
|
|
|
info->sps[i].slb_enc = 0x110;
|
|
|
|
info->sps[i].enc[0].page_shift = 16;
|
|
|
|
info->sps[i].enc[0].pte_enc = 1;
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Standard 16M large page size segment */
|
|
|
|
info->sps[i].page_shift = 24;
|
|
|
|
info->sps[i].slb_enc = SLB_VSID_L;
|
|
|
|
info->sps[i].enc[0].page_shift = 24;
|
|
|
|
info->sps[i].enc[0].pte_enc = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-12-01 12:35:08 +08:00
|
|
|
static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
{
|
2012-12-01 12:35:08 +08:00
|
|
|
CPUState *cs = CPU(cpu);
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
int ret;
|
|
|
|
|
2012-12-01 12:35:08 +08:00
|
|
|
if (kvm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
|
|
|
|
ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_SMMU_INFO, info);
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
if (ret == 0) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-12-01 12:35:08 +08:00
|
|
|
kvm_get_fallback_smmu_info(cpu, info);
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
}
|
|
|
|
|
2015-07-03 04:46:14 +08:00
|
|
|
static long gethugepagesize(const char *mem_path)
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
{
|
|
|
|
struct statfs fs;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
do {
|
|
|
|
ret = statfs(mem_path, &fs);
|
|
|
|
} while (ret != 0 && errno == EINTR);
|
|
|
|
|
|
|
|
if (ret != 0) {
|
|
|
|
fprintf(stderr, "Couldn't statfs() memory path: %s\n",
|
|
|
|
strerror(errno));
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
#define HUGETLBFS_MAGIC 0x958458f6
|
|
|
|
|
|
|
|
if (fs.f_type != HUGETLBFS_MAGIC) {
|
|
|
|
/* Explicit mempath, but it's ordinary pages */
|
|
|
|
return getpagesize();
|
|
|
|
}
|
|
|
|
|
|
|
|
/* It's hugepage, return the huge page size */
|
|
|
|
return fs.f_bsize;
|
|
|
|
}
|
|
|
|
|
2016-03-16 02:34:16 +08:00
|
|
|
/*
|
|
|
|
* FIXME TOCTTOU: this iterates over memory backends' mem-path, which
|
|
|
|
* may or may not name the same files / on the same filesystem now as
|
|
|
|
* when we actually open and map them. Iterate over the file
|
|
|
|
* descriptors instead, and use qemu_fd_getpagesize().
|
|
|
|
*/
|
2015-07-03 04:46:14 +08:00
|
|
|
static int find_max_supported_pagesize(Object *obj, void *opaque)
|
|
|
|
{
|
|
|
|
char *mem_path;
|
|
|
|
long *hpsize_min = opaque;
|
|
|
|
|
|
|
|
if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
|
|
|
|
mem_path = object_property_get_str(obj, "mem-path", NULL);
|
|
|
|
if (mem_path) {
|
|
|
|
long hpsize = gethugepagesize(mem_path);
|
|
|
|
if (hpsize < *hpsize_min) {
|
|
|
|
*hpsize_min = hpsize;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
*hpsize_min = getpagesize();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static long getrampagesize(void)
|
|
|
|
{
|
|
|
|
long hpsize = LONG_MAX;
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
long mainrampagesize;
|
2015-07-03 04:46:14 +08:00
|
|
|
Object *memdev_root;
|
|
|
|
|
|
|
|
if (mem_path) {
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
mainrampagesize = gethugepagesize(mem_path);
|
|
|
|
} else {
|
|
|
|
mainrampagesize = getpagesize();
|
2015-07-03 04:46:14 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* it's possible we have memory-backend objects with
|
|
|
|
* hugepage-backed RAM. these may get mapped into system
|
|
|
|
* address space via -numa parameters or memory hotplug
|
|
|
|
* hooks. we want to take these into account, but we
|
|
|
|
* also want to make sure these supported hugepage
|
|
|
|
* sizes are applicable across the entire range of memory
|
|
|
|
* we may boot from, so we take the min across all
|
|
|
|
* backends, and assume normal pages in cases where a
|
|
|
|
* backend isn't backed by hugepages.
|
|
|
|
*/
|
|
|
|
memdev_root = object_resolve_path("/objects", NULL);
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
if (memdev_root) {
|
|
|
|
object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
|
2015-07-03 04:46:14 +08:00
|
|
|
}
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
if (hpsize == LONG_MAX) {
|
|
|
|
/* No additional memory regions found ==> Report main RAM page size */
|
|
|
|
return mainrampagesize;
|
ppc: Disable huge page support if it is not available for main RAM
On powerpc, we must only signal huge page support to the guest if
all memory areas are capable of supporting huge pages. The commit
2d103aae8765 ("fix hugepage support when using memory-backend-file")
already fixed the case when the user specified the mem-path property
for NUMA memory nodes instead of using the global "-mem-path" option.
However, there is one more case where it currently can go wrong.
When specifying additional memory DIMMs without using NUMA, e.g.
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=2,maxmem=2G \
-device pc-dimm,id=dimm-mem1,memdev=mem1 -object \
memory-backend-file,policy=default,mem-path=/...,size=1G,id=mem1
the code in getrampagesize() currently assumes that huge pages
are possible since they are enabled for the mem1 object. But
since the main RAM is not backed by a huge page filesystem,
the guest Linux kernel then crashes very quickly after being
started. So in case the we've got "normal" memory without NUMA
and without the global "-mem-path" option, we must not announce
huge pages to the guest. Since this is likely a mis-configuration
by the user, also spill out a message in this case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 16:50:05 +08:00
|
|
|
}
|
|
|
|
|
ppc: Yet another fix for the huge page support detection mechanism
Commit 86b50f2e1bef ("Disable huge page support if it is not available
for main RAM") already made sure that huge page support is not announced
to the guest if the normal RAM of non-NUMA configurations is not backed
by a huge page filesystem. However, there is one more case that can go
wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
with huge page support (and only the memory of a DIMM is configured with
it). When QEMU is started with the following command line for example,
the Linux guest currently crashes because it is trying to use huge pages
on a memory region that does not support huge pages:
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
To fix this issue, we've got to make sure to disable huge page support,
too, when there is a NUMA node that is not using a memory backend with
huge page support.
Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-15 16:10:25 +08:00
|
|
|
/* If NUMA is disabled or the NUMA nodes are not backed with a
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
* memory-backend, then there is at least one node using "normal" RAM,
|
|
|
|
* so if its page size is smaller we have got to report that size instead.
|
ppc: Yet another fix for the huge page support detection mechanism
Commit 86b50f2e1bef ("Disable huge page support if it is not available
for main RAM") already made sure that huge page support is not announced
to the guest if the normal RAM of non-NUMA configurations is not backed
by a huge page filesystem. However, there is one more case that can go
wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
with huge page support (and only the memory of a DIMM is configured with
it). When QEMU is started with the following command line for example,
the Linux guest currently crashes because it is trying to use huge pages
on a memory region that does not support huge pages:
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
To fix this issue, we've got to make sure to disable huge page support,
too, when there is a NUMA node that is not using a memory backend with
huge page support.
Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-15 16:10:25 +08:00
|
|
|
*/
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
if (hpsize > mainrampagesize &&
|
|
|
|
(nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
|
ppc: Disable huge page support if it is not available for main RAM
On powerpc, we must only signal huge page support to the guest if
all memory areas are capable of supporting huge pages. The commit
2d103aae8765 ("fix hugepage support when using memory-backend-file")
already fixed the case when the user specified the mem-path property
for NUMA memory nodes instead of using the global "-mem-path" option.
However, there is one more case where it currently can go wrong.
When specifying additional memory DIMMs without using NUMA, e.g.
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=2,maxmem=2G \
-device pc-dimm,id=dimm-mem1,memdev=mem1 -object \
memory-backend-file,policy=default,mem-path=/...,size=1G,id=mem1
the code in getrampagesize() currently assumes that huge pages
are possible since they are enabled for the mem1 object. But
since the main RAM is not backed by a huge page filesystem,
the guest Linux kernel then crashes very quickly after being
started. So in case the we've got "normal" memory without NUMA
and without the global "-mem-path" option, we must not announce
huge pages to the guest. Since this is likely a mis-configuration
by the user, also spill out a message in this case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 16:50:05 +08:00
|
|
|
static bool warned;
|
|
|
|
if (!warned) {
|
|
|
|
error_report("Huge page support disabled (n/a for main memory).");
|
|
|
|
warned = true;
|
|
|
|
}
|
ppc: Huge page detection mechanism fixes - Episode III
After already fixing two issues with the huge page detection mechanism
(see commit 159d2e39a860 and 86b50f2e1bef), Greg Kurz noticed another
case that caused the guest to crash where QEMU announces huge pages
though they should not be available for the guest:
qemu-system-ppc64 -enable-kvm ... -mem-path /dev/hugepages \
-m 1G,slots=4,maxmem=32G
-object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
-numa node,nodeid=0 -numa node,nodeid=1
That means if there is a global mem-path option, we still have
to look at the memory-backend objects that have been specified
additionally and return their minimum page size if that value
is smaller than the page size of the main memory.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 21:19:04 +08:00
|
|
|
return mainrampagesize;
|
ppc: Disable huge page support if it is not available for main RAM
On powerpc, we must only signal huge page support to the guest if
all memory areas are capable of supporting huge pages. The commit
2d103aae8765 ("fix hugepage support when using memory-backend-file")
already fixed the case when the user specified the mem-path property
for NUMA memory nodes instead of using the global "-mem-path" option.
However, there is one more case where it currently can go wrong.
When specifying additional memory DIMMs without using NUMA, e.g.
qemu-system-ppc64 -enable-kvm ... -m 1G,slots=2,maxmem=2G \
-device pc-dimm,id=dimm-mem1,memdev=mem1 -object \
memory-backend-file,policy=default,mem-path=/...,size=1G,id=mem1
the code in getrampagesize() currently assumes that huge pages
are possible since they are enabled for the mem1 object. But
since the main RAM is not backed by a huge page filesystem,
the guest Linux kernel then crashes very quickly after being
started. So in case the we've got "normal" memory without NUMA
and without the global "-mem-path" option, we must not announce
huge pages to the guest. Since this is likely a mis-configuration
by the user, also spill out a message in this case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 16:50:05 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return hpsize;
|
2015-07-03 04:46:14 +08:00
|
|
|
}
|
|
|
|
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
|
|
|
|
{
|
|
|
|
if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
return (1ul << shift) <= rampgsize;
|
|
|
|
}
|
|
|
|
|
2012-12-01 12:35:08 +08:00
|
|
|
static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
{
|
|
|
|
static struct kvm_ppc_smmu_info smmu_info;
|
|
|
|
static bool has_smmu_info;
|
2012-12-01 12:35:08 +08:00
|
|
|
CPUPPCState *env = &cpu->env;
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
long rampagesize;
|
|
|
|
int iq, ik, jq, jk;
|
2016-09-21 17:42:15 +08:00
|
|
|
bool has_64k_pages = false;
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
|
|
|
|
/* We only handle page sizes for 64-bit server guests for now */
|
|
|
|
if (!(env->mmu_model & POWERPC_MMU_64)) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Collect MMU info from kernel if not already */
|
|
|
|
if (!has_smmu_info) {
|
2012-12-01 12:35:08 +08:00
|
|
|
kvm_get_smmu_info(cpu, &smmu_info);
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
has_smmu_info = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
rampagesize = getrampagesize();
|
|
|
|
|
|
|
|
/* Convert to QEMU form */
|
|
|
|
memset(&env->sps, 0, sizeof(env->sps));
|
|
|
|
|
2015-10-22 15:30:59 +08:00
|
|
|
/* If we have HV KVM, we need to forbid CI large pages if our
|
|
|
|
* host page size is smaller than 64K.
|
|
|
|
*/
|
|
|
|
if (smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL) {
|
|
|
|
env->ci_large_pages = getpagesize() >= 0x10000;
|
|
|
|
}
|
|
|
|
|
2014-05-12 00:37:00 +08:00
|
|
|
/*
|
|
|
|
* XXX This loop should be an entry wide AND of the capabilities that
|
|
|
|
* the selected CPU has with the capabilities that KVM supports.
|
|
|
|
*/
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
for (ik = iq = 0; ik < KVM_PPC_PAGE_SIZES_MAX_SZ; ik++) {
|
|
|
|
struct ppc_one_seg_page_size *qsps = &env->sps.sps[iq];
|
|
|
|
struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
|
|
|
|
|
|
|
|
if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
|
|
|
|
ksps->page_shift)) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
qsps->page_shift = ksps->page_shift;
|
|
|
|
qsps->slb_enc = ksps->slb_enc;
|
|
|
|
for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
|
|
|
|
if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
|
|
|
|
ksps->enc[jk].page_shift)) {
|
|
|
|
continue;
|
|
|
|
}
|
2016-09-21 17:42:15 +08:00
|
|
|
if (ksps->enc[jk].page_shift == 16) {
|
|
|
|
has_64k_pages = true;
|
|
|
|
}
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
|
|
|
|
qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
|
|
|
|
if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (++iq >= PPC_PAGE_SIZES_MAX_SZ) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
env->slb_nr = smmu_info.slb_size;
|
2014-05-12 00:37:00 +08:00
|
|
|
if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
env->mmu_model &= ~POWERPC_MMU_1TSEG;
|
|
|
|
}
|
2016-09-21 17:42:15 +08:00
|
|
|
if (!has_64k_pages) {
|
|
|
|
env->mmu_model &= ~POWERPC_MMU_64K;
|
|
|
|
}
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
}
|
|
|
|
#else /* defined (TARGET_PPC64) */
|
|
|
|
|
2012-12-01 12:35:08 +08:00
|
|
|
static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* !defined (TARGET_PPC64) */
|
|
|
|
|
2013-01-23 04:25:01 +08:00
|
|
|
unsigned long kvm_arch_vcpu_id(CPUState *cpu)
|
|
|
|
{
|
2014-02-01 22:45:52 +08:00
|
|
|
return ppc_get_vcpu_dt_id(POWERPC_CPU(cpu));
|
2013-01-23 04:25:01 +08:00
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:37 +08:00
|
|
|
/* e500 supports 2 h/w breakpoint and 2 watchpoint.
|
|
|
|
* book3s supports only 1 watchpoint, so array size
|
|
|
|
* of 4 is sufficient for now.
|
|
|
|
*/
|
|
|
|
#define MAX_HW_BKPTS 4
|
|
|
|
|
|
|
|
static struct HWBreakpoint {
|
|
|
|
target_ulong addr;
|
|
|
|
int type;
|
|
|
|
} hw_debug_points[MAX_HW_BKPTS];
|
|
|
|
|
|
|
|
static CPUWatchpoint hw_watchpoint;
|
|
|
|
|
|
|
|
/* Default there is no breakpoint and watchpoint supported */
|
|
|
|
static int max_hw_breakpoint;
|
|
|
|
static int max_hw_watchpoint;
|
|
|
|
static int nb_hw_breakpoint;
|
|
|
|
static int nb_hw_watchpoint;
|
|
|
|
|
|
|
|
static void kvmppc_hw_debug_points_init(CPUPPCState *cenv)
|
|
|
|
{
|
|
|
|
if (cenv->excp_model == POWERPC_EXCP_BOOKE) {
|
|
|
|
max_hw_breakpoint = 2;
|
|
|
|
max_hw_watchpoint = 2;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((max_hw_breakpoint + max_hw_watchpoint) > MAX_HW_BKPTS) {
|
|
|
|
fprintf(stderr, "Error initializing h/w breakpoints\n");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_init_vcpu(CPUState *cs)
|
2011-04-12 07:34:34 +08:00
|
|
|
{
|
2012-10-31 13:57:49 +08:00
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *cenv = &cpu->env;
|
2011-04-12 07:34:34 +08:00
|
|
|
int ret;
|
|
|
|
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
/* Gather server mmu info from KVM and update the CPU state */
|
2012-12-01 12:35:08 +08:00
|
|
|
kvm_fixup_page_sizes(cpu);
|
ppc64: Rudimentary Support for extra page sizes on server CPUs
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-19 03:56:25 +08:00
|
|
|
|
|
|
|
/* Synchronize sregs with kvm */
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_arch_sync_sregs(cpu);
|
2011-04-12 07:34:34 +08:00
|
|
|
if (ret) {
|
2016-02-19 05:01:56 +08:00
|
|
|
if (ret == -EINVAL) {
|
|
|
|
error_report("Register sync failed... If you're using kvm-hv.ko,"
|
|
|
|
" only \"-cpu host\" is possible");
|
|
|
|
}
|
2011-04-12 07:34:34 +08:00
|
|
|
return ret;
|
|
|
|
}
|
2009-07-17 19:51:43 +08:00
|
|
|
|
2013-08-21 23:03:08 +08:00
|
|
|
idle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, kvm_kick_cpu, cpu);
|
2010-04-19 05:10:17 +08:00
|
|
|
|
2011-08-31 19:26:56 +08:00
|
|
|
switch (cenv->mmu_model) {
|
|
|
|
case POWERPC_MMU_BOOKE206:
|
2016-09-29 18:48:07 +08:00
|
|
|
/* This target supports access to KVM's guest TLB */
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_booke206_tlb_init(cpu);
|
2011-08-31 19:26:56 +08:00
|
|
|
break;
|
2016-09-29 18:48:07 +08:00
|
|
|
case POWERPC_MMU_2_07:
|
|
|
|
if (!cap_htm && !kvmppc_is_pr(cs->kvm_state)) {
|
|
|
|
/* KVM-HV has transactional memory on POWER8 also without the
|
|
|
|
* KVM_CAP_PPC_HTM extension, so enable it here instead. */
|
|
|
|
cap_htm = true;
|
|
|
|
}
|
|
|
|
break;
|
2011-08-31 19:26:56 +08:00
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:35 +08:00
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_DEBUG_INST, &debug_inst_opcode);
|
2014-07-14 17:15:37 +08:00
|
|
|
kvmppc_hw_debug_points_init(cenv);
|
2014-07-14 17:15:35 +08:00
|
|
|
|
2009-07-17 19:51:43 +08:00
|
|
|
return ret;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
static void kvm_sw_tlb_put(PowerPCCPU *cpu)
|
2011-08-31 19:26:56 +08:00
|
|
|
{
|
2012-10-31 13:06:49 +08:00
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
CPUState *cs = CPU(cpu);
|
2011-08-31 19:26:56 +08:00
|
|
|
struct kvm_dirty_tlb dirty_tlb;
|
|
|
|
unsigned char *bitmap;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (!env->kvm_sw_tlb) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
bitmap = g_malloc((env->nb_tlb + 7) / 8);
|
|
|
|
memset(bitmap, 0xFF, (env->nb_tlb + 7) / 8);
|
|
|
|
|
|
|
|
dirty_tlb.bitmap = (uintptr_t)bitmap;
|
|
|
|
dirty_tlb.num_dirty = env->nb_tlb;
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_DIRTY_TLB, &dirty_tlb);
|
2011-08-31 19:26:56 +08:00
|
|
|
if (ret) {
|
|
|
|
fprintf(stderr, "%s: KVM_DIRTY_TLB: %s\n",
|
|
|
|
__func__, strerror(-ret));
|
|
|
|
}
|
|
|
|
|
|
|
|
g_free(bitmap);
|
|
|
|
}
|
|
|
|
|
2013-02-21 00:41:50 +08:00
|
|
|
static void kvm_get_one_spr(CPUState *cs, uint64_t id, int spr)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
union {
|
|
|
|
uint32_t u32;
|
|
|
|
uint64_t u64;
|
|
|
|
} val;
|
|
|
|
struct kvm_one_reg reg = {
|
|
|
|
.id = id,
|
|
|
|
.addr = (uintptr_t) &val,
|
|
|
|
};
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret != 0) {
|
2014-02-04 12:12:34 +08:00
|
|
|
trace_kvm_failed_spr_get(spr, strerror(errno));
|
2013-02-21 00:41:50 +08:00
|
|
|
} else {
|
|
|
|
switch (id & KVM_REG_SIZE_MASK) {
|
|
|
|
case KVM_REG_SIZE_U32:
|
|
|
|
env->spr[spr] = val.u32;
|
|
|
|
break;
|
|
|
|
|
|
|
|
case KVM_REG_SIZE_U64:
|
|
|
|
env->spr[spr] = val.u64;
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
/* Don't handle this size yet */
|
|
|
|
abort();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void kvm_put_one_spr(CPUState *cs, uint64_t id, int spr)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
union {
|
|
|
|
uint32_t u32;
|
|
|
|
uint64_t u64;
|
|
|
|
} val;
|
|
|
|
struct kvm_one_reg reg = {
|
|
|
|
.id = id,
|
|
|
|
.addr = (uintptr_t) &val,
|
|
|
|
};
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
switch (id & KVM_REG_SIZE_MASK) {
|
|
|
|
case KVM_REG_SIZE_U32:
|
|
|
|
val.u32 = env->spr[spr];
|
|
|
|
break;
|
|
|
|
|
|
|
|
case KVM_REG_SIZE_U64:
|
|
|
|
val.u64 = env->spr[spr];
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
/* Don't handle this size yet */
|
|
|
|
abort();
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret != 0) {
|
2014-02-04 12:12:34 +08:00
|
|
|
trace_kvm_failed_spr_set(spr, strerror(errno));
|
2013-02-21 00:41:50 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-02-21 00:41:51 +08:00
|
|
|
static int kvm_put_fp(CPUState *cs)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_one_reg reg;
|
|
|
|
int i;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (env->insns_flags & PPC_FLOAT) {
|
|
|
|
uint64_t fpscr = env->fpscr;
|
|
|
|
bool vsx = !!(env->insns_flags2 & PPC2_VSX);
|
|
|
|
|
|
|
|
reg.id = KVM_REG_PPC_FPSCR;
|
|
|
|
reg.addr = (uintptr_t)&fpscr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set FPSCR to KVM: %s\n", strerror(errno));
|
2013-02-21 00:41:51 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < 32; i++) {
|
|
|
|
uint64_t vsr[2];
|
|
|
|
|
2016-01-15 23:00:12 +08:00
|
|
|
#ifdef HOST_WORDS_BIGENDIAN
|
2013-02-21 00:41:51 +08:00
|
|
|
vsr[0] = float64_val(env->fpr[i]);
|
|
|
|
vsr[1] = env->vsr[i];
|
2016-01-15 23:00:12 +08:00
|
|
|
#else
|
|
|
|
vsr[0] = env->vsr[i];
|
|
|
|
vsr[1] = float64_val(env->fpr[i]);
|
|
|
|
#endif
|
2013-02-21 00:41:51 +08:00
|
|
|
reg.addr = (uintptr_t) &vsr;
|
|
|
|
reg.id = vsx ? KVM_REG_PPC_VSR(i) : KVM_REG_PPC_FPR(i);
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set %s%d to KVM: %s\n", vsx ? "VSR" : "FPR",
|
2013-02-21 00:41:51 +08:00
|
|
|
i, strerror(errno));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (env->insns_flags & PPC_ALTIVEC) {
|
|
|
|
reg.id = KVM_REG_PPC_VSCR;
|
|
|
|
reg.addr = (uintptr_t)&env->vscr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set VSCR to KVM: %s\n", strerror(errno));
|
2013-02-21 00:41:51 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < 32; i++) {
|
|
|
|
reg.id = KVM_REG_PPC_VR(i);
|
|
|
|
reg.addr = (uintptr_t)&env->avr[i];
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set VR%d to KVM: %s\n", i, strerror(errno));
|
2013-02-21 00:41:51 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int kvm_get_fp(CPUState *cs)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_one_reg reg;
|
|
|
|
int i;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (env->insns_flags & PPC_FLOAT) {
|
|
|
|
uint64_t fpscr;
|
|
|
|
bool vsx = !!(env->insns_flags2 & PPC2_VSX);
|
|
|
|
|
|
|
|
reg.id = KVM_REG_PPC_FPSCR;
|
|
|
|
reg.addr = (uintptr_t)&fpscr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get FPSCR from KVM: %s\n", strerror(errno));
|
2013-02-21 00:41:51 +08:00
|
|
|
return ret;
|
|
|
|
} else {
|
|
|
|
env->fpscr = fpscr;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < 32; i++) {
|
|
|
|
uint64_t vsr[2];
|
|
|
|
|
|
|
|
reg.addr = (uintptr_t) &vsr;
|
|
|
|
reg.id = vsx ? KVM_REG_PPC_VSR(i) : KVM_REG_PPC_FPR(i);
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get %s%d from KVM: %s\n",
|
2013-02-21 00:41:51 +08:00
|
|
|
vsx ? "VSR" : "FPR", i, strerror(errno));
|
|
|
|
return ret;
|
|
|
|
} else {
|
2016-01-15 23:00:12 +08:00
|
|
|
#ifdef HOST_WORDS_BIGENDIAN
|
2013-02-21 00:41:51 +08:00
|
|
|
env->fpr[i] = vsr[0];
|
|
|
|
if (vsx) {
|
|
|
|
env->vsr[i] = vsr[1];
|
|
|
|
}
|
2016-01-15 23:00:12 +08:00
|
|
|
#else
|
|
|
|
env->fpr[i] = vsr[1];
|
|
|
|
if (vsx) {
|
|
|
|
env->vsr[i] = vsr[0];
|
|
|
|
}
|
|
|
|
#endif
|
2013-02-21 00:41:51 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (env->insns_flags & PPC_ALTIVEC) {
|
|
|
|
reg.id = KVM_REG_PPC_VSCR;
|
|
|
|
reg.addr = (uintptr_t)&env->vscr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get VSCR from KVM: %s\n", strerror(errno));
|
2013-02-21 00:41:51 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < 32; i++) {
|
|
|
|
reg.id = KVM_REG_PPC_VR(i);
|
|
|
|
reg.addr = (uintptr_t)&env->avr[i];
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get VR%d from KVM: %s\n",
|
2013-02-21 00:41:51 +08:00
|
|
|
i, strerror(errno));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-04-08 03:08:22 +08:00
|
|
|
#if defined(TARGET_PPC64)
|
|
|
|
static int kvm_get_vpa(CPUState *cs)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_one_reg reg;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
reg.id = KVM_REG_PPC_VPA_ADDR;
|
|
|
|
reg.addr = (uintptr_t)&env->vpa_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get VPA address from KVM: %s\n", strerror(errno));
|
2013-04-08 03:08:22 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert((uintptr_t)&env->slb_shadow_size
|
|
|
|
== ((uintptr_t)&env->slb_shadow_addr + 8));
|
|
|
|
reg.id = KVM_REG_PPC_VPA_SLB;
|
|
|
|
reg.addr = (uintptr_t)&env->slb_shadow_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get SLB shadow state from KVM: %s\n",
|
2013-04-08 03:08:22 +08:00
|
|
|
strerror(errno));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert((uintptr_t)&env->dtl_size == ((uintptr_t)&env->dtl_addr + 8));
|
|
|
|
reg.id = KVM_REG_PPC_VPA_DTL;
|
|
|
|
reg.addr = (uintptr_t)&env->dtl_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to get dispatch trace log state from KVM: %s\n",
|
2013-04-08 03:08:22 +08:00
|
|
|
strerror(errno));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int kvm_put_vpa(CPUState *cs)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_one_reg reg;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* SLB shadow or DTL can't be registered unless a master VPA is
|
|
|
|
* registered. That means when restoring state, if a VPA *is*
|
|
|
|
* registered, we need to set that up first. If not, we need to
|
|
|
|
* deregister the others before deregistering the master VPA */
|
|
|
|
assert(env->vpa_addr || !(env->slb_shadow_addr || env->dtl_addr));
|
|
|
|
|
|
|
|
if (env->vpa_addr) {
|
|
|
|
reg.id = KVM_REG_PPC_VPA_ADDR;
|
|
|
|
reg.addr = (uintptr_t)&env->vpa_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set VPA address to KVM: %s\n", strerror(errno));
|
2013-04-08 03:08:22 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
assert((uintptr_t)&env->slb_shadow_size
|
|
|
|
== ((uintptr_t)&env->slb_shadow_addr + 8));
|
|
|
|
reg.id = KVM_REG_PPC_VPA_SLB;
|
|
|
|
reg.addr = (uintptr_t)&env->slb_shadow_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set SLB shadow state to KVM: %s\n", strerror(errno));
|
2013-04-08 03:08:22 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
assert((uintptr_t)&env->dtl_size == ((uintptr_t)&env->dtl_addr + 8));
|
|
|
|
reg.id = KVM_REG_PPC_VPA_DTL;
|
|
|
|
reg.addr = (uintptr_t)&env->dtl_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set dispatch trace log state to KVM: %s\n",
|
2013-04-08 03:08:22 +08:00
|
|
|
strerror(errno));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!env->vpa_addr) {
|
|
|
|
reg.id = KVM_REG_PPC_VPA_ADDR;
|
|
|
|
reg.addr = (uintptr_t)&env->vpa_addr;
|
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
if (ret < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Unable to set VPA address to KVM: %s\n", strerror(errno));
|
2013-04-08 03:08:22 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif /* TARGET_PPC64 */
|
|
|
|
|
2016-03-08 08:33:46 +08:00
|
|
|
int kvmppc_put_books_sregs(PowerPCCPU *cpu)
|
2016-03-09 08:58:33 +08:00
|
|
|
{
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_sregs sregs;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
sregs.pvr = env->spr[SPR_PVR];
|
|
|
|
|
|
|
|
sregs.u.s.sdr1 = env->spr[SPR_SDR1];
|
|
|
|
|
|
|
|
/* Sync SLB */
|
|
|
|
#ifdef TARGET_PPC64
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->slb); i++) {
|
|
|
|
sregs.u.s.ppc64.slb[i].slbe = env->slb[i].esid;
|
|
|
|
if (env->slb[i].esid & SLB_ESID_V) {
|
|
|
|
sregs.u.s.ppc64.slb[i].slbe |= i;
|
|
|
|
}
|
|
|
|
sregs.u.s.ppc64.slb[i].slbv = env->slb[i].vsid;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Sync SRs */
|
|
|
|
for (i = 0; i < 16; i++) {
|
|
|
|
sregs.u.s.ppc32.sr[i] = env->sr[i];
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Sync BATs */
|
|
|
|
for (i = 0; i < 8; i++) {
|
|
|
|
/* Beware. We have to swap upper and lower bits here */
|
|
|
|
sregs.u.s.ppc32.dbat[i] = ((uint64_t)env->DBAT[0][i] << 32)
|
|
|
|
| env->DBAT[1][i];
|
|
|
|
sregs.u.s.ppc32.ibat[i] = ((uint64_t)env->IBAT[0][i] << 32)
|
|
|
|
| env->IBAT[1][i];
|
|
|
|
}
|
|
|
|
|
|
|
|
return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_SREGS, &sregs);
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_put_registers(CPUState *cs, int level)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2012-10-31 13:57:49 +08:00
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
2008-12-16 18:43:58 +08:00
|
|
|
struct kvm_regs regs;
|
|
|
|
int ret;
|
|
|
|
int i;
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_REGS, ®s);
|
|
|
|
if (ret < 0) {
|
2008-12-16 18:43:58 +08:00
|
|
|
return ret;
|
2012-10-31 13:06:49 +08:00
|
|
|
}
|
2008-12-16 18:43:58 +08:00
|
|
|
|
|
|
|
regs.ctr = env->ctr;
|
|
|
|
regs.lr = env->lr;
|
2013-02-20 15:52:13 +08:00
|
|
|
regs.xer = cpu_read_xer(env);
|
2008-12-16 18:43:58 +08:00
|
|
|
regs.msr = env->msr;
|
|
|
|
regs.pc = env->nip;
|
|
|
|
|
|
|
|
regs.srr0 = env->spr[SPR_SRR0];
|
|
|
|
regs.srr1 = env->spr[SPR_SRR1];
|
|
|
|
|
|
|
|
regs.sprg0 = env->spr[SPR_SPRG0];
|
|
|
|
regs.sprg1 = env->spr[SPR_SPRG1];
|
|
|
|
regs.sprg2 = env->spr[SPR_SPRG2];
|
|
|
|
regs.sprg3 = env->spr[SPR_SPRG3];
|
|
|
|
regs.sprg4 = env->spr[SPR_SPRG4];
|
|
|
|
regs.sprg5 = env->spr[SPR_SPRG5];
|
|
|
|
regs.sprg6 = env->spr[SPR_SPRG6];
|
|
|
|
regs.sprg7 = env->spr[SPR_SPRG7];
|
|
|
|
|
2011-04-30 06:10:23 +08:00
|
|
|
regs.pid = env->spr[SPR_BOOKE_PID];
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
for (i = 0;i < 32; i++)
|
|
|
|
regs.gpr[i] = env->gpr[i];
|
|
|
|
|
2013-06-15 09:51:51 +08:00
|
|
|
regs.cr = 0;
|
|
|
|
for (i = 0; i < 8; i++) {
|
|
|
|
regs.cr |= (env->crf[i] & 15) << (4 * (7 - i));
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_SET_REGS, ®s);
|
2008-12-16 18:43:58 +08:00
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
|
2013-02-21 00:41:51 +08:00
|
|
|
kvm_put_fp(cs);
|
|
|
|
|
2011-08-31 19:26:56 +08:00
|
|
|
if (env->tlb_dirty) {
|
2012-10-31 13:06:49 +08:00
|
|
|
kvm_sw_tlb_put(cpu);
|
2011-08-31 19:26:56 +08:00
|
|
|
env->tlb_dirty = false;
|
|
|
|
}
|
|
|
|
|
2012-09-13 00:57:09 +08:00
|
|
|
if (cap_segstate && (level >= KVM_PUT_RESET_STATE)) {
|
2016-03-09 08:58:33 +08:00
|
|
|
ret = kvmppc_put_books_sregs(cpu);
|
|
|
|
if (ret < 0) {
|
2012-09-13 00:57:09 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (cap_hior && (level >= KVM_PUT_RESET_STATE)) {
|
2013-02-21 00:41:50 +08:00
|
|
|
kvm_put_one_spr(cs, KVM_REG_PPC_HIOR, SPR_HIOR);
|
|
|
|
}
|
2012-09-13 00:57:09 +08:00
|
|
|
|
2013-02-21 00:41:50 +08:00
|
|
|
if (cap_one_reg) {
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* We deliberately ignore errors here, for kernels which have
|
|
|
|
* the ONE_REG calls, but don't support the specific
|
|
|
|
* registers, there's a reasonable chance things will still
|
|
|
|
* work, at least until we try to migrate. */
|
|
|
|
for (i = 0; i < 1024; i++) {
|
|
|
|
uint64_t id = env->spr_cb[i].one_reg_id;
|
|
|
|
|
|
|
|
if (id != 0) {
|
|
|
|
kvm_put_one_spr(cs, id, i);
|
|
|
|
}
|
2012-09-13 00:57:09 +08:00
|
|
|
}
|
2013-04-08 03:08:22 +08:00
|
|
|
|
|
|
|
#ifdef TARGET_PPC64
|
2014-06-04 20:51:00 +08:00
|
|
|
if (msr_ts) {
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->tm_gpr); i++) {
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_GPR(i), &env->tm_gpr[i]);
|
|
|
|
}
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->tm_vsr); i++) {
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_VSR(i), &env->tm_vsr[i]);
|
|
|
|
}
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_CR, &env->tm_cr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_LR, &env->tm_lr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_CTR, &env->tm_ctr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_FPSCR, &env->tm_fpscr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_AMR, &env->tm_amr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_PPR, &env->tm_ppr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_VRSAVE, &env->tm_vrsave);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_VSCR, &env->tm_vscr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_DSCR, &env->tm_dscr);
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TM_TAR, &env->tm_tar);
|
|
|
|
}
|
|
|
|
|
2013-04-08 03:08:22 +08:00
|
|
|
if (cap_papr) {
|
|
|
|
if (kvm_put_vpa(cs) < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Warning: Unable to set VPA information to KVM\n");
|
2013-04-08 03:08:22 +08:00
|
|
|
}
|
|
|
|
}
|
2014-05-01 18:37:09 +08:00
|
|
|
|
|
|
|
kvm_set_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset);
|
2013-04-08 03:08:22 +08:00
|
|
|
#endif /* TARGET_PPC64 */
|
2012-09-13 00:57:09 +08:00
|
|
|
}
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:36 +08:00
|
|
|
static void kvm_sync_excp(CPUPPCState *env, int vector, int ivor)
|
|
|
|
{
|
|
|
|
env->excp_vectors[vector] = env->spr[ivor] + env->spr[SPR_BOOKE_IVPR];
|
|
|
|
}
|
|
|
|
|
2016-03-09 08:58:33 +08:00
|
|
|
static int kvmppc_get_booke_sregs(PowerPCCPU *cpu)
|
|
|
|
{
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_sregs sregs;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_SREGS, &sregs);
|
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_BASE) {
|
|
|
|
env->spr[SPR_BOOKE_CSRR0] = sregs.u.e.csrr0;
|
|
|
|
env->spr[SPR_BOOKE_CSRR1] = sregs.u.e.csrr1;
|
|
|
|
env->spr[SPR_BOOKE_ESR] = sregs.u.e.esr;
|
|
|
|
env->spr[SPR_BOOKE_DEAR] = sregs.u.e.dear;
|
|
|
|
env->spr[SPR_BOOKE_MCSR] = sregs.u.e.mcsr;
|
|
|
|
env->spr[SPR_BOOKE_TSR] = sregs.u.e.tsr;
|
|
|
|
env->spr[SPR_BOOKE_TCR] = sregs.u.e.tcr;
|
|
|
|
env->spr[SPR_DECR] = sregs.u.e.dec;
|
|
|
|
env->spr[SPR_TBL] = sregs.u.e.tb & 0xffffffff;
|
|
|
|
env->spr[SPR_TBU] = sregs.u.e.tb >> 32;
|
|
|
|
env->spr[SPR_VRSAVE] = sregs.u.e.vrsave;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_ARCH206) {
|
|
|
|
env->spr[SPR_BOOKE_PIR] = sregs.u.e.pir;
|
|
|
|
env->spr[SPR_BOOKE_MCSRR0] = sregs.u.e.mcsrr0;
|
|
|
|
env->spr[SPR_BOOKE_MCSRR1] = sregs.u.e.mcsrr1;
|
|
|
|
env->spr[SPR_BOOKE_DECAR] = sregs.u.e.decar;
|
|
|
|
env->spr[SPR_BOOKE_IVPR] = sregs.u.e.ivpr;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_64) {
|
|
|
|
env->spr[SPR_BOOKE_EPCR] = sregs.u.e.epcr;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_SPRG8) {
|
|
|
|
env->spr[SPR_BOOKE_SPRG8] = sregs.u.e.sprg8;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_IVOR) {
|
|
|
|
env->spr[SPR_BOOKE_IVOR0] = sregs.u.e.ivor_low[0];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_CRITICAL, SPR_BOOKE_IVOR0);
|
|
|
|
env->spr[SPR_BOOKE_IVOR1] = sregs.u.e.ivor_low[1];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_MCHECK, SPR_BOOKE_IVOR1);
|
|
|
|
env->spr[SPR_BOOKE_IVOR2] = sregs.u.e.ivor_low[2];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DSI, SPR_BOOKE_IVOR2);
|
|
|
|
env->spr[SPR_BOOKE_IVOR3] = sregs.u.e.ivor_low[3];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_ISI, SPR_BOOKE_IVOR3);
|
|
|
|
env->spr[SPR_BOOKE_IVOR4] = sregs.u.e.ivor_low[4];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_EXTERNAL, SPR_BOOKE_IVOR4);
|
|
|
|
env->spr[SPR_BOOKE_IVOR5] = sregs.u.e.ivor_low[5];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_ALIGN, SPR_BOOKE_IVOR5);
|
|
|
|
env->spr[SPR_BOOKE_IVOR6] = sregs.u.e.ivor_low[6];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_PROGRAM, SPR_BOOKE_IVOR6);
|
|
|
|
env->spr[SPR_BOOKE_IVOR7] = sregs.u.e.ivor_low[7];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_FPU, SPR_BOOKE_IVOR7);
|
|
|
|
env->spr[SPR_BOOKE_IVOR8] = sregs.u.e.ivor_low[8];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_SYSCALL, SPR_BOOKE_IVOR8);
|
|
|
|
env->spr[SPR_BOOKE_IVOR9] = sregs.u.e.ivor_low[9];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_APU, SPR_BOOKE_IVOR9);
|
|
|
|
env->spr[SPR_BOOKE_IVOR10] = sregs.u.e.ivor_low[10];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DECR, SPR_BOOKE_IVOR10);
|
|
|
|
env->spr[SPR_BOOKE_IVOR11] = sregs.u.e.ivor_low[11];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_FIT, SPR_BOOKE_IVOR11);
|
|
|
|
env->spr[SPR_BOOKE_IVOR12] = sregs.u.e.ivor_low[12];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_WDT, SPR_BOOKE_IVOR12);
|
|
|
|
env->spr[SPR_BOOKE_IVOR13] = sregs.u.e.ivor_low[13];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DTLB, SPR_BOOKE_IVOR13);
|
|
|
|
env->spr[SPR_BOOKE_IVOR14] = sregs.u.e.ivor_low[14];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_ITLB, SPR_BOOKE_IVOR14);
|
|
|
|
env->spr[SPR_BOOKE_IVOR15] = sregs.u.e.ivor_low[15];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DEBUG, SPR_BOOKE_IVOR15);
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_SPE) {
|
|
|
|
env->spr[SPR_BOOKE_IVOR32] = sregs.u.e.ivor_high[0];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_SPEU, SPR_BOOKE_IVOR32);
|
|
|
|
env->spr[SPR_BOOKE_IVOR33] = sregs.u.e.ivor_high[1];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_EFPDI, SPR_BOOKE_IVOR33);
|
|
|
|
env->spr[SPR_BOOKE_IVOR34] = sregs.u.e.ivor_high[2];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_EFPRI, SPR_BOOKE_IVOR34);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_PM) {
|
|
|
|
env->spr[SPR_BOOKE_IVOR35] = sregs.u.e.ivor_high[3];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_EPERFM, SPR_BOOKE_IVOR35);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_PC) {
|
|
|
|
env->spr[SPR_BOOKE_IVOR36] = sregs.u.e.ivor_high[4];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DOORI, SPR_BOOKE_IVOR36);
|
|
|
|
env->spr[SPR_BOOKE_IVOR37] = sregs.u.e.ivor_high[5];
|
|
|
|
kvm_sync_excp(env, POWERPC_EXCP_DOORCI, SPR_BOOKE_IVOR37);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_ARCH206_MMU) {
|
|
|
|
env->spr[SPR_BOOKE_MAS0] = sregs.u.e.mas0;
|
|
|
|
env->spr[SPR_BOOKE_MAS1] = sregs.u.e.mas1;
|
|
|
|
env->spr[SPR_BOOKE_MAS2] = sregs.u.e.mas2;
|
|
|
|
env->spr[SPR_BOOKE_MAS3] = sregs.u.e.mas7_3 & 0xffffffff;
|
|
|
|
env->spr[SPR_BOOKE_MAS4] = sregs.u.e.mas4;
|
|
|
|
env->spr[SPR_BOOKE_MAS6] = sregs.u.e.mas6;
|
|
|
|
env->spr[SPR_BOOKE_MAS7] = sregs.u.e.mas7_3 >> 32;
|
|
|
|
env->spr[SPR_MMUCFG] = sregs.u.e.mmucfg;
|
|
|
|
env->spr[SPR_BOOKE_TLB0CFG] = sregs.u.e.tlbcfg[0];
|
|
|
|
env->spr[SPR_BOOKE_TLB1CFG] = sregs.u.e.tlbcfg[1];
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_EXP) {
|
|
|
|
env->spr[SPR_BOOKE_EPR] = sregs.u.e.epr;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.features & KVM_SREGS_E_PD) {
|
|
|
|
env->spr[SPR_BOOKE_EPLC] = sregs.u.e.eplc;
|
|
|
|
env->spr[SPR_BOOKE_EPSC] = sregs.u.e.epsc;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sregs.u.e.impl_id == KVM_SREGS_E_IMPL_FSL) {
|
|
|
|
env->spr[SPR_E500_SVR] = sregs.u.e.impl.fsl.svr;
|
|
|
|
env->spr[SPR_Exxx_MCAR] = sregs.u.e.impl.fsl.mcar;
|
|
|
|
env->spr[SPR_HID0] = sregs.u.e.impl.fsl.hid0;
|
|
|
|
|
|
|
|
if (sregs.u.e.impl.fsl.features & KVM_SREGS_E_FSL_PIDn) {
|
|
|
|
env->spr[SPR_BOOKE_PID1] = sregs.u.e.impl.fsl.pid1;
|
|
|
|
env->spr[SPR_BOOKE_PID2] = sregs.u.e.impl.fsl.pid2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int kvmppc_get_books_sregs(PowerPCCPU *cpu)
|
|
|
|
{
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_sregs sregs;
|
|
|
|
int ret;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_SREGS, &sregs);
|
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!env->external_htab) {
|
|
|
|
ppc_store_sdr1(env, sregs.u.s.sdr1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Sync SLB */
|
|
|
|
#ifdef TARGET_PPC64
|
|
|
|
/*
|
|
|
|
* The packed SLB array we get from KVM_GET_SREGS only contains
|
|
|
|
* information about valid entries. So we flush our internal copy
|
|
|
|
* to get rid of stale ones, then put all valid SLB entries back
|
|
|
|
* in.
|
|
|
|
*/
|
|
|
|
memset(env->slb, 0, sizeof(env->slb));
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->slb); i++) {
|
|
|
|
target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
|
|
|
|
target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
|
|
|
|
/*
|
|
|
|
* Only restore valid entries
|
|
|
|
*/
|
|
|
|
if (rb & SLB_ESID_V) {
|
|
|
|
ppc_store_slb(cpu, rb & 0xfff, rb & ~0xfffULL, rs);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Sync SRs */
|
|
|
|
for (i = 0; i < 16; i++) {
|
|
|
|
env->sr[i] = sregs.u.s.ppc32.sr[i];
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Sync BATs */
|
|
|
|
for (i = 0; i < 8; i++) {
|
|
|
|
env->DBAT[0][i] = sregs.u.s.ppc32.dbat[i] & 0xffffffff;
|
|
|
|
env->DBAT[1][i] = sregs.u.s.ppc32.dbat[i] >> 32;
|
|
|
|
env->IBAT[0][i] = sregs.u.s.ppc32.ibat[i] & 0xffffffff;
|
|
|
|
env->IBAT[1][i] = sregs.u.s.ppc32.ibat[i] >> 32;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_get_registers(CPUState *cs)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2012-10-31 13:57:49 +08:00
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
2008-12-16 18:43:58 +08:00
|
|
|
struct kvm_regs regs;
|
2011-04-30 06:10:23 +08:00
|
|
|
uint32_t cr;
|
2010-11-25 15:20:46 +08:00
|
|
|
int i, ret;
|
2008-12-16 18:43:58 +08:00
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
ret = kvm_vcpu_ioctl(cs, KVM_GET_REGS, ®s);
|
2008-12-16 18:43:58 +08:00
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
|
2011-04-30 06:10:23 +08:00
|
|
|
cr = regs.cr;
|
|
|
|
for (i = 7; i >= 0; i--) {
|
|
|
|
env->crf[i] = cr & 15;
|
|
|
|
cr >>= 4;
|
|
|
|
}
|
2009-12-03 06:19:47 +08:00
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
env->ctr = regs.ctr;
|
|
|
|
env->lr = regs.lr;
|
2013-02-20 15:52:13 +08:00
|
|
|
cpu_write_xer(env, regs.xer);
|
2008-12-16 18:43:58 +08:00
|
|
|
env->msr = regs.msr;
|
|
|
|
env->nip = regs.pc;
|
|
|
|
|
|
|
|
env->spr[SPR_SRR0] = regs.srr0;
|
|
|
|
env->spr[SPR_SRR1] = regs.srr1;
|
|
|
|
|
|
|
|
env->spr[SPR_SPRG0] = regs.sprg0;
|
|
|
|
env->spr[SPR_SPRG1] = regs.sprg1;
|
|
|
|
env->spr[SPR_SPRG2] = regs.sprg2;
|
|
|
|
env->spr[SPR_SPRG3] = regs.sprg3;
|
|
|
|
env->spr[SPR_SPRG4] = regs.sprg4;
|
|
|
|
env->spr[SPR_SPRG5] = regs.sprg5;
|
|
|
|
env->spr[SPR_SPRG6] = regs.sprg6;
|
|
|
|
env->spr[SPR_SPRG7] = regs.sprg7;
|
|
|
|
|
2011-04-30 06:10:23 +08:00
|
|
|
env->spr[SPR_BOOKE_PID] = regs.pid;
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
for (i = 0;i < 32; i++)
|
|
|
|
env->gpr[i] = regs.gpr[i];
|
|
|
|
|
2013-02-21 00:41:51 +08:00
|
|
|
kvm_get_fp(cs);
|
|
|
|
|
2011-04-30 06:10:23 +08:00
|
|
|
if (cap_booke_sregs) {
|
2016-03-09 08:58:33 +08:00
|
|
|
ret = kvmppc_get_booke_sregs(cpu);
|
2011-04-30 06:10:23 +08:00
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
2011-05-25 21:04:42 +08:00
|
|
|
}
|
2011-04-30 06:10:23 +08:00
|
|
|
|
|
|
|
if (cap_segstate) {
|
2016-03-09 08:58:33 +08:00
|
|
|
ret = kvmppc_get_books_sregs(cpu);
|
2011-04-30 06:10:23 +08:00
|
|
|
if (ret < 0) {
|
|
|
|
return ret;
|
|
|
|
}
|
2011-05-25 21:04:42 +08:00
|
|
|
}
|
2009-12-03 06:19:47 +08:00
|
|
|
|
2013-02-21 00:41:50 +08:00
|
|
|
if (cap_hior) {
|
|
|
|
kvm_get_one_spr(cs, KVM_REG_PPC_HIOR, SPR_HIOR);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (cap_one_reg) {
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* We deliberately ignore errors here, for kernels which have
|
|
|
|
* the ONE_REG calls, but don't support the specific
|
|
|
|
* registers, there's a reasonable chance things will still
|
|
|
|
* work, at least until we try to migrate. */
|
|
|
|
for (i = 0; i < 1024; i++) {
|
|
|
|
uint64_t id = env->spr_cb[i].one_reg_id;
|
|
|
|
|
|
|
|
if (id != 0) {
|
|
|
|
kvm_get_one_spr(cs, id, i);
|
|
|
|
}
|
|
|
|
}
|
2013-04-08 03:08:22 +08:00
|
|
|
|
|
|
|
#ifdef TARGET_PPC64
|
2014-06-04 20:51:00 +08:00
|
|
|
if (msr_ts) {
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->tm_gpr); i++) {
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_GPR(i), &env->tm_gpr[i]);
|
|
|
|
}
|
|
|
|
for (i = 0; i < ARRAY_SIZE(env->tm_vsr); i++) {
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_VSR(i), &env->tm_vsr[i]);
|
|
|
|
}
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_CR, &env->tm_cr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_LR, &env->tm_lr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_CTR, &env->tm_ctr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_FPSCR, &env->tm_fpscr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_AMR, &env->tm_amr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_PPR, &env->tm_ppr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_VRSAVE, &env->tm_vrsave);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_VSCR, &env->tm_vscr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_DSCR, &env->tm_dscr);
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TM_TAR, &env->tm_tar);
|
|
|
|
}
|
|
|
|
|
2013-04-08 03:08:22 +08:00
|
|
|
if (cap_papr) {
|
|
|
|
if (kvm_get_vpa(cs) < 0) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("Warning: Unable to get VPA information from KVM\n");
|
2013-04-08 03:08:22 +08:00
|
|
|
}
|
|
|
|
}
|
2014-05-01 18:37:09 +08:00
|
|
|
|
|
|
|
kvm_get_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset);
|
2013-04-08 03:08:22 +08:00
|
|
|
#endif
|
2013-02-21 00:41:50 +08:00
|
|
|
}
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level)
|
2010-08-30 19:49:15 +08:00
|
|
|
{
|
|
|
|
unsigned virq = level ? KVM_INTERRUPT_SET_LEVEL : KVM_INTERRUPT_UNSET;
|
|
|
|
|
|
|
|
if (irq != PPC_INTERRUPT_EXT) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!kvm_enabled() || !cap_interrupt_unset || !cap_interrupt_level) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
kvm_vcpu_ioctl(CPU(cpu), KVM_INTERRUPT, &virq);
|
2010-08-30 19:49:15 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2009-07-17 19:51:46 +08:00
|
|
|
#if defined(TARGET_PPCEMB)
|
|
|
|
#define PPC_INPUT_INT PPC40x_INPUT_INT
|
|
|
|
#elif defined(TARGET_PPC64)
|
|
|
|
#define PPC_INPUT_INT PPC970_INPUT_INT
|
|
|
|
#else
|
|
|
|
#define PPC_INPUT_INT PPC6xx_INPUT_INT
|
|
|
|
#endif
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2012-10-31 13:57:49 +08:00
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
2008-12-16 18:43:58 +08:00
|
|
|
int r;
|
|
|
|
unsigned irq;
|
|
|
|
|
2015-06-19 00:47:23 +08:00
|
|
|
qemu_mutex_lock_iothread();
|
|
|
|
|
2012-04-07 15:23:39 +08:00
|
|
|
/* PowerPC QEMU tracks the various core input pins (interrupt, critical
|
2008-12-16 18:43:58 +08:00
|
|
|
* interrupt, reset, etc) in PPC-specific env->irq_input_state. */
|
2010-08-30 19:49:15 +08:00
|
|
|
if (!cap_interrupt_level &&
|
|
|
|
run->ready_for_interrupt_injection &&
|
2013-01-18 01:51:17 +08:00
|
|
|
(cs->interrupt_request & CPU_INTERRUPT_HARD) &&
|
2009-07-17 19:51:46 +08:00
|
|
|
(env->irq_input_state & (1<<PPC_INPUT_INT)))
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
|
|
|
/* For now KVM disregards the 'irq' argument. However, in the
|
|
|
|
* future KVM could cache it in-kernel to avoid a heavyweight exit
|
|
|
|
* when reading the UIC.
|
|
|
|
*/
|
2010-08-30 19:49:15 +08:00
|
|
|
irq = KVM_INTERRUPT_SET;
|
2008-12-16 18:43:58 +08:00
|
|
|
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("injected interrupt %d\n", irq);
|
2012-10-31 13:06:49 +08:00
|
|
|
r = kvm_vcpu_ioctl(cs, KVM_INTERRUPT, &irq);
|
2012-12-17 13:18:02 +08:00
|
|
|
if (r < 0) {
|
|
|
|
printf("cpu %d fail inject %x\n", cs->cpu_index, irq);
|
|
|
|
}
|
2010-04-19 05:10:17 +08:00
|
|
|
|
|
|
|
/* Always wake up soon in case the interrupt was level based */
|
2013-08-21 23:03:08 +08:00
|
|
|
timer_mod(idle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
|
2016-03-22 00:02:30 +08:00
|
|
|
(NANOSECONDS_PER_SECOND / 50));
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* We don't know if there are more interrupts pending after this. However,
|
|
|
|
* the guest will return to userspace in the course of handling this one
|
|
|
|
* anyways, so we will get a chance to deliver the rest. */
|
2015-06-19 00:47:23 +08:00
|
|
|
|
|
|
|
qemu_mutex_unlock_iothread();
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2015-04-08 19:30:58 +08:00
|
|
|
MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2015-04-08 19:30:58 +08:00
|
|
|
return MEMTXATTRS_UNSPECIFIED;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_process_async_events(CPUState *cs)
|
2010-05-04 20:45:27 +08:00
|
|
|
{
|
2013-01-18 01:51:17 +08:00
|
|
|
return cs->halted;
|
2010-05-04 20:45:27 +08:00
|
|
|
}
|
|
|
|
|
2013-01-18 01:51:17 +08:00
|
|
|
static int kvmppc_handle_halt(PowerPCCPU *cpu)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2013-01-18 01:51:17 +08:00
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
|
|
|
|
if (!(cs->interrupt_request & CPU_INTERRUPT_HARD) && (msr_ee)) {
|
|
|
|
cs->halted = 1;
|
2013-08-26 14:31:06 +08:00
|
|
|
cs->exception_index = EXCP_HLT;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2011-03-15 19:26:28 +08:00
|
|
|
return 0;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* map dcr access to existing qemu dcr emulation */
|
2012-03-14 08:38:22 +08:00
|
|
|
static int kvmppc_handle_dcr_read(CPUPPCState *env, uint32_t dcrn, uint32_t *data)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
|
|
|
if (ppc_dcr_read(env->dcr_env, dcrn, data) < 0)
|
|
|
|
fprintf(stderr, "Read to unhandled DCR (0x%x)\n", dcrn);
|
|
|
|
|
2011-03-15 19:26:28 +08:00
|
|
|
return 0;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2012-03-14 08:38:22 +08:00
|
|
|
static int kvmppc_handle_dcr_write(CPUPPCState *env, uint32_t dcrn, uint32_t data)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
|
|
|
if (ppc_dcr_write(env->dcr_env, dcrn, data) < 0)
|
|
|
|
fprintf(stderr, "Write to unhandled DCR (0x%x)\n", dcrn);
|
|
|
|
|
2011-03-15 19:26:28 +08:00
|
|
|
return 0;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:38 +08:00
|
|
|
int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
|
|
|
|
{
|
|
|
|
/* Mixed endian case is not handled */
|
|
|
|
uint32_t sc = debug_inst_opcode;
|
|
|
|
|
|
|
|
if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&bp->saved_insn,
|
|
|
|
sizeof(sc), 0) ||
|
|
|
|
cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&sc, sizeof(sc), 1)) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
|
|
|
|
{
|
|
|
|
uint32_t sc;
|
|
|
|
|
|
|
|
if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&sc, sizeof(sc), 0) ||
|
|
|
|
sc != debug_inst_opcode ||
|
|
|
|
cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)&bp->saved_insn,
|
|
|
|
sizeof(sc), 1)) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:37 +08:00
|
|
|
static int find_hw_breakpoint(target_ulong addr, int type)
|
|
|
|
{
|
|
|
|
int n;
|
|
|
|
|
|
|
|
assert((nb_hw_breakpoint + nb_hw_watchpoint)
|
|
|
|
<= ARRAY_SIZE(hw_debug_points));
|
|
|
|
|
|
|
|
for (n = 0; n < nb_hw_breakpoint + nb_hw_watchpoint; n++) {
|
|
|
|
if (hw_debug_points[n].addr == addr &&
|
|
|
|
hw_debug_points[n].type == type) {
|
|
|
|
return n;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int find_hw_watchpoint(target_ulong addr, int *flag)
|
|
|
|
{
|
|
|
|
int n;
|
|
|
|
|
|
|
|
n = find_hw_breakpoint(addr, GDB_WATCHPOINT_ACCESS);
|
|
|
|
if (n >= 0) {
|
|
|
|
*flag = BP_MEM_ACCESS;
|
|
|
|
return n;
|
|
|
|
}
|
|
|
|
|
|
|
|
n = find_hw_breakpoint(addr, GDB_WATCHPOINT_WRITE);
|
|
|
|
if (n >= 0) {
|
|
|
|
*flag = BP_MEM_WRITE;
|
|
|
|
return n;
|
|
|
|
}
|
|
|
|
|
|
|
|
n = find_hw_breakpoint(addr, GDB_WATCHPOINT_READ);
|
|
|
|
if (n >= 0) {
|
|
|
|
*flag = BP_MEM_READ;
|
|
|
|
return n;
|
|
|
|
}
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_arch_insert_hw_breakpoint(target_ulong addr,
|
|
|
|
target_ulong len, int type)
|
|
|
|
{
|
|
|
|
if ((nb_hw_breakpoint + nb_hw_watchpoint) >= ARRAY_SIZE(hw_debug_points)) {
|
|
|
|
return -ENOBUFS;
|
|
|
|
}
|
|
|
|
|
|
|
|
hw_debug_points[nb_hw_breakpoint + nb_hw_watchpoint].addr = addr;
|
|
|
|
hw_debug_points[nb_hw_breakpoint + nb_hw_watchpoint].type = type;
|
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
case GDB_BREAKPOINT_HW:
|
|
|
|
if (nb_hw_breakpoint >= max_hw_breakpoint) {
|
|
|
|
return -ENOBUFS;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (find_hw_breakpoint(addr, type) >= 0) {
|
|
|
|
return -EEXIST;
|
|
|
|
}
|
|
|
|
|
|
|
|
nb_hw_breakpoint++;
|
|
|
|
break;
|
|
|
|
|
|
|
|
case GDB_WATCHPOINT_WRITE:
|
|
|
|
case GDB_WATCHPOINT_READ:
|
|
|
|
case GDB_WATCHPOINT_ACCESS:
|
|
|
|
if (nb_hw_watchpoint >= max_hw_watchpoint) {
|
|
|
|
return -ENOBUFS;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (find_hw_breakpoint(addr, type) >= 0) {
|
|
|
|
return -EEXIST;
|
|
|
|
}
|
|
|
|
|
|
|
|
nb_hw_watchpoint++;
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
return -ENOSYS;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_arch_remove_hw_breakpoint(target_ulong addr,
|
|
|
|
target_ulong len, int type)
|
|
|
|
{
|
|
|
|
int n;
|
|
|
|
|
|
|
|
n = find_hw_breakpoint(addr, type);
|
|
|
|
if (n < 0) {
|
|
|
|
return -ENOENT;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
case GDB_BREAKPOINT_HW:
|
|
|
|
nb_hw_breakpoint--;
|
|
|
|
break;
|
|
|
|
|
|
|
|
case GDB_WATCHPOINT_WRITE:
|
|
|
|
case GDB_WATCHPOINT_READ:
|
|
|
|
case GDB_WATCHPOINT_ACCESS:
|
|
|
|
nb_hw_watchpoint--;
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
return -ENOSYS;
|
|
|
|
}
|
|
|
|
hw_debug_points[n] = hw_debug_points[nb_hw_breakpoint + nb_hw_watchpoint];
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void kvm_arch_remove_all_hw_breakpoints(void)
|
|
|
|
{
|
|
|
|
nb_hw_breakpoint = nb_hw_watchpoint = 0;
|
|
|
|
}
|
|
|
|
|
2014-07-14 17:15:38 +08:00
|
|
|
void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg)
|
|
|
|
{
|
2014-07-14 17:15:37 +08:00
|
|
|
int n;
|
|
|
|
|
2014-07-14 17:15:38 +08:00
|
|
|
/* Software Breakpoint updates */
|
|
|
|
if (kvm_sw_breakpoints_active(cs)) {
|
|
|
|
dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
|
|
|
|
}
|
2014-07-14 17:15:37 +08:00
|
|
|
|
|
|
|
assert((nb_hw_breakpoint + nb_hw_watchpoint)
|
|
|
|
<= ARRAY_SIZE(hw_debug_points));
|
|
|
|
assert((nb_hw_breakpoint + nb_hw_watchpoint) <= ARRAY_SIZE(dbg->arch.bp));
|
|
|
|
|
|
|
|
if (nb_hw_breakpoint + nb_hw_watchpoint > 0) {
|
|
|
|
dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
|
|
|
|
memset(dbg->arch.bp, 0, sizeof(dbg->arch.bp));
|
|
|
|
for (n = 0; n < nb_hw_breakpoint + nb_hw_watchpoint; n++) {
|
|
|
|
switch (hw_debug_points[n].type) {
|
|
|
|
case GDB_BREAKPOINT_HW:
|
|
|
|
dbg->arch.bp[n].type = KVMPPC_DEBUG_BREAKPOINT;
|
|
|
|
break;
|
|
|
|
case GDB_WATCHPOINT_WRITE:
|
|
|
|
dbg->arch.bp[n].type = KVMPPC_DEBUG_WATCH_WRITE;
|
|
|
|
break;
|
|
|
|
case GDB_WATCHPOINT_READ:
|
|
|
|
dbg->arch.bp[n].type = KVMPPC_DEBUG_WATCH_READ;
|
|
|
|
break;
|
|
|
|
case GDB_WATCHPOINT_ACCESS:
|
|
|
|
dbg->arch.bp[n].type = KVMPPC_DEBUG_WATCH_WRITE |
|
|
|
|
KVMPPC_DEBUG_WATCH_READ;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
cpu_abort(cs, "Unsupported breakpoint type\n");
|
|
|
|
}
|
|
|
|
dbg->arch.bp[n].addr = hw_debug_points[n].addr;
|
|
|
|
}
|
|
|
|
}
|
2014-07-14 17:15:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static int kvm_handle_debug(PowerPCCPU *cpu, struct kvm_run *run)
|
|
|
|
{
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
struct kvm_debug_exit_arch *arch_info = &run->debug.arch;
|
|
|
|
int handle = 0;
|
2014-07-14 17:15:37 +08:00
|
|
|
int n;
|
|
|
|
int flag = 0;
|
2014-07-14 17:15:38 +08:00
|
|
|
|
2014-07-14 17:15:37 +08:00
|
|
|
if (cs->singlestep_enabled) {
|
|
|
|
handle = 1;
|
|
|
|
} else if (arch_info->status) {
|
|
|
|
if (nb_hw_breakpoint + nb_hw_watchpoint > 0) {
|
|
|
|
if (arch_info->status & KVMPPC_DEBUG_BREAKPOINT) {
|
|
|
|
n = find_hw_breakpoint(arch_info->address, GDB_BREAKPOINT_HW);
|
|
|
|
if (n >= 0) {
|
|
|
|
handle = 1;
|
|
|
|
}
|
|
|
|
} else if (arch_info->status & (KVMPPC_DEBUG_WATCH_READ |
|
|
|
|
KVMPPC_DEBUG_WATCH_WRITE)) {
|
|
|
|
n = find_hw_watchpoint(arch_info->address, &flag);
|
|
|
|
if (n >= 0) {
|
|
|
|
handle = 1;
|
|
|
|
cs->watchpoint_hit = &hw_watchpoint;
|
|
|
|
hw_watchpoint.vaddr = hw_debug_points[n].addr;
|
|
|
|
hw_watchpoint.flags = flag;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
|
2014-07-14 17:15:38 +08:00
|
|
|
handle = 1;
|
|
|
|
} else {
|
|
|
|
/* QEMU is not able to handle debug exception, so inject
|
|
|
|
* program exception to guest;
|
|
|
|
* Yes program exception NOT debug exception !!
|
2014-07-14 17:15:37 +08:00
|
|
|
* When QEMU is using debug resources then debug exception must
|
|
|
|
* be always set. To achieve this we set MSR_DE and also set
|
|
|
|
* MSRP_DEP so guest cannot change MSR_DE.
|
|
|
|
* When emulating debug resource for guest we want guest
|
|
|
|
* to control MSR_DE (enable/disable debug interrupt on need).
|
|
|
|
* Supporting both configurations are NOT possible.
|
|
|
|
* So the result is that we cannot share debug resources
|
|
|
|
* between QEMU and Guest on BOOKE architecture.
|
|
|
|
* In the current design QEMU gets the priority over guest,
|
|
|
|
* this means that if QEMU is using debug resources then guest
|
|
|
|
* cannot use them;
|
2014-07-14 17:15:38 +08:00
|
|
|
* For software breakpoint QEMU uses a privileged instruction;
|
|
|
|
* So there cannot be any reason that we are here for guest
|
|
|
|
* set debug exception, only possibility is guest executed a
|
|
|
|
* privileged / illegal instruction and that's why we are
|
|
|
|
* injecting a program interrupt.
|
|
|
|
*/
|
|
|
|
|
|
|
|
cpu_synchronize_state(cs);
|
|
|
|
/* env->nip is PC, so increment this by 4 to use
|
|
|
|
* ppc_cpu_do_interrupt(), which set srr0 = env->nip - 4.
|
|
|
|
*/
|
|
|
|
env->nip += 4;
|
|
|
|
cs->exception_index = POWERPC_EXCP_PROGRAM;
|
|
|
|
env->error_code = POWERPC_EXCP_INVAL;
|
|
|
|
ppc_cpu_do_interrupt(cs);
|
|
|
|
}
|
|
|
|
|
|
|
|
return handle;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
|
2008-12-16 18:43:58 +08:00
|
|
|
{
|
2012-10-31 13:57:49 +08:00
|
|
|
PowerPCCPU *cpu = POWERPC_CPU(cs);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
2011-03-15 19:26:28 +08:00
|
|
|
int ret;
|
2008-12-16 18:43:58 +08:00
|
|
|
|
2015-06-19 00:47:23 +08:00
|
|
|
qemu_mutex_lock_iothread();
|
|
|
|
|
2008-12-16 18:43:58 +08:00
|
|
|
switch (run->exit_reason) {
|
|
|
|
case KVM_EXIT_DCR:
|
|
|
|
if (run->dcr.is_write) {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle dcr write\n");
|
2008-12-16 18:43:58 +08:00
|
|
|
ret = kvmppc_handle_dcr_write(env, run->dcr.dcrn, run->dcr.data);
|
|
|
|
} else {
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle dcr read\n");
|
2008-12-16 18:43:58 +08:00
|
|
|
ret = kvmppc_handle_dcr_read(env, run->dcr.dcrn, &run->dcr.data);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case KVM_EXIT_HLT:
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle halt\n");
|
2013-01-18 01:51:17 +08:00
|
|
|
ret = kvmppc_handle_halt(cpu);
|
2008-12-16 18:43:58 +08:00
|
|
|
break;
|
target-ppc: Remove CONFIG_PSERIES dependency in kvm.c
target-ppc/kvm.c has an #ifdef on CONFIG_PSERIES, for the handling of
KVM exits due to a PAPR hypercall from the guest. However, since commit
e4c8b28cde12d01ada8fe869567dc5717a2dfcb7 "ppc: express FDT dependency of
pSeries and e500 boards via default-configs/", this hasn't worked properly.
That patch altered the configuration setup so that although CONFIG_PSERIES
is visible from the Makefiles, it is not visible from C files. This broke
the pseries machine when KVM is in use.
This patch makes a quick and dirty fix, by removing the CONFIG_PSERIES
dependency, replacing it with TARGET_PPC64 (since removing it entirely
leads to type mismatch errors). Technically this breaks the build when
configured with --disable-fdt, since that disables CONFIG_PSERIES on
TARGET_PPC64. However, it turns out the build was already broken in that
case, so this fixes pseries kvm without breaking anything extra. I'm
looking into how to fix that build breakage, but I don't think that need
delay applying this patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-13 23:53:27 +08:00
|
|
|
#if defined(TARGET_PPC64)
|
2011-08-09 23:57:37 +08:00
|
|
|
case KVM_EXIT_PAPR_HCALL:
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle PAPR hypercall\n");
|
2012-10-31 13:57:49 +08:00
|
|
|
run->papr_hcall.ret = spapr_hypercall(cpu,
|
2012-05-03 12:13:14 +08:00
|
|
|
run->papr_hcall.nr,
|
2011-08-09 23:57:37 +08:00
|
|
|
run->papr_hcall.args);
|
2012-08-07 02:44:45 +08:00
|
|
|
ret = 0;
|
2011-08-09 23:57:37 +08:00
|
|
|
break;
|
|
|
|
#endif
|
2013-01-17 18:54:38 +08:00
|
|
|
case KVM_EXIT_EPR:
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle epr\n");
|
2014-02-14 16:15:21 +08:00
|
|
|
run->epr.epr = ldl_phys(cs->as, env->mpic_iack);
|
2013-01-17 18:54:38 +08:00
|
|
|
ret = 0;
|
|
|
|
break;
|
2013-02-25 02:16:21 +08:00
|
|
|
case KVM_EXIT_WATCHDOG:
|
2013-07-29 20:16:38 +08:00
|
|
|
DPRINTF("handle watchdog expiry\n");
|
2013-02-25 02:16:21 +08:00
|
|
|
watchdog_perform_action();
|
|
|
|
ret = 0;
|
|
|
|
break;
|
|
|
|
|
2014-07-14 17:15:38 +08:00
|
|
|
case KVM_EXIT_DEBUG:
|
|
|
|
DPRINTF("handle debug exception\n");
|
|
|
|
if (kvm_handle_debug(cpu, run)) {
|
|
|
|
ret = EXCP_DEBUG;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
/* re-enter, this exception was guest-internal */
|
|
|
|
ret = 0;
|
|
|
|
break;
|
|
|
|
|
2011-01-22 04:48:06 +08:00
|
|
|
default:
|
|
|
|
fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
|
|
|
|
ret = -1;
|
|
|
|
break;
|
2008-12-16 18:43:58 +08:00
|
|
|
}
|
|
|
|
|
2015-06-19 00:47:23 +08:00
|
|
|
qemu_mutex_unlock_iothread();
|
2008-12-16 18:43:58 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-02-25 02:16:21 +08:00
|
|
|
int kvmppc_or_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits)
|
|
|
|
{
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
uint32_t bits = tsr_bits;
|
|
|
|
struct kvm_one_reg reg = {
|
|
|
|
.id = KVM_REG_PPC_OR_TSR,
|
|
|
|
.addr = (uintptr_t) &bits,
|
|
|
|
};
|
|
|
|
|
|
|
|
return kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_clear_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits)
|
|
|
|
{
|
|
|
|
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
uint32_t bits = tsr_bits;
|
|
|
|
struct kvm_one_reg reg = {
|
|
|
|
.id = KVM_REG_PPC_CLEAR_TSR,
|
|
|
|
.addr = (uintptr_t) &bits,
|
|
|
|
};
|
|
|
|
|
|
|
|
return kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_set_tcr(PowerPCCPU *cpu)
|
|
|
|
{
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
CPUPPCState *env = &cpu->env;
|
|
|
|
uint32_t tcr = env->spr[SPR_BOOKE_TCR];
|
|
|
|
|
|
|
|
struct kvm_one_reg reg = {
|
|
|
|
.id = KVM_REG_PPC_TCR,
|
|
|
|
.addr = (uintptr_t) &tcr,
|
|
|
|
};
|
|
|
|
|
|
|
|
return kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu)
|
|
|
|
{
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (!kvm_enabled()) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!cap_ppc_watchdog) {
|
|
|
|
printf("warning: KVM does not support watchdog");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2014-04-09 23:21:57 +08:00
|
|
|
ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_BOOKE_WATCHDOG, 0);
|
2013-02-25 02:16:21 +08:00
|
|
|
if (ret < 0) {
|
|
|
|
fprintf(stderr, "%s: couldn't enable KVM_CAP_PPC_BOOKE_WATCHDOG: %s\n",
|
|
|
|
__func__, strerror(-ret));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-02-10 00:37:05 +08:00
|
|
|
static int read_cpuinfo(const char *field, char *value, int len)
|
|
|
|
{
|
|
|
|
FILE *f;
|
|
|
|
int ret = -1;
|
|
|
|
int field_len = strlen(field);
|
|
|
|
char line[512];
|
|
|
|
|
|
|
|
f = fopen("/proc/cpuinfo", "r");
|
|
|
|
if (!f) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
do {
|
2014-07-09 18:38:37 +08:00
|
|
|
if (!fgets(line, sizeof(line), f)) {
|
2010-02-10 00:37:05 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (!strncmp(line, field, field_len)) {
|
2012-10-04 19:09:52 +08:00
|
|
|
pstrcpy(value, len, line);
|
2010-02-10 00:37:05 +08:00
|
|
|
ret = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
} while(*line);
|
|
|
|
|
|
|
|
fclose(f);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
uint32_t kvmppc_get_tbfreq(void)
|
|
|
|
{
|
|
|
|
char line[512];
|
|
|
|
char *ns;
|
2016-03-22 00:02:30 +08:00
|
|
|
uint32_t retval = NANOSECONDS_PER_SECOND;
|
2010-02-10 00:37:05 +08:00
|
|
|
|
|
|
|
if (read_cpuinfo("timebase", line, sizeof(line))) {
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!(ns = strchr(line, ':'))) {
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
ns++;
|
|
|
|
|
2015-09-25 16:37:58 +08:00
|
|
|
return atoi(ns);
|
2010-02-10 00:37:05 +08:00
|
|
|
}
|
2010-05-10 16:21:34 +08:00
|
|
|
|
2014-07-09 18:38:37 +08:00
|
|
|
bool kvmppc_get_host_serial(char **value)
|
|
|
|
{
|
|
|
|
return g_file_get_contents("/proc/device-tree/system-id", value, NULL,
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
bool kvmppc_get_host_model(char **value)
|
|
|
|
{
|
|
|
|
return g_file_get_contents("/proc/device-tree/model", value, NULL, NULL);
|
|
|
|
}
|
|
|
|
|
2011-07-21 08:29:15 +08:00
|
|
|
/* Try to find a device tree node for a CPU with clock-frequency property */
|
|
|
|
static int kvmppc_find_cpu_dt(char *buf, int buf_len)
|
|
|
|
{
|
|
|
|
struct dirent *dirp;
|
|
|
|
DIR *dp;
|
|
|
|
|
|
|
|
if ((dp = opendir(PROC_DEVTREE_CPU)) == NULL) {
|
|
|
|
printf("Can't open directory " PROC_DEVTREE_CPU "\n");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
buf[0] = '\0';
|
|
|
|
while ((dirp = readdir(dp)) != NULL) {
|
|
|
|
FILE *f;
|
|
|
|
snprintf(buf, buf_len, "%s%s/clock-frequency", PROC_DEVTREE_CPU,
|
|
|
|
dirp->d_name);
|
|
|
|
f = fopen(buf, "r");
|
|
|
|
if (f) {
|
|
|
|
snprintf(buf, buf_len, "%s%s", PROC_DEVTREE_CPU, dirp->d_name);
|
|
|
|
fclose(f);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
buf[0] = '\0';
|
|
|
|
}
|
|
|
|
closedir(dp);
|
|
|
|
if (buf[0] == '\0') {
|
|
|
|
printf("Unknown host!\n");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2015-11-14 10:13:07 +08:00
|
|
|
static uint64_t kvmppc_read_int_dt(const char *filename)
|
2011-07-21 08:29:15 +08:00
|
|
|
{
|
2011-10-11 02:31:00 +08:00
|
|
|
union {
|
|
|
|
uint32_t v32;
|
|
|
|
uint64_t v64;
|
|
|
|
} u;
|
2011-07-21 08:29:15 +08:00
|
|
|
FILE *f;
|
|
|
|
int len;
|
|
|
|
|
2015-11-14 10:13:07 +08:00
|
|
|
f = fopen(filename, "rb");
|
2011-07-21 08:29:15 +08:00
|
|
|
if (!f) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2011-10-11 02:31:00 +08:00
|
|
|
len = fread(&u, 1, sizeof(u), f);
|
2011-07-21 08:29:15 +08:00
|
|
|
fclose(f);
|
|
|
|
switch (len) {
|
2011-10-11 02:31:00 +08:00
|
|
|
case 4:
|
|
|
|
/* property is a 32-bit quantity */
|
|
|
|
return be32_to_cpu(u.v32);
|
|
|
|
case 8:
|
|
|
|
return be64_to_cpu(u.v64);
|
2011-07-21 08:29:15 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2015-11-14 10:13:07 +08:00
|
|
|
/* Read a CPU node property from the host device tree that's a single
|
|
|
|
* integer (32-bit or 64-bit). Returns 0 if anything goes wrong
|
|
|
|
* (can't find or open the property, or doesn't understand the
|
|
|
|
* format) */
|
|
|
|
static uint64_t kvmppc_read_int_cpu_dt(const char *propname)
|
|
|
|
{
|
|
|
|
char buf[PATH_MAX], *tmp;
|
|
|
|
uint64_t val;
|
|
|
|
|
|
|
|
if (kvmppc_find_cpu_dt(buf, sizeof(buf))) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
tmp = g_strdup_printf("%s/%s", buf, propname);
|
|
|
|
val = kvmppc_read_int_dt(tmp);
|
|
|
|
g_free(tmp);
|
|
|
|
|
|
|
|
return val;
|
|
|
|
}
|
|
|
|
|
2011-10-11 02:31:00 +08:00
|
|
|
uint64_t kvmppc_get_clockfreq(void)
|
|
|
|
{
|
|
|
|
return kvmppc_read_int_cpu_dt("clock-frequency");
|
|
|
|
}
|
|
|
|
|
pseries: Add device tree properties for VMX/VSX and DFP under kvm
Sufficiently recent PAPR specifications define properties "ibm,vmx"
and "ibm,dfp" on the CPU node which advertise whether the VMX vector
extensions (or the later VSX version) and/or the Decimal Floating
Point operations from IBM's recent POWER CPUs are available.
Currently we do not put these in the guest device tree and the guest
kernel will consequently assume they are not available. This is good,
because they are not supported under TCG. VMX is similar enough to
Altivec that it might be trivial to support, but VSX and DFP would
both require significant work to support in TCG.
However, when running under kvm on a host which supports these
instructions, there's no reason not to let the guest use them. This
patch, therefore, checks for the relevant support on the host CPU
and, if present, advertises them to the guest as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-11 02:31:01 +08:00
|
|
|
uint32_t kvmppc_get_vmx(void)
|
|
|
|
{
|
|
|
|
return kvmppc_read_int_cpu_dt("ibm,vmx");
|
|
|
|
}
|
|
|
|
|
|
|
|
uint32_t kvmppc_get_dfp(void)
|
|
|
|
{
|
|
|
|
return kvmppc_read_int_cpu_dt("ibm,dfp");
|
|
|
|
}
|
|
|
|
|
2013-01-03 20:37:02 +08:00
|
|
|
static int kvmppc_get_pvinfo(CPUPPCState *env, struct kvm_ppc_pvinfo *pvinfo)
|
|
|
|
{
|
|
|
|
PowerPCCPU *cpu = ppc_env_get_cpu(env);
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
|
2014-07-15 01:17:35 +08:00
|
|
|
if (kvm_vm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_PVINFO) &&
|
2013-01-03 20:37:02 +08:00
|
|
|
!kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_PVINFO, pvinfo)) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_get_hasidle(CPUPPCState *env)
|
|
|
|
{
|
|
|
|
struct kvm_ppc_pvinfo pvinfo;
|
|
|
|
|
|
|
|
if (!kvmppc_get_pvinfo(env, &pvinfo) &&
|
|
|
|
(pvinfo.flags & KVM_PPC_PVINFO_FLAGS_EV_IDLE)) {
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-03-14 08:38:22 +08:00
|
|
|
int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len)
|
2010-08-03 21:22:42 +08:00
|
|
|
{
|
|
|
|
uint32_t *hc = (uint32_t*)buf;
|
|
|
|
struct kvm_ppc_pvinfo pvinfo;
|
|
|
|
|
2013-01-03 20:37:02 +08:00
|
|
|
if (!kvmppc_get_pvinfo(env, &pvinfo)) {
|
2010-08-03 21:22:42 +08:00
|
|
|
memcpy(buf, pvinfo.hcall, buf_len);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2014-06-11 18:19:03 +08:00
|
|
|
* Fallback to always fail hypercalls regardless of endianness:
|
2010-08-03 21:22:42 +08:00
|
|
|
*
|
2014-06-11 18:19:03 +08:00
|
|
|
* tdi 0,r0,72 (becomes b .+8 in wrong endian, nop in good endian)
|
2010-08-03 21:22:42 +08:00
|
|
|
* li r3, -1
|
2014-06-11 18:19:03 +08:00
|
|
|
* b .+8 (becomes nop in wrong endian)
|
|
|
|
* bswap32(li r3, -1)
|
2010-08-03 21:22:42 +08:00
|
|
|
*/
|
|
|
|
|
2014-06-11 18:19:03 +08:00
|
|
|
hc[0] = cpu_to_be32(0x08000048);
|
|
|
|
hc[1] = cpu_to_be32(0x3860ffff);
|
|
|
|
hc[2] = cpu_to_be32(0x48000008);
|
|
|
|
hc[3] = cpu_to_be32(bswap32(0x3860ffff));
|
2010-08-03 21:22:42 +08:00
|
|
|
|
2016-03-21 10:14:02 +08:00
|
|
|
return 1;
|
2010-08-03 21:22:42 +08:00
|
|
|
}
|
|
|
|
|
2015-05-07 13:33:59 +08:00
|
|
|
static inline int kvmppc_enable_hcall(KVMState *s, target_ulong hcall)
|
|
|
|
{
|
|
|
|
return kvm_vm_enable_cap(s, KVM_CAP_PPC_ENABLE_HCALL, 0, hcall, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
void kvmppc_enable_logical_ci_hcalls(void)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* FIXME: it would be nice if we could detect the cases where
|
|
|
|
* we're using a device which requires the in kernel
|
|
|
|
* implementation of these hcalls, but the kernel lacks them and
|
|
|
|
* produce a warning.
|
|
|
|
*/
|
|
|
|
kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_LOAD);
|
|
|
|
kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_STORE);
|
|
|
|
}
|
|
|
|
|
2015-09-08 09:25:13 +08:00
|
|
|
void kvmppc_enable_set_mode_hcall(void)
|
|
|
|
{
|
|
|
|
kvmppc_enable_hcall(kvm_state, H_SET_MODE);
|
|
|
|
}
|
|
|
|
|
2016-08-30 09:02:47 +08:00
|
|
|
void kvmppc_enable_clear_ref_mod_hcalls(void)
|
|
|
|
{
|
|
|
|
kvmppc_enable_hcall(kvm_state, H_CLEAR_REF);
|
|
|
|
kvmppc_enable_hcall(kvm_state, H_CLEAR_MOD);
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:06:49 +08:00
|
|
|
void kvmppc_set_papr(PowerPCCPU *cpu)
|
2011-08-09 23:57:37 +08:00
|
|
|
{
|
2012-10-31 13:06:49 +08:00
|
|
|
CPUState *cs = CPU(cpu);
|
2011-08-09 23:57:37 +08:00
|
|
|
int ret;
|
|
|
|
|
2014-04-09 23:21:57 +08:00
|
|
|
ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_PAPR, 0);
|
2011-08-09 23:57:37 +08:00
|
|
|
if (ret) {
|
2016-02-19 05:01:38 +08:00
|
|
|
error_report("This vCPU type or KVM version does not support PAPR");
|
|
|
|
exit(1);
|
2011-09-15 03:38:45 +08:00
|
|
|
}
|
2013-04-08 03:08:22 +08:00
|
|
|
|
|
|
|
/* Update the capability flag so we sync the right information
|
|
|
|
* with kvm */
|
|
|
|
cap_papr = 1;
|
2011-08-09 23:57:37 +08:00
|
|
|
}
|
|
|
|
|
2014-05-23 10:26:58 +08:00
|
|
|
int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
|
|
|
|
{
|
|
|
|
return kvm_set_one_reg(CPU(cpu), KVM_REG_PPC_ARCH_COMPAT, &cpu_version);
|
|
|
|
}
|
|
|
|
|
2013-01-17 18:54:38 +08:00
|
|
|
void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
|
|
|
|
{
|
|
|
|
CPUState *cs = CPU(cpu);
|
|
|
|
int ret;
|
|
|
|
|
2014-04-09 23:21:57 +08:00
|
|
|
ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_EPR, 0, mpic_proxy);
|
2013-01-17 18:54:38 +08:00
|
|
|
if (ret && mpic_proxy) {
|
2016-02-19 05:01:38 +08:00
|
|
|
error_report("This KVM version does not support EPR");
|
|
|
|
exit(1);
|
2013-01-17 18:54:38 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-09-30 05:39:10 +08:00
|
|
|
int kvmppc_smt_threads(void)
|
|
|
|
{
|
|
|
|
return cap_ppc_smt ? cap_ppc_smt : 1;
|
|
|
|
}
|
|
|
|
|
2012-09-13 00:57:12 +08:00
|
|
|
#ifdef TARGET_PPC64
|
2014-07-10 23:03:41 +08:00
|
|
|
off_t kvmppc_alloc_rma(void **rma)
|
2011-09-30 05:39:11 +08:00
|
|
|
{
|
|
|
|
off_t size;
|
|
|
|
int fd;
|
|
|
|
struct kvm_allocate_rma ret;
|
|
|
|
|
|
|
|
/* If cap_ppc_rma == 0, contiguous RMA allocation is not supported
|
|
|
|
* if cap_ppc_rma == 1, contiguous RMA allocation is supported, but
|
|
|
|
* not necessary on this hardware
|
|
|
|
* if cap_ppc_rma == 2, contiguous RMA allocation is needed on this hardware
|
|
|
|
*
|
|
|
|
* FIXME: We should allow the user to force contiguous RMA
|
|
|
|
* allocation in the cap_ppc_rma==1 case.
|
|
|
|
*/
|
|
|
|
if (cap_ppc_rma < 2) {
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
fd = kvm_vm_ioctl(kvm_state, KVM_ALLOCATE_RMA, &ret);
|
|
|
|
if (fd < 0) {
|
|
|
|
fprintf(stderr, "KVM: Error on KVM_ALLOCATE_RMA: %s\n",
|
|
|
|
strerror(errno));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
size = MIN(ret.rma_size, 256ul << 20);
|
|
|
|
|
2014-07-10 23:03:41 +08:00
|
|
|
*rma = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
|
|
|
|
if (*rma == MAP_FAILED) {
|
2011-09-30 05:39:11 +08:00
|
|
|
fprintf(stderr, "KVM: Error mapping RMA: %s\n", strerror(errno));
|
|
|
|
return -1;
|
|
|
|
};
|
|
|
|
|
|
|
|
return size;
|
|
|
|
}
|
|
|
|
|
2012-09-13 00:57:12 +08:00
|
|
|
uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
|
|
|
|
{
|
2013-04-08 03:08:18 +08:00
|
|
|
struct kvm_ppc_smmu_info info;
|
|
|
|
long rampagesize, best_page_shift;
|
|
|
|
int i;
|
|
|
|
|
2012-09-13 00:57:12 +08:00
|
|
|
if (cap_ppc_rma >= 2) {
|
|
|
|
return current_size;
|
|
|
|
}
|
2013-04-08 03:08:18 +08:00
|
|
|
|
|
|
|
/* Find the largest hardware supported page size that's less than
|
|
|
|
* or equal to the (logical) backing page size of guest RAM */
|
2013-05-30 04:29:20 +08:00
|
|
|
kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
|
2013-04-08 03:08:18 +08:00
|
|
|
rampagesize = getrampagesize();
|
|
|
|
best_page_shift = 0;
|
|
|
|
|
|
|
|
for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
|
|
|
|
struct kvm_ppc_one_seg_page_size *sps = &info.sps[i];
|
|
|
|
|
|
|
|
if (!sps->page_shift) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((sps->page_shift > best_page_shift)
|
|
|
|
&& ((1UL << sps->page_shift) <= rampagesize)) {
|
|
|
|
best_page_shift = sps->page_shift;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-09-13 00:57:12 +08:00
|
|
|
return MIN(current_size,
|
2013-04-08 03:08:18 +08:00
|
|
|
1ULL << (best_page_shift + hash_shift - 7));
|
2012-09-13 00:57:12 +08:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-05-27 13:36:30 +08:00
|
|
|
bool kvmppc_spapr_use_multitce(void)
|
|
|
|
{
|
|
|
|
return cap_spapr_multitce;
|
|
|
|
}
|
|
|
|
|
2014-06-10 13:39:21 +08:00
|
|
|
void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd,
|
2015-09-30 11:42:55 +08:00
|
|
|
bool need_vfio)
|
2011-09-30 05:39:12 +08:00
|
|
|
{
|
|
|
|
struct kvm_create_spapr_tce args = {
|
|
|
|
.liobn = liobn,
|
|
|
|
.window_size = window_size,
|
|
|
|
};
|
|
|
|
long len;
|
|
|
|
int fd;
|
|
|
|
void *table;
|
|
|
|
|
pseries: Don't try to munmap() a malloc()ed TCE table
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-02-28 01:18:07 +08:00
|
|
|
/* Must set fd to -1 so we don't try to munmap when called for
|
|
|
|
* destroying the table, which the upper layers -will- do
|
|
|
|
*/
|
|
|
|
*pfd = -1;
|
2015-09-30 11:42:55 +08:00
|
|
|
if (!cap_spapr_tce || (need_vfio && !cap_spapr_vfio)) {
|
2011-09-30 05:39:12 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_SPAPR_TCE, &args);
|
|
|
|
if (fd < 0) {
|
pseries: Don't try to munmap() a malloc()ed TCE table
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-02-28 01:18:07 +08:00
|
|
|
fprintf(stderr, "KVM: Failed to create TCE table for liobn 0x%x\n",
|
|
|
|
liobn);
|
2011-09-30 05:39:12 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2013-07-19 03:32:58 +08:00
|
|
|
len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(uint64_t);
|
2011-09-30 05:39:12 +08:00
|
|
|
/* FIXME: round this up to page size */
|
|
|
|
|
2011-10-27 23:56:31 +08:00
|
|
|
table = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
|
2011-09-30 05:39:12 +08:00
|
|
|
if (table == MAP_FAILED) {
|
pseries: Don't try to munmap() a malloc()ed TCE table
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-02-28 01:18:07 +08:00
|
|
|
fprintf(stderr, "KVM: Failed to map TCE table for liobn 0x%x\n",
|
|
|
|
liobn);
|
2011-09-30 05:39:12 +08:00
|
|
|
close(fd);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
*pfd = fd;
|
|
|
|
return table;
|
|
|
|
}
|
|
|
|
|
2014-05-27 13:36:35 +08:00
|
|
|
int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t nb_table)
|
2011-09-30 05:39:12 +08:00
|
|
|
{
|
|
|
|
long len;
|
|
|
|
|
|
|
|
if (fd < 0) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2014-05-27 13:36:35 +08:00
|
|
|
len = nb_table * sizeof(uint64_t);
|
2011-09-30 05:39:12 +08:00
|
|
|
if ((munmap(table, len) < 0) ||
|
|
|
|
(close(fd) < 0)) {
|
pseries: Don't try to munmap() a malloc()ed TCE table
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-02-28 01:18:07 +08:00
|
|
|
fprintf(stderr, "KVM: Unexpected error removing TCE table: %s",
|
|
|
|
strerror(errno));
|
2011-09-30 05:39:12 +08:00
|
|
|
/* Leak the table */
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-09-13 00:57:12 +08:00
|
|
|
int kvmppc_reset_htab(int shift_hint)
|
|
|
|
{
|
|
|
|
uint32_t shift = shift_hint;
|
|
|
|
|
2012-09-20 05:08:42 +08:00
|
|
|
if (!kvm_enabled()) {
|
|
|
|
/* Full emulation, tell caller to allocate htab itself */
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
if (kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) {
|
2012-09-13 00:57:12 +08:00
|
|
|
int ret;
|
|
|
|
ret = kvm_vm_ioctl(kvm_state, KVM_PPC_ALLOCATE_HTAB, &shift);
|
2012-09-20 05:08:42 +08:00
|
|
|
if (ret == -ENOTTY) {
|
|
|
|
/* At least some versions of PR KVM advertise the
|
|
|
|
* capability, but don't implement the ioctl(). Oops.
|
|
|
|
* Return 0 so that we allocate the htab in qemu, as is
|
|
|
|
* correct for PR. */
|
|
|
|
return 0;
|
|
|
|
} else if (ret < 0) {
|
2012-09-13 00:57:12 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
return shift;
|
|
|
|
}
|
|
|
|
|
2012-09-20 05:08:42 +08:00
|
|
|
/* We have a kernel that predates the htab reset calls. For PR
|
|
|
|
* KVM, we need to allocate the htab ourselves, for an HV KVM of
|
2016-09-29 18:48:06 +08:00
|
|
|
* this era, it has allocated a 16MB fixed size hash table already. */
|
|
|
|
if (kvmppc_is_pr(kvm_state)) {
|
2012-09-20 05:08:42 +08:00
|
|
|
/* PR - tell caller to allocate htab */
|
|
|
|
return 0;
|
|
|
|
} else {
|
|
|
|
/* HV - assume 16MB kernel allocated htab */
|
|
|
|
return 24;
|
|
|
|
}
|
2012-09-13 00:57:12 +08:00
|
|
|
}
|
|
|
|
|
2011-10-13 06:40:32 +08:00
|
|
|
static inline uint32_t mfpvr(void)
|
|
|
|
{
|
|
|
|
uint32_t pvr;
|
|
|
|
|
|
|
|
asm ("mfpvr %0"
|
|
|
|
: "=r"(pvr));
|
|
|
|
return pvr;
|
|
|
|
}
|
|
|
|
|
2011-10-18 02:15:41 +08:00
|
|
|
static void alter_insns(uint64_t *word, uint64_t flags, bool on)
|
|
|
|
{
|
|
|
|
if (on) {
|
|
|
|
*word |= flags;
|
|
|
|
} else {
|
|
|
|
*word &= ~flags;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-01-06 16:31:30 +08:00
|
|
|
static void kvmppc_host_cpu_initfn(Object *obj)
|
2011-10-13 06:40:32 +08:00
|
|
|
{
|
2013-01-06 16:31:30 +08:00
|
|
|
assert(kvm_enabled());
|
|
|
|
}
|
|
|
|
|
|
|
|
static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
|
|
|
|
{
|
qdev: Protect device-list-properties against broken devices
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.
This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties. Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1. Example reproducer:
$ qemu-system-aarch64 -nodefaults -display none -machine none -S -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2}, "package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "device-list-properties", "arguments": { "typename": "pxa2xx-pcmcia" } }
qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
Aborted (core dumped)
[Exit 134 (SIGABRT)]
Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_destroy_with_object_finalize_yet
to mark them:
* Hang during cleanup (didn't debug, so I can't say why):
"realview_pci", "versatile_pci".
* Dangling pointer in cpus: most CPUs, plus "allwinner-a10", "digic",
"fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
CPUs
* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
"host-powerpc64-cpu", "host-embedded-powerpc-cpu",
"host-powerpc-cpu" (the powerpc ones can't currently reach the
assertion, because the CPUs are only registered when KVM is enabled,
but the assertion is arguably in the wrong place all the same)
Make qmp_device_list_properties() fail cleanly when the device is so
marked. This improves device-list-properties from "crashes, hangs or
leaves dangling pointers behind" to "fails". Not a complete fix, just
a better-than-nothing work-around. In the above reproducer,
device-list-properties now fails with "Can't list properties of device
'pxa2xx-pcmcia'".
This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2. Example reproducer:
$ qemu-system-aarch64 -machine none -device pxa2xx-pcmcia,help
Before:
qemu-system-aarch64: .../memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
After:
Can't list properties of device 'pxa2xx-pcmcia'
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Green <green@moxielogic.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Jia Liu <proljc@gmail.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-10-git-send-email-armbru@redhat.com>
2015-10-01 16:59:58 +08:00
|
|
|
DeviceClass *dc = DEVICE_CLASS(oc);
|
2013-01-06 16:31:30 +08:00
|
|
|
PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
|
2011-10-18 02:15:41 +08:00
|
|
|
uint32_t vmx = kvmppc_get_vmx();
|
|
|
|
uint32_t dfp = kvmppc_get_dfp();
|
2013-04-08 03:08:19 +08:00
|
|
|
uint32_t dcache_size = kvmppc_read_int_cpu_dt("d-cache-size");
|
|
|
|
uint32_t icache_size = kvmppc_read_int_cpu_dt("i-cache-size");
|
2011-10-13 06:40:32 +08:00
|
|
|
|
2013-02-18 07:16:41 +08:00
|
|
|
/* Now fix up the class with information we can query from the host */
|
2013-09-27 16:05:03 +08:00
|
|
|
pcc->pvr = mfpvr();
|
2011-10-18 02:15:41 +08:00
|
|
|
|
2011-10-25 02:43:22 +08:00
|
|
|
if (vmx != -1) {
|
|
|
|
/* Only override when we know what the host supports */
|
2013-02-18 07:16:41 +08:00
|
|
|
alter_insns(&pcc->insns_flags, PPC_ALTIVEC, vmx > 0);
|
|
|
|
alter_insns(&pcc->insns_flags2, PPC2_VSX, vmx > 1);
|
2011-10-25 02:43:22 +08:00
|
|
|
}
|
|
|
|
if (dfp != -1) {
|
|
|
|
/* Only override when we know what the host supports */
|
2013-02-18 07:16:41 +08:00
|
|
|
alter_insns(&pcc->insns_flags2, PPC2_DFP, dfp);
|
2011-10-25 02:43:22 +08:00
|
|
|
}
|
2013-04-08 03:08:19 +08:00
|
|
|
|
|
|
|
if (dcache_size != -1) {
|
|
|
|
pcc->l1_dcache_size = dcache_size;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (icache_size != -1) {
|
|
|
|
pcc->l1_icache_size = icache_size;
|
|
|
|
}
|
qdev: Protect device-list-properties against broken devices
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.
This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties. Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1. Example reproducer:
$ qemu-system-aarch64 -nodefaults -display none -machine none -S -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2}, "package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "device-list-properties", "arguments": { "typename": "pxa2xx-pcmcia" } }
qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
Aborted (core dumped)
[Exit 134 (SIGABRT)]
Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_destroy_with_object_finalize_yet
to mark them:
* Hang during cleanup (didn't debug, so I can't say why):
"realview_pci", "versatile_pci".
* Dangling pointer in cpus: most CPUs, plus "allwinner-a10", "digic",
"fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
CPUs
* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
"host-powerpc64-cpu", "host-embedded-powerpc-cpu",
"host-powerpc-cpu" (the powerpc ones can't currently reach the
assertion, because the CPUs are only registered when KVM is enabled,
but the assertion is arguably in the wrong place all the same)
Make qmp_device_list_properties() fail cleanly when the device is so
marked. This improves device-list-properties from "crashes, hangs or
leaves dangling pointers behind" to "fails". Not a complete fix, just
a better-than-nothing work-around. In the above reproducer,
device-list-properties now fails with "Can't list properties of device
'pxa2xx-pcmcia'".
This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2. Example reproducer:
$ qemu-system-aarch64 -machine none -device pxa2xx-pcmcia,help
Before:
qemu-system-aarch64: .../memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
After:
Can't list properties of device 'pxa2xx-pcmcia'
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Green <green@moxielogic.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Jia Liu <proljc@gmail.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-10-git-send-email-armbru@redhat.com>
2015-10-01 16:59:58 +08:00
|
|
|
|
|
|
|
/* Reason: kvmppc_host_cpu_initfn() dies when !kvm_enabled() */
|
|
|
|
dc->cannot_destroy_with_object_finalize_yet = true;
|
2011-10-13 06:40:32 +08:00
|
|
|
}
|
|
|
|
|
2013-03-30 14:40:49 +08:00
|
|
|
bool kvmppc_has_cap_epr(void)
|
|
|
|
{
|
|
|
|
return cap_epr;
|
|
|
|
}
|
|
|
|
|
2014-02-21 01:52:24 +08:00
|
|
|
bool kvmppc_has_cap_htab_fd(void)
|
|
|
|
{
|
|
|
|
return cap_htab_fd;
|
|
|
|
}
|
|
|
|
|
2014-06-04 18:14:08 +08:00
|
|
|
bool kvmppc_has_cap_fixup_hcalls(void)
|
|
|
|
{
|
|
|
|
return cap_fixup_hcalls;
|
|
|
|
}
|
|
|
|
|
2016-09-28 19:16:30 +08:00
|
|
|
bool kvmppc_has_cap_htm(void)
|
|
|
|
{
|
|
|
|
return cap_htm;
|
|
|
|
}
|
|
|
|
|
2014-04-12 01:34:25 +08:00
|
|
|
static PowerPCCPUClass *ppc_cpu_get_family_class(PowerPCCPUClass *pcc)
|
|
|
|
{
|
|
|
|
ObjectClass *oc = OBJECT_CLASS(pcc);
|
|
|
|
|
|
|
|
while (oc && !object_class_is_abstract(oc)) {
|
|
|
|
oc = object_class_get_parent(oc);
|
|
|
|
}
|
|
|
|
assert(oc);
|
|
|
|
|
|
|
|
return POWERPC_CPU_CLASS(oc);
|
|
|
|
}
|
|
|
|
|
2016-06-07 23:39:38 +08:00
|
|
|
PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void)
|
|
|
|
{
|
|
|
|
uint32_t host_pvr = mfpvr();
|
|
|
|
PowerPCCPUClass *pvr_pcc;
|
|
|
|
|
|
|
|
pvr_pcc = ppc_cpu_class_by_pvr(host_pvr);
|
|
|
|
if (pvr_pcc == NULL) {
|
|
|
|
pvr_pcc = ppc_cpu_class_by_pvr_mask(host_pvr);
|
|
|
|
}
|
|
|
|
|
|
|
|
return pvr_pcc;
|
|
|
|
}
|
|
|
|
|
2013-02-23 19:22:12 +08:00
|
|
|
static int kvm_ppc_register_host_cpu_type(void)
|
|
|
|
{
|
|
|
|
TypeInfo type_info = {
|
|
|
|
.name = TYPE_HOST_POWERPC_CPU,
|
|
|
|
.instance_init = kvmppc_host_cpu_initfn,
|
|
|
|
.class_init = kvmppc_host_cpu_class_init,
|
|
|
|
};
|
|
|
|
PowerPCCPUClass *pvr_pcc;
|
2014-04-12 01:34:25 +08:00
|
|
|
DeviceClass *dc;
|
2013-02-23 19:22:12 +08:00
|
|
|
|
2016-06-07 23:39:38 +08:00
|
|
|
pvr_pcc = kvm_ppc_get_host_cpu_class();
|
2013-02-23 19:22:12 +08:00
|
|
|
if (pvr_pcc == NULL) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
type_info.parent = object_class_get_name(OBJECT_CLASS(pvr_pcc));
|
|
|
|
type_register(&type_info);
|
2014-04-12 01:34:25 +08:00
|
|
|
|
2016-08-10 01:00:00 +08:00
|
|
|
/* Register generic family CPU class for a family */
|
|
|
|
pvr_pcc = ppc_cpu_get_family_class(pvr_pcc);
|
|
|
|
dc = DEVICE_CLASS(pvr_pcc);
|
|
|
|
type_info.parent = object_class_get_name(OBJECT_CLASS(pvr_pcc));
|
|
|
|
type_info.name = g_strdup_printf("%s-"TYPE_POWERPC_CPU, dc->desc);
|
|
|
|
type_register(&type_info);
|
|
|
|
|
2016-06-10 08:59:01 +08:00
|
|
|
#if defined(TARGET_PPC64)
|
|
|
|
type_info.name = g_strdup_printf("%s-"TYPE_SPAPR_CPU_CORE, "host");
|
|
|
|
type_info.parent = TYPE_SPAPR_CPU_CORE,
|
2016-09-12 15:57:20 +08:00
|
|
|
type_info.instance_size = sizeof(sPAPRCPUCore);
|
|
|
|
type_info.instance_init = NULL;
|
|
|
|
type_info.class_init = spapr_cpu_core_class_init;
|
|
|
|
type_info.class_data = (void *) "host";
|
2016-06-10 08:59:01 +08:00
|
|
|
type_register(&type_info);
|
|
|
|
g_free((void *)type_info.name);
|
2016-08-10 01:00:01 +08:00
|
|
|
|
|
|
|
/* Register generic spapr CPU family class for current host CPU type */
|
|
|
|
type_info.name = g_strdup_printf("%s-"TYPE_SPAPR_CPU_CORE, dc->desc);
|
2016-09-12 15:57:20 +08:00
|
|
|
type_info.class_data = (void *) dc->desc;
|
2016-08-10 01:00:01 +08:00
|
|
|
type_register(&type_info);
|
|
|
|
g_free((void *)type_info.name);
|
2016-06-10 08:59:01 +08:00
|
|
|
#endif
|
|
|
|
|
2013-02-23 19:22:12 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-09-26 14:18:35 +08:00
|
|
|
int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function)
|
|
|
|
{
|
|
|
|
struct kvm_rtas_token_args args = {
|
|
|
|
.token = token,
|
|
|
|
};
|
|
|
|
|
|
|
|
if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) {
|
|
|
|
return -ENOENT;
|
|
|
|
}
|
|
|
|
|
|
|
|
strncpy(args.name, function, sizeof(args.name));
|
|
|
|
|
|
|
|
return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, &args);
|
|
|
|
}
|
2012-04-04 13:02:05 +08:00
|
|
|
|
2013-07-19 03:33:03 +08:00
|
|
|
int kvmppc_get_htab_fd(bool write)
|
|
|
|
{
|
|
|
|
struct kvm_get_htab_fd s = {
|
|
|
|
.flags = write ? KVM_GET_HTAB_WRITE : 0,
|
|
|
|
.start_index = 0,
|
|
|
|
};
|
|
|
|
|
|
|
|
if (!cap_htab_fd) {
|
|
|
|
fprintf(stderr, "KVM version doesn't support saving the hash table\n");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &s);
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns)
|
|
|
|
{
|
2013-08-21 23:03:08 +08:00
|
|
|
int64_t starttime = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
|
2013-07-19 03:33:03 +08:00
|
|
|
uint8_t buf[bufsize];
|
|
|
|
ssize_t rc;
|
|
|
|
|
|
|
|
do {
|
|
|
|
rc = read(fd, buf, bufsize);
|
|
|
|
if (rc < 0) {
|
|
|
|
fprintf(stderr, "Error reading data from KVM HTAB fd: %s\n",
|
|
|
|
strerror(errno));
|
|
|
|
return rc;
|
|
|
|
} else if (rc) {
|
2014-11-03 23:14:50 +08:00
|
|
|
uint8_t *buffer = buf;
|
|
|
|
ssize_t n = rc;
|
|
|
|
while (n) {
|
|
|
|
struct kvm_get_htab_header *head =
|
|
|
|
(struct kvm_get_htab_header *) buffer;
|
|
|
|
size_t chunksize = sizeof(*head) +
|
|
|
|
HASH_PTE_SIZE_64 * head->n_valid;
|
|
|
|
|
|
|
|
qemu_put_be32(f, head->index);
|
|
|
|
qemu_put_be16(f, head->n_valid);
|
|
|
|
qemu_put_be16(f, head->n_invalid);
|
|
|
|
qemu_put_buffer(f, (void *)(head + 1),
|
|
|
|
HASH_PTE_SIZE_64 * head->n_valid);
|
|
|
|
|
|
|
|
buffer += chunksize;
|
|
|
|
n -= chunksize;
|
|
|
|
}
|
2013-07-19 03:33:03 +08:00
|
|
|
}
|
|
|
|
} while ((rc != 0)
|
|
|
|
&& ((max_ns < 0)
|
2013-08-21 23:03:08 +08:00
|
|
|
|| ((qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - starttime) < max_ns)));
|
2013-07-19 03:33:03 +08:00
|
|
|
|
|
|
|
return (rc == 0) ? 1 : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
|
|
|
|
uint16_t n_valid, uint16_t n_invalid)
|
|
|
|
{
|
|
|
|
struct kvm_get_htab_header *buf;
|
|
|
|
size_t chunksize = sizeof(*buf) + n_valid*HASH_PTE_SIZE_64;
|
|
|
|
ssize_t rc;
|
|
|
|
|
|
|
|
buf = alloca(chunksize);
|
|
|
|
buf->index = index;
|
|
|
|
buf->n_valid = n_valid;
|
|
|
|
buf->n_invalid = n_invalid;
|
|
|
|
|
|
|
|
qemu_get_buffer(f, (void *)(buf + 1), HASH_PTE_SIZE_64*n_valid);
|
|
|
|
|
|
|
|
rc = write(fd, buf, chunksize);
|
|
|
|
if (rc < 0) {
|
|
|
|
fprintf(stderr, "Error writing KVM hash table: %s\n",
|
|
|
|
strerror(errno));
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
if (rc != chunksize) {
|
|
|
|
/* We should never get a short write on a single chunk */
|
|
|
|
fprintf(stderr, "Short write, restoring KVM hash table\n");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
bool kvm_arch_stop_on_emulation_error(CPUState *cpu)
|
2010-05-10 16:21:34 +08:00
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
2011-02-02 05:15:51 +08:00
|
|
|
|
2012-10-31 13:57:49 +08:00
|
|
|
int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr)
|
2011-02-02 05:15:51 +08:00
|
|
|
{
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_arch_on_sigbus(int code, void *addr)
|
|
|
|
{
|
|
|
|
return 1;
|
|
|
|
}
|
2013-06-12 15:26:54 +08:00
|
|
|
|
|
|
|
void kvm_arch_init_irq_routing(KVMState *s)
|
|
|
|
{
|
|
|
|
}
|
2013-12-11 21:15:34 +08:00
|
|
|
|
2014-02-21 01:52:24 +08:00
|
|
|
struct kvm_get_htab_buf {
|
|
|
|
struct kvm_get_htab_header header;
|
|
|
|
/*
|
|
|
|
* We require one extra byte for read
|
|
|
|
*/
|
|
|
|
target_ulong hpte[(HPTES_PER_GROUP * 2) + 1];
|
|
|
|
};
|
|
|
|
|
|
|
|
uint64_t kvmppc_hash64_read_pteg(PowerPCCPU *cpu, target_ulong pte_index)
|
|
|
|
{
|
|
|
|
int htab_fd;
|
|
|
|
struct kvm_get_htab_fd ghf;
|
|
|
|
struct kvm_get_htab_buf *hpte_buf;
|
|
|
|
|
|
|
|
ghf.flags = 0;
|
|
|
|
ghf.start_index = pte_index;
|
|
|
|
htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &ghf);
|
|
|
|
if (htab_fd < 0) {
|
|
|
|
goto error_out;
|
|
|
|
}
|
|
|
|
|
|
|
|
hpte_buf = g_malloc0(sizeof(*hpte_buf));
|
|
|
|
/*
|
|
|
|
* Read the hpte group
|
|
|
|
*/
|
|
|
|
if (read(htab_fd, hpte_buf, sizeof(*hpte_buf)) < 0) {
|
|
|
|
goto out_close;
|
|
|
|
}
|
|
|
|
|
|
|
|
close(htab_fd);
|
|
|
|
return (uint64_t)(uintptr_t) hpte_buf->hpte;
|
|
|
|
|
|
|
|
out_close:
|
|
|
|
g_free(hpte_buf);
|
|
|
|
close(htab_fd);
|
|
|
|
error_out:
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void kvmppc_hash64_free_pteg(uint64_t token)
|
|
|
|
{
|
|
|
|
struct kvm_get_htab_buf *htab_buf;
|
|
|
|
|
|
|
|
htab_buf = container_of((void *)(uintptr_t) token, struct kvm_get_htab_buf,
|
|
|
|
hpte);
|
|
|
|
g_free(htab_buf);
|
|
|
|
return;
|
|
|
|
}
|
2014-02-21 01:52:38 +08:00
|
|
|
|
|
|
|
void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
|
|
|
|
target_ulong pte0, target_ulong pte1)
|
|
|
|
{
|
|
|
|
int htab_fd;
|
|
|
|
struct kvm_get_htab_fd ghf;
|
|
|
|
struct kvm_get_htab_buf hpte_buf;
|
|
|
|
|
|
|
|
ghf.flags = 0;
|
|
|
|
ghf.start_index = 0; /* Ignored */
|
|
|
|
htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &ghf);
|
|
|
|
if (htab_fd < 0) {
|
|
|
|
goto error_out;
|
|
|
|
}
|
|
|
|
|
|
|
|
hpte_buf.header.n_valid = 1;
|
|
|
|
hpte_buf.header.n_invalid = 0;
|
|
|
|
hpte_buf.header.index = pte_index;
|
|
|
|
hpte_buf.hpte[0] = pte0;
|
|
|
|
hpte_buf.hpte[1] = pte1;
|
|
|
|
/*
|
|
|
|
* Write the hpte entry.
|
|
|
|
* CAUTION: write() has the warn_unused_result attribute. Hence we
|
|
|
|
* need to check the return value, even though we do nothing.
|
|
|
|
*/
|
|
|
|
if (write(htab_fd, &hpte_buf, sizeof(hpte_buf)) < 0) {
|
|
|
|
goto out_close;
|
|
|
|
}
|
|
|
|
|
|
|
|
out_close:
|
|
|
|
close(htab_fd);
|
|
|
|
return;
|
|
|
|
|
|
|
|
error_out:
|
|
|
|
return;
|
|
|
|
}
|
2015-01-09 16:04:40 +08:00
|
|
|
|
|
|
|
int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
|
2015-10-15 21:44:52 +08:00
|
|
|
uint64_t address, uint32_t data, PCIDevice *dev)
|
2015-01-09 16:04:40 +08:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2015-06-02 21:56:23 +08:00
|
|
|
|
2016-07-14 13:56:31 +08:00
|
|
|
int kvm_arch_add_msi_route_post(struct kvm_irq_routing_entry *route,
|
|
|
|
int vector, PCIDevice *dev)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_arch_release_virq_post(int virq)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2015-06-02 21:56:23 +08:00
|
|
|
int kvm_arch_msi_data_to_gsi(uint32_t data)
|
|
|
|
{
|
|
|
|
return data & 0xffff;
|
|
|
|
}
|
ppc/spapr: Implement H_RANDOM hypercall in QEMU
The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.
This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:
qemu-system-ppc64 -device spapr-rng,use-kvm=true
For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:
qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
-device spapr-rng,rng=gid0 ...
See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-17 16:49:41 +08:00
|
|
|
|
|
|
|
int kvmppc_enable_hwrng(void)
|
|
|
|
{
|
|
|
|
if (!kvm_enabled() || !kvm_check_extension(kvm_state, KVM_CAP_PPC_HWRNG)) {
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return kvmppc_enable_hcall(kvm_state, H_RANDOM);
|
|
|
|
}
|