Commit Graph

73 Commits

Author SHA1 Message Date
Linus Torvalds b9085bcbf5 Fairly small update, but there are some interesting new features.
Common: Optional support for adding a small amount of polling on each HLT
 instruction executed in the guest (or equivalent for other architectures).
 This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes
 or TCP_RR netperf tests).  This also has to be enabled manually for now,
 but the plan is to auto-tune this in the future.
 
 ARM/ARM64: the highlights are support for GICv3 emulation and dirty page
 tracking
 
 s390: several optimizations and bugfixes.  Also a first: a feature
 exposed by KVM (UUID and long guest name in /proc/sysinfo) before
 it is available in IBM's hypervisor! :)
 
 MIPS: Bugfixes.
 
 x86: Support for PML (page modification logging, a new feature in
 Broadwell Xeons that speeds up dirty page tracking), nested virtualization
 improvements (nested APICv---a nice optimization), usual round of emulation
 fixes.  There is also a new option to reduce latency of the TSC deadline
 timer in the guest; this needs to be tuned manually.
 
 Some commits are common between this pull and Catalin's; I see you
 have already included his tree.
 
 ARM has other conflicts where functions are added in the same place
 by 3.19-rc and 3.20 patches.  These are not large though, and entirely
 within KVM.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJU28rkAAoJEL/70l94x66DXqQH/1TDOfJIjW7P2kb0Sw7Fy1wi
 cEX1KO/VFxAqc8R0E/0Wb55CXyPjQJM6xBXuFr5cUDaIjQ8ULSktL4pEwXyyv/s5
 DBDkN65mriry2w5VuEaRLVcuX9Wy+tqLQXWNkEySfyb4uhZChWWHvKEcgw5SqCyg
 NlpeHurYESIoNyov3jWqvBjr4OmaQENyv7t2c6q5ErIgG02V+iCux5QGbphM2IC9
 LFtPKxoqhfeB2xFxTOIt8HJiXrZNwflsTejIlCl/NSEiDVLLxxHCxK2tWK/tUXMn
 JfLD9ytXBWtNMwInvtFm4fPmDouv2VDyR0xnK2db+/axsJZnbxqjGu1um4Dqbak=
 =7gdx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM update from Paolo Bonzini:
 "Fairly small update, but there are some interesting new features.

  Common:
     Optional support for adding a small amount of polling on each HLT
     instruction executed in the guest (or equivalent for other
     architectures).  This can improve latency up to 50% on some
     scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests).  This
     also has to be enabled manually for now, but the plan is to
     auto-tune this in the future.

  ARM/ARM64:
     The highlights are support for GICv3 emulation and dirty page
     tracking

  s390:
     Several optimizations and bugfixes.  Also a first: a feature
     exposed by KVM (UUID and long guest name in /proc/sysinfo) before
     it is available in IBM's hypervisor! :)

  MIPS:
     Bugfixes.

  x86:
     Support for PML (page modification logging, a new feature in
     Broadwell Xeons that speeds up dirty page tracking), nested
     virtualization improvements (nested APICv---a nice optimization),
     usual round of emulation fixes.

     There is also a new option to reduce latency of the TSC deadline
     timer in the guest; this needs to be tuned manually.

     Some commits are common between this pull and Catalin's; I see you
     have already included his tree.

  Powerpc:
     Nothing yet.

     The KVM/PPC changes will come in through the PPC maintainers,
     because I haven't received them yet and I might end up being
     offline for some part of next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: ia64: drop kvm.h from installed user headers
  KVM: x86: fix build with !CONFIG_SMP
  KVM: x86: emulate: correct page fault error code for NoWrite instructions
  KVM: Disable compat ioctl for s390
  KVM: s390: add cpu model support
  KVM: s390: use facilities and cpu_id per KVM
  KVM: s390/CPACF: Choose crypto control block format
  s390/kernel: Update /proc/sysinfo file with Extended Name and UUID
  KVM: s390: reenable LPP facility
  KVM: s390: floating irqs: fix user triggerable endless loop
  kvm: add halt_poll_ns module parameter
  kvm: remove KVM_MMIO_SIZE
  KVM: MIPS: Don't leak FPU/DSP to guest
  KVM: MIPS: Disable HTW while in guest
  KVM: nVMX: Enable nested posted interrupt processing
  KVM: nVMX: Enable nested virtual interrupt delivery
  KVM: nVMX: Enable nested apic register virtualization
  KVM: nVMX: Make nested control MSRs per-cpu
  KVM: nVMX: Enable nested virtualize x2apic mode
  KVM: nVMX: Prepare for using hardware MSR bitmap
  ...
2015-02-13 09:55:09 -08:00
Paolo Bonzini f781951299 kvm: add halt_poll_ns module parameter
This patch introduces a new module parameter for the KVM module; when it
is present, KVM attempts a bit of polling on every HLT before scheduling
itself out via kvm_vcpu_block.

This parameter helps a lot for latency-bound workloads---in particular
I tested it with O_DSYNC writes with a battery-backed disk in the host.
In this case, writes are fast (because the data doesn't have to go all
the way to the platters) but they cannot be merged by either the host or
the guest.  KVM's performance here is usually around 30% of bare metal,
or 50% if you use cache=directsync or cache=writethrough (these
parameters avoid that the guest sends pointless flush requests, and
at the same time they are not slow because of the battery-backed cache).
The bad performance happens because on every halt the host CPU decides
to halt itself too.  When the interrupt comes, the vCPU thread is then
migrated to a new physical CPU, and in general the latency is horrible
because the vCPU thread has to be scheduled back in.

With this patch performance reaches 60-65% of bare metal and, more
important, 99% of what you get if you use idle=poll in the guest.  This
means that the tunable gets rid of this particular bottleneck, and more
work can be done to improve performance in the kernel or QEMU.

Of course there is some price to pay; every time an otherwise idle vCPUs
is interrupted by an interrupt, it will poll unnecessarily and thus
impose a little load on the host.  The above results were obtained with
a mostly random value of the parameter (500000), and the load was around
1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.

The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
that can be used to tune the parameter.  It counts how many HLT
instructions received an interrupt during the polling period; each
successful poll avoids that Linux schedules the VCPU thread out and back
in, and may also avoid a likely trip to C1 and back for the physical CPU.

While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
Of these halts, almost all are failed polls.  During the benchmark,
instead, basically all halts end within the polling period, except a more
or less constant stream of 50 per second coming from vCPUs that are not
running the benchmark.  The wasted time is thus very low.  Things may
be slightly different for Windows VMs, which have a ~10 ms timer tick.

The effect is also visible on Marcelo's recently-introduced latency
test for the TSC deadline timer.  Though of course a non-RT kernel has
awful latency bounds, the latency of the timer is around 8000-10000 clock
cycles compared to 20000-120000 without setting halt_poll_ns.  For the TSC
deadline timer, thus, the effect is both a smaller average latency and
a smaller variance.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-06 13:08:37 +01:00
James Hogan f798217dfd KVM: MIPS: Don't leak FPU/DSP to guest
The FPU and DSP are enabled via the CP0 Status CU1 and MX bits by
kvm_mips_set_c0_status() on a guest exit, presumably in case there is
active state that needs saving if pre-emption occurs. However neither of
these bits are cleared again when returning to the guest.

This effectively gives the guest access to the FPU/DSP hardware after
the first guest exit even though it is not aware of its presence,
allowing FP instructions in guest user code to intermittently actually
execute instead of trapping into the guest OS for emulation. It will
then read & manipulate the hardware FP registers which technically
belong to the user process (e.g. QEMU), or are stale from another user
process. It can also crash the guest OS by causing an FP exception, for
which a guest exception handler won't have been registered.

First lets save and disable the FPU (and MSA) state with lose_fpu(1)
before entering the guest. This simplifies the problem, especially for
when guest FPU/MSA support is added in the future, and prevents FR=1 FPU
state being live when the FR bit gets cleared for the guest, which
according to the architecture causes the contents of the FPU and vector
registers to become UNPREDICTABLE.

We can then safely remove the enabling of the FPU in
kvm_mips_set_c0_status(), since there should never be any active FPU or
MSA state to save at pre-emption, which should plug the FPU leak.

DSP state is always live rather than being lazily restored, so for that
it is simpler to just clear the MX bit again when re-entering the guest.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+: 044f0f03eca0: MIPS: KVM: Deliver guest interrupts
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-04 18:32:14 +01:00
James Hogan c4c6f2cad9 KVM: MIPS: Disable HTW while in guest
Ensure any hardware page table walker (HTW) is disabled while in KVM
guest mode, as KVM doesn't yet set up hardware page table walking for
guest mappings so the wrong mappings would get loaded, resulting in the
guest hanging or crashing once it reaches userland.

The HTW is disabled and re-enabled around the call to
__kvm_mips_vcpu_run() which does the initial switch into guest mode and
the final switch out of guest context. Additionally it is enabled for
the duration of guest exits (i.e. kvm_mips_handle_exit()), getting
disabled again before returning back to guest or host.

In all cases the HTW is only disabled in normal kernel mode while
interrupts are disabled, so that the HTW doesn't get left disabled if
the process is preempted.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.17+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-04 16:23:33 +01:00
Dominik Dingel 31928aa586 KVM: remove unneeded return value of vcpu_postcreate
The return value of kvm_arch_vcpu_postcreate is not checked in its
caller.  This is okay, because only x86 provides vcpu_postcreate right
now and it could only fail if vcpu_load failed.  But that is not
possible during KVM_CREATE_VCPU (kvm_arch_vcpu_load is void, too), so
just get rid of the unchecked return value.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-01-23 13:24:52 +01:00
Pranith Kumar 83fe27ea53 rcu: Make SRCU optional by using CONFIG_SRCU
SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

   text    data     bss     dec     hex filename
   2007       0       0    2007     7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

   text    data     bss     dec     hex filename
 831552   64180   23944  919676   e087c arch/powerpc/boot/zImage : before
 829504   64180   23952  917636   e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
2015-01-06 11:04:29 -08:00
Radim Krčmář 13a34e067e KVM: remove garbage arg to *hardware_{en,dis}able
In the beggining was on_each_cpu(), which required an unused argument to
kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten.

Remove unnecessary arguments that stem from this.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-29 16:35:55 +02:00
Radim Krčmář 0865e636ae KVM: static inline empty kvm_arch functions
Using static inline is going to save few bytes and cycles.
For example on powerpc, the difference is 700 B after stripping.
(5 kB before)

This patch also deals with two overlooked empty functions:
kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c
  2df72e9bc KVM: split kvm_arch_flush_shadow
and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c.
  e790d9ef6 KVM: add kvm_arch_sched_in

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-29 16:35:55 +02:00
Radim Krčmář e790d9ef64 KVM: add kvm_arch_sched_in
Introduce preempt notifiers for architecture specific code.
Advantage over creating a new notifier in every arch is slightly simpler
code and guaranteed call order with respect to kvm_sched_in.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-21 18:45:21 +02:00
Paolo Bonzini cc568ead3c Patch queue for ppc - 2014-08-01
Highlights in this release include:
 
   - BookE: Rework instruction fetch, not racy anymore now
   - BookE HV: Fix ONE_REG accessors for some in-hardware registers
   - Book3S: Good number of LE host fixes, enable HV on LE
   - Book3S: Some misc bug fixes
   - Book3S HV: Add in-guest debug support
   - Book3S HV: Preload cache lines on context switch
   - Remove 440 support
 
 Alexander Graf (31):
       KVM: PPC: Book3s PR: Disable AIL mode with OPAL
       KVM: PPC: Book3s HV: Fix tlbie compile error
       KVM: PPC: Book3S PR: Handle hyp doorbell exits
       KVM: PPC: Book3S PR: Fix ABIv2 on LE
       KVM: PPC: Book3S PR: Fix sparse endian checks
       PPC: Add asm helpers for BE 32bit load/store
       KVM: PPC: Book3S HV: Make HTAB code LE host aware
       KVM: PPC: Book3S HV: Access guest VPA in BE
       KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
       KVM: PPC: Book3S HV: Access XICS in BE
       KVM: PPC: Book3S HV: Fix ABIv2 on LE
       KVM: PPC: Book3S HV: Enable for little endian hosts
       KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
       KVM: PPC: Deflect page write faults properly in kvmppc_st
       KVM: PPC: Book3S: Stop PTE lookup on write errors
       KVM: PPC: Book3S: Add hack for split real mode
       KVM: PPC: Book3S: Make magic page properly 4k mappable
       KVM: PPC: Remove 440 support
       KVM: Rename and add argument to check_extension
       KVM: Allow KVM_CHECK_EXTENSION on the vm fd
       KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
       KVM: PPC: Implement kvmppc_xlate for all targets
       KVM: PPC: Move kvmppc_ld/st to common code
       KVM: PPC: Remove kvmppc_bad_hva()
       KVM: PPC: Use kvm_read_guest in kvmppc_ld
       KVM: PPC: Handle magic page in kvmppc_ld/st
       KVM: PPC: Separate loadstore emulation from priv emulation
       KVM: PPC: Expose helper functions for data/inst faults
       KVM: PPC: Remove DCR handling
       KVM: PPC: HV: Remove generic instruction emulation
       KVM: PPC: PR: Handle FSCR feature deselects
 
 Alexey Kardashevskiy (1):
       KVM: PPC: Book3S: Fix LPCR one_reg interface
 
 Aneesh Kumar K.V (4):
       KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
       KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
       KVM: PPC: BOOK3S: PR: Emulate instruction counter
       KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page
 
 Anton Blanchard (2):
       KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
       KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
 
 Bharat Bhushan (10):
       kvm: ppc: bookehv: Added wrapper macros for shadow registers
       kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
       kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
       kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
       kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
       kvm: ppc: Add SPRN_EPR get helper function
       kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
       KVM: PPC: Booke-hv: Add one reg interface for SPRG9
       KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
       KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr
 
 Michael Neuling (1):
       KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling
 
 Mihai Caraman (8):
       KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
       KVM: PPC: e500: Fix default tlb for victim hint
       KVM: PPC: e500: Emulate power management control SPR
       KVM: PPC: e500mc: Revert "add load inst fixup"
       KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
       KVM: PPC: Book3s: Remove kvmppc_read_inst() function
       KVM: PPC: Allow kvmppc_get_last_inst() to fail
       KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
 
 Paul Mackerras (4):
       KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
       KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
       KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
       KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication
 
 Stewart Smith (2):
       Split out struct kvmppc_vcore creation to separate function
       Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJT21skAAoJECszeR4D/txgeFEP/AzJopN7s//W33CfyBqURHXp
 XALCyAw+S67gtcaTZbxomcG1xuT8Lj9WEw28iz3rCtAnJwIxsY63xrI1nXMzTaI2
 p1rC0ai5Qy+nlEbd6L78spZy/Nzh8DFYGWx78iUSO1mYD8xywJwtoiBA539pwp8j
 8N+mgn61Hwhv31bKtsZlmzXymVr/jbTp5LVuxsBLJwD2lgT49g+4uBnX2cG/iXkg
 Rzbh7LxoNNXrSPI8sYmTWu/81aeXteeX70ja6DHuV5dWLNTuAXJrh5EUfeAZqBrV
 aYcLWUYmIyB87txNmt6ZGVar2p3jr2Xhb9mKx+EN4dbehblanLc1PUqlHd0q3dKc
 Nt60ByqpZn+qDAK86dShSZLEe+GT3lovvE76CqVXD4Er+OUEkc9JoxhN1cof/Gb0
 o6uwZ2isXHRdGoZx5vb4s3UTOlwZGtoL/CyY/HD/ujYDSURkCGbxLj3kkecSY8ut
 QdDAWsC15BwsHtKLr5Zwjp2w+0eGq2QJgfvO0zqWFiz9k33SCBCUpwluFeqh27Hi
 aR5Wir3j+MIw9G8XlYlDJWYfi0h/SZ4G7hh7jSu26NBNBzQsDa8ow/cLzdMhdUwH
 OYSaeqVk5wiRb9to1uq1NQWPA0uRAx3BSjjvr9MCGRqmvn+FV5nj637YWUT+53Hi
 aSvg/U2npghLPPG2cihu
 =JuLr
 -----END PGP SIGNATURE-----

Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm

Patch queue for ppc - 2014-08-01

Highlights in this release include:

  - BookE: Rework instruction fetch, not racy anymore now
  - BookE HV: Fix ONE_REG accessors for some in-hardware registers
  - Book3S: Good number of LE host fixes, enable HV on LE
  - Book3S: Some misc bug fixes
  - Book3S HV: Add in-guest debug support
  - Book3S HV: Preload cache lines on context switch
  - Remove 440 support

Alexander Graf (31):
      KVM: PPC: Book3s PR: Disable AIL mode with OPAL
      KVM: PPC: Book3s HV: Fix tlbie compile error
      KVM: PPC: Book3S PR: Handle hyp doorbell exits
      KVM: PPC: Book3S PR: Fix ABIv2 on LE
      KVM: PPC: Book3S PR: Fix sparse endian checks
      PPC: Add asm helpers for BE 32bit load/store
      KVM: PPC: Book3S HV: Make HTAB code LE host aware
      KVM: PPC: Book3S HV: Access guest VPA in BE
      KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
      KVM: PPC: Book3S HV: Access XICS in BE
      KVM: PPC: Book3S HV: Fix ABIv2 on LE
      KVM: PPC: Book3S HV: Enable for little endian hosts
      KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
      KVM: PPC: Deflect page write faults properly in kvmppc_st
      KVM: PPC: Book3S: Stop PTE lookup on write errors
      KVM: PPC: Book3S: Add hack for split real mode
      KVM: PPC: Book3S: Make magic page properly 4k mappable
      KVM: PPC: Remove 440 support
      KVM: Rename and add argument to check_extension
      KVM: Allow KVM_CHECK_EXTENSION on the vm fd
      KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
      KVM: PPC: Implement kvmppc_xlate for all targets
      KVM: PPC: Move kvmppc_ld/st to common code
      KVM: PPC: Remove kvmppc_bad_hva()
      KVM: PPC: Use kvm_read_guest in kvmppc_ld
      KVM: PPC: Handle magic page in kvmppc_ld/st
      KVM: PPC: Separate loadstore emulation from priv emulation
      KVM: PPC: Expose helper functions for data/inst faults
      KVM: PPC: Remove DCR handling
      KVM: PPC: HV: Remove generic instruction emulation
      KVM: PPC: PR: Handle FSCR feature deselects

Alexey Kardashevskiy (1):
      KVM: PPC: Book3S: Fix LPCR one_reg interface

Aneesh Kumar K.V (4):
      KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
      KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
      KVM: PPC: BOOK3S: PR: Emulate instruction counter
      KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page

Anton Blanchard (2):
      KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
      KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()

Bharat Bhushan (10):
      kvm: ppc: bookehv: Added wrapper macros for shadow registers
      kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
      kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
      kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
      kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
      kvm: ppc: Add SPRN_EPR get helper function
      kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
      KVM: PPC: Booke-hv: Add one reg interface for SPRG9
      KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
      KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

Michael Neuling (1):
      KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

Mihai Caraman (8):
      KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
      KVM: PPC: e500: Fix default tlb for victim hint
      KVM: PPC: e500: Emulate power management control SPR
      KVM: PPC: e500mc: Revert "add load inst fixup"
      KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
      KVM: PPC: Book3s: Remove kvmppc_read_inst() function
      KVM: PPC: Allow kvmppc_get_last_inst() to fail
      KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

Paul Mackerras (4):
      KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
      KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
      KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
      KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication

Stewart Smith (2):
      Split out struct kvmppc_vcore creation to separate function
      Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

Conflicts:
	Documentation/virtual/kvm/api.txt
2014-08-05 09:58:11 +02:00
Linus Torvalds 8533ce7271 These are the x86, MIPS and s390 changes; PPC and ARM will come in a
few days.
 
 MIPS and s390 have little going on this release; just bugfixes, some
 small, some larger.
 
 The highlights for x86 are nested VMX improvements (Jan Kiszka), optimizations
 for old processor (up to Nehalem, by me and Bandan Das), and a lot of x86
 emulator bugfixes (Nadav Amit).
 
 Stephen Rothwell reported a trivial conflict with the tracing branch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJT300XAAoJEBvWZb6bTYby3V8QAJz+XyajnhJ8wH55Vxczz22L
 i2gtUGmBLhEXsBcaVKO4BBfek88lLzg0SGLjfW5wCMQmKtxVlrwTCXNkBoPGjapd
 NwHtWkMKym44PDhRovn7zkSumkxC43uFIBR/ebrhP6Bvhh9s+MnkQUxfw9ILB+YV
 EeKyEG8sSgxFCciuHbp3mIXpDcO6r/ldy6I7009OdyhLoMY+Kvmk7kRe9wtAivdg
 CGJi60QvGOn2RGRPOCEtF6UWr8Ae8fe1t84o0hkXPv/j3jtabzAatXKJa4dYNbIs
 7Mp4NQpxaGV6rq3WCYVeZRxGs+UReGDAS3Il4Z8C9eTOTooSfxdVr8acpM8PY6I8
 UmLT6ECLGycc4ELXrETtR+QLmiXACyJqyVxz4aiLV3kWSWfamKD3hBeQK9NizNcE
 VoPDl+PyISvR1tW4KstBuzfUWAEXi+gO78cqqFr/VW6cl7HKpA1DFQaPfGkYKDae
 2CPwcLwI5/M6RtSgkyXTkEqNZLc2BjldqSeM1lmWjhZVW56X2iqePUL46Vab3Yvt
 U+sELtwEE560NLN3hbaHUsLR1tcUix5w8vTzcXPxgoHQBszHCcAZTWd1XHulr64F
 rp/cangqtkPKcu5j1mNhQs38oLjHI1MUsbQrqFoD4tmHjQ75iXHRFzYGoIVKXyHG
 AnGbQzJzBcdAANhm3LW0
 =UXxV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM changes from Paolo Bonzini:
 "These are the x86, MIPS and s390 changes; PPC and ARM will come in a
  few days.

  MIPS and s390 have little going on this release; just bugfixes, some
  small, some larger.

  The highlights for x86 are nested VMX improvements (Jan Kiszka),
  optimizations for old processor (up to Nehalem, by me and Bandan Das),
  and a lot of x86 emulator bugfixes (Nadav Amit).

  Stephen Rothwell reported a trivial conflict with the tracing branch"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (104 commits)
  x86/kvm: Resolve shadow warnings in macro expansion
  KVM: s390: rework broken SIGP STOP interrupt handling
  KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
  KVM: vmx: remove duplicate vmx_mpx_supported() prototype
  KVM: s390: Fix memory leak on busy SIGP stop
  x86/kvm: Resolve shadow warning from min macro
  kvm: Resolve missing-field-initializers warnings
  Replace NR_VMX_MSR with its definition
  KVM: x86: Assertions to check no overrun in MSR lists
  KVM: x86: set rflags.rf during fault injection
  KVM: x86: Setting rflags.rf during rep-string emulation
  KVM: x86: DR6/7.RTM cannot be written
  KVM: nVMX: clean up nested_release_vmcs12 and code around it
  KVM: nVMX: fix lifetime issues for vmcs02
  KVM: x86: Defining missing x86 vectors
  KVM: x86: emulator injects #DB when RFLAGS.RF is set
  KVM: x86: Cleanup of rflags.rf cleaning
  KVM: x86: Clear rflags.rf on emulated instructions
  KVM: x86: popf emulation should not change RF
  KVM: x86: Clearing rflags.rf upon skipped emulated instruction
  ...
2014-08-04 12:16:46 -07:00
Alexander Graf 784aa3d7fb KVM: Rename and add argument to check_extension
In preparation to make the check_extension function available to VM scope
we add a struct kvm * argument to the function header and rename the function
accordingly. It will still be called from the /dev/kvm fd, but with a NULL
argument for struct kvm *.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-28 15:23:17 +02:00
Deng-Cheng Zhu ab76228a48 MIPS: KVM: Remove dead code of TLB index error in kvm_mips_emul_tlbwr()
It's impossible to fall into the error handling of the TLB index after
being masked by (KVM_MIPS_GUEST_TLB_SIZE - 1). Remove the dead code.

Reported-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:04 +02:00
Deng-Cheng Zhu db2fb7f202 MIPS: KVM: Skip memory cleaning in kvm_mips_commpage_init()
The commpage is allocated using kzalloc(), so there's no need of cleaning
the memory of the kvm_mips_commpage struct and its internal mips_coproc.

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:03 +02:00
Deng-Cheng Zhu d7d5b05faf MIPS: KVM: Rename files to remove the prefix "kvm_" and "kvm_mips_"
Since all the files are in arch/mips/kvm/, there's no need of the prefixes
"kvm_" and "kvm_mips_".

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:03 +02:00
Deng-Cheng Zhu b045c40620 MIPS: KVM: Remove unneeded volatile
The keyword volatile for idx in the TLB functions is unnecessary.

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:03 +02:00
Deng-Cheng Zhu d98403a525 MIPS: KVM: Simplify functions by removing redundancy
No logic changes inside.

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:02 +02:00
Deng-Cheng Zhu 6ad78a5c75 MIPS: KVM: Use KVM internal logger
Replace printks with kvm_[err|info|debug].

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:02 +02:00
Deng-Cheng Zhu d116e812f9 MIPS: KVM: Reformat code and comments
No logic changes inside.

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-30 16:52:01 +02:00
Deng-Cheng Zhu 8c9eb041cf MIPS: KVM: Fix memory leak on VCPU
kvm_arch_vcpu_free() is called in 2 code paths:

1) kvm_vm_ioctl()
       kvm_vm_ioctl_create_vcpu()
           kvm_arch_vcpu_destroy()
               kvm_arch_vcpu_free()
2) kvm_put_kvm()
       kvm_destroy_vm()
           kvm_arch_destroy_vm()
               kvm_mips_free_vcpus()
                   kvm_arch_vcpu_free()

Neither of the paths handles VCPU free. We need to do it in
kvm_arch_vcpu_free() corresponding to the memory allocation in
kvm_arch_vcpu_create().

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: stable@vger.kernel.org
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-25 14:17:03 +02:00
James Hogan ee1a725f44 MIPS: KVM: Remove redundant semicolon
Remove extra semicolon in kvm_arch_vcpu_dump_regs().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:05:59 +02:00
James Hogan c6c0a6637f MIPS: KVM: Remove redundant NULL checks before kfree()
The kfree() function already NULL checks the parameter so remove the
redundant NULL checks before kfree() calls in arch/mips/kvm/.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:05:46 +02:00
James Hogan 6e95bfd267 MIPS: KVM: Quieten kvm_info() logging
The logging from MIPS KVM is fairly noisy with kvm_info() in places
where it shouldn't be, such as on VM creation and migration to a
different CPU. Replace these kvm_info() calls with kvm_debug().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:05:37 +02:00
James Hogan d5c704d525 MIPS: KVM: Remove ifdef DEBUG around kvm_debug
kvm_debug() uses pr_debug() which is already compiled out in the absence
of a DEBUG define, so remove the unnecessary ifdef DEBUG lines around
kvm_debug() calls which are littered around arch/mips/kvm/.

As well as generally cleaning up, this prevents future bit-rot due to
DEBUG not being commonly used.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:05:28 +02:00
James Hogan 3d65483371 MIPS: KVM: Fix kvm_debug bit-rottage
Fix build errors when DEBUG is defined in arch/mips/kvm/.
 - The DEBUG code in kvm_mips_handle_tlbmod() was missing some variables.
 - The DEBUG code in kvm_mips_host_tlb_write() was conditional on an
   undefined "debug" variable.
 - The DEBUG code in kvm_mips_host_tlb_inv() accessed asid_map directly
   rather than using kvm_mips_get_user_asid(). Also fixed brace
   placement.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:05:20 +02:00
James Hogan 0fae34f464 MIPS: KVM: Make kvm_mips_comparecount_{func,wakeup} static
The kvm_mips_comparecount_func() and kvm_mips_comparecount_wakeup()
functions are only used within arch/mips/kvm/kvm_mips.c, so make them
static.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:04:59 +02:00
James Hogan f74a8e224e MIPS: KVM: Add count frequency KVM register
Expose the KVM guest CP0_Count frequency to userland via a new
KVM_REG_MIPS_COUNT_HZ register accessible with the KVM_{GET,SET}_ONE_REG
ioctls.

When the frequency is altered the bias is adjusted such that the guest
CP0_Count doesn't jump discontinuously or lose any timer interrupts.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:02:54 +02:00
James Hogan f82393426a MIPS: KVM: Add master disable count interface
Expose two new virtual registers to userland via the
KVM_{GET,SET}_ONE_REG ioctls.

KVM_REG_MIPS_COUNT_CTL is for timer configuration fields and just
contains a master disable count bit. This can be used by userland to
freeze the timer in order to read a consistent state from the timer
count value and timer interrupt pending bit. This cannot be done with
the CP0_Cause.DC bit because the timer interrupt pending bit (TI) is
also in CP0_Cause so it would be impossible to stop the timer without
also risking a race with an hrtimer interrupt and having to explicitly
check whether an interrupt should have occurred.

When the timer is re-enabled it resumes without losing time, i.e. the
CP0_Count value jumps to what it would have been had the timer not been
disabled, which would also be impossible to do from userland with
CP0_Cause.DC. The timer interrupt also cannot be lost, i.e. if a timer
interrupt would have occurred had the timer not been disabled it is
queued when the timer is re-enabled.

This works by storing the nanosecond monotonic time when the master
disable is set, and using it for various operations instead of the
current monotonic time (e.g. when recalculating the bias when the
CP0_Count is set), until the master disable is cleared again, i.e. the
timer state is read/written as it would have been at that time. This
state is exposed to userland via the read-only KVM_REG_MIPS_COUNT_RESUME
virtual register so that userland can determine the exact time the
master disable took effect.

This should allow userland to atomically save the state of the timer,
and later restore it.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:02:45 +02:00
James Hogan e30492bbe9 MIPS: KVM: Rewrite count/compare timer emulation
Previously the emulation of the CPU timer was just enough to get a Linux
guest running but some shortcuts were taken:
 - The guest timer interrupt was hard coded to always happen every 10 ms
   rather than being timed to when CP0_Count would match CP0_Compare.
 - The guest's CP0_Count register was based on the host's CP0_Count
   register. This isn't very portable and fails on cores without a
   CP_Count register implemented such as Ingenic XBurst. It also meant
   that the guest's CP0_Cause.DC bit to disable the CP0_Count register
   took no effect.
 - The guest's CP0_Count register was emulated by just dividing the
   host's CP0_Count register by 4. This resulted in continuity problems
   when used as a clock source, since when the host CP0_Count overflows
   from 0x7fffffff to 0x80000000, the guest CP0_Count transitions
   discontinuously from 0x1fffffff to 0xe0000000.

Therefore rewrite & fix emulation of the guest timer based on the
monotonic kernel time (i.e. ktime_get()). Internally a 32-bit count_bias
value is added to the frequency scaled nanosecond monotonic time to get
the guest's CP0_Count. The frequency of the timer is initialised to
100MHz and cannot yet be changed, but a later patch will allow the
frequency to be configured via the KVM_{GET,SET}_ONE_REG ioctl
interface.

The timer can now be stopped via the CP0_Cause.DC bit (by the guest or
via the KVM_SET_ONE_REG ioctl interface), at which point the current
CP0_Count is stored and can be read directly. When it is restarted the
bias is recalculated such that the CP0_Count value is continuous.

Due to the nature of hrtimer interrupts any read of the guest's
CP0_Count register while it is running triggers a check for whether the
hrtimer has expired, so that the guest/userland cannot observe the
CP0_Count passing CP0_Compare without queuing a timer interrupt. This is
also taken advantage of when stopping the timer to ensure that a pending
timer interrupt is queued.

This replaces the implementation of:
 - Guest read of CP0_Count
 - Guest write of CP0_Count
 - Guest write of CP0_Compare
 - Guest write of CP0_Cause
 - Guest read of HWR 2 (CC) with RDHWR
 - Host read of CP0_Count via KVM_GET_ONE_REG ioctl interface
 - Host write of CP0_Count via KVM_SET_ONE_REG ioctl interface
 - Host write of CP0_Compare via KVM_SET_ONE_REG ioctl interface
 - Host write of CP0_Cause via KVM_SET_ONE_REG ioctl interface

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:01:48 +02:00
James Hogan 3a0ba77408 MIPS: KVM: Migrate hrtimer to follow VCPU
When a VCPU is scheduled in on a different CPU, refresh the hrtimer used
for emulating count/compare so that it gets migrated to the same CPU.

This should prevent a timer interrupt occurring on a different CPU to
where the guest it relates to is running, which would cause the guest
timer interrupt not to be delivered until after the next guest exit.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:01:33 +02:00
James Hogan 044f0f03ec MIPS: KVM: Deliver guest interrupts after local_irq_disable()
When about to run the guest, deliver guest interrupts after disabling
host interrupts. This should prevent an hrtimer interrupt from being
handled after delivering guest interrupts, and therefore not delivering
the guest timer interrupt until after the next guest exit.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:01:10 +02:00
James Hogan 16fd5c1de4 MIPS: KVM: Add CP0_HWREna KVM register access
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
HWREna register. This is so that userland can save and restore its
value so that RDHWR instructions don't have to be emulated by the guest.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:01:03 +02:00
James Hogan 7767b7d2f7 MIPS: KVM: Add CP0_UserLocal KVM register access
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
UserLocal register. This is so that userland can save and restore its
value.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:00:57 +02:00
James Hogan f8be02daca MIPS: KVM: Add CP0_Count/Compare KVM register access
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
Count and Compare registers. These registers are special in that writing
to them has side effects (adjusting the time until the next timer
interrupt) and reading of Count depends on the time. Therefore add a
couple of callbacks so that different implementations (trap & emulate or
VZ) can implement them differently depending on what the hardware
provides.

The trap & emulate versions mostly duplicate what happens when a T&E
guest reads or writes these registers, so it inherits the same
limitations which can be fixed in later patches.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:00:44 +02:00
James Hogan 48a3c4e4cd MIPS: KVM: Move KVM_{GET,SET}_ONE_REG definitions into kvm_host.h
Move the KVM_{GET,SET}_ONE_REG MIPS register id definitions out of
kvm_mips.c to kvm_host.h so that they can be shared between multiple
source files. This allows register access to be indirected depending on
the underlying implementation (trap & emulate or VZ).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:00:35 +02:00
James Hogan fb6df0cdf0 MIPS: KVM: Add CP0_EPC KVM register access
Contrary to the comment, the guest CP0_EPC register cannot be set via
kvm_regs, since it is distinct from the guest PC. Add the EPC register
to the KVM_{GET,SET}_ONE_REG ioctl interface.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:00:26 +02:00
James Hogan b5dfc6c106 MIPS: KVM: Use tlb_write_random
When MIPS KVM needs to write a TLB entry for the guest it reads the
CP0_Random register, uses it to generate the CP_Index, and writes the
TLB entry using the TLBWI instruction (tlb_write_indexed()).

However there's an instruction for that, TLBWR (tlb_write_random()) so
use that instead.

This happens to also fix an issue with Ingenic XBurst cores where the
same TLB entry is replaced each time preventing forward progress on
stores due to alternating between TLB load misses for the instruction
fetch and TLB store misses.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 13:00:02 +02:00
James Hogan facaaec1a7 MIPS: KVM: Use local_flush_icache_range to fix RI on XBurst
MIPS KVM uses mips32_SyncICache to synchronise the icache with the
dcache after dynamically modifying guest instructions or writing guest
exception vector. However this uses rdhwr to get the SYNCI step, which
causes a reserved instruction exception on Ingenic XBurst cores.

It would seem to make more sense to use local_flush_icache_range()
instead which does the same thing but is more portable.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 12:59:54 +02:00
James Hogan 7006e2dfda MIPS: KVM: Allocate at least 16KB for exception handlers
Each MIPS KVM guest has its own copy of the KVM exception vector. This
contains the TLB refill exception handler at offset 0x000, the general
exception handler at offset 0x180, and interrupt exception handlers at
offset 0x200 in case Cause_IV=1. A common handler is copied to offset
0x2000 and offset 0x3000 is used for temporarily storing k1 during entry
from guest.

However the amount of memory allocated for this purpose is calculated as
0x200 rounded up to the next page boundary, which is insufficient if 4KB
pages are in use. This can lead to the common handler at offset 0x2000
being overwritten and infinitely recursive exceptions on the next exit
from the guest.

Increase the minimum size from 0x200 to 0x4000 to cover the full use of
the page.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 12:59:13 +02:00
Deng-Cheng Zhu 356d4c2040 MIPS: KVM: remove the stale memory alias support function unalias_gfn
The memory alias support has been removed since a1f4d39500 (KVM: Remove
memory alias support). So remove unalias_gfn from the MIPS port.

Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-29 21:38:21 +02:00
James Hogan 36c9549460 MIPS: KVM: Remove dead code in CP0 emulation
The code to check whether rd > MIPS_CP0_DESAVE is dead code, since
MIPS_CP0_DESAVE = 31 and rd is already masked with 0x1f. Remove it.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-19 17:01:50 +01:00
James Hogan 26f4f3b578 MIPS: KVM: Consult HWREna before emulating RDHWR
The ability to read hardware registers from userland with the RDHWR
instruction should depend upon the corresponding bit of the HWREna
register being set, otherwise a reserved instruction exception should be
generated.

However KVM's current emulation ignores the guest's HWREna and always
emulates RDHWR instructions even if the guest OS has disallowed them.

Therefore rework the RDHWR emulation code to check for privilege or the
corresponding bit in the guest HWREna bit. Also remove the #if 0 case
for the UserLocal register. I presume it was there for debug purposes
but it seems unnecessary now that the guest can control whether it
causes a guest exception.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-19 17:01:43 +01:00
James Hogan 1550567936 MIPS: KVM: Pass reserved instruction exceptions to guest
Previously a reserved instruction exception while in guest code would
cause a KVM internal error if kvm_mips_handle_ri() didn't recognise the
instruction (including a RDHWR from an unrecognised hardware register).

However the guest OS should really have the opportunity to catch the
exception so that it can take the appropriate actions such as sending a
SIGILL to the guest user process or emulating the instruction itself.

Therefore in these cases emulate a guest RI exception and only return
EMULATE_FAIL if that fails, being careful to revert the PC first in case
the exception occurred in a branch delay slot in which case the PC will
already point to the branch target.

Also turn the printk messages relating to these cases into kvm_debug
messages so that they aren't usually visible.

This allows crashme to run in the guest without killing the entire VM.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-19 17:01:34 +01:00
Paul Gortmaker 3b2663ca84 mips: delete non-required instances of include <linux/init.h>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6320/
2014-01-24 22:39:56 +01:00
James Hogan 08596b0a75 MIPS: KVM: remove shadow_tlb code
The kvm_mips_init_shadow_tlb() function is called from
kvm_arch_vcpu_init() and initialises entries 0 to
current_cpu_data.tlbsize-1 of the virtual cpu's shadow_tlb[64] array.

However newer cores with FTLBs can have a tlbsize > 64, for example the
ProAptiv I'm testing on has a total tlbsize of 576. This causes
kvm_mips_init_shadow_tlb() to overflow the shadow_tlb[64] array and
overwrite the comparecount_timer among other things, causing a lock up
when starting a KVM guest.

Aside from kvm_mips_init_shadow_tlb() which only initialises it, the
shadow_tlb[64] array is only actually used by the following functions:
 - kvm_shadow_tlb_put() & kvm_shadow_tlb_load()
     These are never called. The only call sites are #if 0'd out.
 - kvm_mips_dump_shadow_tlbs()
     This is never called.

It was originally added for trap & emulate, but turned out to be
unnecessary so it was disabled.

So instead of fixing the shadow_tlb initialisation code, lets just
remove the shadow_tlb[64] array and the above functions entirely. The
only functional change here is the removal of broken shadow_tlb
initialisation. The rest just deletes dead code.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Gleb Natapov <gleb@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6384/
2014-01-24 22:39:56 +01:00
James Hogan e36059e508 MIPS: KVM: use common EHINV aware UNIQUE_ENTRYHI
When KVM is enabled and TLB invalidation is supported,
kvm_mips_flush_host_tlb() can cause a machine check exception due to
multiple matching TLB entries. This can occur on shutdown even when KVM
hasn't been actively used.

Commit adb78de9eae8 (MIPS: mm: Move UNIQUE_ENTRYHI macro to a header
file) created a common UNIQUE_ENTRYHI in asm/tlb.h but it didn't update
the copy of UNIQUE_ENTRYHI in kvm_tlb.c to use it.

Commit 36b175451399 (MIPS: tlb: Set the EHINV bit for TLBINVF cores when
invalidating the TLB) later added TLB invalidation (EHINV) support to
the common UNIQUE_ENTRYHI.

Therefore make kvm_tlb.c use the EHINV aware UNIQUE_ENTRYHI
implementation in asm/tlb.h too.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Gleb Natapov <gleb@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6383/
2014-01-24 22:39:56 +01:00
Aneesh Kumar K.V 5587027ce9 kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 15:49:23 +02:00
Linus Torvalds ae7a835cc5 Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Gleb Natapov:
 "The highlights of the release are nested EPT and pv-ticketlocks
  support (hypervisor part, guest part, which is most of the code, goes
  through tip tree).  Apart of that there are many fixes for all arches"

Fix up semantic conflicts as discussed in the pull request thread..

* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
  ARM: KVM: Add newlines to panic strings
  ARM: KVM: Work around older compiler bug
  ARM: KVM: Simplify tracepoint text
  ARM: KVM: Fix kvm_set_pte assignment
  ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
  ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
  ARM: KVM: vgic: fix GICD_ICFGRn access
  ARM: KVM: vgic: simplify vgic_get_target_reg
  KVM: MMU: remove unused parameter
  KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
  KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
  KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
  KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
  KVM: PPC: Book3S: Fix compile error in XICS emulation
  KVM: PPC: Book3S PR: return appropriate error when allocation fails
  arch: powerpc: kvm: add signed type cast for comparation
  KVM: x86: add comments where MMIO does not return to the emulator
  KVM: vmx: count exits to userspace during invalid guest emulation
  KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
  kvm: optimize away THP checks in kvm_is_mmio_pfn()
  ...
2013-09-04 18:15:06 -07:00
David Daney ea69f28ddf mips/kvm: Make kvm_locore.S 64-bit buildable/safe.
We need to use more of the Macros in asm.h to allow kvm_locore.S to
build in a 64-bit kernel.

For 32-bit there is no change in the generated object code.

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-08-26 12:30:49 +03:00
David Daney bb48c2fc64 mips/kvm: Cleanup .push/.pop directives in kvm_locore.S
There are:
	.set	push
	.set	noreorder
	.set	noat
	 .
	 .
	 .
	.set	pop

Sequences all over the place in this file, but in some places the
final ".set pop" is erroneously converted to ".set push", so none of
these really do what they appear to.

Clean up the whole mess by moving ".set noreorder", ".set noat" to the
top, and get rid of everything else.

Generated object code is unchanged.

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-08-26 12:30:39 +03:00