Commit Graph

47587 Commits

Author SHA1 Message Date
Takuya Yoshikawa 4482b06c04 KVM: ia64: fix dirty_log_lock spin_lock section not to include get_dirty_log()
kvm_get_dirty_log() calls copy_to_user(). So we need to narrow the
dirty_log_lock spin_lock section not to include this.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:15 +03:00
Alexander Graf 4d29bdbf12 KVM: PPC: Make BAT only guest segments work
When a guest sets its SR entry to invalid, we may still find a
corresponding entry in a BAT. So we need to make sure we're not
faulting on invalid SR entries, but instead just claim them to be
BAT resolved.

This resolves breakage experienced when using libogc based guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:14 +03:00
Alexander Graf 3b249157c0 KVM: PPC: Use kernel hash function
The linux kernel already provides a hash function. Let's reuse that
instead of reinventing the wheel!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:13 +03:00
Alexander Graf a576f7a294 KVM: PPC: Remove obsolete kvmppc_mmu_find_pte
Initially we had to search for pte entries to invalidate them. Since
the logic has improved since then, we can just get rid of the search
function.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:12 +03:00
Sheng Yang 6c3f604117 KVM: x86: Enable AVX for guest
Enable Intel(R) Advanced Vector Extension(AVX) for guest.

The detection of AVX feature includes OSXSAVE bit testing. When OSXSAVE bit is
not set, even if AVX is supported, the AVX instruction would result in UD as
well. So we're safe to expose AVX bits to guest directly.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:10 +03:00
Avi Kivity 7ac77099ce KVM: Prevent internal slots from being COWed
If a process with a memory slot is COWed, the page will change its address
(despite having an elevated reference count).  This breaks internal memory
slots which have their physical addresses loaded into vmcs registers (see
the APIC access memory slot).

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:08 +03:00
Avi Kivity a8eeb04a44 KVM: Add mini-API for vcpu->requests
Makes it a little more readable and hackable.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:05 +03:00
Avi Kivity 36633f32ba KVM: i8259: simplify pic_irq_request() calling sequence
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:04 +03:00
Avi Kivity 073d46133a KVM: i8259: reduce excessive abstraction for pic_irq_request()
Part of the i8259 code pretends it isn't part of kvm, but we know better.
Reduce excessive abstraction, eliminating callbacks and void pointers.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:03 +03:00
Avi Kivity b74a07beed KVM: Remove kernel-allocated memory regions
Equivalent (and better) functionality is provided by user-allocated memory
regions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:01 +03:00
Avi Kivity a1f4d39500 KVM: Remove memory alias support
As advertised in feature-removal-schedule.txt.  Equivalent support is provided
by overlapping memory regions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:00 +03:00
Christian Borntraeger fc34531db3 KVM: s390: Don't exit SIE on SIGP sense running
Newer (guest) kernels use sigp sense running in their spinlock
implementation to check if the other cpu is running before yielding
the processor. This revealed some wrong guest settings, causing
unnecessary exits for every sigp sense running.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:59 +03:00
Christian Borntraeger 971eb77f87 KVM: s390: Fix build failure due to centralized vcpu locking patches
This patch fixes
ERROR: "__kvm_s390_vcpu_store_status" [arch/s390/kvm/kvm.ko] undefined!

triggered by
commit 3268c56840dcee78c3e928336550f4e1861504c4 (kvm.git)
Author: Avi Kivity <avi@redhat.com>
Date:   Thu May 13 12:21:46 2010 +0300
    KVM: s390: Centrally lock arch specific vcpu ioctls

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:58 +03:00
Avi Kivity d1ac91d8a2 KVM: Consolidate load/save temporary buffer allocation and freeing
Instead of three temporary variables and three free calls, have one temporary
variable (with four names) and one free call.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:57 +03:00
Avi Kivity a1a005f36e KVM: Fix xsave and xcr save/restore memory leak
We allocate temporary kernel buffers for these structures, but never free them.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:56 +03:00
Wei Yongjun 7d5993d63f KVM: x86 emulator: fix group3 instruction decoding
Group 3 instruction with ModRM reg field as 001 is
defined as test instruction under AMD arch, and
emulate_grp3() is ready for emulate it, so fix the
decoding.

static inline int emulate_grp3(...)
{
	...
	switch (c->modrm_reg) {
	case 0 ... 1:   /* test */
		emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
	...
}

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:55 +03:00
Asias He 6045be5dea KVM: PPC: fix uninitialized variable warning in kvm_ppc_core_deliver_interrupts
Fixes:
arch/powerpc/kvm/booke.c: In function 'kvmppc_core_deliver_interrupts':
arch/powerpc/kvm/booke.c:147: warning: 'msr_mask' may be used uninitialized in this function

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:53 +03:00
Chris Lalancette e7dca5c0eb KVM: x86: Allow any LAPIC to accept PIC interrupts
If the guest wants to accept timer interrupts on a CPU other
than the BSP, we need to remove this gate.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:50 +03:00
Chris Lalancette 33572ac0ad KVM: x86: Introduce a workqueue to deliver PIT timer interrupts
We really want to "kvm_set_irq" during the hrtimer callback,
but that is risky because that is during interrupt context.
Instead, offload the work to a workqueue, which is a bit safer
and should provide most of the same functionality.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:49 +03:00
Wei Yongjun c37eda1384 KVM: x86 emulator: fix pusha instruction emulation
emulate pusha instruction only writeback the last
EDI register, but the other registers which need
to be writeback is ignored. This patch fixed it.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:48 +03:00
Zachary Amsden bd371396b3 KVM: x86: fix -DDEBUG oops
Fix a slight error with assertion in local APIC code.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:46 +03:00
Xiao Guangrong 1047df1fb6 KVM: MMU: don't walk every parent pages while mark unsync
While we mark the parent's unsync_child_bitmap, if the parent is already
unsynced, it no need walk it's parent, it can reduce some unnecessary
workload

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:45 +03:00
Xiao Guangrong 7a8f1a74e4 KVM: MMU: clear unsync_child_bitmap completely
In current code, some page's unsync_child_bitmap is not cleared completely
in mmu_sync_children(), for example, if two PDPEs shard one PDT, one of
PDPE's unsync_child_bitmap is not cleared.

Currently, it not harm anything just little overload, but it's the prepare
work for the later patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:44 +03:00
Xiao Guangrong ebdea638df KVM: MMU: cleanup for __mmu_unsync_walk()
Decrease sp->unsync_children after clear unsync_child_bitmap bit

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:43 +03:00
Xiao Guangrong be71e061d1 KVM: MMU: don't mark pte notrap if it's just sync transient
If the sync-sp just sync transient, don't mark its pte notrap

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:42 +03:00
Xiao Guangrong f918b44352 KVM: MMU: avoid double write protected in sync page path
The sync page is already write protected in mmu_sync_children(), don't
write protected it again

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:41 +03:00
Xiao Guangrong cb83cad2e7 KVM: MMU: cleanup for dirty page judgment
Using wrap function to cleanup page dirty judgment

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:39 +03:00
Xiao Guangrong ac3cd03cca KVM: MMU: rename 'page' and 'shadow_page' to 'sp'
Rename 'page' and 'shadow_page' to 'sp' to better fit the context

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:38 +03:00
Sheng Yang 2d5b5a6655 KVM: x86: XSAVE/XRSTOR live migration support
This patch enable save/restore of xsave state.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:37 +03:00
Denis Kirjanov 69b61833f7 KVM: PPC: fix build warning in kvm_arch_vcpu_ioctl_run
Fix compile warning:
  CC [M]  arch/powerpc/kvm/powerpc.o
  arch/powerpc/kvm/powerpc.c: In function 'kvm_arch_vcpu_ioctl_run':
  arch/powerpc/kvm/powerpc.c:290: warning: 'gpr' may be used uninitialized in this function
  arch/powerpc/kvm/powerpc.c:290: note: 'gpr' was declared here

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:36 +03:00
Avi Kivity 2390218b6a KVM: Fix mov cr3 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:35 +03:00
Avi Kivity a83b29c6ad KVM: Fix mov cr4 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:34 +03:00
Avi Kivity 49a9b07edc KVM: Fix mov cr0 #GP at wrong instruction
On Intel, we call skip_emulated_instruction() even if we injected a #GP,
resulting in the #GP pointing at the wrong address.

Fix by injecting the exception and skipping the instruction at the same place,
so we can do just one or the other.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:46:32 +03:00
Dexuan Cui 2acf923e38 KVM: VMX: Enable XSAVE/XRSTOR for guest
This patch enable guest to use XSAVE/XRSTOR instructions.

We assume that host_xcr0 would use all possible bits that OS supported.

And we loaded xcr0 in the same way we handled fpu - do it as late as we can.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:31 +03:00
Avi Kivity f495c6e5e8 KVM: VMX: Fix incorrect rcu deref in rmode_tss_base()
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:30 +03:00
Andi Kleen a24e809902 KVM: Fix unused but set warnings
No real bugs in this one.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:29 +03:00
Xiao Guangrong 3b5d132186 KVM: MMU: delay local tlb flush
delay local tlb flush until enter guest moden, it can reduce vpid flush
frequency and reduce remote tlb flush IPI(if KVM_REQ_TLB_FLUSH bit is
already set, IPI is not sent)

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:26 +03:00
Xiao Guangrong 5304efde6a KVM: MMU: use wrapper function to flush local tlb
Use kvm_mmu_flush_tlb() function instead of calling
kvm_x86_ops->tlb_flush(vcpu) directly.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:25 +03:00
Xiao Guangrong 4f78fd08e9 KVM: MMU: remove unnecessary remote tlb flush
This remote tlb flush is no necessary since we have synced while
sp is zapped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:24 +03:00
Xiao Guangrong 4b9d3a0451 KVM: VMX: fix rcu usage warning in init_rmode()
fix:

[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
include/linux/kvm_host.h:258 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by qemu-system-x86/3796:
 #0:  (&vcpu->mutex){+.+.+.}, at: [<ffffffffa0217fd8>] vcpu_load+0x1a/0x66 [kvm]

stack backtrace:
Pid: 3796, comm: qemu-system-x86 Not tainted 2.6.34 #25
Call Trace:
 [<ffffffff81070ed1>] lockdep_rcu_dereference+0x9d/0xa5
 [<ffffffffa0214fdf>] gfn_to_memslot_unaliased+0x65/0xa0 [kvm]
 [<ffffffffa0216139>] gfn_to_hva+0x22/0x4c [kvm]
 [<ffffffffa0216217>] kvm_write_guest_page+0x2a/0x7f [kvm]
 [<ffffffffa0216286>] kvm_clear_guest_page+0x1a/0x1c [kvm]
 [<ffffffffa0278239>] init_rmode+0x3b/0x180 [kvm_intel]
 [<ffffffffa02786ce>] vmx_set_cr0+0x350/0x4d3 [kvm_intel]
 [<ffffffffa02274ff>] kvm_arch_vcpu_ioctl_set_sregs+0x122/0x31a [kvm]
 [<ffffffffa021859c>] kvm_vcpu_ioctl+0x578/0xa3d [kvm]
 [<ffffffff8106624c>] ? cpu_clock+0x2d/0x40
 [<ffffffff810f7d86>] ? fget_light+0x244/0x28e
 [<ffffffff810709b9>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff8110501b>] vfs_ioctl+0x32/0xa6
 [<ffffffff81105597>] do_vfs_ioctl+0x47f/0x4b8
 [<ffffffff813ae654>] ? sub_preempt_count+0xa3/0xb7
 [<ffffffff810f7da8>] ? fget_light+0x266/0x28e
 [<ffffffff810f7c53>] ? fget_light+0x111/0x28e
 [<ffffffff81105617>] sys_ioctl+0x47/0x6a
 [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:23 +03:00
Gui Jianfeng 1760dd4939 KVM: VMX: rename vpid_sync_vcpu_all() to vpid_sync_vcpu_single()
The name "pid_sync_vcpu_all" isn't appropriate since it just affect
a single vpid, so rename it to vpid_sync_vcpu_single().

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:16 +03:00
Gui Jianfeng b9d762fa79 KVM: VMX: Add all-context INVVPID type support
Add all-context INVVPID type support.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:46:04 +03:00
Xiao Guangrong 0671a8e75d KVM: MMU: reduce remote tlb flush in kvm_mmu_pte_write()
collect remote tlb flush in kvm_mmu_pte_write() path

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:28 +03:00
Xiao Guangrong f41d335a02 KVM: MMU: traverse sp hlish safely
Now, we can safely to traverse sp hlish

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:28 +03:00
Xiao Guangrong d98ba05365 KVM: MMU: gather remote tlb flush which occurs during page zapped
Using kvm_mmu_prepare_zap_page() and kvm_mmu_zap_page() instead of
kvm_mmu_zap_page() that can reduce remote tlb flush IPI

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong 103ad25a86 KVM: MMU: don't get free page number in the loop
In the later patch, we will modify sp's zapping way like below:

	kvm_mmu_prepare_zap_page A
	kvm_mmu_prepare_zap_page B
	kvm_mmu_prepare_zap_page C
	....
	kvm_mmu_commit_zap_page

[ zaped multiple sps only need to call kvm_mmu_commit_zap_page once ]

In __kvm_mmu_free_some_pages() function, the free page number is
getted form 'vcpu->kvm->arch.n_free_mmu_pages' in loop, it will
hinders us to apply kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page()
since kvm_mmu_prepare_zap_page() not free sp.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong 7775834a23 KVM: MMU: split the operations of kvm_mmu_zap_page()
Using kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page() to
split kvm_mmu_zap_page() function, then we can:

- traverse hlist safely
- easily to gather remote tlb flush which occurs during page zapped

Those feature can be used in the later patches

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong 7ae680eb2d KVM: MMU: introduce some macros to cleanup hlist traverseing
Introduce for_each_gfn_sp() and for_each_gfn_indirect_valid_sp() to
cleanup hlist traverseing

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:27 +03:00
Xiao Guangrong 03116aa57e KVM: MMU: skip invalid sp when unprotect page
In kvm_mmu_unprotect_page(), the invalid sp can be skipped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:39:26 +03:00
Gui Jianfeng 518c8aee5c KVM: VMX: Make sure single type invvpid is supported before issuing invvpid instruction
According to SDM, we need check whether single-context INVVPID type is supported
before issuing invvpid instruction.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:39:26 +03:00