linux

Commit Graph

Author	SHA1	Message	Date
Joerg Roedel	d4f8cf664e	KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa This patch implements logic to make sure that either a page-fault/page-fault-vmexit or a nested-page-fault-vmexit is propagated back to the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:40 +02:00
Joerg Roedel	02f59dc9f1	KVM: MMU: Introduce init_kvm_nested_mmu() This patch introduces the init_kvm_nested_mmu() function which is used to re-initialize the nested mmu when the l2 guest changes its paging mode. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:39 +02:00
Joerg Roedel	3d06b8bfd4	KVM: MMU: Introduce kvm_read_nested_guest_page() This patch introduces the kvm_read_guest_page_x86 function which reads from the physical memory of the guest. If the guest is running in guest-mode itself with nested paging enabled it will read from the guest's guest physical memory instead. The patch also changes changes the code to use this function where it is necessary. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:38 +02:00
Joerg Roedel	2329d46d21	KVM: MMU: Make walk_addr_generic capable for two-level walking This patch uses kvm_read_guest_page_tdp to make the walk_addr_generic functions suitable for two-level page table walking. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:38 +02:00
Joerg Roedel	ec92fe44e7	KVM: X86: Add kvm_read_guest_page_mmu function This patch adds a function which can read from the guests physical memory or from the guest's guest physical memory. This will be used in the two-dimensional page table walker. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:37 +02:00
Joerg Roedel	6539e738f6	KVM: MMU: Implement nested gva_to_gpa functions This patch adds the functions to do a nested l2_gva to l1_gpa page table walk. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:36 +02:00
Joerg Roedel	14dfe855f9	KVM: X86: Introduce pointer to mmu context used for gva_to_gpa This patch introduces the walk_mmu pointer which points to the mmu-context currently used for gva_to_gpa translations. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:35 +02:00
Joerg Roedel	c30a358d33	KVM: MMU: Add infrastructure for two-level page walker This patch introduces a mmu-callback to translate gpa addresses in the walk_addr code. This is later used to translate l2_gpa addresses into l1_gpa addresses. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:34 +02:00
Joerg Roedel	1e301feb07	KVM: MMU: Introduce generic walk_addr function This is the first patch in the series towards a generic walk_addr implementation which could walk two-dimensional page tables in the end. In this first step the walk_addr function is renamed into walk_addr_generic which takes a mmu context as an additional parameter. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:33 +02:00
Joerg Roedel	8df25a328a	KVM: MMU: Track page fault data in struct vcpu This patch introduces a struct with two new fields in vcpu_arch for x86: * fault.address * fault.error_code This will be used to correctly propagate page faults back into the guest when we could have either an ordinary page fault or a nested page fault. In the case of a nested page fault the fault-address is different from the original address that should be walked. So we need to keep track about the real fault-address. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:33 +02:00
Joerg Roedel	3241f22da8	KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu This patch changes is_rsvd_bits_set() function prototype to take only a kvm_mmu context instead of a full vcpu. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:32 +02:00
Joerg Roedel	52fde8df7d	KVM: MMU: Introduce kvm_init_shadow_mmu helper function Some logic of the init_kvm_softmmu function is required to build the Nested Nested Paging context. So factor the required logic into a seperate function and export it. Also make the whole init path suitable for more than one mmu context. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:32 +02:00
Joerg Roedel	cb659db8a7	KVM: MMU: Introduce inject_page_fault function pointer This patch introduces an inject_page_fault function pointer into struct kvm_mmu which will be used to inject a page fault. This will be used later when Nested Nested Paging is implemented. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:31 +02:00
Joerg Roedel	5777ed340d	KVM: MMU: Introduce get_cr3 function pointer This function pointer in the MMU context is required to implement Nested Nested Paging. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:31 +02:00
Joerg Roedel	1c97f0a04c	KVM: X86: Introduce a tdp_set_cr3 function This patch introduces a special set_tdp_cr3 function pointer in kvm_x86_ops which is only used for tpd enabled mmu contexts. This allows to remove some hacks from svm code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:30 +02:00
Joerg Roedel	f43addd461	KVM: MMU: Make set_cr3 a function pointer in kvm_mmu This is necessary to implement Nested Nested Paging. As a side effect this allows some cleanups in the SVM nested paging code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:29 +02:00
Joerg Roedel	c5a78f2b64	KVM: MMU: Make tdp_enabled a mmu-context parameter This patch changes the tdp_enabled flag from its global meaning to the mmu-context and renames it to direct_map there. This is necessary for Nested SVM with emulation of Nested Paging where we need an extra MMU context to shadow the Nested Nested Page Table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:28 +02:00
Joerg Roedel	957446afce	KVM: MMU: Check for root_level instead of long mode The walk_addr function checks for !is_long_mode in its 64 bit version. But what is meant here is a check for pae paging. Change the condition to really check for pae paging so that it also works with nested nested paging. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:27 +02:00
Jes Sorensen	7b91409822	KVM: x86: Emulate MSR_EBC_FREQUENCY_ID Some operating systems store data about the host processor at the time of installation, and when booted on a more uptodate cpu tries to read MSR_EBC_FREQUENCY_ID. This has been found with XP. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:27 +02:00
Jes Sorensen	b9a52c4b78	x86: Define MSR_EBC_FREQUENCY_ID Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:26 +02:00
Roedel, Joerg	b75f4eb341	KVM: SVM: Clean up rip handling in vmrun emulation This patch changes the rip handling in the vmrun emulation path from using next_rip to the generic kvm register access functions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:25 +02:00
Joerg Roedel	cda0008299	KVM: SVM: Restore correct registers after sel_cr0 intercept emulation This patch implements restoring of the correct rip, rsp, and rax after the svm emulation in KVM injected a selective_cr0 write intercept into the guest hypervisor. The problem was that the vmexit is emulated in the instruction emulation which later commits the registers right after the write-cr0 instruction. So the l1 guest will continue to run with the l2 rip, rsp and rax resulting in unpredictable behavior. This patch is not the final word, it is just an easy patch to fix the issue. The real fix will be done when the instruction emulator is made aware of nested virtualization. Until this is done this patch fixes the issue and provides an easy way to fix this in -stable too. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:24 +02:00
Joerg Roedel	f87f928882	KVM: MMU: Fix 32 bit legacy paging with NPT This patch fixes 32 bit legacy paging with NPT enabled. The mmu_check_root call on the top-level of the loop causes root_gfn to take values (in the tdp_enabled path) which are outside of guest memory. So the mmu_check_root call fails at some point in the loop interation causing the guest to tiple-fault. This patch changes the mmu_check_root calls to the places where they are really necessary. As a side-effect it introduces a check for the root of a pae page table too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:23 +02:00
Alexander Graf	26e673c300	KVM: PPC: Move of include to __KERNEL__ section We have to protect the include for linux/of.h by __KERNEL__ so it doesn't accidently get referenced outside. This patch fixes this and makes the tree compile again. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:23 +02:00
Alexander Graf	344941beb9	KVM: PPC: Fix compile error in e500_tlb.c The e500_tlb.c file didn't compile for me due to the following error: arch/powerpc/kvm/e500_tlb.c: In function ‘kvmppc_e500_shadow_map’: arch/powerpc/kvm/e500_tlb.c:300: error: format ‘%lx’ expects type ‘long unsigned int’, but argument 2 has type ‘gfn_t’ So let's explicitly cast the argument to make printk happy. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:22 +02:00
Kyle Moffett	21e537ba14	KVM: PPC: e500_tlb: Fix a minor copy-paste tracing bug The kvmppc_e500_stlbe_invalidate() function was trying to pass too many parameters to trace_kvm_stlb_inval(). This appears to be a bad copy-paste from a call to trace_kvm_stlb_write(). Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:21 +02:00
Alexander Graf	c5335f1765	KVM: PPC: Implement level interrupts for BookE BookE also wants to support level based interrupts, so let's implement all the necessary logic there. We need to trick a bit here because the irqprios are 1:1 assigned to architecture defined values. But since there is some space left there, we can just pick a random one and move it later on - it's internal anyways. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:20 +02:00
Alexander Graf	7b4203e8cb	KVM: PPC: Expose level based interrupt cap Now that we have all the level interrupt magic in place, let's expose the capability to user space, so it can make use of it! Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:19 +02:00
Alexander Graf	17bd158006	KVM: PPC: Implement Level interrupts on Book3S The current interrupt logic is just completely broken. We get a notification from user space, telling us that an interrupt is there. But then user space expects us that we just acknowledge an interrupt once we deliver it to the guest. This is not how real hardware works though. On real hardware, the interrupt controller pulls the external interrupt line until it gets notified that the interrupt was received. So in reality we have two events: pulling and letting go of the interrupt line. To maintain backwards compatibility, I added a new request for the pulling part. The letting go part was implemented earlier already. With this in place, we can now finally start guests that do not randomly stall and stop to work at random times. This patch implements above logic for Book3S. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:19 +02:00
Alexander Graf	591bd8e7b4	KVM: PPC: Enable napping only for Book3s_64 Before I incorrectly enabled napping also for BookE, which would result in needless dcache flushes. Since we only need to force enable napping on Book3s_64 because it doesn't go into MSR_POW otherwise, we can just #ifdef that code to this particular platform. Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:19 +02:00
Hollis Blanchard	ebc65874e9	KVM: PPC: allow ppc440gp to pass the compatibility check Match only the first part of cur_cpu_spec->platform. 440GP (the first 440 processor) is identified by the string "ppc440gp", while all later 440 processors use simply "ppc440". Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:18 +02:00
Hollis Blanchard	0b3bafc8e5	KVM: PPC: fix compilation of "dump tlbs" debug function Missing local variable. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:17 +02:00
Hollis Blanchard	082decf29a	KVM: PPC: initialize IVORs in addition to IVPR Developers can now tell at a glace the exact type of the premature interrupt, instead of just knowing that there was some premature interrupt. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:17 +02:00
Alexander Graf	296c19d0b4	KVM: PPC: Don't put MSR_POW in MSR On Book3S a mtmsr with the MSR_POW bit set indicates that the OS is in idle and only needs to be waked up on the next interrupt. Now, unfortunately we let that bit slip into the stored MSR value which is not what the real CPU does, so that we ended up executing code like this: r = mfmsr(); /* r containts MSR_POW */ mtmsr(r \| MSR_EE); This obviously breaks, as we're going into idle mode in code sections that don't expect to be idling. This patch masks MSR_POW out of the stored MSR value on wakeup, making guests happy again. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:16 +02:00
Alexander Graf	8b6db3bc96	KVM: PPC: Implement correct SID mapping on Book3s_32 Up until now we were doing segment mappings wrong on Book3s_32. For Book3s_64 we were using a trick where we know that a single mmu_context gives us 16 bits of context ids. The mm system on Book3s_32 instead uses a clever algorithm to distribute VSIDs across the available range, so a context id really only gives us 16 available VSIDs. To keep at least a few guest processes in the SID shadow, let's map a number of contexts that we can use as VSID pool. This makes the code be actually correct and shouldn't hurt performance too much. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:15 +02:00
Alexander Graf	ad0873763a	KVM: PPC: Force enable nap on KVM There are some heuristics in the PPC power management code that try to find out if the particular hardware we're running on supports proper power management or just hangs the machine when going into nap mode. Since we know that KVM is safe with nap, let's force enable it in the PV code once we're certain that we are on a KVM VM. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:15 +02:00
Alexander Graf	df08bd1026	KVM: PPC: Make PV mtmsrd L=1 work with r30 and r31 We had an arbitrary limitation in mtmsrd L=1 that kept us from using r30 and r31 as input registers. Let's get rid of that and get more potential speedups! Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:14 +02:00
Alexander Graf	9ee18b1e08	KVM: PPC: Update int_pending also on dequeue When having a decrementor interrupt pending, the dequeuing happens manually through an mtdec instruction. This instruction simply calls dequeue on that interrupt, so the int_pending hint doesn't get updated. This patch enables updating the int_pending hint also on dequeue, thus correctly enabling guests to stay in guest contexts more often. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:14 +02:00
Alexander Graf	512ba59ed9	KVM: PPC: Make PV mtmsr work with r30 and r31 So far we've been restricting ourselves to r0-r29 as registers an mtmsr instruction could use. This was bad, as there are some code paths in Linux actually using r30. So let's instead handle all registers gracefully and get rid of that stupid limitation Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:13 +02:00
Alexander Graf	cbe487fac7	KVM: PPC: Add mtsrin PV code This is the guest side of the mtsr acceleration. Using this a guest can now call mtsrin with almost no overhead as long as it ensures that it only uses it with (MSR_IR\|MSR_DR) == 0. Linux does that, so we're good. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:12 +02:00
Alexander Graf	df1bfa25d8	KVM: PPC: Put segment registers in shared page Now that the actual mtsr doesn't do anything anymore, we can move the sr contents over to the shared page, so a guest can directly read and write its sr contents from guest context. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:11 +02:00
Alexander Graf	8e8651783f	KVM: PPC: Interpret SR registers on demand Right now we're examining the contents of Book3s_32's segment registers when the register is written and put the interpreted contents into a struct. There are two reasons this is bad. For starters, the struct has worse real-time performance, as it occupies more ram. But the more important part is that with segment registers being interpreted from their raw values, we can put them in the shared page, allowing guests to mess with them directly. This patch makes the internal representation of SRs be u32s. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:11 +02:00
Alexander Graf	c1c88e2fa1	KVM: PPC: Move BAT handling code into spr handler The current approach duplicates the spr->bat finding logic and makes it harder to reuse the actually used variables. So let's move everything down to the spr handler. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:10 +02:00
Alexander Graf	7508e16c9f	KVM: PPC: Add feature bitmap for magic page We will soon add SR PV support to the shared page, so we need some infrastructure that allows the guest to query for features KVM exports. This patch adds a second return value to the magic mapping that indicated to the guest which features are available. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:09 +02:00
Alexander Graf	cb24c50826	KVM: PPC: Remove unused define The define VSID_ALL is unused. Let's remove it. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:08 +02:00
Alexander Graf	b9877ce299	KVM: PPC: Revert "KVM: PPC: Use kernel hash function" It turns out the in-kernel hash function is sub-optimal for our subtle hash inputs where every bit is significant. So let's revert to the original hash functions. This reverts commit 05340ab4f9a6626f7a2e8f9fe5397c61d494f445. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:08 +02:00
Alexander Graf	928d78be54	KVM: PPC: Move slb debugging to tracepoints This patch moves debugging printks for shadow SLB debugging over to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:07 +02:00
Alexander Graf	e7c1d14e3b	KVM: PPC: Make invalidation code more reliable There is a race condition in the pte invalidation code path where we can't be sure if a pte was invalidated already. So let's move the spin lock around to get rid of the race. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:06 +02:00
Alexander Graf	2e602847d9	KVM: PPC: Don't flush PTEs on NX/RO hit When hitting a no-execute or read-only data/inst storage interrupt we were flushing the respective PTE so we're sure it gets properly overwritten next. According to the spec, this is unnecessary though. The guest issues a tlbie anyways, so we're safe to just keep the PTE around and have it manually removed from the guest, saving us a flush. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:06 +02:00
Alexander Graf	4cb6b7ea0c	KVM: PPC: Preload magic page when in kernel mode When the guest jumps into kernel mode and has the magic page mapped, theres a very high chance that it will also use it. So let's detect that scenario and map the segment accordingly. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:05 +02:00
Alexander Graf	c60b4cf701	KVM: PPC: Add tracepoints for generic spte flushes The different ways of flusing shadow ptes have their own debug prints which use stupid old printk. Let's move them to tracepoints, making them easier available, faster and possible to activate on demand Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:04 +02:00
Alexander Graf	c22c31963b	KVM: PPC: Fix sid map search after flush After a flush the sid map contained lots of entries with 0 for their gvsid and hvsid value. Unfortunately, 0 can be a real value the guest searches for when looking up a vsid so it would incorrectly find the host's 0 hvsid mapping which doesn't belong to our sid space. So let's also check for the valid bit that indicated that the sid we're looking at actually contains useful data. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:03 +02:00
Alexander Graf	8696ee4312	KVM: PPC: Move pte invalidate debug code to tracepoint This patch moves the SPTE flush debug printk over to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:03 +02:00
Alexander Graf	4c4eea7769	KVM: PPC: Add tracepoint for generic mmu map This patch moves the generic mmu map debugging over to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:02 +02:00
Alexander Graf	82fdee7bce	KVM: PPC: Move book3s_64 mmu map debug print to trace point This patch moves Book3s MMU debugging over to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:01 +02:00
Alexander Graf	bed1ed9860	KVM: PPC: Move EXIT_DEBUG partially to tracepoints We have a debug printk on every exit that is usually #ifdef'ed out. Using tracepoints makes a lot more sense here though, as they can be dynamically enabled. This patch converts the most commonly used debug printks of EXIT_DEBUG to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:00 +02:00
Takuya Yoshikawa	55438cc751	KVM: ia64: define kvm_lapic_enabled() to fix a compile error The following patch commit 57ce1659316f4ca298919649f9b1b55862ac3826 KVM: x86: In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's ignored the fact that kvm_irq_delivery_to_apic() was also used by ia64. We define kvm_lapic_enabled() to fix a compile error caused by this. This will have the same effect as reverting the problematic patch for ia64. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:00 +02:00
Xiao Guangrong	30644b902c	KVM: MMU: lower the aduit frequency The audit is very high overhead, so we need lower the frequency to assure the guest is running. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:59 +02:00
Xiao Guangrong	eb2591865a	KVM: MMU: improve spte audit Both audit_mappings() and audit_sptes_have_rmaps() need to walk vcpu's page table, so we can do these checking in a spte walking Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:58 +02:00
Xiao Guangrong	49edf87806	KVM: MMU: improve active sp audit Both audit_rmap() and audit_write_protection() need to walk all active sp, so we can do these checking in a sp walking Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:57 +02:00
Xiao Guangrong	2f4f337248	KVM: MMU: move audit to a separate file Move the audit code from arch/x86/kvm/mmu.c to arch/x86/kvm/mmu_audit.c Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:57 +02:00
Xiao Guangrong	8b1fe17cc7	KVM: MMU: support disable/enable mmu audit dynamicly Add a r/w module parameter named 'mmu_audit', it can control audit enable/disable: enable: echo 1 > /sys/module/kvm/parameters/mmu_audit disable: echo 0 > /sys/module/kvm/parameters/mmu_audit This patch not change the logic Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:56 +02:00
Jes Sorensen	84e0cefa8d	KVM: Fix guest kernel crash on MSR_K7_CLK_CTL MSR_K7_CLK_CTL is a no longer documented MSR, which is only relevant on said old AMD CPU models. This change returns the expected value, which the Linux kernel is expecting to avoid writing back the MSR, plus it ignores all writes to the MSR. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:55 +02:00
Avi Kivity	9ed049c3b6	KVM: i8259: Make ICW1 conform to spec ICW is not a full reset, instead it resets a limited number of registers in the PIC. Change ICW1 emulation to only reset those registers. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:54 +02:00
Avi Kivity	7d9ddaedd8	KVM: x86 emulator: clean up control flow in x86_emulate_insn() x86_emulate_insn() is full of things like if (rc != X86EMUL_CONTINUE) goto done; break; consolidate all of those at the end of the switch statement. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:54 +02:00
Avi Kivity	a4d4a7c188	KVM: x86 emulator: fix group 11 decoding for reg != 0 These are all undefined. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:53 +02:00
Avi Kivity	b9eac5f4d1	KVM: x86 emulator: use single stage decoding for mov instructions Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:52 +02:00
Avi Kivity	e90aa41e6c	KVM: Don't save/restore MSR_IA32_PERF_STATUS It is read/only; restoring it only results in annoying messages. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:51 +02:00
Marcelo Tosatti	eaa48512ba	KVM: SVM: init_vmcb should reset vcpu->efer Otherwise EFER_LMA bit is retained across a SIPI reset. Fixes guest cpu onlining. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:51 +02:00
Marcelo Tosatti	678041ad9d	KVM: SVM: reset mmu context in init_vmcb Since commit `aad827034e` no mmu reinitialization is performed via init_vmcb. Zero vcpu->arch.cr0 and pass the reset value as a parameter to kvm_set_cr0. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:50 +02:00
Avi Kivity	c41a15dd46	KVM: Fix pio trace direction out = write, in = read, not the other way round. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:49 +02:00
Xiao Guangrong	8e0e8afa82	KVM: MMU: remove count_rmaps() Nothing is checked in count_rmaps(), so remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:49 +02:00
Xiao Guangrong	365fb3fdf6	KVM: MMU: rewrite audit_mappings_page() function There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection). This patch fix it by: - introduce gfn_to_pfn_atomic instead of gfn_to_pfn - get the mapping gfn from kvm_mmu_page_get_gfn() And it adds 'notrap' ptes check in unsync/direct sps Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:48 +02:00
Xiao Guangrong	bc32ce2152	KVM: MMU: fix wrong not write protected sp report The audit code reports some sp not write protected in current code, it's just the bug in audit_write_protection(), since: - the invalid sp not need write protected - using uninitialize local variable('gfn') - call kvm_mmu_audit() out of mmu_lock's protection Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:47 +02:00
Xiao Guangrong	0beb8d6604	KVM: MMU: check rmap for every spte The read-only spte also has reverse mapping, so fix the code to check them, also modify the function name to fit its doing Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:46 +02:00
Xiao Guangrong	9ad17b1001	KVM: MMU: fix compile warning in audit code fix: arch/x86/kvm/mmu.c: In function ‘kvm_mmu_unprotect_page’: arch/x86/kvm/mmu.c:1741: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c:1745: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘mmu_unshadow’: arch/x86/kvm/mmu.c:1761: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘set_spte’: arch/x86/kvm/mmu.c:2005: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 3 has type ‘gfn_t’ arch/x86/kvm/mmu.c: In function ‘mmu_set_spte’: arch/x86/kvm/mmu.c:2033: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 7 has type ‘gfn_t’ Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:46 +02:00
Jason Wang	23e7a7944f	KVM: pit: Do not check pending pit timer in vcpu thread Pit interrupt injection was done by workqueue, so no need to check pending pit timer in vcpu thread which could lead unnecessary unblocking of vcpu. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:45 +02:00
Alexander Graf	989044ee0f	KVM: PPC: Fix CONFIG_KVM_GUEST && !CONFIG_KVM case When CONFIG_KVM_GUEST is selected, but CONFIG_KVM is not, we were missing some defines in asm-offsets.c and included too many headers at other places. This patch makes above configuration work. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:44 +02:00
Avi Kivity	6230f7fc04	KVM: x86 emulator: simplify ALU opcode block decode further The ALU opcode block is very regular; introduce D6ALU() to define decode flags for 6 instructions at a time. Suggested by Paolo Bonzini. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:43 +02:00
Avi Kivity	217fc9cfca	KVM: Fix build error due to 64-bit division in nsec_to_cycles() Use do_div() instead. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:43 +02:00
Avi Kivity	34d1f4905e	KVM: x86 emulator: trap and propagate #DE from DIV and IDIV Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:42 +02:00
Avi Kivity	f6b3597bde	KVM: x86 emulator: add macros for executing instructions that may trap Like DIV and IDIV. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:41 +02:00
Avi Kivity	739ae40606	KVM: x86 emulator: simplify instruction decode flags for opcodes 0F 00-FF Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:41 +02:00
Avi Kivity	d269e3961a	KVM: x86 emulator: simplify instruction decode flags for opcodes E0-FF Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:40 +02:00
Avi Kivity	d2c6c7adb1	KVM: x86 emulator: simplify instruction decode flags for opcodes C0-DF Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:39 +02:00
Avi Kivity	50748613d1	KVM: x86 emulator: simplify instruction decode flags for opcodes A0-AF Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:38 +02:00
Avi Kivity	76e8e68d44	KVM: x86 emulator: simplify instruction decode flags for opcodes 80-8F Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:38 +02:00
Avi Kivity	48fe67b5f7	KVM: x86 emulator: simplify string instruction decode flags Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:37 +02:00
Avi Kivity	5315fbb223	KVM: x86 emulator: simplify ALU block (opcodes 00-3F) decode flags Use the new byte/word dual opcode decode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:36 +02:00
Avi Kivity	8d8f4e9f66	KVM: x86 emulator: support byte/word opcode pairs Many x86 instructions come in byte and word variants distinguished with bit 0 of the opcode. Add macros to aid in defining them. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:35 +02:00
Avi Kivity	081bca0e6b	KVM: x86 emulator: refuse SrcMemFAddr (e.g. LDS) with register operand SrcMemFAddr is not defined with the modrm operand designating a register instead of a memory address. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:35 +02:00
Gleb Natapov	d2ddd1c483	KVM: x86 emulator: get rid of "restart" in emulation context. x86_emulate_insn() will return 1 if instruction can be restarted without re-entering a guest. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:34 +02:00
Gleb Natapov	3e2f65d57a	KVM: x86 emulator: move string instruction completion check into separate function Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:33 +02:00
Gleb Natapov	6e2fb2cadd	KVM: x86 emulator: Rename variable that shadows another local variable. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:32 +02:00
Wei Yongjun	cc4feed57f	KVM: x86 emulator: add CALL FAR instruction emulation (opcode 9a) Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:31 +02:00
Alexander Graf	a3c321c6e2	KVM: S390: Export kvm_virtio.h As suggested by Christian, we should expose headers to user space with information that might be valuable there. The s390 virtio interface is one of those cases. It defines an ABI between hypervisor and guest, so it should be exposed to user space. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:30 +02:00
Alexander Graf	cefa33e2f8	KVM: S390: Add virtio hotplug add support The one big missing feature in s390-virtio was hotplugging. This is no more. This patch implements hotplug add support, so you can on the fly add new devices in the guest. Keep in mind that this needs a patch for qemu to actually leverage the functionality. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:29 +02:00
Alexander Graf	fc678d67fe	KVM: S390: take a full byte as ext_param indicator Currenty the ext_param field only distinguishes between "config change" and "vring interrupt". We can do a lot more with it though, so let's enable a full byte of possible values and constants to #defines while at it. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:29 +02:00
Xiao Guangrong	189be38db3	KVM: MMU: combine guest pte read between fetch and pte prefetch Combine guest pte read between guest pte check in the fetch path and pte prefetch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:28 +02:00
Xiao Guangrong	957ed9effd	KVM: MMU: prefetch ptes when intercepted guest #PF Support prefetch ptes when intercept guest #PF, avoid to #PF by later access If we meet any failure in the prefetch path, we will exit it and not try other ptes to avoid become heavy path Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:27 +02:00

1 2 3 4 5 ...

50806 Commits