Commit Graph

35613 Commits

Author SHA1 Message Date
Boaz Harrosh 6f9a35e2da [SCSI] bidirectional command support
At the block level bidi request uses req->next_rq pointer for a second
bidi_read request.
At Scsi-midlayer a second scsi_data_buffer structure is used for the
bidi_read part. This bidi scsi_data_buffer is put on
request->next_rq->special. Struct scsi_cmnd is not changed.

- Define scsi_bidi_cmnd() to return true if it is a bidi request and a
  second sgtable was allocated.

- Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
  from this command This API is to isolate users from the mechanics of
  bidi.

- Define scsi_end_bidi_request() to do what scsi_end_request() does but
  for a bidi request. This is necessary because bidi commands are a bit
  tricky here. (See comments in body)

- scsi_release_buffers() will also release the bidi_read scsi_data_buffer

- scsi_io_completion() on bidi commands will now call
  scsi_end_bidi_request() and return.

- The previous work done in scsi_init_io() is now done in a new
  scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
  The new scsi_init_io() will call the above twice if needed also for
  the bidi_read command. Only at this point is a command bidi.

- In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
  confused by a get-sense command that looks like bidi. This is done
  by puting NULL at request->next_rq, and restoring.

[jejb: update to sg_table and resolve conflicts
also update to blk-end-request and resolve conflicts]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:41 -06:00
Boaz Harrosh 30b0c37b27 [SCSI] implement scsi_data_buffer
In preparation for bidi we abstract all IO members of scsi_cmnd,
that will need to duplicate, into a substructure.

- Group all IO members of scsi_cmnd into a scsi_data_buffer
  structure.
- Adjust accessors to new members.
- scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
  scsi_cmnd. And work on it.
- Adjust scsi_init_io() and  scsi_release_buffers() for above
  change.
- Fix other parts of scsi_lib/scsi.c to members migration. Use
  accessors where appropriate.

- fix Documentation about scsi_cmnd in scsi_host.h

- scsi_error.c
  * Changed needed members of struct scsi_eh_save.
  * Careful considerations in scsi_eh_prep/restore_cmnd.

- sd.c and sr.c
  * sd and sr would adjust IO size to align on device's block
    size so code needs to change once we move to scsi_data_buff
    implementation.
  * Convert code to use scsi_for_each_sg
  * Use data accessors where appropriate.

- tgt: convert libsrp to use scsi_data_buffer

- isd200: This driver still bangs on scsi_cmnd IO members,
  so need changing

[jejb: rebased on top of sg_table patches fixed up conflicts
and used the synergy to eliminate use_sg and sg_count]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
Boaz Harrosh bb52d82f45 [SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable
If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more
insulated from scsi_lib changes. As a bonus it will also gain bidi
capability when it comes.

[jejb: rebase on to sg_table and fix up rejections]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
FUJITA Tomonori 03e7925d07 [SCSI] aic7xxx: fix warnings with CONFIG_PM disabled
CC [M]  drivers/scsi/aic7xxx/aic7xxx_osm_pci.o
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:148: warning: 'ahc_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:166: warning: 'ahc_linux_pci_dev_resume' defined but not used

This moves aic7xxx_pci_driver struct, removes some forward declarations,
and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
FUJITA Tomonori 67eb63364e [SCSI] aic79xx: fix warnings with CONFIG_PM disabled
CC [M]  drivers/scsi/aic7xxx/aic79xx_osm_pci.o
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:101: warning: 'ahd_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:121: warning: 'ahd_linux_pci_dev_resume' defined but not used

This moves aic79xx_pci_driver struct, removes some forward
declarations, and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
David Milburn 969ceffb66 [SCSI] aic7xxx: fix ahc_done check SCB_ACTIVE for tagged transactions
The driver only needs to check the SCB_ACTIVE flag if the SCB is not
in the untagged queue.

If the driver is in error recovery, you may end panic'ing on a TUR
that is in the untagged queue.

Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
SCB 3 done'd twice

This patch is included in Adaptec's 6.3.11 driver on their website.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
Thomas Bogendoerfer 2adbfa333a [SCSI] sgiwd93: use cached memory access to make driver work on IP28
SGI IP28 machines would need special treatment (enable adding addtional
wait states) when accessing memory uncached. To avoid this pain I
changed the driver to use only cached access to memory.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori 9d058ecfd4 [SCSI] zfcp: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori 149d6bafc4 [SCSI] ncr53c8xx: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori c1c9ce52c8 [SCSI] aic79xx: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori c372f4a82f [SCSI] hptiop: fix sense_buffer access bug
&cmnd->sense_buffer now zeroes the wrong thing.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:38 -06:00
Nathan Lynch de15c2017c [SCSI] sym53c8xx: fix bad memset argument in sym_set_cam_result_error
On a big powerpc box I got the following oops with 2.6.24-git2:

sym0: <1010-66> rev 0x1 at pci 0000:d0:01.0 irq 215
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
 target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31)
scsi 0:0:8:0: Direct-Access     IBM      ST318305LC       C509 PQ: 0
ANSI: 3
 target0:0:8: tagged command queuing enabled, command queue depth 16.
 target0:0:8: Beginning Domain Validation
 target0:0:8: asynchronous
 target0:0:8: wide asynchronous
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000038460
cpu 0x25: Vector: 300 (Data Access) at [c00000000f567840]
    pc: c000000000038460: .memcpy+0x60/0x280
    lr: d000000000050280: .sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
    sp: c00000000f567ac0
   msr: 8000000000009032
   dar: 0
 dsisr: 42000000
  current = 0xc000006d1e0af0a0
  paca    = 0xc0000000004afc00
    pid   = 0, comm = swapper
enter ? for help
[link register   ] d000000000050280
.sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
[c00000000f567ac0] c00000000f567b80 (unreliable)
[c00000000f567b80] d0000000000552b8 .sym_complete_error+0x12c/0x1bc [sym53c8xx]
[c00000000f567c20] d0000000000561a4 .sym_int_sir+0xaa4/0x1718 [sym53c8xx]
[c00000000f567d00] d000000000057e8c .sym_interrupt+0x4e4/0x6ec [sym53c8xx]
[c00000000f567dc0] d00000000004fdf4 .sym53c8xx_intr+0x6c/0xdc [sym53c8xx]
[c00000000f567e50] c0000000000a83e0 .handle_IRQ_event+0x7c/0xec
[c00000000f567ef0] c0000000000aa344 .handle_fasteoi_irq+0x130/0x1f0
[c00000000f567f90] c00000000002a538 .call_handle_irq+0x1c/0x2c
[c000004d5e0b3a90] c00000000000c320 .do_IRQ+0x108/0x1d0
[c000004d5e0b3b20] c000000000004790 hardware_interrupt_entry+0x18/0x1c

The memset() in sym_set_cam_result_error() would appear to be trashing
the scsi_cmnd struct instead of clearing sense_buffer.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:38 -06:00
Avi Kivity 0fce5623ba KVM: Move drivers/kvm/* to virt/kvm/
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 18:01:18 +02:00
Avi Kivity edf884172e KVM: Move arch dependent files to new directory arch/x86/kvm/
This paves the way for multiple architecture support.  Note that while
ioapic.c could potentially be shared with ia64, it is also moved.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 18:01:18 +02:00
Ryan Harper 9584bf2c93 KVM: VMX: Add printk_ratelimit in vmx_intr_assist
Add printk_ratelimit check in front of printk.  This prevents spamming
of the message during 32-bit ubuntu 6.06server install.  Previously, it
would hang during the partition formatting stage.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao 0711456c0d KVM: Portability: Move kvm_vm_stat to x86.h
This patch moves kvm_vm_stat to x86.h, and every arch
can define its own kvm_vm_stat in $arch.h

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao bfc6d222bd KVM: Portability: Move round_robin_prev_vcpu and tss_addr to kvm_arch
This patches moves two fields round_robin_prev_vcpu and tss to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao d7deeeb02c KVM: Portability: move vpic and vioapic to kvm_arch
This patches moves two fields vpid and vioapic to kvm_arch

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao f05e70ac03 KVM: Portability: Move mmu-related fields to kvm_arch
This patches moves mmu-related fields to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao d69fb81f05 KVM: Portability: Move memslot aliases to new struct kvm_arch
This patches create kvm_arch to hold arch-specific kvm fileds
and moves fields naliases and aliases to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao 77b4c255af KVM: Portability: Move kvm_vcpu_stat to x86.h
This patches moves kvm_vcpu_stat to x86.h, so every
arch can define its own kvm_vcpu_stat structure.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:10 +02:00
Zhang Xiantao d17fbbf738 KVM: Portability: Expand the KVM_VCPU_COMM in kvm_vcpu structure.
This patches removes KVM_COMM macro, original it is hold
kvm_vcpu common fields.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:09 +02:00
Zhang Xiantao d657a98e3c KVM: Portability: Move kvm_vcpu definition back to kvm.h
This patches moves kvm_vcpu definition to kvm.h, and finally
kvm.h includes x86.h.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:09 +02:00
Zhang Xiantao 1d737c8a68 KVM: Portability: Split mmu-related static inline functions to mmu.h
Since these functions need to know the details of kvm or kvm_vcpu structure,
it can't be put in x86.h.  Create mmu.h to hold them.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:09 +02:00
Zhang Xiantao ad312c7c79 KVM: Portability: Introduce kvm_vcpu_arch
Move all the architecture-specific fields in kvm_vcpu into a new struct
kvm_vcpu_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:58:09 +02:00
Zhang Xiantao 682c59a3f3 KVM: Portability: Move kvm{pic,ioapic} accesors to x86 specific code
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:22 +02:00
Marcelo Tosatti 2bacc55c7c KVM: MMU: emulated cmpxchg8b should be atomic on i386
Emulate cmpxchg8b atomically on i386. This is required to avoid a guest
pte walker from seeing a splitted write.

[avi: make it compile]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:22 +02:00
Joerg Roedel 62b9abaaf8 KVM: SVM: support writing 0 to K8 performance counter control registers
This lets SVM ignore writes of the value 0 to the performance counter control
registers.  Thus enabling them will still fail in the guest, but a write of 0
which keeps them disabled is accepted.  This is required to boot Windows
Vista 64bit.

[avi: avoid fall-thru in switch statement]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:22 +02:00
Joerg Roedel 722f6ecbcf KVM: LAPIC: minor debugging compile fix
This patch fixes a compile error of the LAPIC code with APIC debugging enabled.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Marcelo Tosatti 7819026eef KVM: MMU: Fix SMP shadow instantiation race
There is a race where VCPU0 is shadowing a pagetable entry while VCPU1
is updating it, which results in a stale shadow copy.

Fix that by comparing the contents of the cached guest pte with the
current guest pte after write-protecting the guest pagetable.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Joerg Roedel 1d07543414 KVM: SVM: Exit to userspace if write to cr8 and not using in-kernel apic
With this patch KVM on SVM will exit to userspace if the guest writes to CR8
and the in-kernel APIC is disabled.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Avi Kivity e833240f3c KVM: MMU: Use mmu_set_spte() for real-mode shadows
In addition to removing some duplicated code, this also handles the unlikely
case of real-mode code updating a guest page table.  This can happen when
one vcpu (in real mode) touches a second vcpu's (in protected mode) page
tables, or if a vcpu switches to real mode, touches page tables, and switches
back.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Avi Kivity bc750ba860 KVM: MMU: Adjust mmu_set_spte() debug code for gpte removal
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Avi Kivity 1c4f1fd6d5 KVM: MMU: Move set_pte() into guest paging mode independent code
As set_pte() no longer references either a gpte or the guest walker, we can
move it out of paging mode dependent code (which compiles twice and is
generally nasty).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Avi Kivity 2fbf4cf13f KVM: MMU: Remove walker argument to set_pte()
Unused.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:21 +02:00
Avi Kivity e3f9550422 KVM: MMU: Pass pte dirty flag to set_pte() instead of calculating it on-site
This allows us to remove its dependency on pt_element_t.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Avi Kivity b4ab019ce7 KVM: MMU: No need to pick up nx bit from guest pte
We already set it according to cumulative access permissions.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Avi Kivity 41074d07c7 KVM: MMU: Fix inherited permissions for emulated guest pte updates
When we emulate a guest pte write, we fail to apply the correct inherited
permissions from the parent ptes.  Now that we store inherited permissions
in the shadow page, we can use that to update the pte permissions correctly.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Avi Kivity bedbe4ee86 KVM: MMU: Move pte access calculation into a helper function
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Avi Kivity 8d87a03aea KVM: MMU: Set nx bit correctly on shadow ptes
While the page table walker correctly generates a guest page fault
if a guest tries to execute a non-executable page, the shadow code does
not mark it non-executable.  This means that if a guest accesses an nx
page first with a read access, then subsequent code fetch accesses will
succeed.

Fix by setting the nx bit on shadow ptes.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Avi Kivity fe135d2ceb KVM: MMU: Simplify calculation of pte access
The nx bit is awkwardly placed in the 63rd bit position; furthermore it
has a reversed meaning compared to the other bits, which means we can't use
a bitwise and to calculate compounded access masks.

So, we simplify things by creating a new 3-bit exec/write/user access word,
and doing all calculations in that.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:20 +02:00
Marcelo Tosatti b3e4e63fd9 KVM: MMU: Use cmpxchg for pte updates on walk_addr()
In preparation for multi-threaded guest pte walking, use cmpxchg()
when updating guest pte's. This guarantees that the assignment of the
dirty bit can't be lost if two CPU's are faulting the same address
simultaneously.

[avi: fix kunmap_atomic() parameters]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Avi Kivity 80a8119ca3 KVM: SVM: Trap access to the cr8 register
Later we may be able to use the virtual tpr feature, but for now,
just trap it.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Avi Kivity 6e3d5dfbad KVM: x86 emulator: Fix stack instructions on 64-bit mode
Stack instructions are always 64-bit on 64-bit mode; many of the
emulated stack instructions did not take that into account.  Fix by
adding a 'Stack' bitflag and setting the operand size appropriately
during the decode stage (except for 'push r/m', which is in a group
with a few other instructions, so it gets its own treatment).

This fixes random crashes on Vista x64.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Joerg Roedel 152ff9be2e KVM: SVM: Emulate read/write access to cr8
This patch adds code to emulate the access to the cr8 register to the x86
instruction emulator in kvm.  This is needed on svm, where there is no
hardware decode for control register access.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Avi Kivity e5314067f6 KVM: VMX: Avoid exit when setting cr8 if the local apic is in the kernel
With apic in userspace, we must exit to userspace after a cr8 write in order
to update the tpr.  But if the apic is in the kernel, the exit is unnecessary.

Noticed by Joerg Roedel.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Avi Kivity e934c9c1c8 KVM: x86 emulator: fix eflags preparation for emulation
We prepare eflags for the emulated instruction, then clobber it with an 'andl'.
Fix by popping eflags as the last thing in the sequence.

Patch taken from Xen (16143:959b4b92b6bf)

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:19 +02:00
Avi Kivity 7ee5d940f5 KVM: Use generalized exception queue for injecting #UD
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
Avi Kivity c1a5d4f990 KVM: Replace #GP injection by the generalized exception queue
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
Avi Kivity c3c91fee51 KVM: Replace page fault injection by the generalized exception queue
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
Avi Kivity 298101da2f KVM: Generalize exception injection mechanism
Instead of each subarch doing its own thing, add an API for queuing an
injection, and manage failed exception injection centerally (i.e., if
an inject failed due to a shadow page fault, we need to requeue it).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
Marcelo Tosatti 4bf8ed8dd2 KVM: MMU: Remove unused prev_shadow_ent variable from fetch()
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
npiggin@suse.de e4a533a416 KVM: Convert KVM from ->nopage() to ->fault()
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: kvm-devel@lists.sourceforge.net
Cc: avi@qumranet.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:18 +02:00
Hollis Blanchard 53e0aa7b65 KVM: Portability: Create kvm_arch_vcpu_runnable() function
This abstracts the detail of x86 hlt and INIT modes into a function.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Hollis Blanchard e01a1b570f KVM: Portability: Stop including x86-specific headers in kvm_main.c
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Hollis Blanchard e2174021cf KVM: Portability: Move IO device definitions to its own header file
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Hollis Blanchard d77a39d982 KVM: Portability: Move address types to their own header file
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Zhang Xiantao b1fd3d30ba KVM: Extend ioapic code to support iosapic
iosapic supports an additional mmio EOI register compared to ioapic.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Zhang Xiantao 0c7ac28d3d KVM: Replace dest_Lowest_Prio and dest_Fixed with self-defined macros
Change
  dest_Loest_Prio -> IOAPIC_LOWEST_PRIORITY
  dest_Fixed -> IOAPIC_FIXED

the original names are x86 specific, while the ioapic code will be reused
for ia64.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Zhang Xiantao 8be5453f95 KVM: Replace kvm_lapic with kvm_vcpu in ioapic/lapic interface
This patch replaces lapic structure with kvm_vcpu in ioapic.c, making ioapic
independent of the local apic, as required by ia64.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:17 +02:00
Carlo Marcelo Arenas Belon 2b5203ee68 KVM: SVM: Remove KVM specific defines for MSR_EFER
This patch removes the KVM specific defines for MSR_EFER that were being used
in the svm support file and migrates all references to use instead the ones
from the kernel headers that are used everywhere else and that have the same
values.

Signed-off-by: Carlo Marcelo Arenas Belon <carenas@sajinet.com.pe>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Avi Kivity fb56dbb31c KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM
Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.h
only if the arch actually supports it.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Zhang Xiantao d230878471 KVM: Correct kvm_init() error paths not freeing bad_pge.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Zhang Xiantao f77bc6a420 KVM: Portability: Move KVM_INTERRUPT vcpu ioctl to x86.c
Other archs doesn't need it.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Avi Kivity 018a98db74 KVM: x86 emulator: unify four switch statements into two
Unify the special instruction switch with the regular instruction switch,
and the two byte special instruction switch with the regular two byte
instruction switch.  That makes it much easier to find an instruction or
the place an instruction needs to be added in.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Avi Kivity 111de5d60c KVM: x86 emulator: unify two switches
The rep prefix cleanup left two switch () statements next to each other.
Unify them.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:16 +02:00
Avi Kivity b9fa9d6bc6 KVM: x86 emulator: Move rep processing before instruction execution
Currently rep processing is handled somewhere in the middle of instruction
processing.  Move it to a sensible place.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:15 +02:00
Guillaume Thouvenin d7e5117a25 KVM: x86 emulator: cmps instruction
Add emulation for the cmps instruction.  This lets OpenBSD boot on kvm.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:15 +02:00
Sheng Yang e8d8d7fe88 KVM: x86 emulator: Rename 'cr2' to 'memop'
Previous patches have removed the dependency on cr2; we can now stop passing
it to the emulator and rename uses to 'memop'.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:15 +02:00
Izik Eidus 448353caea KVM: MMU: mark pages that were inserted to the shadow pages table as accessed
Mark guest pages as accessed when removed from the shadow page tables for
better lru processing.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:15 +02:00
Avi Kivity eb9774f0d6 KVM: Remove misleading check for mmio during event injection
mmio was already handled in kvm_arch_vcpu_ioctl_run(), so no need to check
again.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:15 +02:00
Avi Kivity f21b8bf4cc KVM: x86 emulator: address size and operand size overrides are sticky
Current implementation is to toggle, which is incorrect.  Patch ported from
corresponding Xen code.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:14 +02:00
Guillaume Thouvenin 90e0a28f6b KVM: x86 emulator: Make a distinction between repeat prefixes F3 and F2
cmps and scas instructions accept repeat prefixes F3 and F2. So in
order to emulate those prefixed instructions we need to be able to know
if prefixes are REP/REPE/REPZ or REPNE/REPNZ. Currently kvm doesn't make
this distinction. This patch introduces this distinction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:14 +02:00
Zhang Xiantao e9f85cde99 KVM: Portability: Move unalias_gfn to arch dependent file
Non-x86 archs don't need this mechanism. Move it to arch, and
keep its interface in common.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:14 +02:00
Sheng Yang 83ff3b9d4a KVM: VMX: Remove the secondary execute control dependency on irqchip
The state of SECONDARY_VM_EXEC_CONTROL shouldn't depend on in-kernel IRQ chip,
this patch fix this.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:14 +02:00
Dan Kenigsberg 0771671749 KVM: Enhance guest cpuid management
The current cpuid management suffers from several problems, which inhibit
passing through the host feature set to the guest:

 - No way to tell which features the host supports

  While some features can be supported with no changes to kvm, others
  need explicit support.  That means kvm needs to vet the feature set
  before it is passed to the guest.

 - No support for indexed or stateful cpuid entries

  Some cpuid entries depend on ecx as well as on eax, or on internal
  state in the processor (running cpuid multiple times with the same
  input returns different output).  The current cpuid machinery only
  supports keying on eax.

 - No support for save/restore/migrate

  The internal state above needs to be exposed to userspace so it can
  be saved or migrated.

This patch adds extended cpuid support by means of three new ioctls:

 - KVM_GET_SUPPORTED_CPUID: get all cpuid entries the host (and kvm)
   supports

 - KVM_SET_CPUID2: sets the vcpu's cpuid table

 - KVM_GET_CPUID2: gets the vcpu's cpuid table, including hidden state

[avi: fix original KVM_SET_CPUID not removing nx on non-nx hosts as it did
      before]

Signed-off-by: Dan Kenigsberg <danken@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:13 +02:00
Avi Kivity 6d4e4c4fca KVM: Disallow fork() and similar games when using a VM
We don't want the meaning of guest userspace changing under our feet.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:13 +02:00
Avi Kivity 76c35c6e99 KVM: MMU: Rename 'release_page'
Rename the awkwardly named variable.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:12 +02:00
Avi Kivity 4db3531487 KVM: MMU: Rename variables of type 'struct kvm_mmu_page *'
These are traditionally named 'page', but even more traditionally, that name
is reserved for variables that point to a 'struct page'.  Rename them to 'sp'
(for "shadow page").

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:12 +02:00
Avi Kivity 1d28f5f4a4 KVM: Remove gpa_to_hpa()
Converting last uses along the way.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:12 +02:00
Avi Kivity 0d81f2966a KVM: MMU: Remove gva_to_hpa()
No longer used.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 3f3e7124f6 KVM: MMU: Simplify nonpaging_map()
Instead of passing an hpa, pass a regular struct page.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 1755fbcc66 KVM: MMU: Introduce gfn_to_gpa()
Converting a frame number to an address is tricky since the data type changes
size.  Introduce a function to do it.  This fixes an actual bug when
accessing guest ptes.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 38c335f1f5 KVM: MMU: Adjust page_header_update_slot() to accept a gfn instead of a gpa
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 230c9a8f23 KVM: MMU: Merge set_pte() and set_pte_common()
Since set_pte() is now the only caller of set_pte_common(), merge the two
functions.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 050e64992f KVM: MMU: Remove set_pde()
It is now identical to set_pte().

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity 4e542370c7 KVM: MMU: Remove extra gaddr parameter from set_pte_common()
Similar information is available in the gfn parameter, so use that.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:11 +02:00
Avi Kivity da928521b7 KVM: MMU: Move pse36 handling to the guest walker
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Avi Kivity 5fb07ddb18 KVM: MMU: Introduce and use gpte_to_gfn()
Instead of repretitively open-coding this.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Izik Eidus b238f7bc2d KVM: MMU: Code cleanup
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Avi Kivity d835dfecd0 KVM: Don't bother the mmu if cr3 load doesn't change cr3
If the guest requests just a tlb flush, don't take the vm lock and
drop the mmu context pointlessly.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Avi Kivity 79539cec0c KVM: MMU: Avoid unnecessary remote tlb flushes when guest updates a pte
If all we're doing is increasing permissions on a pte (typical for demand
paging), then there's not need to flush remote tlbs.  Worst case they'll
get a spurious page fault.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Avi Kivity 0f74a24c59 KVM: Add statistic for remote tlb flushes
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:10 +02:00
Avi Kivity e5a4c8cad9 KVM: MMU: Implement guest page fault bypass for nonpae
I spent an hour worrying why I see so many guest page faults on FC6 i386.
Turns out bypass wasn't implemented for nonpae.  Implement it so it doesn't
happen again.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Avi Kivity 26e5215fdc KVM: Split vcpu creation to avoid vcpu_load() before preemption setup
Split kvm_arch_vcpu_create() into kvm_arch_vcpu_create() and
kvm_arch_vcpu_setup(), enabling preemption notification between the two.
This mean that we can now do vcpu_load() within kvm_arch_vcpu_setup().

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Zhang Xiantao 0de10343b3 KVM: Portability: Split kvm_set_memory_region() to have an arch callout
Moving !user_alloc case to kvm_arch to avoid unnecessary
code logic in non-x86 platform.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Zhang Xiantao 3ad82a7e87 KVM: Recalculate mmu pages needed for every memory region change
Instead of incrementally changing the mmu cache size for every memory slot
operation, recalculate it from scratch.  This is simpler and safer.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Avi Kivity 6226686954 KVM: x86 emulator: prefetch up to 15 bytes of the instruction executed
Instead of fetching one byte at a time, prefetch 15 bytes (or until the next
page boundary) to avoid guest page table walks.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Avi Kivity 93a0039c8d KVM: x86 emulator: retire ->write_std()
Theoretically used to acccess memory known to be ordinary RAM, it was
never implemented.  It is questionable whether it is possible to implement
it correctly.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Izik Eidus b4231d6180 KVM: MMU: Selectively set PageDirty when releasing guest memory
Improve dirty bit setting for pages that kvm release, until now every page
that we released we marked dirty, from now only pages that have potential
to get dirty we mark dirty.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:09 +02:00
Izik Eidus 2065b3727e KVM: MMU: Fix potential memory leak with smp real-mode
When we map a page, we check whether some other vcpu mapped it for us and if
so, bail out.  But we should decrease the refcount on the page as we do so.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:08 +02:00
Hollis Blanchard 7faa8f6fcc KVM: Move misplaced comment
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:07 +02:00
Hollis Blanchard d40ccc6246 KVM: Correct consistent typo: "destory" -> "destroy"
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:07 +02:00
Hollis Blanchard 00fc9f5ae5 KVM: Remove unused "rmap_overflow" variable
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:07 +02:00
Avi Kivity 971535ff65 KVM: MMU: Remove unused variable
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Izik Eidus 3e021bf505 KVM: Simplify kvm_clear_guest_page()
Use kvm_write_guest_page() with empty_zero_page, instead of doing
kmap and memset.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Izik Eidus ec8d4eaefc KVM: MMU: Change guest pte access to kvm_{read,write}_guest()
Things are simpler and more regular this way.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Jan Kiszka 15b00f32d5 KVM: VMX: Force seg.base == (seg.sel << 4) in real mode
Ensure that segment.base == segment.selector << 4 when entering the real
mode on Intel so that the CPU will not bark at us.  This fixes some old
protected mode demo from http://www.x86.org/articles/pmbasics/tspec_a1_doc.htm.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Zhang Xiantao 54f1585a8d KVM: Portability: Move some function declarations to x86.h
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Zhang Xiantao ec6d273deb KVM: Move some static inline functions out from kvm.h into x86.h
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:06 +02:00
Zhang Xiantao 2b3ccfa0c5 KVM: Portability: Move vcpu regs enumeration definition to x86.h
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Zhang Xiantao ea4a5ff80c KVM: Portability: Move struct kvm_x86_ops definition to x86.h
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Zhang Xiantao cd6e8f87ef KVM: Portability: Move some macro definitions from kvm.h to x86.h
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Zhang Xiantao 56c6d28a9a KVM: Portability: MMU initialization and teardown split
Move out kvm_mmu init and exit functionality from kvm_main.c

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Zhang Xiantao 5bb064dcde KVM: Portability: Move kvm_vcpu_ioctl_get_dirty_log to arch-specific file
Meanwhile keep the interface in common, and leave as more logic in common
as possible.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Amit Shah 9327fd1195 KVM: Make unloading of FPU state when putting vcpu arch-independent
Instead of having each architecture do it individually, we
do this in the arch-independent code (just x86 as of now).

[avi: add svm to the mix, which was added to mainline during the
 2.6.24-rc process]

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:05 +02:00
Avi Kivity 4cee576493 KVM: MMU: Add some mmu statistics
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Avi Kivity ba1389b7a0 KVM: Extend stats support for VM stats
This is in addition to the current virtual cpu statistics.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Avi Kivity f2b5756bb3 KVM: Add instruction emulation statistics 2008-01-30 17:53:04 +02:00
Avi Kivity f096ed8588 KVM: Add fpu_reload counter
Measure the number of times we switch the fpu state.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Avi Kivity e1beb1d37c KVM: Replace 'light_exits' stat with 'host_state_reload'
This is a little more accurate (since it counts actual reloads, not potential
reloads), and reverses the sense of the statistic to measure a bad event like
most of the other stats (e.g. we want to minimize all counters).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Zhang Xiantao d19a9cd275 KVM: Portability: Add two hooks to handle kvm_create and destroy vm
Add two arch hooks to handle kvm_create_vm and kvm destroy_vm. Now, just
put io_bus init and destory in common.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Zhang Xiantao a16b043cc9 KVM: Remove __init attributes for kvm_init_debug and kvm_init_msr_list
Since their callers are not declared with __init.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:04 +02:00
Joe Perches 56919c5c97 KVM: Remove ptr comparisons to 0
Fix sparse warnings "Using plain integer as NULL pointer"

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Zhang Xiantao 8b0067913d KVM: Portability: Make kvm_vcpu_ioctl_translate arch dependent
Move kvm_vcpu_ioctl_translate to arch, since mmu would be put under arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Avi Kivity e08aa78ae5 KVM: VMX: Consolidate register usage in vmx_vcpu_run()
We pass vcpu, vmx->fail, and vmx->launched to assembly code, but all three
are fields within vmx.  Consolidate by only passing in vmx and offsets for
the rest.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Zhang Xiantao 018d00d2fe KVM: Portability: move KVM_CHECK_EXTENSION
Make KVM_CHECK_EXTENSION code into a function, all archs can define its
capability independently.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Sheng Yang a7e6c88a78 KVM: x86 emulator: modify 'lods', and 'stos' not to depend on CR2
The current 'lods' and 'stos' is depending on incoming CR2 rather than decode
memory address from registers.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Zhang Xiantao f8c16bbaa9 KVM: Portability: Move x86 specific code from kvm_init() to kvm_arch()
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:03 +02:00
Zhang Xiantao cb498ea2ce KVM: Portability: Combine kvm_init and kvm_init_x86
Will be called once arch module registers itself.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:02 +02:00
Zhang Xiantao e9b11c1755 KVM: Portability: Add vcpu and hardware management arch hooks
Add the following hooks:

  void decache_vcpus_on_cpu(int cpu);
  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu);
  void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu);
  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
  struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id);
  void kvm_arch_vcpu_destory(struct kvm_vcpu *vcpu);
  int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu);
  void kvm_arch_hardware_enable(void *garbage);
  void kvm_arch_hardware_disable(void *garbage);
  int kvm_arch_hardware_setup(void);
  void kvm_arch_hardware_unsetup(void);
  void kvm_arch_check_processor_compat(void *rtn);

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:02 +02:00
Zhang Xiantao 97896d04a1 KVM: Portability: Move kvm_x86_ops to x86.c
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:02 +02:00
Zhang Xiantao d825ed0a97 KVM: Portability: Move some includes to x86.c
Move some includes to x86.c from kvm_main.c, since the related functions
have been moved to x86.c

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:02 +02:00
Izik Eidus e0506bcba5 KVM: Change kvm_{read,write}_guest() to use copy_{from,to}_user()
This changes kvm_write_guest_page/kvm_read_guest_page to use
copy_to_user/read_from_user, as a result we get better speed
and better dirty bit tracking.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:02 +02:00
Izik Eidus 539cb6608c KVM: introduce gfn_to_hva()
Convert a guest frame number to the corresponding host virtual address.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Izik Eidus f9d46eb0e4 KVM: add kvm_is_error_hva()
Check for the "error hva", an address outside the user address space that
signals a bad gfn.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Avi Kivity 1a6f4d7fbd KVM: Simplify CPU_TASKS_FROZEN cpu notifier handling
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Izik Eidus 906e608b05 KVM: x86 emulator: remove 8 bytes operands emulator for call near instruction
it is removed beacuse it isnt supported on a real host

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Eddie Dong e5edaa01c4 KVM: VMX: wbinvd exiting
Add wbinvd VM Exit support to prepare for pass-through
device cache emulation and also enhance real time
responsiveness.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Eddie Dong 8a70cc3d0f KVM: VMX: Comment VMX primary/secondary exec ctl definitions
Add comments for secondary/primary Processor-Based VM-execution controls.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Avi Kivity 9c8cba3761 KVM: Fix faults during injection of real-mode interrupts
If vmx fails to inject a real-mode interrupt while fetching the interrupt
redirection table, it fails to record this in the vectoring information
field.  So we detect this condition and do it ourselves.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:01 +02:00
Avi Kivity 1155f76a81 KVM: VMX: Read & store IDT_VECTORING_INFO_FIELD
We'll want to write to it in order to fix real-mode irq injection problems,
but it is a read-only field.  Storing it in a variable solves that issue.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Avi Kivity 9c5623e3e4 KVM: VMX: Use vmx to inject real-mode interrupts
Instead of injecting real-mode interrupts by writing the interrupt frame into
guest memory, abuse vmx by injecting a software interrupt.  We need to
pretend the software interrupt instruction had a length > 0, so we have to
adjust rip backward.

This lets us not to mess with writing guest memory, which is complex and also
sleeps.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Dor Laor 12264760e4 KVM: Add make_page_dirty() to kvm_clear_guest_page()
Every write access to guest pages should be tracked.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Hollis Blanchard b6c7a5dccf KVM: Portability: Move x86 vcpu ioctl handlers to x86.c
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Hollis Blanchard d075206073 KVM: Portability: Move x86 FPU handling to x86.c
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Hollis Blanchard 8776e5194f KVM: Portability: Move x86 instruction emulation code to x86.c
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:53:00 +02:00
Hollis Blanchard 417bc3041f KVM: Portability: Make exported debugfs data architecture-specific
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:52:59 +02:00
Avi Kivity 1c73ef6650 KVM: x86 emulator: Hoist modrm and abs decoding into separate functions
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:52:59 +02:00
Uri Lublin 3b6fff198c KVM: Make mark_page_dirty() work for aliased pages too.
Recommended by Izik Eidus.

Signed-off-by: Uri Lublin <uril@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30 17:52:59 +02:00