openkylin/qemu - qemu - 红山开源项目托管

Commit Graph

Author	SHA1	Message	Date
Emilio G. Cota	37b995f6e7	target-i386: remove helper_lock() It's been superseded by the atomic helpers. The use of the atomic helpers provides a significant performance and scalability improvement. Below is the result of running the atomic_add-test microbenchmark with: $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n , where $n is the number of threads and $r is the allowed range for the additions. The scenarios measured are: - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset) - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper - master: before this patchset Results sorted in ascending range, i.e. descending degree of contention. Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64 Opteron 6376 cores. atomic_add-bench: 5000000 ops/thread, [0,1] range 25 ++---------+----------+---------+----------+----------+----------+---++ + atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 +Emaster +-N--+ ++ \|\| \| \|++ \| \|\| \| 15 +++ ++ \|N\| \| \|+\| \| 10 ++\| ++ \|+\|+ \| \| \| -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E\| \|+E+E+- +++ +E+------+E+-- \| 5 ++\|+ ++ \|+N+H+--- +++ \| ++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- \| 0 ++---------+-----H----+---H-----+----------+----------+----------+---H+ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,2] range 25 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 ++master +-N--+ ++ \|E\| \| \|++ \| \|\|E \| 15 ++\| ++ \|N\|\| \| \|+\|\| ---+E+------+E+-----+E+------+E\| 10 ++\| \| ---+E+------+E+-----+E+--- +++ +++ \|\|H+E+--+E+-- \| \|+++++ \| \| \|\| \| 5 ++\|+H+-- +++ ++ \|+N+ - ---+H+------+H+------ \| + +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H\| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,8] range 40 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| 35 +cmpxchg +-H--+ ++ \| master +-N--+ ---+E+------+E+------+E+-----+E+------+E\| 30 ++\| ---+E+-- +++ ++ \| \| -+E+--- \| 25 ++E ---- +++ ++ \|+++++ -+E+ \| 20 +E+ E-- +++ ++ \|H\|+++ \| \|+\| +H+------- \| 15 ++H+ ---+++ +H+------ ++ \|N++H+-- +++--- +H+------++\| 10 ++ +++ - +++ ---+H+ +++ +H+ \| \| +H+-----+H+------+H+-- \| 5 ++\| +++ ++ ++N+N+--+N++ + + + + + \| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,128] range 160 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 140 +cmpxchg +-H--+ +++ +++ ++ \| master +-N--+ E--------E------+E+------++\| 120 ++ --\| \| +++ E+ \| -- +++ +++ ++\| 100 ++ - ++ \| +++- +++ ++\| 80 ++ -+E+ -+H+------+H+------H--------++ \| ---- ---- +++ H\| \| ---+E+-----+E+- ---+H+ ++\| 60 ++ +E+--- +++ ---+H+--- ++ \| --+++ ---+H+-- \| 40 ++ +E+-+H+--- ++ \| +H+ \| 20 +EE+ ++ +N+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,1024] range 350 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 300 +cmpxchg +-H--+ +++ \| master +-N--+ +++ \|\| \| +++ \| ----E\| 250 ++ \| ----E---- ++ \| ----E--- \| ---+H\| 200 ++ -+E+--- +++ ---+H+--- ++ \| ---- -+H+-- \| \| +E+ +++ ---- +++ \| 150 ++ ---+++ ---+H+- ++ \| --- -+H+-- \| 100 ++ ---+E+ ---- +++ ++ \| +++ ---+E+-----+H+- \| \| -+E+------+H+-- \| 50 ++ +E+ ++ +EE+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads hi-res: http://imgur.com/a/fMRmq For master I stopped measuring master after 8 threads, because there is little point in measuring the well-known performance collapse of a contended lock. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Emilio G. Cota	ae03f8de45	target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers The diff here is uglier than necessary. All this does is to turn FOO into: if (s->prefix & PREFIX_LOCK) { BAR } else { FOO } where FOO is the original implementation of an unlocked cmpxchg. [rth: Adjust unlocked cmpxchg to use movcond instead of branches. Adjust helpers to use atomic helpers.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Paolo Bonzini	0f70ed4759	target-i386: implement PKE for TCG Tested with kvm-unit-tests. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-24 14:01:08 +01:00
Richard Henderson	07929f2ab2	target-i386: Implement FSGSBASE Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-15 14:50:00 +11:00
Richard Henderson	7d117ce81e	target-i386: Clear bndregs during legacy near jumps Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-15 14:50:00 +11:00
Richard Henderson	bdd87b3b59	target-i386: Implement BNDLDX, BNDSTX Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-15 14:50:00 +11:00
Richard Henderson	523e28d761	target-i386: Implement BNDCL, BNDCU, BNDCN Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-15 14:50:00 +11:00
Richard Henderson	7f0b7141b4	target-i386: Perform set/reset_inhibit_irq inline With helpers that can be reused for other things. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-13 07:59:59 +11:00
Richard Henderson	c9cfe8f9fb	target-i386: Implement XSAVEOPT Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-13 07:59:59 +11:00
Richard Henderson	19dc85dba2	target-i386: Add XSAVE extension This includes XSAVE, XRSTOR, XGETBV, XSETBV, which are all related, as well as the associate cpuid bits. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-13 07:59:59 +11:00
Richard Henderson	64dbaff09b	target-i386: Split fxsave/fxrstor implementation We will be able to reuse these pieces for XSAVE/XRSTOR. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-13 07:59:59 +11:00
Richard Henderson	743e398e2f	target-i386: Rewrite gen_enter inline Use gen_lea_v_seg for centralized segment base knowledge. Unify code across 32- and 64-bit. Fix note about "must save state" before using the out-of-line helpers. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1450379966-28198-8-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-02-09 15:46:54 +01:00
Richard Henderson	d005233923	target-i386: Check CR4[DE] for processing DR4/DR5 Introduce helper_get_dr so that we don't have to put CR4[DE] into the scarce HFLAGS resource. At the same time, rename helper_movl_drN_T0 to helper_set_dr and set the helper flags. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2015-10-23 12:59:27 -02:00
Eduardo Habkost	5223a9423c	target-i386: Handle I/O breakpoints Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2015-10-23 12:59:27 -02:00
Pavel Dovgalyuk	100ec09919	target-i386: exception handling for seg_helper functions This patch fixes exception handling for seg_helper functions. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-09-15 12:31:59 -07:00
Paolo Bonzini	3f7d846486	target-i386: Use correct memory attributes for ioport accesses In order to do this, stop using the cpu_in/out helpers, and instead access address_space_io directly. cpu_in* and cpu_out* remain for usage in the monitor, in qtest, and in Xen. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:10:00 +02:00
Richard Henderson	2ef6175aa7	tcg: Invert the inclusion of helper.h Rather than include helper.h with N values of GEN_HELPER, include a secondary file that sets up the macros to include helper.h. This minimizes the files that must be rebuilt when changing the macros for file N. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Paolo Bonzini	81f3053b77	target-i386: yield to another VCPU on PAUSE After commit `b1bbfe7` (aio / timers: On timer modification, qemu_notify or aio_notify, 2013-08-21) FreeBSD guests report a huge slowdown. The problem shows up as soon as FreeBSD turns out its periodic (~1 ms) tick, but the timers are only the trigger for a pre-existing problem. Before the offending patch, setting a timer did a timer_settime system call. After, setting the timer exits the event loop (which uses poll) and reenters it with a new deadline. This does not cause any slowdown; the difference is between one system call (timer_settime and a signal delivery (SIGALRM) before the patch, and two system calls afterwards (write to a pipe or eventfd + calling poll again when re-entering the event loop). Unfortunately, the exit/enter causes the main loop to grab the iothread lock, which in turns kicks the VCPU thread out of execution. This causes TCG to execute the next VCPU in its round-robin scheduling of VCPUS. When the second VCPU is mostly unused, FreeBSD runs a "pause" instruction in its idle loop which only burns cycles without any progress. As soon as the timer tick expires, the first VCPU runs the interrupt handler but very soon it sets it again---and QEMU then goes back doing nothing in the second VCPU. The fix is to make the pause instruction do "cpu_loop_exit". Cc: Richard Henderson <rth@twiddle.net> Reported-by: Luigi Rizzo <rizzo@iet.unipi.it> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1384948442-24217-1-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@amazon.com>	2013-11-21 07:55:45 -08:00
Richard Henderson	a4bcea3d67	target-i386: Use mulu2 and muls2 These correspond very closely to the insns that we're emulating. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-27 19:06:28 +00:00
Richard Henderson	321c535105	target-i386: Implement tzcnt and fix lzcnt We weren't computing flags for lzcnt at all. At the same time, adjust the implementation of bsf/bsr to avoid the local branch, using movcond instead. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-19 23:05:18 -08:00
Richard Henderson	f1300734cb	target-i386: Use clz/ctz for bsf/bsr helpers And mark the helpers as NO_RWG_SE. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-19 23:05:18 -08:00
Richard Henderson	0592f74a75	target-i386: Implement PDEP, PEXT Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-18 15:52:32 -08:00
Richard Henderson	5f1f4b1771	target-i386: Implement MULX Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-18 15:52:32 -08:00
Richard Henderson	988c3eb0d6	target-i386: Use CC_SRC2 for ADC and SBB Add another slot in ENV and store two of the three inputs. This lets us do less work when carry-out is not needed, and avoids the unpredictable CC_OP after translating these insns. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-18 15:39:09 -08:00
Richard Henderson	db9f259772	target-i386: Make helper_cc_compute_{all,c} const Pass the data in explicitly, rather than indirectly via env. This avoids all sorts of unnecessary register spillage. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-02-18 15:25:55 -08:00
Paolo Bonzini	022c62cbbc	exec: move include files to include/exec/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Aurelien Jarno	95b638a292	target-i386: rename helper flags Rename helper flags to the new ones. This is purely a mechanical change, it's possible to use better flags by looking at the helpers. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:23 +01:00
H. Peter Anvin	a9321a4d49	x86: Implement SMEP and SMAP This patch implements Supervisor Mode Execution Prevention (SMEP) and Supervisor Mode Access Prevention (SMAP) for x86. The purpose of the patch, obviously, is to help kernel developers debug the support for those features. A fair bit of the code relates to the handling of CPUID features. The CPUID code probably would get greatly simplified if all the feature bit words were unified into a single vector object, but in the interest of producing a minimal patch for SMEP/SMAP, and because I had very limited time for this project, I followed the existing style. [ v2: don't change the definition of the qemu64 CPU shorthand, since that breaks loading old snapshots. Per Anthony Liguori this can be fixed once the CPU feature set is snapshot. Change the coding style slightly to conform to checkpatch.pl. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-10-01 08:04:22 -05:00
Blue Swirl	92fc4b586f	x86: switch to AREG0 free mode Add an explicit CPUX86State parameter instead of relying on AREG0. Remove temporary wrappers and switch to AREG0 free mode. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:26 +00:00
Blue Swirl	2999a0b200	x86: avoid AREG0 in segmentation helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Rename remains of op_helper.c to seg_helper.c. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:26 +00:00
Blue Swirl	4a7443be52	x86: avoid AREG0 for misc helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:26 +00:00
Blue Swirl	608badfc66	x86: avoid AREG0 for SMM helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:25 +00:00
Blue Swirl	052e80d5e0	x86: avoid AREG0 for SVM helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:25 +00:00
Blue Swirl	7923057bae	x86: avoid AREG0 for integer helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:25 +00:00
Blue Swirl	f0967a1add	x86: avoid AREG0 for condition code helpers Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:25 +00:00
Blue Swirl	d3eb5eaeb5	x86: avoid AREG0 for FPU helpers Make FPU helpers take a parameter for CPUState instead of relying on global env. Introduce temporary wrappers for FPU load and store ops. Remove wrappers for non-AREG0 code. Don't call unconverted helpers directly. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-08-14 19:01:25 +00:00
Blue Swirl	77b2bc2c09	x86: avoid AREG0 for exceptions Add an explicit CPUX86State parameter instead of relying on AREG0. Merge raise_exception_env() to raise_exception(), likewise with raise_exception_err_env() and raise_exception_err(). Introduce cpu_svm_check_intercept_param() and cpu_vmexit() as wrappers. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-06-28 20:28:08 +00:00
Aurelien Jarno	2355c16e74	target-i386: fix SSE rounding and flush to zero SSE rounding and flush to zero control has never been implemented. However given that softfloat-native was using a single state for FPU and SSE and given that glibc is setting both FPU and SSE state in fesetround(), this was working correctly up to the switch to softfloat. Fix that by adding an update_sse_status() function similar to update_fpu_status(), and callin git on write to mxcsr. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-01-11 09:55:28 +01:00
Andre Przywara	31501a714b	target-i386: implement lzcnt emulation lzcnt is a AMD Phenom/Barcelona added instruction returning the number of leading zero bits in a word. As this is similar to the "bsr" instruction, reuse the existing code. There need to be some more changes, though, as lzcnt always returns a valid value (in opposite to bsr, which has a special case when the operand is 0). lzcnt is guarded by the ABM CPUID bit (Fn8000_0001:ECX_5). Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2009-10-23 17:10:36 +02:00
Andre Przywara	1b050077d2	target-i386: add RDTSCP support RDTSCP reads the time stamp counter and atomically also the content of a 32-bit MSR, which can be freely set by the OS. This allows CPU local data to be queried by userspace. Linux uses this to allow a fast implementation of the getcpu() syscall, which uses the vsyscall page to avoid a context switch. AMD CPUs since K8RevF and Intel CPUs since Nehalem support this instruction. RDTSCP is guarded by the RDTSCP CPUID bit (Fn8000_0001:EDX[27]). Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2009-10-04 14:46:34 +02:00
Jan Kiszka	a23978077b	x86: Add support for resume flag Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>	2009-05-22 10:50:37 -05:00
pbrook	a7812ae412	TCG variable type checking. Signed-off-by: Paul Brook <paul@codesourcery.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5729 c046a42c-6fe2-441c-8c8c-71466251a162	2008-11-17 14:43:54 +00:00
balrog	2436b61a6b	SYSENTER/SYSEXIT IA-32e implementation (Alexander Graf). On Intel CPUs, sysenter and sysexit are valid in 64-bit mode. This patch makes both 64-bit aware and enables them for Intel CPUs. Add cpu save/load for 64-bit wide sysenter variables. Signed-off-by: Alexander Graf <agraf@suse.de> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5318 c046a42c-6fe2-441c-8c8c-71466251a162	2008-09-25 18:16:18 +00:00
blueswir1	79383c9c08	Fix some warnings that would be generated by gcc -Wredundant-decls git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5115 c046a42c-6fe2-441c-8c8c-71466251a162	2008-08-30 09:51:20 +00:00
bellard	94451178b6	HLT, MWAIT and MONITOR insn fixes (initial patch by Alexander Graf) git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4746 c046a42c-6fe2-441c-8c8c-71466251a162	2008-06-18 09:32:32 +00:00
bellard	db620f46a8	reworked SVM interrupt handling logic - fixed vmrun EIP saved value - reworked cr8 handling - added CPUState.hflags2 git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4662 c046a42c-6fe2-441c-8c8c-71466251a162	2008-06-04 17:02:19 +00:00
bellard	914178d34b	32 bit SVM fixes - INVLPG and INVLPGA updates git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4660 c046a42c-6fe2-441c-8c8c-71466251a162	2008-06-04 13:53:05 +00:00
bellard	872929aa59	SVM rework git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4605 c046a42c-6fe2-441c-8c8c-71466251a162	2008-05-28 16:16:54 +00:00
bellard	437a88a51c	proper helper definition registering (all targets must do that) git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4530 c046a42c-6fe2-441c-8c8c-71466251a162	2008-05-22 16:11:04 +00:00
bellard	1b9d9ebb8a	cmpxchg8b fix - added cmpxchg16b git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4522 c046a42c-6fe2-441c-8c8c-71466251a162	2008-05-22 09:52:38 +00:00

1 2

60 Commits