Commit Graph

1644 Commits

Author SHA1 Message Date
Philippe Mathieu-Daudé 04ef33052c tcg/tci: do not use ldst label (never implemented)
changed in 659ef5cbb8, this fixes building with --enable-tcg-interpreter:

/home/travis/build/qemu/qemu/tcg/tcg.c:116:14: error: ‘tcg_out_ldst_finalize’ used but never defined [-Werror]
 static bool tcg_out_ldst_finalize(TCGContext *s);
              ^

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20170911022839.23231-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-11 19:24:05 +01:00
Richard Henderson 53c89efd02 tcg/ppc: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 77bfc7c0b4 tcg/ppc: Look for shifted constants
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 5964fca8a1 tcg/ppc: Change TCG_REG_RA to TCG_REG_TB
At this point the conversion is a wash.  Loading of TB+ofs is
smaller, but the actual return address from exit_tb is larger.
There are a few more insns required to transition between TBs.

But the expectation is that accesses to the constant pool will
on the whole be smaller.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson afe74dbd6a tcg/arm: Use constant pool for call
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 880ad9626c tcg/arm: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 2a8ab93c6b tcg/arm: Extract INSN_NOP
We'll want this for tcg_out_nop_fill.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 1507061637 tcg/arm: Code rearrangement
Move constants before all of the functions.
Move tcg_out_<format> functions before all
of the others.  No functional change.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 95ede84f4d tcg/arm: Tighten tlb indexing offset test
We are not going to use ldrd for loading the comparator
for 32-bit guests, so don't limit cmp_off to 8 bits then.
This eliminates one insn in the tlb load for some guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 647ab96aaf tcg/arm: Improve tlb load for armv7
Use UBFX to avoid limitation on CPU_TLB_BITS.  Since we're dropping
the initial shift, we need to replace the page masking.  We can use
MOVW+BIC to do this without shifting.  The result is the same size
as the armv6 path with one less conditional instruction.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson e9823b4c33 tcg/sparc: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson ab20bdc116 tcg/sparc: Introduce TCG_REG_TB
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 55129955e9 tcg/aarch64: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson a534bb15f3 tcg/s390: Use constant pool for cmpi
Also use CHI/CGHI for 16-bit signed constants.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 5bf67a9217 tcg/s390: Use constant pool for xori
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 4046d9ca04 tcg/s390: Use constant pool for ori
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson bdcd5d1926 tcg/s390: Use constant pool for andi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 28eef8aaec tcg/s390: Use constant pool for movi
Split out maybe_out_small_movi for use with other operations
that want to add to the constant pool.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson e692a3492d tcg/s390: Fix sign of patch_reloc addend
We were passing in -2 instead of +2, but then ignoring
the actual contents of addend in the calculation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 829e1376d9 tcg/s390: Introduce TCG_REG_TB
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 4e45f23943 tcg/i386: Store out-of-range call targets in constant pool
Already it saves 2 bytes per call, but also the constant pool
entry may well be shared across multiple calls.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 57a269469d tcg: Infrastructure for managing constant pools
A new shared header tcg-pool.inc.c adds new_pool_label,
for registering a tcg_target_ulong to be emitted after
the generated code, plus relocation data to install a
pointer to the data.

A new pointer is added to the TCGContext, so that we
dump the constant pool as data, not code.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson 659ef5cbb8 tcg: Rearrange ldst label tracking
Dispense with TCGBackendData, as it has never been used for more than
holding a single pointer.  Use a define in the cpu/tcg-target.h to
signal requirement for TCGLabelQemuLdst, so that we can drop the no-op
tcg-be-null.h stubs.  Rename tcg-be-ldst.h to tcg-ldst.inc.c.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson a858339336 tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test.  Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.

While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.

Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.

This opens the possibility for TCG_TARGET_HAS_direct_jump to be
a runtime decision -- based on host cpu capabilities, the size of
code_gen_buffer, or a future debugging switch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:34 -07:00
Richard Henderson cda4a338c4 tcg/tci: Add TCG_TARGET_DEFAULT_MO
Missed being added as part of 71650df7b0.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-07 18:57:34 +01:00
Richard Henderson 4609190b5f tcg/s390: Use slbgr for setcond le and leu
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:46 -07:00
Richard Henderson 7af525af01 tcg/s390: Use load-on-condition-2 facility
This allows LOAD HALFWORD IMMEDIATE ON CONDITION,
eliminating one insn in some common cases.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:41 -07:00
Richard Henderson c2097136ad tcg/s390: Use distinct-operands facility
This allows using a 3-operand insn form for some arithmetic,
logicals and shifts.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:38 -07:00
Richard Henderson e42349cbd6 tcg/s390: Merge ori+xori facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:35 -07:00
Richard Henderson ba18b07dc6 tcg/s390: Merge add2i facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:33 -07:00
Richard Henderson a8f0269e9e tcg/s390: Merge muli facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:31 -07:00
Richard Henderson 07952d9570 tcg/s390: Merge cmpi facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:28 -07:00
Richard Henderson 9b5500b697 tcg/s390: Fully convert tcg_target_op_def
Use a switch instead of searching a table.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:22:24 -07:00
Pranith Kumar b32dc3370a tcg: Implement implicit ordering semantics
Currently, we cannot use mttcg for running strong memory model guests
on weak memory model hosts due to missing ordering semantics.

We implicitly generate fence instructions for stronger guests if an
ordering mismatch is detected. We generate fences only for the orders
for which fence instructions are necessary, for example a fence is not
necessary between a store and a subsequent load on x86 since its
absence in the guest binary tells that ordering need not be
ensured. Also note that if we find multiple subsequent fence
instructions in the generated IR, we combine them in the TCG
optimization pass.

This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170829063313.10237-4-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 13:41:46 -07:00
Pranith Kumar 71650df7b0 tcg: Add tcg target default memory ordering
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170829063313.10237-3-bobby.prani@gmail.com>
[rth: Dropped ia64 hunk]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 12:56:40 -07:00
Richard Henderson a46c1244a0 tcg: Remove support for ia64 as host
We threatened to remove ia64 as host in v2.9.0.  Its time has now come.

There are still some usages of defined(__ia64__) throughout the source
code that would be triggered if one were to enable TCI on an ia64 host.
Leave those alone for now.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 12:39:25 -07:00
Richard Henderson 13aaef678e tcg: Increase minimum alignment from tcg_malloc to 8
For a 64-bit ILP32 host, aligning to sizeof(long) is not enough.
Guess the minimum for any host is 8, as that covers uint64_t.
Qemu doesn't use a host long double or host vectors, except in
extremely limited circumstances.

Fixes a bus error for a sparc v8plus host.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-08-03 11:00:30 -07:00
Richard Henderson ca671de8af tcg/arm: Fix runtime overalignment test
Patch 85aa80813d changed the IF emitting the TST instruction,
but failed to change the ?: converting CMP to CMPEQ, so the
result of the TST is ignored.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-08-03 10:56:44 -07:00
Philippe Mathieu-Daudé b208ac07ea docs: fix broken paths to docs/devel/atomics.txt
With the move of some docs/ to docs/devel/ on ac06724a71,
a couple of references were not updated.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:12:47 +03:00
Richard Henderson 5dd8990841 util: Introduce include/qemu/cpuid.h
Clang 3.9 passes the CONFIG_AVX2_OPT configure test.  However, the
supplied <cpuid.h> does not contain the bit_AVX2 define that we use
when detecting whether the routine can be enabled.

Introduce a qemu-specific header that uses the compiler's definition
of __cpuid et al, but supplies any missing bit_* definitions needed.
This avoids introducing any extra ifdefs to util/bufferiszero.c, and
allows quite a few to be removed from tcg/i386/tcg-target.inc.c.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170719044018.18063-1-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-24 12:42:55 +01:00
Philippe Mathieu-Daudé 797ed66d29 tcg/tci: enable bswap16_i64
Altough correctly implemented, bswap16_i64() never got tested/executed so the
safety TODO() statement was never removed.

Since it got now tested the TODO() can be removed.

while running Alex Bennée's image aarch64-linux-3.15rc2-buildroot.img:

Trace 0x7fa1904b0890 [0: ffffffc00036cd04]
----------------
IN:
0xffffffc00036cd24:  5ac00694      rev16 w20, w20

OP:
 ---- ffffffc00036cd24 0000000000000000 0000000000000000
 ext32u_i64 tmp3,x20
 ext16u_i64 tmp2,tmp3
 bswap16_i64 x20,tmp2
 movi_i64 tmp4,$0x10
 shr_i64 tmp2,tmp3,tmp4
 ext16u_i64 tmp2,tmp2
 bswap16_i64 tmp2,tmp2
 deposit_i64 x20,x20,tmp2,$0x10,$0x10

Linking TBs 0x7fa1904b0890 [ffffffc00036cd04] index 0 -> 0x7fa1904b0aa0 [ffffffc00036cd24]
Trace 0x7fa1904b0aa0 [0: ffffffc00036cd24]
TODO qemu/tci.c:1049: tcg_qemu_tb_exec()
qemu/tci.c:1049: tcg fatal error
Aborted

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jaroslaw Pelczar <j.pelczar@samsung.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20170718045540.16322-11-f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-19 14:45:16 -07:00
Jiang Biao 4df9cac57f tcg/mips: reserve a register for the guest_base.
Reserve a register for the guest_base using ppc code for reference.
By doing so, we do not have to recompute it for every memory load.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1499677934-2249-1-git-send-email-jiang.biao2@zte.com.cn>
2017-07-19 14:45:15 -07:00
Lluís Vilanova 61a67f71dd exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic
tracing event state values. Each set of TBs can be used by any number of
vCPUs to maximize TB reuse when vCPUs have the same tracing state.

This feature is later used by tracetool to optimize tracing of guest
code events.

The maximum number of TB sets is defined as 2^E, where E is the number
of events that have the 'vcpu' property (their state is stored in
CPUState->trace_dstate).

For this to work, a change on the dynamic tracing state of a vCPU will
force it to flush its virtual TB cache (which is only indexed by
address), and fall back to the physical TB cache (which now contains the
vCPU's dynamic tracing state as part of the hashing function).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:05 +01:00
Jiang Biao 8b8d768f19 tcg/mips: Bugfix for crash when running program with qemu-i386.
When running a helloworld program with qemu-i386 in linux-user
mode on Loongson 3A3000, it will crash. This patch fix the bug.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Message-Id: <1499669979-25904-1-git-send-email-jiang.biao2@zte.com.cn>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:11:38 -10:00
Pranith Kumar 2acee8b2b5 tcg/aarch64: Enable indirect jump path using LDR (literal)
This patch enables the indirect jump path using an LDR (literal)
instruction. It will be interesting to test and see which performs
better among the two paths.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-3-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Pranith Kumar b68686bd4b tcg/aarch64: Use ADRP+ADD to compute target address
We use ADRP+ADD to compute the target address for goto_tb. This patch
introduces the NOP instruction which is used to align the above
instruction pair so that we can use one atomic instruction to patch
the destination offsets.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-2-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Pranith Kumar 23b7aa1d2a tcg/aarch64: Introduce and use long branch to register
We can use a branch to register instruction for exit_tb for offsets
greater than 128MB.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-1-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Paolo Bonzini beeaef55e4 tcg: move tb_lock out of translate-all.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:01:16 +02:00
Peter Maydell db7a99cdc1 Queued TCG patches
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZSBP2AAoJEK0ScMxN0CebnyMH/1ZiDhYiqCD7PYfk4/Y7Db+h
 MNKNozrWKyChWQp1RzwWqcBaIzbuMZkDYn8dfS419PNtFRNoYtHjhYvjSTfcrxS0
 U8dGOoqQUHCr/jlyIDUE4y5+aFA9R/1Ih5IQv+QCi5QNXcfeST8zcYF+ImuikP6C
 7heIc7dE9kXdA8ycWJ39kYErHK9qEJbvDx6dxMPmb4cM36U239Zb9so985TXULlQ
 LoHrDpOCBzCbsICBE8iP2RKDvcwENIx21Dwv+9gW/NqR+nRdKcxhTjKEodkS8gl/
 UxMxM/TjIPQOLLUhdck5DFgIgBgQWHRqPMJKqt466I0JlXvSpifmWxckWzslXLc=
 =R+em
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170619' into staging

Queued TCG patches

# gpg: Signature made Mon 19 Jun 2017 19:12:06 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170619:
  target/arm: Exit after clearing aarch64 interrupt mask
  target/s390x: Exit after changing PSW mask
  target/alpha: Use tcg_gen_lookup_and_goto_ptr
  tcg: Increase hit rate of lookup_tb_ptr
  tcg/arm: Use ldr (literal) for goto_tb
  tcg/arm: Try pc-relative addresses for movi
  tcg/arm: Remove limit on code buffer size
  tcg/arm: Use indirect branch for goto_tb
  tcg/aarch64: Use ADR in tcg_out_movi
  translate-all: consolidate tb init in tb_gen_code
  tcg: allocate TB structs before the corresponding translated code
  util: add cacheinfo

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-22 10:25:03 +01:00
Richard Henderson 308714e6bc tcg/arm: Use ldr (literal) for goto_tb
The new placement of the TB means that we can use one insn
to load the goto_tb destination directly from the TB.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson 9c39b94f14 tcg/arm: Try pc-relative addresses for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson 3fb53fb4d1 tcg/arm: Use indirect branch for goto_tb
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson cc74d332ff tcg/aarch64: Use ADR in tcg_out_movi
The new placement of the TB means that we can use one insn
to load the return value for exit_tb returning the TB pointer.

Tested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Emilio G. Cota 6e3b2bfd6a tcg: allocate TB structs before the corresponding translated code
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.

An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).

Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:

  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
  Subject: Re: GSoC 2017 Proposal: TCG performance enhancements
  Message-ID: <1e67644b-4b30-887e-d329-1848e94c9484@twiddle.net>

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496790745-314-3-git-send-email-cota@braap.org>
[rth: Simplify the arithmetic in tcg_tb_alloc]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Emilio G. Cota b255b2c8a5 util: add cacheinfo
Add helpers to gather cache info from the host at init-time.

For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.

Suggested-by: Richard Henderson <rth@twiddle.net>
Suggested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Tested-by:    Geert Martin Ijewski <gm.ijewski@web.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496794624-4083-1-git-send-email-cota@braap.org>
[rth: Move all implementations from tcg/ppc/]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Yang Zhong 244f144134 tcg: move tcg backend files into accel/tcg/
move tcg-runtime.c, translate-all.(ch) and translate-common.c into
accel/tcg/ subdirectory and updated related trace-events file.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1496383606-18060-4-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:06 +02:00
Aurelien Jarno 5786e0683c tcg/mips: implement goto_ptr
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170430145254.25616-2-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson 085c648bef tcg/arm: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson 702a947484 tcg/arm: Clarify tcg_out_bx for arm4 host
In theory this would re-enable usage of QEMU on an armv4 host.
Whether this is worthwhile is debatable -- we've been unconditionally
issuing the armv5t BX instruction in the prologue since 2011 without
complaint.  Possibly we should simply require an armv6 host.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson 46644483ca tcg/s390: Implement goto_ptr
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson 38f81dc593 tcg/sparc: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson b19f0c2e7d tcg/aarch64: Implement goto_ptr
Measurements:

                      SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz

 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+
       |                                      *****                                                                      |
       |      +++                             *   *                                                    +goto-ptr         |
  1.4x +-+...*****............................*...*....................................................................+-+
       |     *+++*                            *   *                            +++                                       |
 1.35x +-+...*...*............................*...*...........................*****....................................+-+
       |     *   *                            *   *                           *+++*                                      |
       |     *   *                            *   *                           *   *                                      |
  1.3x +-+...*...*............................*...*...........................*...*....................................+-+
       |     *   *                            *   *                           *   *                                      |
       |     *   *                            *   *                           *   *                    *****             |
 1.25x +-+...*...*...........*****............*...*...........................*...*............*****...*...*...........+-+
       |     *   *           *   *            *   *                           *   *            *+++*   *   *             |
  1.2x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...........+-+
       |     *   *           *   *            *   *                           *   *            *   *   *   *             |
       |     *   *           *   *            *   *                           *   *            *   *   *   *   *****     |
 1.15x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...*...*...+-+
       |     *   *           *   *            *   *                           *   *    +++     *   *   *   *   *   *     |
       |     *   *           *   *            *   *                           *   *   *****    *   *   *   *   *   *     |
  1.1x +-+...*...*...........*...*....*****...*...*...*****...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *           *   *    *   *   *   *   *   *                   *   *   *   *    *   *   *   *   *   *     |
 1.05x +-+...*...*...........*...*....*...*...*...*...*...*...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *   *****   *   *    *   *   *   *   *   *                   *   *   *   *    *   *   *   *   *   *     |
       |     *   *   *   *   *   *    *   *   *   *   *   *   *****   *****   *   *   *   *    *   *   *   *   *   *     |
    1x +-+---*****---*****---*****----*****---*****---*****---*****---*****---*****---*****----*****---*****---*****---+-+
          astar   bzip2     gcc    gobmk h264ref   hmmlibquantum     mcf omnetpperlbench    sjenxalancbmk   hmean
  png: http://imgur.com/en9HE8L

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson 0c240785a8 tcg/ppc: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota 5cb4ef80f6 tcg/i386: implement goto_ptr
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-6-git-send-email-cota@braap.org>
[rth: Reuse goto_ptr epilogue for exit_tb 0.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota cedbcb0152 tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr
Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org>
[rth: Squashed 4 related commits.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Aurelien Jarno 2f5a5f5774 tcg/mips: fix field extraction opcode
The "msb" argument should correspond to (len - 1).

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-06 12:48:53 +02:00
Richard Henderson 79b1af9062 tcg: Initialize return value after exit_atomic
Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize
the output.  Even though this code is dead, it gets translated, and
without the initialization we encounter a tcg_error.

Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-04-26 19:26:11 +02:00
Peter Maydell fa54abb8c2 Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1
We already require gcc 4.1 or newer (for the atomic
support), so the fallback codepaths for older gcc
versions than that are now dead code and we can
just delete them.

NB: clang reports itself as gcc 4.2 (regardless of
clang version), so clang won't be using the fallbacks
either.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-04-20 18:33:33 +01:00
Peter Maydell 5c32be5baf tcg/sparc: Zero extend address argument to ld/st helpers
The C store helper functions take the address argument as a
target_ulong type; if this is 32 bit but the host is 64 bit
then the SPARC calling convention requires that the caller
must zero extend the value. We weren't doing this, which
meant we could pass values to the caller with high bits set
and QEMU would crash if it was compiled with optimizations.
In particular, the i386 BIOS would not start.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490871151-29029-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:37 +01:00
Peter Maydell 709a340d67 tcg/sparc: Zero extend data argument to store helpers
The C store helper functions take the data argument as a uint8_t,
uint16_t, etc depending on the store size. The SPARC calling
convention requires that data types smaller than the register
size must be extended by the caller. We weren't doing this,
which meant that if QEMU was compiled with optimizations enabled
we could end up storing incorrect values to guest memory.
(In particular the i386 guest BIOS would crash on startup.)

Add code to the trampolines that call the store helpers to
do the zero extension as required.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1490871151-29029-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:14 +01:00
Paolo Bonzini 30f3dda24b Merge branch 'icount-update' into HEAD
Merge the original development branch due to breakage caused by the
MTTCG merge.

Conflicts:
	cpu-exec.c
	translate-common.c

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:39:18 +01:00
Pranith Kumar dc1eccd661 aarch64: Change ext type to TCGType to fix warnings
To fix the following warnings:

In file included from /users/pranith/qemu/tcg/tcg.c:255:
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:879:24: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
      [-Wenum-conversion]
        tcg_out_cmp(s, ext, a, b, b_const);
        ~~~~~~~~~~~    ^~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:893:36: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
      [-Wenum-conversion]
        tcg_out_insn(s, 3201, CBZ, ext, a, offset);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn'
    glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__)
                                                                ^
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:895:37: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
      [-Wenum-conversion]
        tcg_out_insn(s, 3201, CBNZ, ext, a, offset);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn'
    glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__)
                                                                ^
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:1610:27: warning: implicit conversion from enumeration type 'TCGType' (aka 'enum TCGType') to different enumeration type 'TCGMemOp' (aka 'enum TCGMemOp')
      [-Wenum-conversion]
        tcg_out_brcond(s, ext, a2, a0, a1, const_args[1], arg_label(args[3]));
        ~~~~~~~~~~~~~~    ^~~

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170217154311.13920-1-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-03-01 08:28:06 +11:00
Alex Bennée ca759f9e38 tcg: enable MTTCG by default for ARM on x86 hosts
This enables the multi-threaded system emulation by default for ARMv7
and ARMv8 guests using the x86_64 TCG backend. This is because on the
guest side:

  - The ARM translate.c/translate-64.c have been converted to
    - use MTTCG safe atomic primitives
    - emit the appropriate barrier ops
  - The ARM machine has been updated to
    - hold the BQL when modifying shared cross-vCPU state
    - defer powerctl changes to async safe work

All the host backends support the barrier and atomic primitives but
need to provide same-or-better support for normal load/store
operations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2017-02-24 10:32:46 +00:00
KONRAD Frederic 8d4e9146b3 tcg: add options for enabling MTTCG
We know there will be cases where MTTCG won't work until additional work
is done in the front/back ends to support. It will however be useful to
be able to turn it on.

As a result MTTCG will default to off unless the combination is
supported. However the user can turn it on for the sake of testing.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: move to -accel tcg,thread=multi|single, defaults]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Alex Bennée 2093714314 tcg: move TCG_MO/BAR types into own file
We'll be using the memory ordering definitions to define values for
both the host and guest. To avoid fighting with circular header
dependencies just move these types into their own minimal header.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Paolo Bonzini 1aab16c28a cpu-exec: unify icount_decr and tcg_exit_req
The icount interrupt flag and tcg_exit_req serve almost the same
purpose, let's make them completely the same.

The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are
unified, since we can distinguish them from the value of the
interrupt flag.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-22 14:56:34 +01:00
Stefan Weil 77e217d1bf tci: Remove invalid assertions
tb_jmp_insn_offset and tb_jmp_reset_offset are pointers
and cannot be used with ARRAY_SIZE.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170202195601.11286-1-sw@weilnetz.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-03 11:38:55 +00:00
Richard Henderson 39f099ec9d tcg/i386: Always use TZCNT when available
I think this is cleaner than sometimes using BSF.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-17 12:02:08 -08:00
Richard Henderson 9bf38308f6 Revert "tcg/i386: Rely on undefined/undocumented behaviour of BSF/BSR"
This reverts commit 4ac7691073.

This fixes
  http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03062.html

While I think we could get away with relying on the undocumented
behaviour, the tcg constraint system isn't powerful enough to
properly describe the required (non-)overlap conditions.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-17 11:59:13 -08:00
Richard Henderson 8cf9a3d3f7 tcg/aarch64: Fix tcg_out_movi
There were some patterns, like 0x0000_ffff_ffff_00ff, for which we
would select to begin a multi-insn sequence with MOVN, but would
fail to set the 0x0000 lane back from 0xffff.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161207180727.6286-3-rth@twiddle.net>
2017-01-13 11:47:29 -08:00
Richard Henderson b1eb20da62 tcg/aarch64: Fix addsub2 for 0+C
When al == xzr, we cannot use addi/subi because that encodes xsp.
Force a zero into the temp register for that (rare) case.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161207180727.6286-2-rth@twiddle.net>
2017-01-13 11:46:27 -08:00
Richard Henderson a32b6ae897 tcg/s390: Fix merge error with facilities
The variable was renamed s390_facilities.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-13 09:30:40 -08:00
Richard Henderson 993508e43e tcg/i386: Handle ctpop opcode
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:49:59 -08:00
Richard Henderson 33e75fb9c8 tcg/ppc: Handle ctpop opcode
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:49:59 -08:00
Richard Henderson 14e99210f6 tcg: Use ctpop to generate ctz if needed
Particularly when andc is also available, this is two insns
shorter than using clz to compute ctz.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:49:59 -08:00
Richard Henderson a768e4e992 tcg: Add opcode for ctpop
The number of actual invocations of ctpop itself does not warrent
an opcode, but it is very helpful for POWER7 to use in generating
an expansion for ctz.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:48:56 -08:00
Richard Henderson 086920c2c8 tcg: Add helpers for clrsb
The number of actual invocations does not warrent an opcode,
and the backends generating it.  But at least we can eliminate
redundant helpers.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson 4ac7691073 tcg/i386: Rely on undefined/undocumented behaviour of BSF/BSR
The ISA manual documents the output is undefined if the input was zero.

However, we document in target-i386 that the behavior of real silicon
is to preserve the contents of the output register.  We also mention
that there are real applications that depend on this.  That this is
baked into silicon is mentioned as a potential cause for some false
sharing behaviour wrt lzcnt/tzcnt.

Taking advantage of this allows us to save 2 insns in the normal case,
and 4 insns for i686 emulating a 64-bit clz.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson bbf25f90ba tcg/i386: Handle ctz and clz opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson 6a5aed4bdc tcg/i386: Allow bmi2 shiftx to have non-matching operands
Previously we could not have different constraints for different ISA levels,
which prevented us from eliding the matching constraint for shifts.

We do now have to make sure that the operands match for constant shifts.
We can also handle some small left shifts via lea.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson 42d5b51492 tcg/i386: Hoist common arguments in tcg_out_op
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson cd26449a50 tcg/i386: Fuly convert tcg_target_op_def
Use a switch instead of searching a table.  Share constraints between
32-bit and 64-bit, when at all possible.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson ce411066f4 tcg/s390: Handle clz opcode
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson 2a1d9d41ae tcg/mips: Handle clz opcode
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:47:48 -08:00
Richard Henderson cc0fec8a4d tcg/arm: Handle ctz and clz opcodes
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 53c76c1990 tcg/aarch64: Handle ctz and clz opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson d0b07481fa tcg/ppc: Handle ctz and clz opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 0e28d0063b tcg: Add clz and ctz opcodes
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 17280ff4a5 tcg: Allow an operand to be matching or a constant
This allows an output operand to match an input operand
only when the input operand needs a register.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 069ea736b5 tcg: Pass the opcode width to target_parse_constraint
This will let us choose how to interpret a given constraint
depending on whether the opcode is 32- or 64-bit.  Which will
let us share more constraint combinations between opcodes.

At the same time, change the interface to return the advanced
pointer instead of passing it in/out by reference.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson f69d277ece tcg: Transition flat op_defs array to a target callback
This will allow the target to tailor the constraints to the
auto-detected ISA extensions.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 82790a8709 tcg: Add markup for output requires new register
This is the same concept as, and same markup as, the
early clobber markup in gcc.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:11 -08:00
Richard Henderson 333b21b809 tcg/optimize: Fold movcond 0/1 into setcond
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:10 -08:00
Richard Henderson 752b1be947 tcg/s390: Support deposit into zero
Since we can no longer use matching constraints, this does
mean we must handle that data movement by hand.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:10 -08:00
Richard Henderson b0bf5fe82d tcg/s390: Implement field extraction opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:10 -08:00
Richard Henderson b2c98d9d39 tcg/s390: Expose host facilities to tcg-target.h
This lets us expose facilities to TCG_TARGET_HAS_* defines
directly, rather than hiding behind function calls.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:06:10 -08:00
Richard Henderson c05021c3c8 tcg/ppc: Implement field extraction opcodes
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:03:59 -08:00
Richard Henderson befbb3ced5 tcg/mips: Implement field extraction opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:03:59 -08:00
Richard Henderson 78fdbfb946 tcg/i386: Implement field extraction opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson ec903af184 tcg/arm: Implement field extraction opcodes
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson 40b2ccb156 tcg/arm: Move isa detection to tcg-target.h
This allows us to use this detection within the TCG_TARGET_HAS_*
macros, instead of requiring a function call into tcg-target.inc.c.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson e2179f94a1 tcg/aarch64: Implement field extraction opcodes
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson 07cc68d528 tcg: Add deposit_z expander
While we don't require a new opcode, it is handy to have an expander
that knows the first source is zero.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson 0d0d309df0 tcg: Minor adjustments to deposit expanders
Assert that len is not 0.

Since we have asserted that ofs + len <= N, a later
check for len == N implies that ofs == 0.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Richard Henderson 7ec8bab3de tcg: Add field extraction primitives
Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of
fixed position bitfields, much like we already have for deposit.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 07:59:11 -08:00
Jin Guojie f0d703314e tcg-mips: Adjust qemu_ld/st for mips64
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-11-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:09:10 -08:00
Jin Guojie 999b941633 tcg-mips: Adjust calling conventions for mips64
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-10-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 98d690761a tcg-mips: Add tcg unwind info
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-9-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 0973b1cff8 tcg-mips: Adjust prologue for mips64
Take stack frame parameters out from the function body.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-8-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 32b69707df tcg-mips: Adjust load/store functions for mips64
tcg_out_ldst: using a generic ALIAS_PADD to avoid ifdefs
tcg_out_ld: generates LD or LW
tcg_out_st: generates SD or SW

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-7-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 2294d05dab tcg-mips: Adjust move functions for mips64
tcg_out_mov: using OPC_OR as most mips assemblers do;
tcg_out_movi: extended to 64-bit immediate.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-6-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 7f54eaa3b7 tcg-mips: Add bswap32u and bswap64
Without the mips32r2 instructions to perform swapping, bswap is quite large,
dominating the size of each reverse-endian qemu_ld/qemu_st operation.

Create two subroutines in the prologue block.  The subroutines require extra
reserved registers (TCG_TMP[2, 3]).  Using these within qemu_ld means that
we need not place additional restrictions on the qemu_ld outputs.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-5-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 0119b1927d tcg-mips: Support 64-bit opcodes
Bulk patch adding 64-bit opcodes into tcg_out_op.  Note that
mips64 is as yet neither complete nor enabled.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-4-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie 57a701fc2b tcg-mips: Add mips64 opcodes
Since the mips manual tables are in octal, reorg all of the opcodes
into that format for clarity.  Note that the 64-bit opcodes are as
yet unused.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-3-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Jin Guojie bb08afe9f0 tcg-mips: Move bswap code to a subroutine
Without the mips32r2 instructions to perform swapping, bswap is quite large,
dominating the size of each reverse-endian qemu_ld/qemu_st operation.

Create a subroutine in the prologue block.  The subroutine requires extra
reserved registers (TCG_TMP[2, 3]).  Using these within qemu_ld means that
we need not place additional restrictions on the qemu_ld outputs.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-2-git-send-email-jinguojie@loongson.cn>
2017-01-06 10:03:54 -08:00
Richard Henderson e45d4ef6e3 tcg/s390: Remove 'R' constraint
Since R0 is reserved, we don't need a special case constraint.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-12-23 19:38:27 -08:00
Richard Henderson 65839b56b9 tcg/s390: Fix setcond expansion
We can't use LOAD AND TEST for unsigned data and then expect to
extract the result with ADD LOGICAL WITH CARRY.  Fall through to
using COMPARE LOGICAL IMMEDIATE instead.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-12-23 19:38:27 -08:00
Joseph Myers 3ff91d7e85 tcg: correct 32-bit tcg_gen_ld8s_i64 sign-extension
The version of tcg_gen_ld8s_i64 for 32-bit systems does a load into
the low part of the return value - then attempts a sign extension into
the high part, but wrongly sets the high part to a sign extension of
itself rather than of the low part.  This results in TCG internal
errors from the use of the uninitialized high part (in some GCC tests
of AArch64 NEON shift intrinsics, in particular).  This patch corrects
the sign-extension logic, making it match other functions such as
tcg_gen_ld16s_i64.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1610272333560.22353@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:30:45 -06:00
Peter Maydell a40d4701bc tcg/tcg.h: Improve documentation of TCGv_i32 etc types
The typedefs we use for the TCGv_i32, TCGv_i64 and TCGv_ptr
types are somewhat confusing, because we define them as
pointers to structs, but the structs themselves are never
defined. Explain in the comments a bit more clearly why
this is OK and what is going on under the hood.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1477067922-26202-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:30:45 -06:00
Richard Henderson 5087abfb7d tcg: Add tcg_gen_mulsu2_{i32,i64,tl}
This multiply has one signed input and one unsigned input,
producing the full double-width result.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1475011433-24456-2-git-send-email-rth@twiddle.net>
2016-11-01 10:30:45 -06:00
Richard Henderson 1ee73216f4 log: Add locking to large logging blocks
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.

While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.

For mingw32, we appear to have no viable solution for this.  The locking
functions are not properly exported from the system runtime library.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:29:03 -06:00
Paolo Bonzini 7d7500d998 tcg: comment on which functions have to be called with tb_lock held
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks.  Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.

This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.

Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock.  The next one will rectify this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Richard Henderson 91682118aa tcg: Emit barriers with parallel_cpus
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson df79b996a7 tcg: Add CONFIG_ATOMIC64
Allow qemu to build on 32-bit hosts without 64-bit atomic ops.

Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
guests, we still need some way to handle the 32-bit guest using a
64-bit atomic operation.  Do so by dropping back to single-step.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson 7ebee43ee3 tcg: Add atomic128 helpers
Force the use of cmpxchg16b on x86_64.

Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction.  Further, it's required by Windows 8 so no new cpus
will ever omit it.

If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson c482cb117c tcg: Add atomic helpers
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson fdbc2b5722 tcg: Add EXCP_ATOMIC
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Paolo Bonzini 0fe4fca4e1 tcg: try sti when moving a constant into a dead memory temp
This comes from free from unifying tcg_reg_alloc_mov and
tcg_reg_alloc_movi's handling of TEMP_VAL_CONST.  It triggers
often on moves to cc_dst, such as the following translation
of "sub $0x3c,%esp":

  before:                          after:
  subl   $0x3c,%ebp                subl   $0x3c,%ebp
  movl   %ebp,0x10(%r14)           movl   %ebp,0x10(%r14)
  movl   $0x3c,%ebx                movl   $0x3c,0x2c(%r14)
  movl   %ebx,0x2c(%r14)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1473945360-13663-1-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini bf28a69eeb qemu-tech: move text from qemu-tech to tcg/README
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-07 10:05:18 +02:00
Alex Bennée 550276ae0a tcg/optimize: move default return out of if statement
This is to appease sanitizer builds which complain that:

  "error: control reaches end of non-void function"

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20160930213106.20186-5-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-04 10:00:25 +02:00
Richard Henderson ebb90a005d tcg/i386: Extend TARGET_PAGE_MASK to the proper type
TARGET_PAGE_MASK, as defined, has type "int".  We need to extend
that to the proper target width before oring in an "unsigned".

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-20 11:45:30 -07:00
Pranith Kumar 34f939218c tcg: Optimize fence instructions
This commit optimizes fence instructions.  Two optimizations are
currently implemented: (1) unnecessary duplicate fence instructions,
and (2) merging weaker fences into a stronger fence.

[rth: Merge tcg_optimize_mb back into tcg_optimize, so that we only
loop over the opcode stream once.  Merge "unrelated" weaker barriers
into one stronger barrier.]

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160823134825.32578-1-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:12 -07:00
Pranith Kumar a1e69e2f81 tcg/tci: Add support for fence
Cc: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-11-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:12 -07:00
Pranith Kumar f8f03b3707 tcg/sparc: Add support for fence
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-10-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar c9314d610e tcg/s390: Add support for fence
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-9-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar 7b4af5ee8a tcg/ppc: Add support for fence
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-8-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar 6f0b99104a tcg/mips: Add support for fence
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-7-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar 5bbadbdfd6 tcg/ia64: Add support for fence
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-6-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar 40f191ab82 tcg/arm: Add support for fence
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-5-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Pranith Kumar c7a59c2a92 tcg/aarch64: Add support for fence
Cc: Claudio Fontana <claudio.fontana@gmail.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160714202026.9727-4-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00