Commit Graph

61897 Commits

Author SHA1 Message Date
Richard Henderson 9f75462065 tcg: Reduce max TB opcode count
Also, assert that we don't overflow any of two different offsets into
the TB. Both unwind and goto_tb both record a uint16_t for later use.

This fixes an arm-softmmu test case utilizing NEON in which there is
a TB generated that runs to 7800 opcodes, and compiles to 96k on an
x86_64 host.  This overflows the 16-bit offset in which we record the
goto_tb reset offset.  Because of that overflow, we install a jump
destination that goes to neverland.  Boom.

With this reduced op count, the same TB compiles to about 48k for
aarch64, ppc64le, and x86_64 hosts, and neither assertion fires.

Cc: qemu-stable@nongnu.org
Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 09:39:53 -10:00
Emilio G. Cota 0ac20318ce tcg: remove tb_lock
Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.

Per-TB locks are used in both modes to protect TB jumps.

Some notes:

- tb_lock is removed from notdirty_mem_write by passing a
  locked page_collection to tb_invalidate_phys_page_fast.

- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
  so there is no need to further serialize access to them.

- do_tb_flush is run in a safe async context, meaning no other
  vCPU threads are running. Therefore acquiring mmap_lock there
  is just to please tools such as thread sanitizer.

- Not visible in the diff, but tb_invalidate_phys_page already
  has an assert_memory_lock.

- cpu_io_recompile is !user-only, so no mmap_lock there.

- Added mmap_unlock()'s before all siglongjmp's that could
  be called in user-mode while mmap_lock is held.
  + Added an assert for !have_mmap_lock() after returning from
    the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.

Performance numbers before/after:

Host: AMD Opteron(tm) Processor 6376

                 ubuntu 17.04 ppc64 bootup+shutdown time

  700 +-+--+----+------+------------+-----------+------------*--+-+
      |    +    +      +            +           +           *B    |
      |         before ***B***                            ** *    |
      |tb lock removal ###D###                         ***        |
  600 +-+                                           ***         +-+
      |                                           **         #    |
      |                                        *B*          #D    |
      |                                     *** *         ##      |
  500 +-+                                ***           ###      +-+
      |                             * ***           ###           |
      |                            *B*          # ##              |
      |                          ** *          #D#                |
  400 +-+                      **            ##                 +-+
      |                      **           ###                     |
      |                    **           ##                        |
      |                  **         # ##                          |
  300 +-+  *           B*          #D#                          +-+
      |    B         ***        ###                               |
      |    *       **       ####                                  |
      |     *   ***      ###                                      |
  200 +-+   B  *B     #D#                                       +-+
      |     #B* *   ## #                                          |
      |     #*    ##                                              |
      |    + D##D#     +            +           +            +    |
  100 +-+--+----+------+------------+-----------+------------+--+-+
           1    8      16      Guest CPUs       48           64
  png: https://imgur.com/HwmBHXe

              debian jessie aarch64 bootup+shutdown time

  90 +-+--+-----+-----+------------+------------+------------+--+-+
     |    +     +     +            +            +            +    |
     |         before ***B***                                B    |
  80 +tb lock removal ###D###                              **D  +-+
     |                                                   **###    |
     |                                                 **##       |
  70 +-+                                             ** #       +-+
     |                                             ** ##          |
     |                                           **  #            |
  60 +-+                                       *B  ##           +-+
     |                                       **  ##               |
     |                                    ***  #D                 |
  50 +-+                               ***   ##                 +-+
     |                             * **   ###                     |
     |                           **B*  ###                        |
  40 +-+                     ****  # ##                         +-+
     |                   ****     #D#                             |
     |             ***B**      ###                                |
  30 +-+    B***B**        ####                                 +-+
     |    B *   *     # ###                                       |
     |     B       ###D#                                          |
  20 +-+   D  ##D##                                             +-+
     |      D#                                                    |
     |    +     +     +            +            +            +    |
  10 +-+--+-----+-----+------------+------------+------------+--+-+
          1     8     16      Guest CPUs        48           64
  png: https://imgur.com/iGpGFtv

The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota 705ad1ff0c translate-all: remove tb_lock mention from cpu_restore_state_from_tb
tb_lock was needed when the function did retranslation. However,
since fca8a500d5 ("tcg: Save insn data and use it in
cpu_restore_state_from_tb") we don't do retranslation.

Get rid of the comment.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota b7542f7fe8 cputlb: remove tb_lock from tlb_flush functions
The acquisition of tb_lock was added when the async tlb_flush
was introduced in e3b9ca810 ("cputlb: introduce tlb_flush_* async work.")

tb_lock was there to allow us to do memset() on the tb_jmp_cache's.
However, since f3ced3c592 ("tcg: consistently access cpu->tb_jmp_cache
atomically") all accesses to tb_jmp_cache are atomic, so tb_lock
is not needed here. Get rid of it.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota 194125e3eb translate-all: protect TB jumps with a per-destination-TB lock
This applies to both user-mode and !user-mode emulation.

Instead of relying on a global lock, protect the list of incoming
jumps with tb->jmp_lock. This lock also protects tb->cflags,
so update all tb->cflags readers outside tb->jmp_lock to use
atomic reads via tb_cflags().

In order to find the destination TB (and therefore its jmp_lock)
from the origin TB, we introduce tb->jmp_dest[].

I considered not using a linked list of jumps, which simplifies
code and makes the struct smaller. However, it unnecessarily increases
memory usage, which results in a performance decrease. See for
instance these numbers booting+shutting down debian-arm:
                      Time (s)  Rel. err (%)  Abs. err (s)  Rel. slowdown (%)
------------------------------------------------------------------------------
 before                  20.88          0.74      0.154512                 0.
 after                   20.81          0.38      0.079078        -0.33524904
 GTree                   21.02          0.28      0.058856         0.67049808
 GHashTable + xxhash     21.63          1.08      0.233604          3.5919540

Using a hash table or a binary tree to keep track of the jumps
doesn't really pay off, not only due to the increased memory usage,
but also because most TBs have only 0 or 1 jumps to them. The maximum
number of jumps when booting debian-arm that I measured is 35, but
as we can see in the histogram below a TB with that many incoming jumps
is extremely rare; the average TB has 0.80 incoming jumps.

n_jumps: 379208; avg jumps/tb: 0.801099
dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁  ▁▁▁     ▁|[34.0,35.0]

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota 95590e24af translate-all: discard TB when tb_link_page returns an existing matching TB
Use the recently-gained QHT feature of returning the matching TB if it
already exists. This allows us to get rid of the lookup we perform
right after acquiring tb_lock.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:42 -10:00
Emilio G. Cota faa9372c07 translate-all: introduce assert_no_pages_locked
The appended adds assertions to make sure we do not longjmp with page
locks held. Note that user-mode has nothing to check, since page_locks
are !user-mode only.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 6d9abf85d5 translate-all: add page_locked assertions
This is only compiled under CONFIG_DEBUG_TCG to avoid
bloating the binary.

In user-mode, assert_page_locked is equivalent to assert_mmap_lock.

Note: There are some tb_lock assertions left that will be
removed by later patches.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 0b5c91f74f translate-all: use per-page locking in !user-mode
Groundwork for supporting parallel TCG generation.

Instead of using a global lock (tb_lock) to protect changes
to pages, use fine-grained, per-page locks in !user-mode.
User-mode stays with mmap_lock.

Sometimes changes need to happen atomically on more than one
page (e.g. when a TB that spans across two pages is
added/invalidated, or when a range of pages is invalidated).
We therefore introduce struct page_collection, which helps
us keep track of a set of pages that have been locked in
the appropriate locking order (i.e. by ascending page index).

This commit first introduces the structs and the function helpers,
to then convert the calling code to use per-page locking. Note
that tb_lock is not removed yet.

While at it, rename tb_alloc_page to tb_page_add, which pairs with
tb_page_remove.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 45c73de594 translate-all: move tb_invalidate_phys_page_range up in the file
This greatly simplifies next commit's diff.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota ae5486e273 translate-all: work page-by-page in tb_invalidate_phys_range_1
So that we pass a same-page range to tb_invalidate_phys_page_range,
instead of always passing an end address that could be on a different
page.

As discussed with Peter Maydell on the list [1], tb_invalidate_phys_page_range
doesn't actually do much with 'end', which explains why we have never
hit a bug despite going against what the comment on top of
tb_invalidate_phys_page_range requires:

> * Invalidate all TBs which intersect with the target physical address range
> * [start;end[. NOTE: start and end must refer to the *same* physical page.

The appended honours the comment, which avoids confusion.

While at it, rework the loop into a for loop, which is less error prone
(e.g. "continue" won't result in an infinite loop).

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg09165.html

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 94da9aec2a translate-all: remove hole in PageDesc
Groundwork for supporting parallel TCG generation.

Move the hole to the end of the struct, so that a u32
field can be added there without bloating the struct.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 78722ed0b8 translate-all: make l1_map lockless
Groundwork for supporting parallel TCG generation.

We never remove entries from the radix tree, so we can use cmpxchg
to implement lockless insertions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 1e05197f24 translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB
This commit does several things, but to avoid churn I merged them all
into the same commit. To wit:

- Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
  Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
  jmp_list_{next|first} fields of TB"), the rationale is the same: these
  are tagged pointers, not pointers. So use a more appropriate type.

- Only check the least significant bit of the tagged pointers. Masking
  with 3/~3 is unnecessary and confusing.

- Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
  PAGE_FOR_EACH_TB, which improves readability. Note that
  TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.

- Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
  is a bug and we attempt to remove a TB that is not in the list, instead
  of segfaulting (since the list is NULL-terminated) we will reach
  g_assert_not_reached().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 128ed2278c tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx
Thereby making it per-TCGContext. Once we remove tb_lock, this will
avoid an atomic increment every time a TB is invalidated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota be2cdc5e35 tcg: track TBs with per-region BST's
This paves the way for enabling scalable parallel generation of TCG code.

Instead of tracking TBs with a single binary search tree (BST), use a
BST for each TCG region, protecting it with a lock. This is as scalable
as it gets, since each TCG thread operates on a separate region.

The core of this change is the introduction of struct tcg_region_tree,
which contains a pointer to a GTree and an associated lock to serialize
accesses to it. We then allocate an array of tcg_region_tree's, adding
the appropriate padding to avoid false sharing based on
qemu_dcache_linesize.

Given a tc_ptr, we first find the corresponding region_tree. This
is done by special-casing the first and last regions first, since they
might be of size != region.size; otherwise we just divide the offset
by region.stride. I was worried about this division (several dozen
cycles of latency), but profiling shows that this is not a fast path.
Note that region.stride is not required to be a power of two; it
is only required to be a multiple of the host's page size.

Note that with this design we can also provide consistent snapshots
about all region trees at once; for instance, tcg_tb_foreach
acquires/releases all region_tree locks before/after iterating over them.
For this reason we now drop tb_lock in dump_exec_info().

As an alternative I considered implementing a concurrent BST, but this
can be tricky to get right, offers no consistent snapshots of the BST,
and performance and scalability-wise I don't think it could ever beat
having separate GTrees, given that our workload is insert-mostly (all
concurrent BST designs I've seen focus, understandably, on making
lookups fast, which comes at the expense of convoluted, non-wait-free
insertions/removals).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 32359d529f qht: return existing entry when qht_insert fails
The meaning of "existing" is now changed to "matches in hash and
ht->cmp result". This is saner than just checking the pointer value.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by:  Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota 61b8cef1d4 qht: require a default comparison function
qht_lookup now uses the default cmp function. qht_lookup_custom is defined
to retain the old behaviour, that is a cmp function is explicitly provided.

qht_insert will gain use of the default cmp in the next patch.

Note that we move qht_lookup_custom's @func to be the last argument,
which makes the new qht_lookup as simple as possible.
Instead of this (i.e. keeping @func 2nd):
0000000000010750 <qht_lookup>:
   10750:       89 d1                   mov    %edx,%ecx
   10752:       48 89 f2                mov    %rsi,%rdx
   10755:       48 8b 77 08             mov    0x8(%rdi),%rsi
   10759:       e9 22 ff ff ff          jmpq   10680 <qht_lookup_custom>
   1075e:       66 90                   xchg   %ax,%ax

We get:
0000000000010740 <qht_lookup>:
   10740:       48 8b 4f 08             mov    0x8(%rdi),%rcx
   10744:       e9 37 ff ff ff          jmpq   10680 <qht_lookup_custom>
   10749:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
John Arbuckle 1019242af1 tcg/i386: Use byte form of xgetbv instruction
The assembler in most versions of Mac OS X is pretty old and does not
support the xgetbv instruction.  To go around this problem, the raw
encoding of the instruction is used instead.

Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Message-Id: <20180604215102.11002-1-programmingkidx@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Peter Maydell 42747d6abb xilinx-next-2018-06-15.for-upstream
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJbI84PAAoJECnFlngPa8qD3cAIAJomdBvSh8SQP9hCHyjVn0Ks
 vHvcpuB7PvtezXvj8ibaLqbHQtJAL9Wxghsfj8EtQSbbXMw8Ndbb9bcIpBY87W6C
 YtgcJ9sZOw6ya16AmEbti5Qm9GSrzhMRyYhRORqrViOmxJQrbXRboXx73QGog/10
 S6LPBfsQT3LI4V6IOvqKsjPnTVTzHsv3l+IYEGA74W05OxvTehYpIek0g/D0o+Ul
 TgsbQJWbUDaTwyDsn6GXd6buoD0QBJa6aLLuDIbes0dJq7EIi4fTutXvErTHC+wg
 Mzvec6Xg8dSHuoHsQlVzk8MFHnHOL6Dlty1hP4ydhri93GBDCJGND/YZQpZqELw=
 =oKzf
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2018-06-15.for-upstream' into staging

xilinx-next-2018-06-15.for-upstream

# gpg: Signature made Fri 15 Jun 2018 15:32:47 BST
# gpg:                using RSA key 29C596780F6BCA83
# gpg: Good signature from "Edgar E. Iglesias (Xilinx key) <edgar.iglesias@xilinx.com>"
# gpg:                 aka "Edgar E. Iglesias <edgar.iglesias@gmail.com>"
# Primary key fingerprint: AC44 FEDC 14F7 F1EB EDBF  4151 29C5 9678 0F6B CA83

* remotes/edgar/tags/edgar/xilinx-next-2018-06-15.for-upstream:
  target-microblaze: Rework NOP/zero instruction handling
  target-microblaze: mmu: Correct masking of output addresses

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 17:28:37 +01:00
Peter Maydell 4359255ad3 Block layer patches:
- Fix options that work only with -drive or -blockdev, but not with
   both, because of QDict type confusion
 - rbd: Add options 'auth-client-required' and 'key-secret'
 - Remove deprecated -drive options serial/addr/cyls/heads/secs/trans
 - rbd, iscsi: Remove deprecated 'filename' option
 - Fix 'qemu-img map' crash with unaligned image size
 - Improve QMP documentation for jobs
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbI8sTAAoJEH8JsnLIjy/WeIsP/Ak8ealnjx8GRUfWwi69DX5z
 LPzGXjlFJ1DqVb1SCuc7mptlgQwvHSoeYk6i+2xzJy9WmaAJsT2Mxz+EkviGzlcU
 Xna6t/a9dgMxvHQMKbN1j4Rm3tDJzm2x6ZnBLYlwFhecmkEz4i3TEsceFL7BrN4F
 C21jm+yC2wKPyggKXRRLRP4a1UfuM7OH8OEE3aWCZwEz1+6ucZGs3xsmWJfXxlCK
 adigVec3gNWgQK+gWFlYKXRJKalK50BnZ7+2LlQIcC1kLAx22Wr2t2zmlROa351w
 JEBIQlLkEN8hpB44Xk3Wla3YmwumW8EANIdxkeAXDgYgn8KEpN1nwWyxGUsfoOqz
 WhZ+pT0yerg/kSovKgTyxsF9YoeptJEUmmR0nZX4+xpnctbU6EHuspymMb1Xl95h
 /7lirJbBVb8o9rzcCNOzTc0f9lot9pDglLjxcWq3RqvpJ2iQmxxY2UUidYl9BmRg
 SpkkA9E613bZ9zuK+17FDbnGq/N9h77HSBSVbLgEoajFLXYUtkuc0egi/ZVLWo0W
 wo/XvXTZqy7n9/M/j4yVS+Xv7WotNuyHajk6N73KdL0NodXE776CPxvMlUEHEHB9
 Pizo/fuxmgatjDFf1WU3Qt5pZlU1Pao4Aakcq3z/Ag7ucMK03QQIMBcrJC24+DD2
 pjT5HPIeS3vklkg9Jqdi
 =LxgF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Fix options that work only with -drive or -blockdev, but not with
  both, because of QDict type confusion
- rbd: Add options 'auth-client-required' and 'key-secret'
- Remove deprecated -drive options serial/addr/cyls/heads/secs/trans
- rbd, iscsi: Remove deprecated 'filename' option
- Fix 'qemu-img map' crash with unaligned image size
- Improve QMP documentation for jobs

# gpg: Signature made Fri 15 Jun 2018 15:20:03 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (26 commits)
  block: Remove dead deprecation warning code
  block: Remove deprecated -drive option serial
  block: Remove deprecated -drive option addr
  block: Remove deprecated -drive geometry options
  rbd: New parameter key-secret
  rbd: New parameter auth-client-required
  block: Fix -blockdev / blockdev-add for empty objects and arrays
  check-block-qdict: Cover flattening of empty lists and dictionaries
  check-block-qdict: Rename qdict_flatten()'s variables for clarity
  block-qdict: Simplify qdict_is_list() some
  block-qdict: Clean up qdict_crumple() a bit
  block-qdict: Tweak qdict_flatten_qdict(), qdict_flatten_qlist()
  block-qdict: Simplify qdict_flatten_qdict()
  block: Make remaining uses of qobject input visitor more robust
  block: Factor out qobject_input_visitor_new_flat_confused()
  block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts()
  block: Fix -drive for certain non-string scalars
  block: Fix -blockdev for certain non-string scalars
  qobject: Move block-specific qdict code to block-qdict.c
  block: Add block-specific QDict header
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 16:30:27 +01:00
Peter Maydell 81d3864796 target-arm and miscellaneous queue:
* fix KVM state save/restore for GICv3 priority registers for high IRQ numbers
  * hw/arm/mps2-tz: Put ethernet controller behind PPC
  * hw/sh/sh7750: Convert away from old_mmio
  * hw/m68k/mcf5206: Convert away from old_mmio
  * hw/block/pflash_cfi02: Convert away from old_mmio
  * hw/watchdog/wdt_i6300esb: Convert away from old_mmio
  * hw/input/pckbd: Convert away from old_mmio
  * hw/char/parallel: Convert away from old_mmio
  * armv7m: refactor to get rid of armv7m_init() function
  * arm: Don't crash if user tries to use a Cortex-M CPU without an NVIC
  * hw/core/or-irq: Support more than 16 inputs to an OR gate
  * cpu-defs.h: Document CPUIOTLBEntry 'addr' field
  * cputlb: Pass cpu_transaction_failed() the correct physaddr
  * CODING_STYLE: Define our preferred form for multiline comments
  * Add and use new stn_*_p() and ldn_*_p() memory access functions
  * target/arm: More parts of the upcoming SVE support
  * aspeed_scu: Implement RNG register
  * m25p80: add support for two bytes WRSR for Macronix chips
  * exec.c: Handle IOMMUs being in the path of TCG CPU memory accesses
  * target/arm: Allow ARMv6-M Thumb2 instructions
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJbI8wDAAoJEDwlJe0UNgze8acP/i3q3aGcb4b+FwWogktqVC+h
 +eY+T0dQC0ohGjWMvR8jk4h7Gh75zpvH5CO0Oh7u8JqOBTEvPllvCatcqYJ36Vik
 +spqdNr94yW7M5GKMl1+vJdQNv5SejuUg5EdwVdk2bJlObuVIfQDCAAi0DaW443B
 iJaUrIxVP06DeVjPvSG+WVgP+N3BualHURXBdH6h23gZeftWZ3iDMPELPYuLbUOF
 LF1N/w/rtvx0qOzr7VBdsSEdEyFFli5B5Pv4+xiCfgKTBu9cm+ima0NXPPv4Wcaw
 322LQvvQBDEVT3AWP6oPLfXmpNjmCe73G73ncInWiN8pESe8O6T0aJk6vAk6WP5A
 UDGtHGw4H6jCDFn28W/Sl/vHVV6qdd8PFDAlomEBNhy2j4jqMW4xPPu2mMprLV1L
 6aYsENCxgg7kIBmJlMiO710MdLvIBcVL1eWILTG82fvKmb4qwbJ1r4PInGBOxM6s
 LGB14JSvgfsKqlxZPjD6oNXOyxbTdeU72YIBMCUhBD5/8B1zFk8MKofhVy2Z50QA
 yxwjBlJL/g1xLMM0WdED9F7SeZKCjaJK+VZF+5qwvvmA0VL9QPsxXq+djTlF/mkZ
 agu/jS7L7dFdSeionRRKTdc0jXfimoSE9QFj7tpxdh3VehSGRBBp7vTYQAp+Bc9s
 5ghD5oektvRQslM/qLWu
 =McgN
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180615' into staging

target-arm and miscellaneous queue:
 * fix KVM state save/restore for GICv3 priority registers for high IRQ numbers
 * hw/arm/mps2-tz: Put ethernet controller behind PPC
 * hw/sh/sh7750: Convert away from old_mmio
 * hw/m68k/mcf5206: Convert away from old_mmio
 * hw/block/pflash_cfi02: Convert away from old_mmio
 * hw/watchdog/wdt_i6300esb: Convert away from old_mmio
 * hw/input/pckbd: Convert away from old_mmio
 * hw/char/parallel: Convert away from old_mmio
 * armv7m: refactor to get rid of armv7m_init() function
 * arm: Don't crash if user tries to use a Cortex-M CPU without an NVIC
 * hw/core/or-irq: Support more than 16 inputs to an OR gate
 * cpu-defs.h: Document CPUIOTLBEntry 'addr' field
 * cputlb: Pass cpu_transaction_failed() the correct physaddr
 * CODING_STYLE: Define our preferred form for multiline comments
 * Add and use new stn_*_p() and ldn_*_p() memory access functions
 * target/arm: More parts of the upcoming SVE support
 * aspeed_scu: Implement RNG register
 * m25p80: add support for two bytes WRSR for Macronix chips
 * exec.c: Handle IOMMUs being in the path of TCG CPU memory accesses
 * target/arm: Allow ARMv6-M Thumb2 instructions

# gpg: Signature made Fri 15 Jun 2018 15:24:03 BST
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180615: (43 commits)
  target/arm: Allow ARMv6-M Thumb2 instructions
  exec.c: Handle IOMMUs in address_space_translate_for_iotlb()
  iommu: Add IOMMU index argument to translate method
  iommu: Add IOMMU index argument to notifier APIs
  iommu: Add IOMMU index concept to IOMMU API
  m25p80: add support for two bytes WRSR for Macronix chips
  aspeed_scu: Implement RNG register
  target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group
  target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group
  target/arm: Implement FDUP/DUP
  target/arm: Implement SVE Integer Compare - Scalars Group
  target/arm: Implement SVE Predicate Count Group
  target/arm: Implement SVE Partition Break Group
  target/arm: Implement SVE Integer Compare - Immediate Group
  target/arm: Implement SVE Integer Compare - Vectors Group
  target/arm: Implement SVE Select Vectors Group
  target/arm: Implement SVE vector splice (predicated)
  target/arm: Implement SVE reverse within elements
  target/arm: Implement SVE copy to vector (predicated)
  target/arm: Implement SVE conditionally broadcast/extract element
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:27:48 +01:00
Julia Suvorova 14120108f8 target/arm: Allow ARMv6-M Thumb2 instructions
ARMv6-M supports 6 Thumb2 instructions. This patch checks for these
instructions and allows their execution.
Like Thumb2 cores, ARMv6-M always interprets BL instruction as 32-bit.

This patch is required for future Cortex-M0 support.

Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180612204632.28780-1-jusual@mail.ru
[PMM: move armv6m_insn[] and armv6m_mask[] closer to
 point of use, and mark 'const'. Check for M-and-not-v7
 rather than M-and-6.]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Peter Maydell 1f871c5e6b exec.c: Handle IOMMUs in address_space_translate_for_iotlb()
Currently we don't support board configurations that put an IOMMU
in the path of the CPU's memory transactions, and instead just
assert() if the memory region fonud in address_space_translate_for_iotlb()
is an IOMMUMemoryRegion.

Remove this limitation by having the function handle IOMMUs.
This is mostly straightforward, but we must make sure we have
a notifier registered for every IOMMU that a transaction has
passed through, so that we can flush the TLB appropriately
when any of the IOMMUs change their mappings.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell 2c91bcf273 iommu: Add IOMMU index argument to translate method
Add an IOMMU index argument to the translate method of
IOMMUs. Since all of our current IOMMU implementations
support only a single IOMMU index, this has no effect
on the behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell cb1efcf462 iommu: Add IOMMU index argument to notifier APIs
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
When initializing a notifier with iommu_notifier_init(), the caller
must pass the IOMMU index that it is interested in. When a change
happens, the IOMMU implementation must pass
memory_region_notify_iommu() the IOMMU index that has changed and
that notifiers must be called for.

IOMMUs which support only a single index don't need to change.
Callers which only really support working with IOMMUs with a single
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
memory_region_iommu_attrs_to_index().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell 21f402093c iommu: Add IOMMU index concept to IOMMU API
If an IOMMU supports mappings that care about the memory
transaction attributes, then it no longer has a unique
address -> output mapping, but more than one. We can
represent these using an IOMMU index, analogous to TCG's
mmu indexes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-2-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Cédric Le Goater 2151b044fd m25p80: add support for two bytes WRSR for Macronix chips
On Macronix chips, two bytes can written to the WRSR. First byte will
configure the status register and the second the configuration
register. It is important to save the configuration value as it
contains the dummy cycle setting when using dual or quad IO mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Joel Stanley acd9575e59 aspeed_scu: Implement RNG register
The ASPEED SoCs contain a single register that returns random data when
read. This models that register so that guests can use it.

The random number data register has a corresponding control register,
however it returns data regardless of the state of the enabled bit, so
the model follows this behaviour.

When the qcrypto call fails we exit as the guest uses the random number
device to feed it's entropy pool, which is used for cryptographic
purposes.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Message-id: 20180613114836.9265-1-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 29b80469dc target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 6e6a157d68 target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson ed49196125 target/arm: Implement FDUP/DUP
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson caf1cefc72 target/arm: Implement SVE Integer Compare - Scalars Group
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-16-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 9ee3a611de target/arm: Implement SVE Predicate Count Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 35da316f5e target/arm: Implement SVE Partition Break Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 38cadeba0d target/arm: Implement SVE Integer Compare - Immediate Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 757f9cff1b target/arm: Implement SVE Integer Compare - Vectors Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson d3fe4a29d7 target/arm: Implement SVE Select Vectors Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson b48ff24098 target/arm: Implement SVE vector splice (predicated)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson dae8fb9019 target/arm: Implement SVE reverse within elements
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 792a557847 target/arm: Implement SVE copy to vector (predicated)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson ef23cb726d target/arm: Implement SVE conditionally broadcast/extract element
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 3ca879aeb3 target/arm: Implement SVE compress active elements
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 234b48e9c6 target/arm: Implement SVE Permute - Interleaving Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson d731d8cb3c target/arm: Implement SVE Permute - Predicates Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 30562ab716 target/arm: Implement SVE Permute - Unpredicated Group
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Richard Henderson 66f2dbd783 target/arm: Extend vec_reg_offset to larger sizes
Rearrange the arithmetic so that we are agnostic about the total size
of the vector and the size of the element.  This will allow us to index
up to the 32nd byte and with 16-byte elements.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180613015641.5667-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 15:23:34 +01:00
Peter Maydell 6d3ede5410 exec.c: Use stn_p() and ldn_p() instead of explicit switches
Now we have stn_p() and ldn_p() we can use them in various
functions in exec.c that used to have their own switch-on-size code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-4-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell 22672c6075 exec.c: Don't accidentally sign-extend 4-byte loads in subpage_read()
In subpage_read() we perform a load of the data into a local buffer
which we then access using ldub_p(), lduw_p(), ldl_p() or ldq_p()
depending on its size, storing the result into the uint64_t *data.
Since ldl_p() returns an 'int', this means that for the 4-byte
case we will sign-extend the data, whereas for 1 and 2 byte
reads we zero-extend it.

This ought not to matter since the caller will likely ignore values in
the high bytes of the data, but add a cast so that we're consistent.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell afa4f6653d bswap: Add new stn_*_p() and ldn_*_p() memory access functions
There's a common pattern in QEMU where a function needs to perform
a data load or store of an N byte integer in a particular endianness.
At the moment this is handled by doing a switch() on the size and
calling the appropriate ld*_p or st*_p function for each size.

Provide a new family of functions ldn_*_p() and stn_*_p() which
take the size as an argument and do the switch() themselves.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-2-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00