Commit Graph

1047 Commits

Author SHA1 Message Date
Dr. David Alan Gilbert 801110ab22 find_ram_offset: Align ram_addr_t allocation on long boundaries
The dirty bitmaps are built from 'long's and there is fast-path code
for synchronising the case where the RAMBlock is aligned to the start
of a long boundary.  Align the allocation to this boundary
to cause the fast path to be used.

Offsets before change:
11398@1515169675.018566:find_ram_offset size: 0x1e0000 @ 0x8000000
11398@1515169675.020064:find_ram_offset size: 0x20000 @ 0x81e0000
11398@1515169675.020244:find_ram_offset size: 0x20000 @ 0x8200000
11398@1515169675.024343:find_ram_offset size: 0x1000000 @ 0x8220000
11398@1515169675.025154:find_ram_offset size: 0x10000 @ 0x9220000
11398@1515169675.027682:find_ram_offset size: 0x40000 @ 0x9230000
11398@1515169675.032921:find_ram_offset size: 0x200000 @ 0x9270000
11398@1515169675.033307:find_ram_offset size: 0x1000 @ 0x9470000
11398@1515169675.033601:find_ram_offset size: 0x1000 @ 0x9471000

after change:
10923@1515169108.818245:find_ram_offset size: 0x1e0000 @ 0x8000000
10923@1515169108.819410:find_ram_offset size: 0x20000 @ 0x8200000
10923@1515169108.819587:find_ram_offset size: 0x20000 @ 0x8240000
10923@1515169108.823708:find_ram_offset size: 0x1000000 @ 0x8280000
10923@1515169108.824503:find_ram_offset size: 0x10000 @ 0x9280000
10923@1515169108.827093:find_ram_offset size: 0x40000 @ 0x92c0000
10923@1515169108.833045:find_ram_offset size: 0x200000 @ 0x9300000
10923@1515169108.833504:find_ram_offset size: 0x1000 @ 0x9500000
10923@1515169108.833787:find_ram_offset size: 0x1000 @ 0x9540000

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert 154cc9ea3b find_ram_offset: Add comments and tracing
Add some comments so I can understand the various nested loops.
Add some tracing so I can see what they're doing.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-2-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Peter Maydell 8af36743c2 exec: Don't reuse unassigned_mem_ops for io_mem_rom
We set up the io_mem_rom special memory region using the
unassigned_mem_ops structure; this is then used when a guest tries to
write to ROM.  This is incorrect, because the behaviour of unassigned
memory may be different from that of ROM for writes.  In particular,
on some architectures writing to unassigned memory generates a guest
exception, whereas writing to ROM is generally ignored.  Use a
special readonly_mem_ops for this purpose instead, so writes to
ROM are ignored for all guest CPUs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1513187549-2435-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:32 +01:00
Peter Xu 87a621d857 cpu: suffix cpu address spaces with cpu index
Renaming cpu address space names so that they won't be the same when
there are more than one.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:31 +01:00
Peter Xu 80ceb07a83 cpu: refactor cpu_address_space_init()
Normally we create an address space for that CPU and pass that address
space into the function.  Let's just do it inside to unify address space
creations.  It'll simplify my next patch to rename those address spaces.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:31 +01:00
Philippe Mathieu-Daudé ff676046fb misc: remove duplicated includes
exec: housekeeping (funny since 02d0e09503)

applied using ./scripts/clean-includes

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Peter Maydell 2726627197 exec.c: Factor out before/after actions for notdirty memory writes
The function notdirty_mem_write() has a sequence of actions
it has to do before and after the actual business of writing
data to host RAM to ensure that dirty flags are correctly
updated and we flush any TCG translations for the region.
We need to do this also in other places that write directly
to host RAM, most notably the TCG atomic helper functions.
Pull out the before and after pieces into their own functions.

We use an API where the prepare function stashes the various
bits of information about the write into a struct for the
complete function to use, because in the calls for the atomic
helpers the place where the complete function will be called
doesn't have the information to hand.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511201308-23580-2-git-send-email-peter.maydell@linaro.org
2017-11-21 12:09:25 +00:00
Peter Maydell 62955e101e Miscellaneous bugfixes
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAloMXN0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNNAQf/e7/uT2tW7WNfamSOMYXswf0R6ak+
 KjVSG+qiNsKaZzXmMFkhm4n0u1vCW0VGEQGRHr0MoSCyyfhupzLRHxfHi8SytqTf
 S6wqNtIbOK0L8bW+U5vzADks33UCuuUNlVZeOAkEPaXiLlgxmBoHfyoXkIGemJc2
 epx5x22rloNQLaBoL7FGmAkQhQCSJg19hAtRLo0tkryCwBZ9P6a1K0aNAHU2RFaB
 LgRFcxwduwTydsHRYeQ8J7YR0fERle01QUB8y9tlOc8/d2x9yRPBWhPHwscKMv6I
 JwM0c2Mnw6Yqbwyj7snWty7epgUcHWrOVnZnaIpNW9Z8m/wgz28oZ3a09w==
 =6wL6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Miscellaneous bugfixes

# gpg: Signature made Wed 15 Nov 2017 15:27:25 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  fix scripts/update-linux-headers.sh here document
  exec: Do not resolve subpage in mru_section
  util/stats64: Fix min/max comparisons
  cpu-exec: avoid cpu_exec_nocache infinite loop with record/replay
  cpu-exec: don't overwrite exception_index
  vhost-user-scsi: add missing virtqueue_size param
  target-i386: adds PV_TLB_FLUSH CPUID feature bit
  thread-posix: fix qemu_rec_mutex_trylock macro
  Makefile: simpler/faster "make help"
  ioapic/tracing: Remove last DPRINTFs
  Enable 8-byte wide MMIO for 16550 serial devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-16 14:42:54 +00:00
Paolo Bonzini 07c114bbf3 exec: Do not resolve subpage in mru_section
This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions.  The expensive part of
address_space_lookup_region anyway is phys_page_find; performance-wise
it is okay to repeat the subsequent subpage lookup.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20171114225941.072707456B5@zero.eik.bme.hu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-11-15 15:11:16 +01:00
Emilio G. Cota 2dda635410 qom: move CPUClass.tcg_initialize to a global
55c3cee ("qom: Introduce CPUClass.tcg_initialize", 2017-10-24)
introduces a per-CPUClass bool that we check so that the target CPU
is initialized for TCG only once. This works well except when
we end up creating more than one CPUClass, in which case we end
up incorrectly initializing TCG more than once, i.e. once for
each CPUClass.

This can be replicated with:
  $ aarch64-softmmu/qemu-system-aarch64 -machine xlnx-zcu102 -smp 6 \
      -global driver=xlnx,,zynqmp,property=has_rpu,value=on
In this case the class name of the "RPUs" is prefixed by "cortex-r5-",
whereas the "regular" CPUs are prefixed by "cortex-a53-". This
results in two CPUClass instances being created.

Fix it by introducing a static variable, so that only the first
target CPU being initialized will initialize the target-dependent
part of TCG, regardless of CPUClass instances.

Fixes: 55c3ceef61
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1510343626-25861-2-git-send-email-cota@braap.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-13 13:55:25 +00:00
Richard Henderson 9b990ee5a3 tcg: Add CPUState cflags_next_tb
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.

Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Emilio G. Cota 4e2ca83e71 tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.

Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Richard Henderson 55c3ceef61 qom: Introduce CPUClass.tcg_initialize
Move target cpu tcg initialization to common code,
called from cpu_exec_realizefn.

Acked-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 22:00:13 +02:00
Paolo Bonzini 306526b5de watch_mem_write: implement 8-byte accesses
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however a write
write watchpoint would end up tearing the 8-byte write into two 4-byte
writes in access_with_adjusted_size().

Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:15:00 +02:00
Andrew Baumann ad52878f97 notdirty_mem_write: implement 8-byte accesses
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however if we missed
in the softmmu, and the TLB line was marked as not dirty, then we
would end up tearing the 8-byte write into two 4-byte writes in
access_with_adjusted_size().

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20171013181913.7556-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:15:00 +02:00
Peter Xu 076a93d797 exec: simplify address_space_get_iotlb_entry
This patch let address_space_get_iotlb_entry() to use the newly
introduced page_mask parameter in flatview_do_translate(). Then we
will be sure the IOTLB can be aligned to page mask, also we should
nicely support huge pages now when introducing a764040.

Fixes: a764040 ("exec: abstract address_space_do_translate()")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20171010094247.10173-3-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12 12:10:38 +02:00
Peter Xu d5e5fafd11 exec: add page_mask for flatview_do_translate
The function is originally used for flatview_space_translate() and what
we care about most is (xlat, plen) range. However for iotlb requests, we
don't really care about "plen", but the size of the page that "xlat" is
located on. While, plen cannot really contain this information.

A simple example to show why "plen" is not good for IOTLB translations:

E.g., for huge pages, it is possible that guest mapped 1G huge page on
device side that used this GPA range:

  0x100000000 - 0x13fffffff

Then let's say we want to translate one IOVA that finally mapped to GPA
0x13ffffe00 (which is located on this 1G huge page). Then here we'll
get:

  (xlat, plen) = (0x13fffe00, 0x200)

So the IOTLB would be only covering a very small range since from
"plen" (which is 0x200 bytes) we cannot tell the size of the page.

Actually we can really know that this is a huge page - we just throw the
information away in flatview_do_translate().

This patch introduced "page_mask" optional parameter to capture that
page mask info. Also, I made "plen" an optional parameter as well, with
some comments for the whole function.

No functional change yet.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20171010094247.10173-2-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12 12:10:38 +02:00
Emilio G. Cota 3637cf58f9 util: move qemu_real_host_page_size/mask to osdep.h
These only depend on the host and therefore belong in the common
osdep, not in a target-dependent object.

While at it, query the host during an init constructor, which guarantees
the page size will be well-defined throughout the execution of the program.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 09:45:00 -07:00
Alexey Kardashevskiy 5e8fd947e2 memory: Rework "info mtree" to print flat views and dispatch trees
This adds a new "-d" switch to "info mtree" to print dispatch tree
internals.

This changes the way "-f" is handled - it prints now flat views and
associated address spaces.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-15-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy 8629d3fcb7 memory: Rename mem_begin/mem_commit/mem_add helpers
This renames some helpers to reflect better what they do.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-9-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy 9950322a59 memory: Cleanup after switching to FlatView
We store AddressSpaceDispatch* in FlatView anyway so there is no need
to carry it from mem_add() to register_subpage/register_multipage.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-8-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy 166206845f memory: Switch memory from using AddressSpace to FlatView
FlatView's will be shared between AddressSpace's and subpage_t
and MemoryRegionSection cannot store AS anymore, hence this change.

In particular, for:

 typedef struct subpage_t {
     MemoryRegion iomem;
-    AddressSpace *as;
+    FlatView *fv;
     hwaddr base;
     uint16_t sub_section[];
 } subpage_t;

  struct MemoryRegionSection {
     MemoryRegion *mr;
-    AddressSpace *address_space;
+    FlatView *fv;
     hwaddr offset_within_region;
     Int128 size;
     hwaddr offset_within_address_space;
     bool readonly;
 };

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-7-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy c775252378 memory: Remove AddressSpace pointer from AddressSpaceDispatch
AS in ASD is only used to pass AS from mem_begin() to register_subpage()
to store it in MemoryRegionSection, we can do this directly now.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-6-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy 66a6df1dc6 memory: Move AddressSpaceDispatch from AddressSpace to FlatView
As we are going to share FlatView's between AddressSpace's,
and AddressSpaceDispatch is a structure to perform quick lookup
in FlatView, this moves ASD to FlatView.

After previosly open coded ASD rendering, we can also remove
as->next_dispatch as the new FlatView pointer is stored
on a stack and set to an AS atomically.

flatview_destroy() is executed under RCU instead of
address_space_dispatch_free() now.

This makes mem_begin/mem_commit to work with ASD and mem_add with FV
as later on mem_add will be taking FV as an argument anyway.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-5-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy 9a62e24f45 memory: Open code FlatView rendering
We are going to share FlatView's between AddressSpace's and per-AS
memory listeners won't suit the purpose anymore so open code
the dispatch tree rendering.

Since there is a good chance that dispatch_listener was the only
listener, this avoids address_space_update_topology_pass() if there is
no registered listeners; this should improve starting time.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-3-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy e76bb18f7e exec: Explicitly export target AS from address_space_translate_internal
This adds an AS** parameter to address_space_do_translate()
to make it easier for the next patch to share FlatViews.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-2-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
David Hildenbrand 7f579e272f exec,dump,i386,ppc,s390x: don't include exec/cpu-all.h explicitly
All but a handful of files include exec/cpu-all.h via cpu.h only.
As these files already include cpu.h, let's just drop the additional
include.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-09-19 18:21:33 +02:00
Anthony PERARD f5aa69bdc3 exec: Add lock parameter to qemu_ram_ptr_length
Commit 04bf2526ce (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.

In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.

The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20170726165326.10327-1-anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-01 17:27:33 +02:00
Fam Zheng c7e002c55a cpu: Convert to DEFINE_PROP_LINK
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-20-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:43 +02:00
Alexey Kardashevskiy 1221a47467 memory/iommu: introduce IOMMUMemoryRegionClass
This finishes QOM'fication of IOMMUMemoryRegion by introducing
a IOMMUMemoryRegionClass. This also provides a fastpath analog for
IOMMU_MEMORY_REGION_GET_CLASS().

This makes IOMMUMemoryRegion an abstract class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170711035620.4232-3-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Alexey Kardashevskiy 3df9d74806 memory/iommu: QOM'fy IOMMU MemoryRegion
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion
as a parent.

This moves IOMMU-related fields from MR to IOMMU MR. However to avoid
dymanic QOM casting in fast path (address_space_translate, etc),
this adds an @is_iommu boolean flag to MR and provides new helper to
do simple cast to IOMMU MR - memory_region_get_iommu. The flag
is set in the instance init callback. This defines
memory_region_is_iommu as memory_region_get_iommu()!=NULL.

This switches MemoryRegion to IOMMUMemoryRegion in most places except
the ones where MemoryRegion may be an alias.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170711035620.4232-2-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Pranith Kumar 406bc339b0 Revert "exec.c: Fix breakpoint invalidation race"
Now that we have proper locking after MTTCG patches have landed, we
can revert the commit.  This reverts commit

a9353fe897.

CC: Peter Maydell <peter.maydell@linaro.org>
CC: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170712215143.19594-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 11:05:19 +02:00
Prasad J Pandit 04bf2526ce exec: use qemu_ram_ptr_length to access guest ram
When accessing guest's ram block during DMA operation, use
'qemu_ram_ptr_length' to get ram block pointer. It ensures
that DMA operation of given length is possible; And avoids
any OOB memory access situations.

Reported-by: Alex <broscutamaker@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170712123840.29328-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 11:04:34 +02:00
Paolo Bonzini 5aa1ef71b4 exec: elide calls to tb_lock and tb_unlock
Adding assertions fixes link errors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:01:16 +02:00
Yang Zhong a0be0c585f tcg: move page_size_init() function
translate-all.c will be disabled if tcg is disabled in the build,
so page_size_init() function and related variables will be moved
to exec.c file.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:00:12 +02:00
Marc-André Lureau 38b3362dd1 exec: split qemu_ram_alloc_from_file()
Add qemu_ram_alloc_from_fd(), which can be use to allocate ramblock from
fd only.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170602141229.15326-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:04 +02:00
Marc-André Lureau 8d37b030fe exec: split file_ram_alloc()
Move file opening part in a seperate function, file_ram_open(). This
allows for reuse of file_ram_alloc() with a given fd.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170602141229.15326-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:04 +02:00
Marc-André Lureau e45e7ae281 exec: check kvm mmu notifiers earlier
Move kvm mmu notifiers check before calling file_ram_alloc(), with the
other xen precondition. (file_ram_alloc() will be reused in other cases
than -mem-path).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170602141229.15326-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:04 +02:00
Peter Xu 003a0cf2cd exec: simplify phys_page_find() params
It really only plays with the dispatchers, so the parameter list does
not need that complexity. This helps for readability at least.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494838260-30439-2-git-send-email-peterx@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Peter Xu bf55b7afce memory: tune last param of iommu_ops.translate()
This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Juan Quintela 46d702b106 migration: Make savevm.c target independent
It only needed TARGET_PAGE_SIZE/BITS/BITS_MIN values, so just export
them from exec.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-18 19:21:00 +02:00
Juan Quintela 51180423a2 exec: Create include for target_page_size()
That is the only function that we need from exec.c, and having to
include the whole sysemu.h for this.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

---

/me leans to be less sloppy with copyright notices
thanks Dave
2017-05-18 19:20:59 +02:00
Stefan Hajnoczi 56821559f0 HMP pull
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZHJB7AAoJEAUWMx68W/3nSr8P/3uw04AxuCiEpwVYjtYaMGgs
 C/1zNXFPRIWDx5kNxi7reH0aknYSjHVijxuGUiAqP9gM+VpGO485gEWDrmSo6tqy
 9s3HbudTzL/Qfp7ZwJY+v8gFxjwCZYhDh3WI+3yIjWDLHpGhTdRuYIC4DEA1EkGY
 zvG8fHISMqZUh1DYE7ttCX1zLlZLaAA1znjYbnXCYNjoRdeeY2+uJsaPhdjqi1la
 5qhKWtKS68cfiKSkntBASKMs2+z1+4iyGVVzgd8E64O4JbdirB2JHnjSwm3fJ5yY
 dhJLY6nSn7eNdBmVHSe0ZBDHfwoVbOlPC9+z0Ob+UxiGcXd0KSYaE3IaD2Ref2LS
 Wju92rSLcADNYBQtNAxwKpDeixf+WbfetiKDPx71N2rcpdpZg1I3kwu4ippMwh6g
 jpgjSV2Nkcnv4PQBD9kPSt6jJBmcyN3Y98VvaOFxyOv1u1Xm7ZbhkmxOQKQvG4jc
 OlBbA20EzOC0axl9ktVF7GljiziCqDaMFljBxNlpdAqzjou7ztEEvXBwcl8I8eBH
 ThocuOf25CPLrTAwggbEWpKfxzSQ9pEIPHpQL3Qz9Ip7STlcxB/FJ/9oEQvpZeJg
 MHpgBLDzw1xqJw01n711qgGtEsCvTUoQMcnJxgeaI/WzEIaRd68breI2/qG22aBl
 0XgH7THRl8WP+EC8ZYrB
 =wJvl
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'dgilbert/tags/pull-hmp-20170517' into staging

HMP pull

# gpg: Signature made Wed 17 May 2017 07:03:39 PM BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* dgilbert/tags/pull-hmp-20170517:
  ramblock: add new hmp command "info ramblock"
  utils: provide size_to_str()
  ramblock: add RAMBLOCK_FOREACH()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 13:36:15 +01:00
Stefan Hajnoczi adb354dd1e pci, virtio, vhost: fixes
A bunch of fixes that missed the release.
 Most notably we are reverting shpc back to enabled by default state
 as guests uses that as an indicator that hotplug is supported
 (even though it's unused). Unfortunately we can't fix this
 on the stable branch since that would break migration.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZHMOuAAoJECgfDbjSjVRp/5IH/3kOa7yV3KUi4QVbQV7WwBH3
 LK+/jwIz4UhOZn+bS4qi+gjN6aFhNoBNDFmYsRTWKKdLMvZvkRBMDcv8DMIKeAyl
 kG/ispv8VI+GY/CRKnqzPm0FSulv8WPRryxkdGzK4oHiMv+4FpFR0v/n9NRHjwTA
 XNJ4k33IqBldXyZwwAzP5dT019EMvbn4bNrkLzlcF2w8mTWPf43eX/kIkRX0cAys
 5IVTQVGEOwpnyV0jxJDP+aoVMrqv8xl88LLuRpTgWUo0UnxXL5/GZQOCCUN6DQ7M
 BOLmyyP9mT9k8iUI+fQsDxAtY7cL9torq+p985nQdH0nxmI3GCoufn9aJG0J9yc=
 =d34x
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mst/tags/for_upstream' into staging

pci, virtio, vhost: fixes

A bunch of fixes that missed the release.
Most notably we are reverting shpc back to enabled by default state
as guests uses that as an indicator that hotplug is supported
(even though it's unused). Unfortunately we can't fix this
on the stable branch since that would break migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 17 May 2017 10:42:06 PM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  exec: abstract address_space_do_translate()
  pci: deassert intx when pci device unrealize
  virtio: allow broken device to notify guest
  Revert "hw/pci: disable pci-bridge's shpc by default"
  acpi-defs: clean up open brace usage
  ACPI: don't call acpi_pcihp_device_plug_cb on xen
  iommu: Don't crash if machine is not PC_MACHINE
  pc: add 2.10 machine type
  pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
  libvhost-user: fix crash when rings aren't ready
  hw/virtio: fix vhost user fails to startup when MQ
  hw/arm/virt: generate 64-bit addressable ACPI objects
  hw/acpi-defs: replace leading X with x_ in FADT field names

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:01:08 +01:00
Peter Xu a764040cc8 exec: abstract address_space_do_translate()
This function is an abstraction helper for address_space_translate() and
address_space_get_iotlb_entry(). It does the lookup of address into
memory region section, then does proper IOMMU translation if necessary.
Refactor the two existing functions to use it.

This fixes vhost when IOMMU is disabled by guest.

Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-18 00:35:15 +03:00
Peter Xu be9b23c4a5 ramblock: add new hmp command "info ramblock"
To dump information about ramblocks. It looks like:

(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    2 MiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
                vga.vram    4 KiB  0x0000000080060000 0x0000000001000000 0x0000000001000000
    /rom@etc/acpi/tables    4 KiB  0x00000000810b0000 0x0000000000020000 0x0000000000200000
                 pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
  0000:00:03.0/e1000.rom    4 KiB  0x0000000081070000 0x0000000000040000 0x0000000000040000
                  pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
    0000:00:02.0/vga.rom    4 KiB  0x0000000081060000 0x0000000000010000 0x0000000000010000
   /rom@etc/table-loader    4 KiB  0x00000000812b0000 0x0000000000001000 0x0000000000001000
      /rom@etc/acpi/rsdp    4 KiB  0x00000000812b1000 0x0000000000001000 0x0000000000001000

Ramblock is something hidden internally in QEMU implementation, and this
command should only be used by mostly QEMU developers on RAM stuff. It
is not a command suitable for QMP interface. So only HMP interface is
provided for it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-4-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:31:16 +01:00
Peter Xu 99e15582de ramblock: add RAMBLOCK_FOREACH()
So that it can simplifies the iterators.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:30:37 +01:00
Stefano Stabellini 1ff7c5986a xen/mapcache: store dma information in revmapcache entries for debugging
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.

>From the QEMU point of view there are two kinds of long term mappings:

[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends

After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.

However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.

This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.

Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].

The goal of the patch is to make debugging and system understanding
easier.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2017-05-16 11:49:09 -07:00
Gerd Hoffmann 8deaf12ca1 memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Juan Quintela 66103a5796 ram: Remove migration_bitmap_extend()
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela b8c4899398 ram: rename last_ram_offset() last_ram_pages()
We always use it as pages anyways.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela 20afaed98b ram: Rename qemu_target_page_bits() to qemu_target_page_size()
It was used as a size in all cases except one.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Paolo Bonzini 90c4fe5fc5 exec: revert MemoryRegionCache
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time).  Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 13:41:53 +02:00
Dr. David Alan Gilbert 463a4ac23b RAMBlocks: qemu_ram_is_shared
Provide a helper to say whether a RAMBlock was created as a
shared mapping.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 09:00:58 +01:00
Christian Borntraeger 79ca7a1b89 exec: add cpu_synchronize_state to cpu_memory_rw_debug
I sometimes got "Cannot access memory" when using the x command
on the monitor. Turns out that the cpu env did contain stale data
(e.g. wrong control register content for page table origin).
We must synchronize the state of the CPU before walking the page
tables. A similar issues happens for a remote gdb, so lets
do the cpu_synchronize_state in cpu_memory_rw_debug.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1488896348-13560-1-git-send-email-borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Jitendra Kolhe 1e356fc14b mem-prealloc: reduce large guest start-up and migration time.
Using "-mem-prealloc" option for a large guest leads to higher guest
start-up and migration time. This is because with "-mem-prealloc" option
qemu tries to map every guest page (create address translations), and
make sure the pages are available during runtime. virsh/libvirt by
default, seems to use "-mem-prealloc" option in case the guest is
configured to use huge pages. The patch tries to map all guest pages
simultaneously by spawning multiple threads. Currently limiting the
change to QEMU library functions on POSIX compliant host only, as we are
not sure if the problem exists on win32. Below are some stats with
"-mem-prealloc" option for guest configured to use huge pages.

------------------------------------------------------------------------
Idle Guest      | Start-up time | Migration time
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - single threaded (existing code)
------------------------------------------------------------------------
64 Core - 4TB   | 54m11.796s    | 75m43.843s
64 Core - 1TB   | 8m56.576s     | 14m29.049s
64 Core - 256GB | 2m11.245s     | 3m26.598s
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 8 threads
------------------------------------------------------------------------
64 Core - 4TB   | 5m1.027s      | 34m10.565s
64 Core - 1TB   | 1m10.366s     | 8m28.188s
64 Core - 256GB | 0m19.040s     | 2m10.148s
-----------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 16 threads
-----------------------------------------------------------------------
64 Core - 4TB   | 1m58.970s     | 31m43.400s
64 Core - 1TB   | 0m39.885s     | 7m55.289s
64 Core - 256GB | 0m11.960s     | 2m0.135s
-----------------------------------------------------------------------

Changed in v2:
 - modify number of memset threads spawned to min(smp_cpus, 16).
 - removed 64GB memory restriction for spawning memset threads.

Changed in v3:
 - limit number of threads spawned based on
   min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus)
 - implement memset thread specific siglongjmp in SIGBUS signal_handler.

Changed in v4
 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread
   as main thread no longer touches any pages.
 - simplify code my returning memset_thread_failed status from
   touch_all_pages.

Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Alexey Kardashevskiy 9c60766887 exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Dr. David Alan Gilbert 67f11b5c23 postcopy: Record largest page size
Record the largest page size in use; we'll need it soon for allocating
temporary buffers.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-7-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Dr. David Alan Gilbert e2fa71f527 postcopy: enhance ram_block_discard_range for hugepages
Unfortunately madvise DONTNEED doesn't work on hugepagetlb
so use fallocate(FALLOC_FL_PUNCH_HOLE)
qemu_fd_getpagesize only sets the page based off a file
if the file is from hugetlbfs.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-6-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Dr. David Alan Gilbert d3a5038c46 exec: ram_block_discard_range
Create ram_block_discard_range in exec.c to replace
postcopy_ram_discard_range and most of ram_discard_range.

Those two routines are a bit of a weird combination, and
ram_discard_range is about to get more complex for hugepages.
It's OS dependent code (so shouldn't be in migration/ram.c) but
it needs quite a bit of the innards of RAMBlock so doesn't belong in
the os*.c.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-5-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Jan Kiszka 8d04fb55de tcg: drop global lock during TCG code execution
This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
[PM: target-arm changes]
Acked-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-24 10:32:45 +00:00
Paolo Bonzini 91047df38d exec: make address_space_cache_destroy idempotent
Clear cache->mr so that address_space_cache_destroy does nothing
the second time it is called.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Julian Brown 4061200059 arm: Correctly handle watchpoints for BE32 CPUs
In BE32 mode, sub-word size watchpoints can fail to trigger because the
address of the access is adjusted in the opcode helpers before being
compared with the watchpoint registers.  This patch reverses the address
adjustment before performing the comparison with the help of a new CPUClass
hook.

This version of the patch augments and tidies up comments a little.

Signed-off-by: Julian Brown <julian@codesourcery.com>
Message-id: caaf64ffc72f6ae183015337b7afdbd4b8989cb6.1484929304.git.julian@codesourcery.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-07 18:29:59 +00:00
Daniel P. Berrange 0ab8ed18a6 trace: switch to modular code generation for sub-directories
Introduce rules in the top level Makefile that are able to generate
trace.[ch] files in every subdirectory which has a trace-events file.

The top level directory is handled specially, so instead of creating
trace.h, it creates trace-root.h. This allows sub-directories to
include the top level trace-root.h file, without ambiguity wrt to
the trace.g file in the current sub-dir.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170125161417.31949-7-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-31 17:11:18 +00:00
Ladi Prosek 6da67de680 memory: don't sign-extend 32-bit writes
ldl_p has a signed return type so assigning it to uint64_t implicitly
sign-extends the value. This results in devices with min_access_size = 8
seeing unexpected values passed to their write handlers.

Example: guest performs a 32-bit write of 0x80000000 to an mmio region
and the handler receives 0xFFFFFFFF80000000 in its value argument.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-Id: <1485440557-10384-1-git-send-email-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:08:00 +01:00
Peter Maydell 598cf1c805 * QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
 * Memory leak fixes (Li Qiang, me)
 * Ctrl-a b regression (Marc-André)
 * Stubs cleanups and fixes (Leif, me)
 * hxtool tweak (me)
 * HAX support (Vincent)
 * QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
 * PC_COMPAT_2_8 fix (Marcelo)
 * stronger bitmap assertions (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYggc9FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 5pMH/092iVHw1la8VmphQd8W7hkCHckvVbwaEJ+n4BP8MjeUNmYFJX+op9Qlpqfe
 ekYqQgK69v2UwuofVK2gqS+Y2EyFHivTESk5pS3SM3lTewV1fzCM/HVG3pTxV/ol
 V+eBnp+shrfNG3Eg7YThTqx4LkDUp24Pd3HJVblQZMVpqGzL2xUuUQzSf8F/eeQJ
 xO61pm0ovpCY5MCg3kPLx8GIkPAmcXo5jhMCTz5aLnQW6TO/mwx271a4UE2RTLZ7
 cFjNhxdGSzlnn2RwId4HVYWGU42taW6mpa8NX1hVVUXa1A2qlAfi5N/WLaH0aGYR
 J5ZTIaXdPUBx2SrUmd8udj4a818=
 =H5BQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
* Memory leak fixes (Li Qiang, me)
* Ctrl-a b regression (Marc-André)
* Stubs cleanups and fixes (Leif, me)
* hxtool tweak (me)
* HAX support (Vincent)
* QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
* PC_COMPAT_2_8 fix (Marcelo)
* stronger bitmap assertions (Peter)

# gpg: Signature made Fri 20 Jan 2017 12:49:01 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8
  bitmap: assert that start and nr are non negative
  Revert "win32: don't run subprocess tests on Mingw32 platform"
  hax: add Darwin support
  Plumb the HAXM-based hardware acceleration support
  target/i386: Add Intel HAX files
  kvm: move cpu synchronization code
  KVM: PPC: eliminate unnecessary duplicate constants
  ramblock-notifier: new
  char: fix ctrl-a b not working
  exec: Add missing rcu_read_unlock
  x86: ioapic: fix fail migration when irqchip=split
  x86: ioapic: dump version for "info ioapic"
  x86: ioapic: add traces for ioapic
  hxtool: emit Texinfo headings as @subsection
  qemu-thread: fix qemu_thread_set_name() race in qemu_thread_create()
  serial: fix memory leak in serial exit
  scsi-block: fix direction of BYTCHK test for VERIFY commands
  pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged
  acpi: filter based on CONFIG_ACPI_X86 rather than TARGET
  ...

# Conflicts:
#	include/hw/i386/pc.h
2017-01-20 16:42:07 +00:00
Paolo Bonzini 0987d735a3 ramblock-notifier: new
This adds a notify interface of ram block additions and removals.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Roman Kapl 5ad4a2b75f exec: Add missing rcu_read_unlock
rcu_read_unlock was not called if the address_space_access_valid result is
negative.

This caused (at least) a problem when qemu on PPC/E500+TAP failed to terminate
properly and instead got stuck in a deadlock.

Signed-off-by: Roman Kapl <rka@sysgo.com>
Message-Id: <20170109110921.4931-1-rka@sysgo.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Alex Bennée d10eb08f5d cputlb: drop flush_global flag from tlb_flush
We have never has the concept of global TLB entries which would avoid
the flush so we never actually use this flag. Drop it and make clear
that tlb_flush is the sledge-hammer it has always been.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[DG: ppc portions]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-13 14:24:37 +00:00
Jason Wang 052c8fa998 exec: introduce address_space_get_iotlb_entry()
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Paolo Bonzini 1f4e496e1f exec: introduce MemoryRegionCache
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap.  This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini 715c31ec8e exec: introduce address_space_extend_translation
This extracts the common part of address_space_map and
address_space_cache_init into a new function.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini 0ce265ffef exec: introduce memory_ldst.inc.c
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini 2651efe7f5 exec: optimize remaining address_space_* cases
Do them right before the next patch generalizes them into a multi-included
file.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:22 +01:00
Peter Maydell a9353fe897 exec.c: Fix breakpoint invalidation race
A bug (1647683) was reported showing a crash when removing
breakpoints.  The reproducer was bisected to 3359baad when tb_flush
was finally made thread safe.  While in MTTCG the locking in
breakpoint_invalidate would have prevented any problems, but
currently tb_lock() is a NOP for system emulation.

The race is between a tb_flush from the gdbstub and the
tb_invalidate_phys_addr() in breakpoint_invalidate().

Ideally we'd have actual locking here; for the moment the
simple fix is to do a full tb_flush() for a bp invalidate,
since that is thread-safe even if no lock is taken.

Reported-by: Julian Brown <julian@codesourcery.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1481047629-7763-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-12-06 20:21:46 +00:00
Stefan Hajnoczi 199a5bde46 * NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
 * Memory backend fixes (Haozhong)
 * Atomics fix (Alex)
 * New AVX512 features (Luwei)
 * "make check" logging fix (Paolo)
 * Chardev refactoring fallout (Paolo)
 * Small checkpatch improvements (Paolo, Jeff)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYGaRPFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 XKgH/RgNtosBTqJsmphkS7wACFAFOf7Uq46ajoKfB66Pt1J/++pFQg4TApPYkb7j
 KlKeKmXa7hb6+Jg8325H4zGkGno4kn2dE+OnznaB1xPKwiZVAMQVzQsagsEVqpno
 k/5PBVRptIiuHQKyU29Go0CxbWJBTH0O14S7rDK4YDF0YMnuT280HQOI3jdu1igV
 G/Q+CMgfk+yXf6GWHE8Z9sNq7n0ha8qgruA/X3NC7+pAvEsUcAP065zwLp9weYuK
 W1MU68L7Ub4tRo0SVf1HFkDUNdMv4T4hg+wpGe1GwthJWexHu9x0YAQBy60ykJb6
 NtHwjLwCUWtm7AiZD/btsOJPmjk=
 =+Dt/
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)

# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (30 commits)
  main-loop: Suppress I/O thread warning under qtest
  docs/rcu.txt: Fix minor typo
  vl: exit qemu on guest panic if -no-shutdown is not set
  checkpatch: allow spaces before parenthesis for 'coroutine_fn'
  x86: add AVX512_4VNNIW and AVX512_4FMAPS features
  slirp: fix CharDriver breakage
  qemu-char: do not forward events through the mux until QEMU has started
  nbd: Implement NBD_CMD_WRITE_ZEROES on client
  nbd: Implement NBD_CMD_WRITE_ZEROES on server
  nbd: Improve server handling of shutdown requests
  nbd: Refactor conversion to errno to silence checkpatch
  nbd: Support shorter handshake
  nbd: Less allocation during NBD_OPT_LIST
  nbd: Let client skip portions of server reply
  nbd: Let server know when client gives up negotiation
  nbd: Share common option-sending code in client
  nbd: Send message along with server NBD_REP_ERR errors
  nbd: Share common reply-sending code in server
  nbd: Rename struct nbd_request and nbd_reply
  nbd: Rename NbdClientSession to NBDClientSession
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-03 16:32:30 +00:00
Haozhong Zhang 1775f111ea exec.c: check memory backend file size with 'size' option
If the memory backend file is not large enough to hold the required 'size',
Qemu will report error and exit.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20161027042300.5929-3-haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20161102010551.2723-1-haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-02 09:28:51 +01:00
Richard Henderson 1ee73216f4 log: Add locking to large logging blocks
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.

While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.

For mingw32, we appear to have no viable solution for this.  The locking
functions are not properly exported from the system runtime library.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:29:03 -06:00
Haozhong Zhang d6af99c9f8 exec.c: do not truncate non-empty memory backend file
For '-object memory-backend-file,mem-path=foo,size=xyz', if the size of
file 'foo' does not match the given size 'xyz', the current QEMU will
truncate the file to the given size, which may corrupt the existing data
in that file. To avoid such data corruption, this patch disables
truncating non-empty backend files.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20161027042300.5929-2-haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Alex Bennée f35e44e764 exec.c: ensure all AddressSpaceDispatch updates under RCU
The memory_dispatch field is meant to be protected by RCU so we should
use the correct primitives when accessing it. This race was flagged up
by the ThreadSanitizer.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161021153418.21571-1-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Alex Bennée ba051fb5e5 tcg: move locking for tb_invalidate_phys_page_range up
In the linux-user case all things that involve ''l1_map' and PageDesc
tweaks are protected by the memory lock (mmpa_lock). For SoftMMU mode
we previously relied on single threaded behaviour, with MTTCG we now use
the tb_lock().

As a result we need to do a little re-factoring  and push the taking of
this lock up the call tree. This requires a slightly different entry for
the SoftMMU and user-mode cases from tb_invalidate_phys_range.

This also means user-mode breakpoint insertion needs to take two locks
but it hadn't taken any previously so this is an improvement.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-20-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 15:00:25 +01:00
KONRAD Frederic a5e998262f tcg: protect translation related stuff with tb_lock.
This protects all translation related work with tb_lock() too ensure
thread safety. This effectively serialises all code generation. In
addition to the code generation we also take the lock for TB
invalidation. This has a knock on effect of meaning tb_lock() is held
for modification of the SoftMMU TLB by non-self threads which will be
used in later patches.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-Id: <1439220437-23957-8-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: moved into tree, clean-up history]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-10-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Richard Henderson 258dfaaad0 exec: Avoid direct references to Int128 parts
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Peter Maydell c43e853afe x86 and CPU queue, 2016-10-24
x2APIC support to APIC code, cpu_exec_init() refactor on all
 architectures, and other x86 changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYDmYyAAoJECgHk2+YTcWmoSUP/2ga+b9YmPuyL7XC+12pff0I
 Z8gdjUzbMUNcCI0JMZCTGUJbs3BapLcnsA7ypmt88s9kG02WeDMhNx1BfYiAFgLU
 kPLQlXAM7awEdGagd3sTCiFojSUZ7GxYHjd5fuhPoOAXvXM8im6zJl18ZcsnStjO
 /J8JGoGDHq1XJlz+RIjnGamojJWCiO/+iiD+rFmVSic8zjHPDYq14sIk/QJX+DaF
 azLiOI6DAlX3kyrN5ZshhIRQ3COzzUMUSDF/ZaYHjudUco5MBnwj/oLQniTq+ZUd
 hCu7dr5TpLxI7q1yltyd0UIl/+aZGbE8tEvoXAtc735iK4m2CTckT7ql6x3xI+Ir
 PmpPgIswHqfCiCXm8imLj6ZI47kRA1x4x4AudLaNVKP7jO82485sS9HWpOadYsaU
 jvek2SqfqvH+vce4FzwlLEcXGDb73MT/XkIUvd7SfPIbs9umgdZc03U4SHfAWr0i
 lAIRs4Ym0AAS2WSE4E09wvdUUr9oxaQBMhw3JAiNmg7hLfyINTP+D/IhtlAVXXEA
 F9D7fky5lDwfKvIwPxPJbDD5bCBV9AmxhiahIhv3epu4Kg4orf1inkrx0IZWSbB0
 7+JZ7j8asuizfibkeZAN9rxVwmz32makJNsnjzZHlnaPxTvIDzvRkNceBnhC5vKq
 3yfxgl4agXmMjveraAtt
 =T2kg
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 and CPU queue, 2016-10-24

x2APIC support to APIC code, cpu_exec_init() refactor on all
architectures, and other x86 changes.

# gpg: Signature made Mon 24 Oct 2016 20:51:14 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  exec: call cpu_exec_exit() from a CPU unrealize common function
  exec: move cpu_exec_init() calls to realize functions
  exec: split cpu_exec_init()
  pc: q35: Bump max_cpus to 288
  pc: Require IRQ remapping and EIM if there could be x2APIC CPUs
  pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
  Increase MAX_CPUMASK_BITS from 255 to 288
  pc: Clarify FW_CFG_MAX_CPUS usage comment
  pc: kvm_apic: Pass APIC ID depending on xAPIC/x2APIC mode
  pc: apic_common: Reset APIC ID to initial ID when switching into x2APIC mode
  pc: apic_common: Restore APIC ID to initial ID on reset
  pc: apic_common: Extend APIC ID property to 32bit
  pc: Leave max apic_id_limit only in legacy cpu hotplug code
  acpi: cphp: Force switch to modern cpu hotplug if APIC ID > 254
  pc: acpi: x2APIC support for SRAT table
  pc: acpi: x2APIC support for MADT table and _MAT method

Conflicts:
	target-arm/cpu.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-25 10:25:27 +01:00
Laurent Vivier 7bbc124e7e exec: call cpu_exec_exit() from a CPU unrealize common function
As cpu_exec_exit() mirrors the cpu_exec_realizefn(),
rename it as cpu_exec_unrealizefn().

Create and register a cpu_common_unrealizefn() function for
the CPU device class and call cpu_exec_unrealizefn() from
this function.

Remove cpu_exec_exit() from cpu_common_finalize()
(which mirrors init, not realize), and as x86_cpu_unrealizefn()
and ppc_cpu_unrealizefn() overwrite the device class unrealize function,
add a call to a parent_unrealize pointer.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier ce5b1bbf62 exec: move cpu_exec_init() calls to realize functions
Modify all CPUs to call it from XXX_cpu_realizefn() function.

Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)

for arm:

Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier 39e329e341 exec: split cpu_exec_init()
Put in cpu_exec_initfn() what initializes the CPU,
and leave in cpu_exec_init() what adds it to the environment.

As cpu_exec_initfn() is called by all XX_cpu_initfn(), call it
directly in cpu_common_initfn().
cpu_exec_init() is now a realize function, it will be renamed
to cpu_exec_realizefn() and moved to the XX_cpu_realizefn()
function in a following patch.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Peter Maydell 20bccb82ff cpu: Support a target CPU having a variable page size
Support target CPUs having a page size which isn't knownn
at compile time. To use this, the CPU implementation should:
 * define TARGET_PAGE_BITS_VARY
 * not define TARGET_PAGE_BITS
 * define TARGET_PAGE_BITS_MIN to the smallest value it
   might possibly want for TARGET_PAGE_BITS
 * call set_preferred_target_page_bits() in its realize
   function to indicate the actual preferred target page
   size for the CPU (and report any error from it)

In CONFIG_USER_ONLY, the CPU implementation should continue
to define TARGET_PAGE_BITS appropriately for the guest
OS page size.

Machines which want to take advantage of having the page
size something larger than TARGET_PAGE_BITS_MIN must
set the MachineClass minimum_page_bits field to a value
which they guarantee will be no greater than the preferred
page size for any CPU they create.

Note that changing the target page size by setting
minimum_page_bits is a migration compatibility break
for that machine.

For debugging purposes, attempts to use TARGET_PAGE_SIZE
before it has been finally confirmed will assert.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-24 16:26:49 +01:00
Vijaya Kumar K 2615fabd42 exec.c: Remove static allocation of sub_section of sub_page
Allocate sub_section dynamically. Remove dependency
on TARGET_PAGE_SIZE to make run-time page size detection
for arm platforms.

Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Message-id: 1465808915-4887-3-git-send-email-vijayak@caviumnetworks.com
[PMM: use flexible array member rather than separate malloc
 so we don't need an extra pointer deref when using it]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:49 +01:00
Haozhong Zhang 8360668e69 exec.c: workaround regression caused by alignment change in d2f39ad
Commit d2f39ad "exec.c: Ensure right alignment also for file backed ram"
added an additional alignment requirement on the size of backend file
besides the previous page size. On x86, the alignment is changed from
4KB in QEMU 2.6 to 2MB in QEMU 2.7.

This change breaks certain usages in QEMU 2.7 on x86, e.g.
    -object memory-backend-file,id=mem1,mem-path=/tmp/,size=$SZ
    -device pc-dimm,id=dimm1,memdev=mem1
where $SZ is multiple of 4KB but not 2MB (e.g. 1023M). QEMU 2.7
reports the following error message and aborts:
qemu-system-x86_64: -device pc-dimm,memdev=mem1,id=nv1: backend memory size must be multiple of 0x200000

The same regression may also happen in other platforms as indicated by
Igor Mammedov. This change is however necessary for s390 according to
the commit message of d2f39ad, so we workaround the regression by taking
the change only on s390.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reported-by: "Xu, Anthony" <anthony.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:11 +02:00
Dr. David Alan Gilbert 863e9621c5 RAMBlocks: Store page size
Store the page size in each RAMBlock, we need it later.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Marc-André Lureau efee678d6d exec: remove unused compacted argument
Since commit b35ba30f8f when it was introduced, phys_page_compact()
takes an unused compacted argument.

ubsan complains about it when launching qemu-x86_64 without arguments:
qemu/exec.c:310:5: runtime error: variable length array bound evaluates to non-positive value 0

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-08 11:25:29 +03:00
Paolo Bonzini 267f685b8b cpus-common: move CPU list management to common code
Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work.  Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Cao jin c2cd627ddb kvm-all: drop kvm_setup_guest_memory
kvm_setup_guest_memory only does "madvise to QEMU_MADV_DONTFORK" and
is only called by ram_block_add, which actually is duplicate code.
Bonus: add simple comment for kvm_has_sync_mmu to make life easier.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1473662096-32598-1-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:09:43 +02:00
Igor Mammedov 3b8c1761f0 qtail: clean up direct access to tqe_prev field
instead of accessing tqe_prev field dircetly outside
of queue.h use macros to check if element is in list
and make sure that afer element is removed from list
tqe_prev field could be used to do the same check.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469450832-84343-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:08:41 +02:00
Igor Mammedov 630eb0faf4 exec: Ensure the only one cpu_index allocation method is used
Make sure that cpu_index auto allocation isn't used in
combination with manual cpu_index assignment. And
dissallow out of order cpu removal if auto allocation
is in use.

Target that wishes to support out of order unplug should
switch to manual cpu_index assignment. Following patch
could be used as an example:
 (pc: init CPUState->cpu_index with index in possible_cpus[]))

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-05 16:01:55 -03:00
Igor Mammedov 056b68af77 fix qemu exit on memory hotplug when allocation fails at prealloc time
When adding hostmem backend at runtime, QEMU might exit with error:
  "os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM"

It happens due to os_mem_prealloc() not handling errors gracefully.

Fix it by passing errp argument so that os_mem_prealloc() could
report error to callers and undo performed allocation when
os_mem_prealloc() fails.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469008443-72059-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-02 12:03:58 +02:00
Igor Mammedov a07f953ef4 exec: Set cpu_index only if it's not been explictly set
It keeps the legacy behavior for all users that doesn't care
about stable cpu_index value, but would allow boards that
would support device_add/device_del to set stable cpu_index
that won't depend on order in which cpus are created/destroyed.

While at that simplify cpu_get_free_index() as cpu_index
generated by USER_ONLY and softmmu variants is the same
since none of the users support cpu-remove so far, except
of not yet released spapr/x86 device_add/delr, which
will be altered by follow up patches to set stable
cpu_index manually.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:32:01 -03:00
Igor Mammedov 8b1b835035 exec: Don't use cpu_index to detect if cpu_exec_init()'s been called
Instead use QTAIL's tqe_prev field to detect if cpu's been
placed in list by cpu_exec_init() which is always set if
QTAIL element is in list.

Fixes SIGSEGV on failure path in case cpu_index is assigned
by board and cpu.relalize() fails before cpu_exec_init() is called.

In follow up patches, cpu_index will be assigned by boards that
support cpu hot(un)plug and need stable cpu_index that doesn't
depend on order cpus are created/removed.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:32:00 -03:00
Igor Mammedov 1bc7e522d9 exec: Reduce CONFIG_USER_ONLY ifdeffenery
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:31:58 -03:00