This region is exported at /machine/smram. It is "empty" if
SMRAME=0 and points to SMRAM if SMRAME=1. The CPU will
enable/disable it as it enters or exits SMRAM.
While touching nearby code, the existing memory region setup was
slightly inconsistent. The smram_region is *disabled* in order to open
SMRAM (because the smram_region shows the low VRAM instead of the RAM
at 0xa0000). Because SMRAM is closed at startup, the smram_region must
be enabled when creating the i440fx or q35 devices.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Different CPUs can be in SMM or not at the same time, thus they
will see different things where the chipset places SMRAM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-global does not work for drivers that have a dot in their name, such as
cfi.pflash01. This is just a parsing limitation, because such globals
can be declared easily inside a -readconfig file.
To allow this usage, support the full QemuOpts key/value syntax for -global
too, for example "-global driver=cfi.pflash01,property=secure,value=on".
The two formats do not conflict, because the key/value syntax does not have
a period before the first equal sign.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When this property is set, MMIO accesses are only allowed with the
MEMTXATTRS_SECURE attribute. This is used for secure access to UEFI
variables stored in flash.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An SMI should definitely wake up a processor in halted state!
This lets OVMF boot with SMM on multiprocessor systems, although
it halts very soon after that with a "CpuIndex != BspIndex"
assertion failure.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because the limit field's bits 31:20 is 1, G should be 1.
VMX actually enforces this, let's do it for completeness
in QEMU as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU is not blocking NMIs on entry to SMM. Implementing this has to
cover a few corner cases, because:
- NMIs can then be enabled by an IRET instruction and there
is no mechanism to _set_ the "NMIs masked" flag on exit from SMM:
"A special case can occur if an SMI handler nests inside an NMI handler
and then another NMI occurs. [...] When the processor enters SMM while
executing an NMI handler, the processor saves the SMRAM state save map
but does not save the attribute to keep NMI interrupts disabled.
- However, there is some hidden state, because "If NMIs were blocked
before the SMI occurred [and no IRET is executed while in SMM], they
are blocked after execution of RSM." This is represented by the new
HF2_SMM_INSIDE_NMI_MASK bit. If it is zero, NMIs are _unblocked_
on exit from RSM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to do this, stop using the cpu_in*/out* helpers, and instead
access address_space_io directly.
cpu_in* and cpu_out* remain for usage in the monitor, in qtest, and
in Xen.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These include page table walks, SVM accesses and SMM state save accesses.
The bulk of the patch is obtained with
sed -i 's/\(\<[a-z_]*_phys\(_notdirty\)\?\>(cs\)->as,/x86_\1,/'
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While qemu is running in sleep=no mode, a warning will be printed
when no timer deadline is set.
As this mode is intended for getting deterministic virtual time, if no
timer is set on the virtual clock this determinism is broken.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-4-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'sleep' parameter sets the icount_sleep mode, which is enabled by
default. To disable it, add the 'sleep=no' parameter (or 'nosleep') to the
qemu -icount option.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-3-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the icount_sleep mode is disabled, the QEMU_VIRTUAL_CLOCK runs at the
maximum possible speed by warping the sleep times of the virtual cpu to the
soonest clock deadline. The virtual clock will be updated only according
the instruction counter.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-2-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mr->terminates alone doesn't guarantee that we are looking at a RAM region.
mr->ram_addr also has to be checked, in order to distinguish RAM and I/O
regions.
So, do the following:
1) add a new define RAM_ADDR_INVALID, and test it in the assertions
instead of mr->terminates
2) IOMMU regions were not setting mr->ram_addr to a bogus value, initialize
it in the instance_init function so that the new assertions would fire
for IOMMU regions as well.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The fast path of cpu_physical_memory_sync_dirty_bitmap() directly
manipulates the dirty bitmap. Use atomic_xchg() to make the
test-and-clear atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-7-git-send-email-stefanha@redhat.com>
[Only do xchg on nonzero words. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.
Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The dirty memory bitmap is managed by ram_addr.h and copied to
migration_bitmap[] periodically during live migration.
Move the code to sync the bitmap to ram_addr.h where related code lives.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-5-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-4-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new bitmap_test_and_clear_atomic() function clears a range and
returns whether or not the bits were set.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-3-git-send-email-stefanha@redhat.com>
[Test before xchg; then a full barrier is needed at the end just like
in the previous patch. The barrier can be avoided if we did at least
one xchg. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use atomic_or() for atomic bitmaps where several threads may set bits at
the same time. This avoids the race condition between threads loading
an element, bitwise ORing, and then storing the element.
When setting all bits in a word we can avoid atomic ops and instead just
use an smp_mb() at the end.
Most bitmap users don't need atomicity so introduce new functions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-2-git-send-email-stefanha@redhat.com>
[Avoid barrier in the single word case, use full barrier instead of write.
- Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_physical_memory_set_dirty_lebitmap unconditionally syncs the
DIRTY_MEMORY_CODE bitmap. This however is unused unless TCG is
enabled.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.
In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.
With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While it is obvious that cpu_physical_memory_get_dirty returns true even if
a single page is dirty, the same is not true for cpu_physical_memory_get_clean;
one would expect that it returns true only if all the pages are clean, but
it actually looks for even one clean page. (By contrast, the caller of that
function, cpu_physical_memory_range_includes_clean, has a good name).
To clarify, rename the function to cpu_physical_memory_all_dirty and return
true if _all_ the pages are dirty. This is the opposite of the previous
meaning, because "all are 1" is the same as "not (any is 0)", so we have to
modify cpu_physical_memory_range_includes_clean as well.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
is_cpu_write_access is only set if tb_invalidate_phys_page_range is called
from tb_invalidate_phys_page_fast, and hence from notdirty_mem_write.
However:
- the code bitmap can be built directly in tb_invalidate_phys_page_fast
(unconditionally, since is_cpu_write_access would always be passed as 1);
- the virtual address is not needed to mark the page as "not containing
code" (dirty code bitmap = 1), so we can also remove that use of
is_cpu_write_access. For calls of tb_invalidate_phys_page_range
that do not come from notdirty_mem_write, the next call to
notdirty_mem_write will notice that the page does not contain code
anymore, and will fix up the TLB entry.
The parameter needs to remain in order to guard accesses to cpu->mem_io_pc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These days modification of the TLB is done in notdirty_mem_write,
so the virtual address and env pointer as unnecessary.
The new name of the function, tlb_unprotect_code, is consistent with
tlb_protect_code.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove them from the sundry exec-all.h header, since they are only used by
the TCG runtime in exec.c and user-exec.c.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The memory API can now return the exact set of bitmaps that have to
be tracked. Use it instead of the in_migration variable.
In the next patches, we will also use it to set only DIRTY_MEMORY_VGA
or DIRTY_MEMORY_MIGRATION if necessary. This can make a difference
for dataplane, especially after the dirty bitmap is changed to use
more expensive atomic operations.
Of some interest is the change to stl_phys_notdirty. When migration
was introduced, stl_phys_notdirty was changed to effectively behave
as stl_phys during migration. In fact, if one looks at the function as it
was in the beginning (commit 8df1cd0, physical memory access functions,
2005-01-28), at the time the dirty bitmap was the equivalent of
DIRTY_MEMORY_CODE nowadays; hence, the function simply should not touch
the dirty code bits. This patch changes it to do the intended thing.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode;
it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap.
The remaining call from invalidate_and_set_dirty's "else" branch will go
away soon.
Second, fix the second argument to the function in the
cpu_physical_memory_set_dirty_lebitmap call site. That function is only used
by KVM, but it is better to be clean anyway.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from
kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02).
Another performance problem is that KVM keeps tracking dirty pages
after a failed live migration, which causes bad performance due to
disallowing huge page mapping.
Thanks to the previous patch, KVM can now stop hooking into
log_global_start/stop. This simplifies the KVM code noticeably.
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reported-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The separate handling of DIRTY_MEMORY_MIGRATION, which does not
call log_start/log_stop callbacks when it changes in a region's
dirty logging mask, has caused several bugs.
One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from
kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02).
Another performance problem is that KVM keeps tracking dirty pages
after a failed live migration, which causes bad performance due to
disallowing huge page mapping.
This patch removes the root cause of the problem by reporting
DIRTY_MEMORY_MIGRATION changes via log_start and log_stop.
Note that we now have to rebuild the FlatView when global dirty
logging is enabled or disabled; this ensures that log_start and
log_stop callbacks are invoked.
This will also be used to make the setting of bitmaps conditional.
In general, this patch lets users of the memory API ignore the
global state of dirty logging if they handle dirty logging
generically per region.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is okay if memory is not mapped into the guest but has dirty logging
enabled. When this happens, KVM will not do anything and only accesses
from the host will be logged.
This can be triggered by iofuzz.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
dpy_gfx_update_dirty expects DIRTY_MEMORY_VGA logging to be always on,
but that will not be the case soon. Because it computes the memory
region on the fly for every update (with memory_region_find), it cannot
enable/disable logging by itself.
We could always treat updates as invalidations if dirty logging is
not enabled, assuming that the board will enable logging on the
RAM region that includes the framebuffer.
However, the function is unused, so just drop it.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
framebuffer.c expects DIRTY_MEMORY_VGA logging to be always on, but that
will not be the case soon. Because framebuffer.c computes the memory
region on the fly for every update (with memory_region_find), it cannot
enable/disable logging by itself.
Instead, always treat updates as invalidations if dirty logging is
not enabled, assuming that the board will enable logging on the
RAM region that includes the framebuffer.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.
For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.
For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop). On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon. To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask. memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.
While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are strictly speaking only needed for KVM and Xen, but it's still
nice to be consistent.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be required soon by the memory core.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coalescing work on MMIO, not RAM, thus this call has no effect.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start
and memory_global_dirty_log_stop, so it cannot be used with
memory_region_set_log.
Specify this in the documentation and assert it.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
make can be invoked in the individual build dirs to build an individual
target or just a single file of a target. e.g.
touch translate-all.c
make -C microblazeel-softmmu translate-all.o
There is however a small bug when using the pixman submodule.
config-host.mak will ref BUILD_DIR for the pixman -I CFLAGS:
grep BUILD_DIR config-host.mak
QEMU_CFLAGS=-I$(SRC_PATH)/pixman/pixman -I$(BUILD_DIR)/pixman/pixman ...
This causes a build failure as -I/pixman/pixman (BUILD_DIR=="") will
not be found.
BUILD_DIR is usually set by the top level Makefile. Just lazy-set it in
Makefile.target to the parent directory.
Granted, this will not work if the pixman submodule is not prebuilt,
but it at least means you can do incremental partial builds once you
have done your initial full build (or attempt) from the top level.
The next step would be refactor make infrastructure to rebuild pixman
on a submake like the one above.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <1432618686-16077-1-git-send-email-crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
phys_page_set_level is writing zeroes to a struct that has just been
filled in by phys_map_node_alloc. Instead, tell phys_map_node_alloc
whether to fill in the page "as a leaf" or "as a non-leaf".
memcpy is faster than struct assignment, which copies each bitfield
individually. A compiler bug (https://gcc.gnu.org/PR66391), and
small memcpys like this one are special-cased anyway, and optimized
to a register move, so just use the memcpy.
This cuts the cost of phys_page_set_level from 25% to 5% when
booting qboot.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Achieved by:
- Remembering the server fd with a global variable, in order to access
it from nbd_client_closed.
- Checking nbd_can_accept() and updating server_fd handler whenever
client connects or disconnects.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1432032670-15124-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On POWER8 systems, KVM checks if VCPU is running on primary threads,
and that secondary threads are offline. If this is not the case,
ioctl() fails with errno set to EBUSY.
QEMU aborts with a non explicit error message:
$ ./qemu-system-ppc64 --nographic -machine pseries,accel=kvm
error: kvm run failed Device or resource busy
To help user to diagnose the problem, this patch adds an informative
error message.
There is no easy way to check if SMT is enabled before starting the VCPU,
and as this case is the only one setting errno to EBUSY, we just check
the errno value to display a message.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <1431976007-20503-1-git-send-email-lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>