Commit Graph

408 Commits

Author SHA1 Message Date
Richard Henderson 127b2b0863 target/arm: Rename ARMMMUIdx*_S1E3 to ARMMMUIdx*_SE3
This is part of a reorganization to the set of mmu_idx.
The EL3 regime only has a single stage translation, and
is always secure.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:23 +00:00
Richard Henderson fba37aedec target/arm: Rename ARMMMUIdx_S1SE[01] to ARMMMUIdx_SE10_[01]
This is part of a reorganization to the set of mmu_idx.
This emphasizes that they apply to the Secure EL1&0 regime.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:23 +00:00
Richard Henderson 2859d7b590 target/arm: Rename ARMMMUIdx_S1NSE* to ARMMMUIdx_Stage1_E*
This is part of a reorganization to the set of mmu_idx.
The EL1&0 regime is the only one that uses 2-stage translation.
Spelling out Stage avoids confusion with Secure.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson 97fa935001 target/arm: Rename ARMMMUIdx_S2NS to ARMMMUIdx_Stage2
The EL1&0 regime is the only one that uses 2-stage translation.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson 01b98b6864 target/arm: Rename ARMMMUIdx*_S12NSE* to ARMMMUIdx*_E10_*
This is part of a reorganization to the set of mmu_idx.
This emphasizes that they apply to the EL1&0 regime.

The ultimate goal is

 -- Non-secure regimes:
    ARMMMUIdx_E10_0,
    ARMMMUIdx_E20_0,
    ARMMMUIdx_E10_1,
    ARMMMUIdx_E2,
    ARMMMUIdx_E20_2,

 -- Secure regimes:
    ARMMMUIdx_SE10_0,
    ARMMMUIdx_SE10_1,
    ARMMMUIdx_SE3,

 -- Helper mmu_idx for non-secure EL1&0 stage1 and stage2
    ARMMMUIdx_Stage2,
    ARMMMUIdx_Stage1_E0,
    ARMMMUIdx_Stage1_E1,

The 'S' prefix is reserved for "Secure".  Unless otherwise specified,
each mmu_idx represents all stages of translation.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson 527db2be8b target/arm: Simplify tlb_force_broadcast alternatives
Rather than call to a separate function and re-compute any
parameters for the flush, simply use the correct flush
function directly.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson 90c19cdf1d target/arm: Split out alle1_tlbmask
No functional change, but unify code sequences.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson b7e0730de3 target/arm: Split out vae1_tlbmask
No functional change, but unify code sequences.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:22 +00:00
Richard Henderson 53d1f85608 target/arm: Update CNTVCT_EL0 for VHE
The virtual offset may be 0 depending on EL, E2H and TGE.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:21 +00:00
Richard Henderson ed30da8eee target/arm: Add TTBR1_EL2
At the same time, add writefn to TTBR0_EL2 and TCR_EL2.
A later patch will update any ASID therein.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:21 +00:00
Richard Henderson e2a1a4616c target/arm: Add CONTEXTIDR_EL2
Not all of the breakpoint types are supported, but those that
only examine contextidr are extended to support the new register.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:21 +00:00
Richard Henderson 03c76131bc target/arm: Enable HCR_E2H for VHE
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:21 +00:00
Alex Bennée 4ff5ef9e91 target/arm: only update pc after semihosting completes
Before we introduce blocking semihosting calls we need to ensure we
can restart the system on semi hosting exception. To be able to do
this the EXCP_SEMIHOST operation should be idempotent until it finally
completes. Practically this means ensureing we only update the pc
after the semihosting call has completed.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
2020-01-09 11:41:29 +00:00
Alex Bennée b906acbb3a target/arm: remove unused EXCP_SEMIHOST leg
All semihosting exceptions are dealt with earlier in the common code
so we should never get here.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
2020-01-09 11:41:29 +00:00
Philippe Mathieu-Daudé 0ee8b24a69 target/arm: Display helpful message when hflags mismatch
Instead of crashing in a confuse way, give some hint to the user
about why we aborted. He might report the issue without having
to use a debugger.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191209134552.27733-1-philmd@redhat.com
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 14:03:00 +00:00
Andrew Jeffery 96eec6b2b3 target/arm: Prepare generic timer for per-platform CNTFRQ
The ASPEED AST2600 clocks the generic timer at the rate of HPLL. On
recent firmwares this is at 1125MHz, which is considerably quicker than
the assumed 62.5MHz of the current generic timer implementation. The
delta between the value as read from CNTFRQ and the true rate of the
underlying QEMUTimer leads to sticky behaviour in AST2600 guests.

Add a feature-gated property exposing CNTFRQ for ARM CPUs providing the
generic timer. This allows platforms to configure CNTFRQ (and the
associated QEMUTimer) to the appropriate frequency prior to starting the
guest.

As the platform can now determine the rate of CNTFRQ we're exposed to
limitations of QEMUTimer that didn't previously materialise: In the
course of emulation we need to arbitrarily and accurately convert
between guest ticks and time, but we're constrained by QEMUTimer's use
of an integer scaling factor. The effect is QEMUTimer cannot exactly
capture the period of frequencies that do not cleanly divide
NANOSECONDS_PER_SECOND for scaling ticks to time. As such, provide an
equally inaccurate scaling factor for scaling time to ticks so at least
a self-consistent inverse relationship holds.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: a22db9325f96e39f76e3c2baddcb712149f46bf2.1576215453.git-series.andrew@aj.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 14:02:59 +00:00
Andrew Jeffery 7def875482 target/arm: Abstract the generic timer frequency
Prepare for SoCs such as the ASPEED AST2600 whose firmware configures
CNTFRQ to values significantly larger than the static 62.5MHz value
currently derived from GTIMER_SCALE. As the OS potentially derives its
timer periods from the CNTFRQ value the lack of support for running
QEMUTimers at the appropriate rate leads to sticky behaviour in the
guest.

Substitute the GTIMER_SCALE constant with use of a helper to derive the
period from gt_cntfrq_hz stored in struct ARMCPU. Initially set
gt_cntfrq_hz to the frequency associated with GTIMER_SCALE so current
behaviour is maintained.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 40bd8df043f66e1ccfb3e9482999d099ac72bb2e.1576215453.git-series.andrew@aj.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 14:02:59 +00:00
Andrew Jeffery 4a0245b625 target/arm: Remove redundant scaling of nexttick
The corner-case codepath was adjusting nexttick such that overflow
wouldn't occur when timer_mod() scaled the value back up. Remove a use
of GTIMER_SCALE and avoid unnecessary operations by calling
timer_mod_ns() directly.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: f8c680720e3abe55476e6d9cb604ad27fdbeb2e0.1576215453.git-series.andrew@aj.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-20 14:02:59 +00:00
Alex Bennée f80741d107 target/arm: ensure we use current exception state after SCR update
A write to the SCR can change the effective EL by droppping the system
from secure to non-secure mode. However if we use a cached current_el
from before the change we'll rebuild the flags incorrectly. To fix
this we introduce the ARM_CP_NEWEL CP flag to indicate the new EL
should be used when recomputing the flags.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191212114734.6962-1-alex.bennee@linaro.org
Cc: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191209143723.6368-1-alex.bennee@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:52:58 +00:00
Beata Michalska 0d57b49992 target/arm: Add support for DC CVAP & DC CVADP ins
ARMv8.2 introduced support for Data Cache Clean instructions
to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
- DV CVADP. Both specify conceptual points in a memory system where all writes
that are to reach them are considered persistent.
The support provided considers both to be actually the same so there is no
distinction between the two. If none is available (there is no backing store
for given memory) both will result in Data Cache Clean up to the point of
coherency. Otherwise sync for the specified range shall be performed.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191121000843.24844-5-beata.michalska@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:35 +00:00
Marc Zyngier f96f3d5f09 target/arm: Add support for missing Jazelle system registers
QEMU lacks the minimum Jazelle implementation that is required
by the architecture (everything is RAZ or RAZ/WI). Add it
together with the HCR_EL2.TID0 trapping that goes with it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191201122018.25808-6-maz@kernel.org
[PMM: moved ARMCPRegInfo array to file scope, marked it
 'static global', moved new condition down in
 register_cp_regs_for_features() to go with other feature
 things rather than up with the v6/v7/v8 stuff]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:35 +00:00
Marc Zyngier 5bb0a20b74 target/arm: Handle AArch32 CP15 trapping via HSTR_EL2
HSTR_EL2 offers a way to trap ranges of CP15 system register
accesses to EL2, and it looks like this register is completely
ignored by QEMU.

To avoid adding extra .accessfn filters all over the place (which
would have a direct performance impact), let's add a new TB flag
that gets set whenever HSTR_EL2 is non-zero and that QEMU translates
a context where this trap has a chance to apply, and only generate
the extra access check if the hypervisor is actively using this feature.

Tested with a hand-crafted KVM guest accessing CBAR.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191201122018.25808-5-maz@kernel.org
[PMM: use is_a64(); fix comment syntax]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:35 +00:00
Marc Zyngier 93fbc983b2 target/arm: Honor HCR_EL2.TID1 trapping requirements
HCR_EL2.TID1 mandates that access from EL1 to REVIDR_EL1, AIDR_EL1
(and their 32bit equivalents) as well as TCMTR, TLBTR are trapped
to EL2. QEMU ignores it, making it harder for a hypervisor to
virtualize the HW (though to be fair, no known hypervisor actually
cares).

Do the right thing by trapping to EL2 if HCR_EL2.TID1 is set.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191201122018.25808-3-maz@kernel.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:34 +00:00
Marc Zyngier 630fcd4d2b target/arm: Honor HCR_EL2.TID2 trapping requirements
HCR_EL2.TID2 mandates that access from EL1 to CTR_EL0, CCSIDR_EL1,
CCSIDR2_EL1, CLIDR_EL1, CSSELR_EL1 are trapped to EL2, and QEMU
completely ignores it, making it impossible for hypervisors to
virtualize the cache hierarchy.

Do the right thing by trapping to EL2 if HCR_EL2.TID2 is set.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191201122018.25808-2-maz@kernel.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:34 +00:00
Marc Zyngier 6a4ef4e5d1 target/arm: Honor HCR_EL2.TID3 trapping requirements
HCR_EL2.TID3 mandates that access from EL1 to a long list of id
registers traps to EL2, and QEMU has so far ignored this requirement.

This breaks (among other things) KVM guests that have PtrAuth enabled,
while the hypervisor doesn't want to expose the feature to its guest.
To achieve this, KVM traps the ID registers (ID_AA64ISAR1_EL1 in this
case), and masks out the unsupported feature.

QEMU not honoring the trap request means that the guest observes
that the feature is present in the HW, starts using it, and dies
a horrible death when KVM injects an UNDEF, because the feature
*really* isn't supported.

Do the right thing by trapping to EL2 if HCR_EL2.TID3 is set.

Note that this change does not include trapping of the MVFR
registers from AArch32 (they are accessed via the VMRS
instruction and need to be handled in a different way).

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Message-id: 20191123115618.29230-1-maz@kernel.org
[PMM: added missing accessfn line for ID_AA4PFR2_EL1_RESERVED;
 changed names of access functions to include _tid3]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-26 13:55:37 +00:00
Marc Zyngier 7cf95aed53 target/arm: Fix ISR_EL1 tracking when executing at EL2
The ARMv8 ARM states when executing at EL2, EL3 or Secure EL1,
ISR_EL1 shows the pending status of the physical IRQ, FIQ, or
SError interrupts.

Unfortunately, QEMU's implementation only considers the HCR_EL2
bits, and ignores the current exception level. This means a hypervisor
trying to look at its own interrupt state actually sees the guest
state, which is unexpected and breaks KVM as of Linux 5.3.

Instead, check for the running EL and return the physical bits
if not running in a virtualized context.

Fixes: 636540e9c4
Cc: qemu-stable@nongnu.org
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-id: 20191122135833.28953-1-maz@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-26 13:55:36 +00:00
Richard Henderson 6e553f2a1b target/arm: Merge arm_cpu_vq_map_next_smaller into sole caller
Coverity reports, in sve_zcr_get_valid_len,

"Subtract operation overflows on operands
arm_cpu_vq_map_next_smaller(cpu, start_vq + 1U) and 1U"

First, the aarch32 stub version of arm_cpu_vq_map_next_smaller,
returning 0, does exactly what Coverity reports.  Remove it.

Second, the aarch64 version of arm_cpu_vq_map_next_smaller has
a set of asserts, but they don't cover the case in question.
Further, there is a fair amount of extra arithmetic needed to
convert from the 0-based zcr register, to the 1-base vq form,
to the 0-based bitmap, and back again.  This can be simplified
by leaving the value in the 0-based form.

Finally, use test_bit to simplify the common case, where the
length in the zcr registers is in fact a supported length.

Reported-by: Coverity (CID 1407217)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20191118091414.19440-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-19 13:20:27 +00:00
Andrew Jones 0df9142d27 target/arm/cpu64: max cpu: Introduce sve<N> properties
Introduce cpu properties to give fine control over SVE vector lengths.
We introduce a property for each valid length up to the current
maximum supported, which is 2048-bits. The properties are named, e.g.
sve128, sve256, sve384, sve512, ..., where the number is the number of
bits. See the updates to docs/arm-cpu-features.rst for a description
of the semantics and for example uses.

Note, as sve-max-vq is still present and we'd like to be able to
support qmp_query_cpu_model_expansion with guests launched with e.g.
-cpu max,sve-max-vq=8 on their command lines, then we do allow
sve-max-vq and sve<N> properties to be provided at the same time, but
this is not recommended, and is why sve-max-vq is not mentioned in the
document.  If sve-max-vq is provided then it enables all lengths smaller
than and including the max and disables all lengths larger. It also has
the side-effect that no larger lengths may be enabled and that the max
itself cannot be disabled. Smaller non-power-of-two lengths may,
however, be disabled, e.g. -cpu max,sve-max-vq=4,sve384=off provides a
guest the vector lengths 128, 256, and 512 bits.

This patch has been co-authored with Richard Henderson, who reworked
the target/arm/cpu64.c changes in order to push all the validation and
auto-enabling/disabling steps into the finalizer, resulting in a nice
LOC reduction.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Beata Michalska <beata.michalska@linaro.org>
Message-id: 20191031142734.8590-5-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-01 20:40:59 +00:00
Richard Henderson e979972a6a target/arm: Rely on hflags correct in cpu_get_tb_cpu_state
This is the payoff.

From perf record -g data of ubuntu 18 boot and shutdown:

BEFORE:

-   23.02%     2.82%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 20.22% helper_lookup_tb_ptr
      + 10.05% tb_htable_lookup
      - 9.13% cpu_get_tb_cpu_state
           3.20% aa64_va_parameters_both
           0.55% fp_exception_el

-   11.66%     4.74%  qemu-system-aar  [.] cpu_get_tb_cpu_state
   - 6.96% cpu_get_tb_cpu_state
        3.63% aa64_va_parameters_both
        0.60% fp_exception_el
        0.53% sve_exception_el

AFTER:

-   16.40%     3.40%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 13.03% helper_lookup_tb_ptr
      + 11.19% tb_htable_lookup
        0.55% cpu_get_tb_cpu_state

     0.98%     0.71%  qemu-system-aar  [.] cpu_get_tb_cpu_state

     0.87%     0.24%  qemu-system-aar  [.] rebuild_hflags_a64

Before, helper_lookup_tb_ptr is the second hottest function in the
application, consuming almost a quarter of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state consumes about 12%.

After, helper_lookup_tb_ptr has dropped to the fourth hottest function,
with consumption dropping to a sixth of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state has dropped below 1%, and the
supporting function to rebuild hflags also consumes about 1%.

Assertions are retained for --enable-debug-tcg.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 2e5dcf3628 target/arm: Rebuild hflags at Xscale SCTLR writes
Continue setting, but not relying upon, env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson a8a79c7a07 target/arm: Rebuild hflags at EL changes
Begin setting, but not relying upon, env->hflags.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 14f3c58826 target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32})
This functions are given the mode and el state of the cpu
and writes the computed value to env->hflags.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 9b253fe554 target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state
By performing this store early, we avoid having to save and restore
the register holding the address around any function calls.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 164690b29f target/arm: Split out arm_mmu_idx_el
Avoid calling arm_current_el() twice.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 3d74e2e9ff target/arm: Add arm_rebuild_hflags
This function assumes nothing about the current state of the cpu,
and writes the computed value to env->hflags.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 0a54d68e21 target/arm: Hoist computation of TBFLAG_A32.VFPEN
There are 3 conditions that each enable this flag.  M-profile always
enables; A-profile with EL1 as AA64 always enables.  Both of these
conditions can easily be cached.  The final condition relies on the
FPEXC register which we are not prepared to cache.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
Richard Henderson 60e12c3776 target/arm: Simplify set of PSTATE_SS in cpu_get_tb_cpu_state
Hoist the variable load for PSTATE into the existing test vs is_a64.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson bbad7c62d4 target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state
We do not need to compute any of these values for M-profile.
Further, XSCALE_CPAR overlaps VECSTRIDE so obviously the two
sets must be mutually exclusive.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson 83f4baef3e target/arm: Split out rebuild_hflags_aprofile
Create a function to compute the values of the TBFLAG_ANY bits
that will be cached, and are used by A-profile.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson c747224cc3 target/arm: Split out rebuild_hflags_a32
Currently a trivial wrapper for rebuild_hflags_common_32.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson 9550d1bd88 target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state
Hoist the computation of some TBFLAG_A32 bits that only apply to
M-profile under a single test for ARM_FEATURE_M.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson 6e33ced563 target/arm: Split out rebuild_hflags_m32
Create a function to compute the values of the TBFLAG_A32 bits
that will be cached, and are used by M-profile.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson 8061a64910 target/arm: Split arm_cpu_data_is_big_endian
Set TBFLAG_ANY.BE_DATA in rebuild_hflags_common_32 and
rebuild_hflags_a64 instead of rebuild_hflags_common, where we do
not need to re-test is_a64() nor re-compute the various inputs.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson 43eccfb6ed target/arm: Split out rebuild_hflags_common_32
Create a function to compute the values of the TBFLAG_A32 bits
that will be cached, and are used by all profiles.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson d4d7503ac6 target/arm: Split out rebuild_hflags_a64
Create a function to compute the values of the TBFLAG_A64 bits
that will be cached.  For now, the env->hflags variable is not
used, and the results are fed back to cpu_get_tb_cpu_state.

Note that not all BTI related flags are cached, so we have to
test the BTI feature twice -- once for those bits moved out to
rebuild_hflags_a64 and once for those bits that remain in
cpu_get_tb_cpu_state.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Richard Henderson fdd1b228c2 target/arm: Split out rebuild_hflags_common
Create a function to compute the values of the TBFLAG_ANY bits
that will be cached.  For now, the env->hflags variable is not
used, and the results are fed back to cpu_get_tb_cpu_state.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:27 +01:00
Alex Bennée ed6e6ba9c4 target/arm: remove run time semihosting checks
Now we do all our checking and use a common EXCP_SEMIHOST for
semihosting operations we can make helper code a lot simpler.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190913151845.12582-5-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-27 11:41:31 +01:00
Luc Michel d56974afe9 target/arm: fix CBAR register for AArch64 CPUs
For AArch64 CPUs with a CBAR register, we have two views for it:
  - in AArch64 state, the CBAR_EL1 register (S3_1_C15_C3_0), returns the
    full 64 bits CBAR value
  - in AArch32 state, the CBAR register (cp15, opc1=1, CRn=15, CRm=3, opc2=0)
    returns a 32 bits view such that:
      CBAR = CBAR_EL1[31:18] 0..0 CBAR_EL1[43:32]

This commit fixes the current implementation where:
  - CBAR_EL1 was returning the 32 bits view instead of the full 64 bits
    value,
  - CBAR was returning a truncated 32 bits version of the full 64 bits
    one, instead of the 32 bits view
  - CBAR was declared as cp15, opc1=4, CRn=15, CRm=0, opc2=0, which is
    the CBAR register found in the ARMv7 Cortex-Ax CPUs, but not in
    ARMv8 CPUs.

Signed-off-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20190912110103.1417887-1-luc.michel@greensocs.com
[PMM: Added a comment about the two different kinds of CBAR]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-27 11:41:28 +01:00
Peter Maydell 0710b2fa84 target/arm: Take exceptions on ATS instructions when needed
The translation table walk for an ATS instruction can result in
various faults.  In general these are just reported back via the
PAR_EL1 fault status fields, but in some cases the architecture
requires that the fault is turned into an exception:
 * synchronous stage 2 faults of any kind during AT S1E0* and
   AT S1E1* instructions executed from NS EL1 fault to EL2 or EL3
 * synchronous external aborts are taken as Data Abort exceptions

(This is documented in the v8A Arm ARM DDI0487A.e D5.2.11 and
G5.13.4.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20190816125802.25877-3-peter.maydell@linaro.org
2019-09-03 16:20:34 +01:00
Peter Maydell afd7605393 target-arm queue:
* target/arm: generate a custom MIDR for -cpu max
  * hw/misc/zynq_slcr: refactor to use standard register definition
  * Set ENET_BD_BDU in I.MX FEC controller
  * target/arm: Fix routing of singlestep exceptions
  * refactor a32/t32 decoder handling of PC
  * minor optimisations/cleanups of some a32/t32 codegen
  * target/arm/cpu64: Ensure kvm really supports aarch64=off
  * target/arm/cpu: Ensure we can use the pmu with kvm
  * target/arm: Minor cleanups preparatory to KVM SVE support
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAl1WrIsZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3tJ/D/9I0ccyciHwuekySUHs+Wq6
 2grX8t6RFzlhA1ULoAaEO4x8uWWGnbiGTeSGM819T3nj1a7neQV12Xe5RRGG0j7n
 aeVseYnZF96oshKPkDSVTcGQisVfmmHIJ0oqx2k1aUGrmyFJlTuLWQBZCCiZKhxA
 zA6YzUbOA2apfi9nun6SbbjysiRD2lp2i9vI79nVlo+ca77v/1sdFUwzg0hRE//X
 IondHeWtCZScmc/GwABv4EdNzQ4Aerfe10v/pOKXEC59rPwEiaiSGBPu6SRUaGWH
 qHlwjVU2+BFGkz9Oy/7+tDTBk6saPi4taZF8SxxiC/QTyNV2ijyKV5iy9KOYAFw7
 E41fhv4+Kch569/SX7fiyAxL0gAS2HGFtegByuQEgjjioOCRugFcX275NXvuW06j
 jfOP/zSD9P39WA0jCJaNj5FdJTcLmIuFxKjBUEX3Cdb+3igIq1BW0ZFd/OOBoo1W
 GHcEmO6tLyx35kigOb3TkayQpkqCoaGCcgzJ0g2Oy06rKwlcci+BfCfc3aG+uSSY
 +TuGjRhpQxQJJt880d7tBqeH9R5FABvQ0TEwGuACylDEZM5bN7BpZxCxCVN/bFG+
 pzvzs/QtOq0FN7LK4L4rbuJui4nBhAyalbiIXQ8ihWQgmMqaYQSK8mXFgSZgizFl
 qATcYIr/q2gL4wHRos3XdA==
 =8BAF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190816' into staging

target-arm queue:
 * target/arm: generate a custom MIDR for -cpu max
 * hw/misc/zynq_slcr: refactor to use standard register definition
 * Set ENET_BD_BDU in I.MX FEC controller
 * target/arm: Fix routing of singlestep exceptions
 * refactor a32/t32 decoder handling of PC
 * minor optimisations/cleanups of some a32/t32 codegen
 * target/arm/cpu64: Ensure kvm really supports aarch64=off
 * target/arm/cpu: Ensure we can use the pmu with kvm
 * target/arm: Minor cleanups preparatory to KVM SVE support

# gpg: Signature made Fri 16 Aug 2019 14:15:55 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20190816: (29 commits)
  target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word
  target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
  target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
  target/arm: Use ror32 instead of open-coding the operation
  target/arm: Remove redundant shift tests
  target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
  target/arm: Use tcg_gen_extract_i32 for shifter_out_im
  target/arm/kvm64: Move the get/put of fpsimd registers out
  target/arm/kvm64: Fix error returns
  target/arm/cpu: Use div-round-up to determine predicate register array size
  target/arm/helper: zcr: Add build bug next to value range assumption
  target/arm/cpu: Ensure we can use the pmu with kvm
  target/arm/cpu64: Ensure kvm really supports aarch64=off
  target/arm: Remove helper_double_saturate
  target/arm: Use unallocated_encoding for aarch32
  target/arm: Remove offset argument to gen_exception_bkpt_insn
  target/arm: Replace offset with pc in gen_exception_internal_insn
  target/arm: Replace offset with pc in gen_exception_insn
  target/arm: Replace s->pc with s->base.pc_next
  target/arm: Remove redundant s->pc & ~1
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16 17:21:40 +01:00