- Infrastructure to allow building irqchip drivers as modules
- Consolidation of irqchip ACPI probing
- Removal of the EOI-preflow interrupt handler which was required for
SPARC support and became obsolete after SPARC was converted to
use sparse interrupts.
- Cleanups, fixes and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pDL0THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoRTFEACYvH2LnSu1GlXB0XtL3+XyV8bWN3Yr
Qfcp9JbIibx65YkJjcyvfBNA6GjXoogMr9vOHeRVnPtOwzl/7n/lnh/43d6+YPot
7UvIjGtpH3E/lF0kJKfuEsM8CX8DcVhn6dV/T+dJ00m69dAVQHNRsVqAi1/iWEeT
9vBBELoJL79BU2g83NQZ7V0UrqiA5QlPYLpbSffliE6UWjG6XTH2CPM5XucuySNQ
es3szxQ55rtPEzqCHVL0YW75vV39bmKZPqoApA/XQDJrp3bgftjdldoTe7YPQfSG
MXAvB+6axPD+mdeag7/XZFC1DcMx8CnistZSJKpdYZe7mQ7iunfeJRhkEzb+DrO1
WdcDcYOm0rLHhPrUZItJdACjuPNmN9pMaK1PbabsivnHVWzMYYKmMwbW+AEsygGW
nnlsZP1Nr61Mo7O8+EKmxDdox4Qjk3lmQl4SdQgUKNKsI5yFYjvt2CfCjWLQJNBa
w7YiLnL9IChXwrvdGqMIoEueUi0pC3gGbZ/bjDbxI4NJxJgEEav49m/prxM2A2Pl
gfNdwlM1xgNydIBgt/jij/a8Lmv555RuZmvDV7QV7fFwaIqt3Qb5cs0Roq+GlzZR
e0wuikGl0r/Bdow62rle7EysbBBGosAYf6K/kaGhd8v/kx2ByDnPPWzOqtxc+K+i
Iw/daEQRsSnWuw==
=KA8b
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"The usual boring updates from the interrupt subsystem:
- Infrastructure to allow building irqchip drivers as modules
- Consolidation of irqchip ACPI probing
- Removal of the EOI-preflow interrupt handler which was required for
SPARC support and became obsolete after SPARC was converted to use
sparse interrupts.
- Cleanups, fixes and improvements all over the place"
* tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
irqchip/loongson-pch-pic: Fix the misused irq flow handler
irqchip/loongson-htvec: Support 8 groups of HT vectors
irqchip/loongson-liointc: Fix misuse of gc->mask_cache
dt-bindings: interrupt-controller: Update Loongson HTVEC description
irqchip/imx-intmux: Fix irqdata regs save in imx_intmux_runtime_suspend()
irqchip/imx-intmux: Implement intmux runtime power management
irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()
irqchip: Fix IRQCHIP_PLATFORM_DRIVER_* compilation by including module.h
irqchip/stm32-exti: Map direct event to irq parent
irqchip/mtk-cirq: Convert to a platform driver
irqchip/mtk-sysirq: Convert to a platform driver
irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros
irqchip: irq-bcm2836.h: drop a duplicated word
irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR
irqchip/irq-bcm7038-l1: Guard uses of cpu_logical_map
irqchip/gic-v3: Remove unused register definition
irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module
genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent
irqdomain: Export irq_domain_update_bus_token
...
John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:
"It looks like what happens is that because the interrupts are not per-CPU
in the hardware, armpmu_request_irq() calls irq_force_affinity() while
the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
IRQF_NOBALANCING.
Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
irq_setup_affinity() which returns early because IRQF_PERCPU and
IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."
This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.
Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...
Fixes: baedb87d1b ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
box, I get the following kernel splat:
[ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
[ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
[ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23
[ 0.053770] Call trace:
[ 0.053774] dump_backtrace+0x0/0x218
[ 0.053775] show_stack+0x2c/0x38
[ 0.053777] dump_stack+0xc4/0x10c
[ 0.053779] ___might_sleep+0xfc/0x140
[ 0.053780] __might_sleep+0x58/0x90
[ 0.053782] slab_pre_alloc_hook+0x7c/0x90
[ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0
[ 0.053785] its_cpu_init+0x6f4/0xe40
[ 0.053786] gic_starting_cpu+0x24/0x38
[ 0.053788] cpuhp_invoke_callback+0xa0/0x710
[ 0.053789] notify_cpu_starting+0xcc/0xd8
[ 0.053790] secondary_start_kernel+0x148/0x200
# ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
its_cpu_init+0x6f4/0xe40:
allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
(inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138
(inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166
It turned out that we're allocating memory using GFP_KERNEL (may sleep)
within the CPU hotplug notifier, which is indeed an atomic context. Bad
thing may happen if we're playing on a system with more than a single
CommonLPIAff group. Avoid it by turning this into an atomic allocation.
Fixes: 5e5168461c ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200630133746.816-1-yuzenghui@huawei.com
The GICv4.1 spec tells us that it's CONSTRAINED UNPREDICTABLE to issue a
register-based invalidation operation for a vPEID not mapped to that RD,
or another RD within the same CommonLPIAff group.
To follow this rule, commit f3a059219b ("irqchip/gic-v4.1: Ensure mutual
exclusion between vPE affinity change and RD access") tried to address the
race between the RD accesses and the vPE affinity change, but somehow
forgot to take GICR_INVALLR into account. Let's take the vpe_lock before
evaluating vpe->col_idx to fix it.
Fixes: f3a059219b ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200720092328.708-1-yuzenghui@huawei.com
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8DWosUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroO8cAf/UskNg8qoLGG17rQwhFpmigSllbiJ
TAyi3tpb1Y0Z2MfYeGkeiEb1L34bS28Cxl929DoqI3hrXy1wDCmsHPB5c3URXrzd
aswvr7pJtQV9iH1ykaS2woFJnOUovMFsFYMhj46yUPoAvdKOZKvuqcduxbogYHFw
YeRhS+1lGfiP2A0j3O/nnNJ0wq+FxKO46G3CgWeqG75+FSL6y/tl0bZJUMKKajQZ
GNaOv/CYCHAfUdvgy0ZitRD8lV8yxng3dYGjm+a52Kmn2ZWiFlxNrnxzHySk16Rn
Lq6MfFOqgrYpoZv7SnsFYnRE05U5bEFQ8BGr22fImQ+ktKDgq+9gv6cKwA==
=+DN/
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Bugfixes and a one-liner patch to silence a sparse warning"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART
KVM: arm64: PMU: Fix per-CPU access in preemptible context
KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks
KVM: x86: Mark CR4.TSD as being possibly owned by the guest
KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode
kvm: use more precise cast and do not drop __user
KVM: x86: bit 8 of non-leaf PDPEs is not reserved
KVM: X86: Fix async pf caused null-ptr-deref
KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell
KVM: arm64: pvtime: Ensure task delay accounting is enabled
KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE
KVM: arm64: Annotate hyp NMI-related functions as __always_inline
KVM: s390: reduce number of IO pins to 1
When making a vPE non-resident because it has hit a blocking WFI,
the doorbell can fire at any time after the write to the RD.
Crucially, it can fire right between the write to GICR_VPENDBASER
and the write to the pending_last field in the its_vpe structure.
This means that we would overwrite pending_last with stale data,
and potentially not wakeup until some unrelated event (such as
a timer interrupt) puts the vPE back on the CPU.
GICv4 isn't affected by this as we actively mask the doorbell on
entering the guest, while GICv4.1 automatically manages doorbell
delivery without any hypervisor-driven masking.
Use the vpe_lock to synchronize such update, which solves the
problem altogether.
Fixes: ae699ad348 ("irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
readx_poll_timeout() can sleep if @sleep_us is specified by the caller,
and is therefore unsafe to be used inside the atomic context, which is
this case when we use it to poll the GICR_VPENDBASER.Dirty bit in
irq_set_vcpu_affinity() callback.
Let's convert to its atomic version instead which helps to get the v4.1
board back to life!
Fixes: 96806229ca ("irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200605052345.1494-1-yuzenghui@huawei.com
When mapping a LPI, the ITS driver picks the first possible
affinity, which is in most cases CPU0, assuming that if
that's not suitable, someone will come and set the affinity
to something more interesting.
It apparently isn't the case, and people complain of poor
performance when many interrupts are glued to the same CPU.
So let's place the interrupts by finding the "least loaded"
CPU (that is, the one that has the fewer LPIs mapped to it).
So called 'managed' interrupts are an interesting case where
the affinity is actually dictated by the kernel itself, and
we should honor this.
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1575642904-58295-1-git-send-email-john.garry@huawei.com
Link: https://lore.kernel.org/r/20200515165752.121296-3-maz@kernel.org
In order to improve the distribution of LPIs among CPUs, let start by
tracking the number of LPIs assigned to CPUs, both for managed and
non-managed interrupts (as separate counters).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20200515165752.121296-2-maz@kernel.org
Although the vSGIs are not directly visible to the host, they still
get moved around by the CPU hotplug, for example. This results in
the kernel moaning on the console, such as:
genirq: irq_chip GICv4.1-sgi did not update eff. affinity mask of irq 38
Updating the effective affinity on set_affinity() fixes it.
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
When a vPE is made resident, the GIC starts parsing the virtual pending
table to deliver pending interrupts. This takes place asynchronously,
and can at times take a long while. Long enough that the vcpu enters
the guest and hits WFI before any interrupt has been signaled yet.
The vcpu then exits, blocks, and now gets a doorbell. Rince, repeat.
In order to avoid the above, a (optional on GICv4, mandatory on v4.1)
feature allows the GIC to feedback to the hypervisor whether it is
done parsing the VPT by clearing the GICR_VPENDBASER.Dirty bit.
The hypervisor can then wait until the GIC is ready before actually
running the vPE.
Plug the detection code as well as polling on vPE schedule. While
at it, tidy-up the kernel message that displays the GICv4 optional
features.
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Now that we have HW-accelerated SGIs being delivered to VPEs, it
becomes required to map the VPEs on all ITSs instead of relying
on the lazy approach that we would use when using the ITS-list
mechanism.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-17-maz@kernel.org
Just like for vLPIs, there is some configuration information that cannot
be directly communicated through the normal irqchip API, and we have to
use our good old friend set_vcpu_affinity as a side-band communication
mechanism.
This is used to configure group and priority for a given vSGI.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-13-maz@kernel.org
To implement the get/set_irqchip_state callbacks (limited to the
PENDING state), we have to use a particular set of hacks:
- Reading the pending state is done by using a pair of new redistributor
registers (GICR_VSGIR, GICR_VSGIPENDR), which allow the 16 interrupts
state to be retrieved.
- Setting the pending state is done by generating it as we'd otherwise do
for a guest (writing to GITS_SGIR).
- Clearing the pending state is done by emitting a VSGI command with the
"clear" bit set.
This requires some interesting locking though:
- When talking to the redistributor, we must make sure that the VPE
affinity doesn't change, hence taking the VPE lock.
- At the same time, we must ensure that nobody accesses the same
redistributor's GICR_VSGIR registers for a different VPE, which
would corrupt the reading of the pending bits. We thus take the
per-RD spinlock. Much fun.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-12-maz@kernel.org
Implement mask/unmask for virtual SGIs by calling into the
configuration helper.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-11-maz@kernel.org
The GICv4.1 ITS has yet another new command (VSGI) which allows
a VPE-targeted SGI to be configured (or have its pending state
cleared). Add support for this command and plumb it into the
activate irqdomain callback so that it is ready to be used.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-10-maz@kernel.org
Since GICv4.1 has the capability to inject 16 SGIs into each VPE,
and that I'm keen not to invent too many specific interfaces to
manipulate these interrupts, let's pretend that each of these SGIs
is an actual Linux interrupt.
For that matter, let's introduce a minimal irqchip and irqdomain
setup that will get fleshed up in the following patches.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-9-maz@kernel.org
There is no special reason to set virtual LPI pending table as
non-shareable. If we choose to hard code the shareability without
probing, Inner-Shareable is likely to be a better choice, as the
VPEs can move around and benefit from having the redistributors
snooping each other's cache, if that's something they can do.
Furthermore, Hisilicon hip08 ends up with unspecified errors when
mixing shareability attributes. So let's move to IS attributes for
the VPT. This has also been tested on D05 and didn't show any
regression.
Signed-off-by: Heyi Guo <guoheyi@huawei.com>
[maz: rewrote commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191130073849.38378-1-guoheyi@huawei.com
One of the new features of GICv4.1 is to allow virtual SGIs to be
directly signaled to a VPE. For that, the ITS has grown a new
64kB page containing only a single register that is used to
signal a SGI to a given VPE.
Add a second mapping covering this new 64kB range, and take this
opportunity to limit the original mapping to 64kB, which is enough
to cover the span of the ITS registers.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-8-maz@kernel.org
Tell KVM that we support v4.1. Nothing uses this information so far.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-7-maz@kernel.org
The GICv4.1 spec says that it is CONTRAINED UNPREDICTABLE to write to
any of the GICR_INV{LPI,ALL}R registers if GICR_SYNCR.Busy == 1.
To deal with it, we must ensure that only a single invalidation can
happen at a time for a given redistributor. Add a per-RD lock to that
effect and take it around the invalidation/syncr-read to deal with this.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-6-maz@kernel.org
In GICv4.1, we emulate a guest-issued INVALL command by a direct write
to GICR_INVALLR. Before we finish the emulation and go back to guest,
let's make sure the physical invalidate operation is actually completed
and no stale data will be left in redistributor. Per the specification,
this can be achieved by polling the GICR_SYNCR.Busy bit (to zero).
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200302092145.899-1-yuzenghui@huawei.com
Link: https://lore.kernel.org/r/20200304203330.4967-5-maz@kernel.org
Before GICv4.1, all operations would be serialized with the affinity
changes by virtue of using the same ITS command queue. With v4.1, things
change, as invalidations (and a number of other operations) are issued
using the redistributor MMIO frame.
We must thus make sure that these redistributor accesses cannot race
against aginst the affinity change, or we may end-up talking to the
wrong redistributor.
To ensure this, we expand the irq_to_cpuid() helper to take a spinlock
when the LPI is mapped to a vLPI (a new per-VPE lock) on each operation
that requires mutual exclusion.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-4-maz@kernel.org
In a system that is only sparsly populated with CPUs, we can end-up with
redistributors structures that are not initialized. Let's make sure we
don't try and access those when iterating over them (in this case when
checking we have a L2 VPE table).
Fixes: 4e6437f12d ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-3-maz@kernel.org
The GICv3 ITS driver assumes that once it has latched on a page size for
a given BASER register, it can use the same page size as the maximum
page size for all subsequent BASER registers.
Although it worked so far, nothing in the architecture guarantees this,
and Nianyao Tang hit this problem on some undisclosed implementation.
Let's bite the bullet and probe the the supported page size on all BASER
registers before starting to populate the tables. This simplifies the
setup a bit, at the expense of a few additional MMIO accesses.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reported-by: Nianyao Tang <tangnianyao@huawei.com>
Tested-by: Nianyao Tang <tangnianyao@huawei.com>
Link: https://lore.kernel.org/r/1584089195-63897-1-git-send-email-zhangshaokun@hisilicon.com
GICR_SYNCR is a 32bit register, so it is better to access it with
32bit access width, though we have not seen any real problem.
Signed-off-by: Heyi Guo <guoheyi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200225090023.28020-1-guoheyi@huawei.com
In order to allow the GICv4 code to link properly on 32bit ARM,
make sure we don't use 64bit divisions when it isn't strictly
necessary.
Fixes: 4e6437f12d ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
V{PEND,PROP}BASER registers are actually located in VLPI_base frame
of the *redistributor*. Rename their accessors to reflect this fact.
No functional changes.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com
"ITS virtual pending table not cleaning" is already complained inside
its_clear_vpend_valid(), there's no need to trigger a WARN_ON again.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-6-yuzenghui@huawei.com
In GICv4, we will ensure that level2 vPE table memory is allocated
for the specified vpe_id on all v4 ITS, in its_alloc_vpe_table().
This still works well for the typical GICv4.1 implementation, where
the new vPE table is shared between the ITSs and the RDs.
To make it explicit, let us introduce allocate_vpe_l2_table() to
make sure that the L2 tables are allocated on all v4.1 RDs. We're
likely not need to allocate memory in it because the vPE table is
shared and (L2 table is) already allocated at ITS level, except
for the case where the ITS doesn't share anything (say SVPET == 0,
practically unlikely but architecturally allowed).
The implementation of allocate_vpe_l2_table() is mostly copied from
its_alloc_table_entry().
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-4-yuzenghui@huawei.com
Currently, we will not set vpe_l1_page for the current RD if we can
inherit the vPE configuration table from another RD (or ITS), which
results in an inconsistency between RDs within the same CommonLPIAff
group.
Let's rename it to vpe_l1_base to indicate the base address of the
vPE configuration table of this RD, and set it properly for *all*
v4.1 redistributors.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com
The Size field of GICv4.1 VPROPBASER register indicates number of
pages minus one and together Page_Size and Size control the vPEID
width. Let's respect this requirement of the architecture.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-2-yuzenghui@huawei.com
It looks like an obvious mistake to use its_mapc_cmd descriptor when
building the INVALL command block. It so far worked by luck because
both its_mapc_cmd.col and its_invall_cmd.col sit at the same offset of
the ITS command descriptor, but we should not rely on it.
Fixes: cc2d3216f5 ("irqchip: GICv3: ITS command queue")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191202071021.1251-1-yuzenghui@huawei.com
Just like for INVALL, GICv4.1 has grown a VPE-aware INVLPI register.
Let's plumb it in and make use of the DirectLPI code in that case.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-16-maz@kernel.org
Since GICv4.1 gives us a per-VPE doorbell, avoid programming anything
else on VMOVI/VMAPI/VMAPTI and on any other action that would have
otherwise resulted in a per-VLPI doorbell to be programmed.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-15-maz@kernel.org
GICv4.1 redistributors have a VPE-aware INVALL register. Progress!
We can now emulate a guest-requested INVALL without emiting a
VINVALL command.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-14-maz@kernel.org
When descheduling a VPE, special care must be taken to tell the GIC
about whether we want to receive a doorbell or not. This is a
major improvement on GICv4.0, where the doorbell had to be separately
enabled/disabled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-13-maz@kernel.org
Making a VPE resident on GICv4.1 is pretty simple, as it is just a
single write to the local redistributor. We just need extra information
about which groups to enable, which the KVM code will have to provide.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-12-maz@kernel.org
masking/unmasking doorbells on GICv4.1 relies on a new INVDB command,
which broadcasts the invalidation to all RDs.
Implement the new command as well as the masking callbacks, and plug
the whole thing into the v4.1 VPE irqchip.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-11-maz@kernel.org
Just like for GICv4.0, each VPE has its own doorbell interrupt, and
thus an irqchip that manages them. Since the doorbell management is
quite different on GICv4.1, let's introduce an almost empty irqchip
the will get populated over the next new patches.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-10-maz@kernel.org
With GICv4.1, VMOVP is extended to allow a default doorbell to be
specified, as well as a validity bit for this doorbell. As an added
bonus, VMOVP isn't required anymore of moving a VPE between
redistributors that share the same affinity.
Let's add this support to the VMOVP builder, and make sure we don't
issue the command if we don't really need to.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-9-maz@kernel.org
The infamous VPE proxy device isn't used with GICv4.1 because:
- we can invalidate any LPI from the DirectLPI MMIO interface
- the ITS and redistributors understand the life cycle of
the doorbell, so we don't need to enable/disable it all
the time
So let's escape early from the proxy related functions.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-8-maz@kernel.org
The ITS VMAPP command gains some new fields with GICv4.1:
- a default doorbell, which allows a single doorbell to be used for
all the VLPIs routed to a given VPE
- a pointer to the configuration table (instead of having it in a register
that gets context switched)
- a flag indicating whether this is the first map or the last unmap for
this particular VPE
- a flag indicating whether the pending table is known to be zeroed, or not
Plumb in the new fields in the VMAPP builder, and add the map/unmap
refcounting so that the ITS can do the right thing.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-7-maz@kernel.org
GICv4.1 defines a new VPE table that is potentially shared between
both the ITSs and the redistributors, following complicated affinity
rules.
To make things more confusing, the programming of this table at
the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register
for something completely different.
The code flow is somewhat complexified by the need to respect the
affinities required by the HW, meaning that tables can either be
inherited from a previously discovered ITS or redistributor.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-6-maz@kernel.org
While GICv4.0 mandates 16 bit worth of VPEIDs, GICv4.1 allows smaller
implementations to be built. Add the required glue to dynamically
compute the limit.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-3-maz@kernel.org
When updating an LPI configuration, get_vlpi_map() may be passed a
irq_data structure relative to an ITS domain (the normal case) or one
that is relative to the core GICv3 domain in the case of a GICv4
doorbell.
In the latter case, special care must be take not to dereference
the irq_chip data as an its_dev structure, as that isn't what is
stored there. Instead, check *first* whether the IRQ is forwarded
to a vcpu, and only then try to obtain the vlpi mapping.
Fixes: c1d4d5cd20 ("irqchip/gic-v3-its: Add its_vlpi_map helpers")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200122085609.658-1-yuzenghui@huawei.com