Check for iommu_gather_ops structures that are only stored in the tlb
field of an io_pgtable_cfg structure. The tlb field is of type
const struct iommu_gather_ops *, so iommu_gather_ops structures
having this property can be declared as const.
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We can use for_each_set_bit() to simplify the code slightly in the
ARM io-pgtable self tests.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull IOMMU fixes from David Woodhouse:
"Two minor fixes.
The first fixes the assignment of SR-IOV virtual functions to the
correct IOMMU unit, and the second fixes the excessively large (and
physically contiguous) PASID tables used with SVM"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: Fix PASID table allocation
iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.
In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables — still not ideal, but better than before.
Reported by Mika Kuoppala <mika.kuoppala@linux.intel.com> and also by
Xunlei Pang <xlpang@redhat.com> who submitted a simpler patch to fix
only the allocation (and not the free) to the "correct" limit... which
was still problematic.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org
When searching for a free IOVA range, we optimise the tree traversal
by starting from the cached32_node, instead of the last node, when
limit_pfn is equal to dma_32bit_pfn. However, if limit_pfn happens to
be smaller, then we'll go ahead and start from the top even though
dma_32bit_pfn is still a more suitable upper bound. Since this is
clearly a silly thing to do, adjust the lookup condition appropriately.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For each subsequent device assigned to the m4u_group after its initial
allocation, we need to take an additional reference. Otherwise, the
caller of iommu_group_get_for_dev() will inadvertently remove the
reference taken by iommu_group_add_device(), and the group will be
freed prematurely if any device is removed.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For each subsequent device assigned to the m4u_group after its initial
allocation, we need to take an additional reference. Otherwise, the
caller of iommu_group_get_for_dev() will inadvertently remove the
reference taken by iommu_group_add_device(), and the group will be
freed prematurely if any device is removed.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If acpihid_device_group() finds an existing group for the relevant
devid, it should be taking an additional reference on that group.
Otherwise, the caller of iommu_group_get_for_dev() will inadvertently
remove the reference taken by iommu_group_add_device(), and the group
will be freed prematurely if any device is removed.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When arm_smmu_device_group() finds an existing group due to Stream ID
aliasing, it should be taking an additional reference on that group.
Otherwise, the caller of iommu_group_get_for_dev() will inadvertently
remove the reference taken by iommu_group_add_device(), and the group
will be freed prematurely if any device is removed.
Reported-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
iommu_group_get_for_dev() expects that the IOMMU driver's device_group
callback return a group with a reference held for the given device.
Whilst allocating a new group is fine, and pci_device_group() correctly
handles reusing an existing group, there is no general means for IOMMU
drivers doing their own group lookup to take additional references on an
existing group pointer without having to also store device pointers or
resort to elaborate trickery.
Add an IOMMU-driver-specific function to fill the hole.
Acked-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch uses recently introduced device dependency links to track the
runtime pm state of the master's device. The goal is to let SYSMMU
controller device's runtime PM to follow the runtime PM state of the
respective master's device. This way each SYSMMU controller is active
only when its master's device is active and can properly restore or save
its state instead on runtime PM transition of master's device.
This approach replaces old behavior, when SYSMMU controller was set to
runtime active once after attaching to the master device. In the new
approach SYSMMU controllers no longer prevents respective power domains
to be turned off when master's device is not being used.
This patch reduces total power consumption of idle system, because most
power domains can be finally turned off. For example, on Exynos 4412
based Odroid U3 this patch reduces power consuption from 136mA to 130mA
at 5V (by 4.4%).
The dependency links also enforce proper order of suspending/restoring
devices during system sleep transition, so there is no more need to use
LATE_SYSTEM_SLEEP_PM_OPS-based workaround for ensuring that SYSMMUs are
suspended after their master devices.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds runtime pm implementation, which is based on previous
suspend/resume code. SYSMMU controller is now being enabled/disabled mainly
from the runtime pm callbacks. System sleep callbacks relies on generic
pm_runtime_force_suspend/pm_runtime_force_resume helpers. To ensure
internal state consistency, additional lock for runtime pm transitions
was introduced.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch reworks locking in the exynos_iommu_attach/detach_device
functions to ensure that all entries of the sysmmu_drvdata and
exynos_iommu_owner structure are updated under the respective spinlocks,
while runtime pm functions are called without any spinlocks held.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
To avoid possible races, set master device pointer in each SYSMMU
controller once on boot. Suspend/resume callbacks now properly relies on
the configured iommu domain to enable or disable SYSMMU controller.
While changing the code, also update the sleep debug messages and make
them conditional.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove remaining leftovers of the ref-count related code in the
__sysmmu_enable/disable functions inline __sysmmu_enable/disable_nocount
to them. Suspend/resume callbacks now checks if master device is set for
given SYSMMU controller instead of relying on the activation count.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
__sysmmu_enable/disable functions were designed to do ref-count based
operations, but current code always calls them only once, so the code for
checking the conditions and invalid conditions can be simply removed
without any influence to the driver operation.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove excessive, useless debug about skipping TLB invalidation, which
is a normal situation when more aggressive power management is enabled.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the new dma_{map,unmap}_resource() functions added to the DMA API
for the benefit of cases like slave DMA, add suitable implementations to
the arsenal of our generic layer. Since cache maintenance should not be
a concern, these can both be standalone callback implementations without
the need for arch code wrappers.
CC: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch add support for page access protection bits. Till now this
feature was disabled and Exynos SYSMMU always mapped pages as read/write.
Now page access bits are set according to the protection bits provided
in iommu_map(), so Exynos SYSMMU is able to detect incorrect access to
mapped pages. Exynos SYSMMU earlier than v5 doesn't support write-only
mappings, so pages with such protection bits are mapped as read/write.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This will get rid of a lot false positives caused by kmemleak being
unaware of the irq_remap_table. Based on a suggestion from Catalin Marinas.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Convert DT component matching to use component_match_add_release().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Our per-device data consists of the M4U instance and firmware-provided
list of LARB IDs, which is a perfect fit for the generic iommu_fwspec
machinery. Use that directly instead of the custom archdata code - while
we can't rely on the of_xlate() mechanism to initialise things until the
32-bit ARM DMA code learns about groups and default domains, it still
results in a reasonable simplification overall.
CC: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Our per-device data consists of the M4U instance and firmware-provided
list of LARB IDs, which is a perfect fit for the generic iommu_fwspec
machinery. Use that directly as a simpler alternative to the custom
archdata code.
CC: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.
Fixes: 55d940430a ('iommu/vt-d: Get rid of domain->iommu_lock')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we iterate a master's config entries, what we generally care
about is the entry's stream map index, rather than the entry index
itself, so it's nice to have the iterator automatically assign the
former from the latter. Unfortunately, booting with KASAN reveals
the oversight that using a simple comma operator results in the
entry index being dereferenced before being checked for validity,
so we always access one element past the end of the fwspec array.
Flip things around so that the check always happens before the index
may be dereferenced.
Fixes: adfec2e709 ("iommu/arm-smmu: Convert to iommu_fwspec")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We seem to have forgotten to check that iommu_fwspecs actually belong to
us before we go ahead and dereference their private data. Oops.
Fixes: 021bb8420d ("iommu/arm-smmu: Wire up generic configuration support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We now delay installing our per-bus iommu_ops until we know an SMMU has
successfully probed, as they don't serve much purpose beforehand, and
doing so also avoids fights between multiple IOMMU drivers in a single
kernel. However, the upshot of passing the return value of bus_set_iommu()
back from our probe function is that if there happens to be more than
one SMMUv3 device in a system, the second and subsequent probes will
wind up returning -EBUSY to the driver core and getting torn down again.
Avoid re-setting ops if ours are already installed, so that any genuine
failures stand out.
Fixes: 08d4ca2a67 ("iommu/arm-smmu: Support non-PCI devices with SMMUv3")
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The 32-bit ARM DMA configuration code predates the IOMMU core's default
domain functionality, and instead relies on allocating its own domains
and attaching any devices using the generic IOMMU binding to them.
Unfortunately, it does this relatively early on in the creation of the
device, before we've seen our add_device callback, which leads us to
attempt to operate on a half-configured master.
To avoid a crash, check for this situation on attach, but refuse to
play, as there's nothing we can do. This at least allows VFIO to keep
working for people who update their 32-bit DTs to the generic binding,
albeit with a few (innocuous) warnings from the DMA layer on boot.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The VT-d specification (§8.3.3) says:
‘Virtual Functions’ of a ‘Physical Function’ are under the scope
of the same remapping unit as the ‘Physical Function’.
The BIOS is not required to list all the possible VFs in the scope
tables, and arguably *shouldn't* make any attempt to do so, since there
could be a huge number of them.
This has been broken basically for ever — the VF is never going to match
against a specific unit's scope, so it ends up being assigned to the
INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but
now we're looking at Root-Complex integrated devices with SR-IOV support
it's going to start being wrong.
Fix it to simply use pci_physfn() before doing the lookup for PCI devices.
Cc: stable@vger.kernel.org
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Including:
* Support for interrupt virtualization in the AMD IOMMU driver.
These patches were shared with the KVM tree and are already
merged through that tree.
* Generic DT-binding support for the ARM-SMMU driver. With this
the driver now makes use of the generic DMA-API code. This
also required some changes outside of the IOMMU code, but
these are acked by the respective maintainers.
* More cleanups and fixes all over the place.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJX/PmUAAoJECvwRC2XARrjyloP/1hymxXC2yXZ4EIBTHSO5X+c
jSJaGTIbQAdQDpllscSNJ0Au43L3vGtJcHo4JqwEERNlwLsU82LH7QJhq+q1La/b
5cPaY5gI3E++qxQt8umuZJAIUQthFYrfGoS5lJc5t5r/d8iVsLWbW4VkR19/1o7A
4/Uz7ETmi9VVy8Hkvumx+PQ0VHJet381KB7ud9LU5Spim0En2AAGwZXLMkmxXd2W
uDQ+O1rlDVc2/ka3+GmfZEml5EASWRqS/MTNoU/ZbQGYWKCWygXbuiqt6gLudWjx
dCR1Knh68b0gN6k/QAj8XY/1gkfmZ3YkfS0AHIMLYTFRT51BuxOrkXrBdkYnWEBv
UirmaiV87SlR1j83yb3ZmjpBPvd2sGWYFDqY1P0riLutjGUS6zycWWs13olvbfbz
SFrH7PT7JPQGYprI1oVn4ihszjN1NZ4+Gj7QBhyFW6FtvqTzmaFVsMOlDIeg1FwR
k8cOzov4NG33Bp4IpsHK8e0/qV6K3oJOiOQgCyQp9kPKK+UWv9v9+HaEA7npJuRV
c+lTE6j3G4LjEoVybkqm8TiPKxTMVNjUjgA3kwB2yNkCQT7hTCNYIAFrtfCYjYdo
B1dnFE7feVqtoimnu2qvkVs59hWlF7Hc3RRHoBxMmO8DLwl9n2OcmoQIeCTsviss
i9aNwC9bzBs+Hd3X/psB
=1hFE
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- support for interrupt virtualization in the AMD IOMMU driver. These
patches were shared with the KVM tree and are already merged through
that tree.
- generic DT-binding support for the ARM-SMMU driver. With this the
driver now makes use of the generic DMA-API code. This also required
some changes outside of the IOMMU code, but these are acked by the
respective maintainers.
- more cleanups and fixes all over the place.
* tag 'iommu-updates-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (40 commits)
iommu/amd: No need to wait iommu completion if no dte irq entry change
iommu/amd: Free domain id when free a domain of struct dma_ops_domain
iommu/amd: Use standard bitmap operation to set bitmap
iommu/amd: Clean up the cmpxchg64 invocation
iommu/io-pgtable-arm: Check for v7s-incapable systems
iommu/dma: Avoid PCI host bridge windows
iommu/dma: Add support for mapping MSIs
iommu/arm-smmu: Set domain geometry
iommu/arm-smmu: Wire up generic configuration support
Docs: dt: document ARM SMMU generic binding usage
iommu/arm-smmu: Convert to iommu_fwspec
iommu/arm-smmu: Intelligent SMR allocation
iommu/arm-smmu: Add a stream map entry iterator
iommu/arm-smmu: Streamline SMMU data lookups
iommu/arm-smmu: Refactor mmu-masters handling
iommu/arm-smmu: Keep track of S2CR state
iommu/arm-smmu: Consolidate stream map entry state
iommu/arm-smmu: Handle stream IDs more dynamically
iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
iommu/arm-smmu: Support non-PCI devices with SMMUv3
...
All architectures:
Move `make kvmconfig` stubs from x86; use 64 bits for debugfs stats.
ARM:
Important fixes for not using an in-kernel irqchip; handle SError
exceptions and present them to guests if appropriate; proxying of GICV
access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8;
preparations for GICv3 save/restore, including ABI docs; cleanups and
a bit of optimizations.
MIPS:
A couple of fixes in preparation for supporting MIPS EVA host kernels;
MIPS SMP host & TLB invalidation fixes.
PPC:
Fix the bug which caused guests to falsely report lockups; other minor
fixes; a small optimization.
s390:
Lazy enablement of runtime instrumentation; up to 255 CPUs for nested
guests; rework of machine check deliver; cleanups and fixes.
x86:
IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V
TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in
nVMX; cleanups and fixes.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL
sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC
eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW
hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4
K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI
B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18=
=fZRB
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"All architectures:
- move `make kvmconfig` stubs from x86
- use 64 bits for debugfs stats
ARM:
- Important fixes for not using an in-kernel irqchip
- handle SError exceptions and present them to guests if appropriate
- proxying of GICV access at EL2 if guest mappings are unsafe
- GICv3 on AArch32 on ARMv8
- preparations for GICv3 save/restore, including ABI docs
- cleanups and a bit of optimizations
MIPS:
- A couple of fixes in preparation for supporting MIPS EVA host
kernels
- MIPS SMP host & TLB invalidation fixes
PPC:
- Fix the bug which caused guests to falsely report lockups
- other minor fixes
- a small optimization
s390:
- Lazy enablement of runtime instrumentation
- up to 255 CPUs for nested guests
- rework of machine check deliver
- cleanups and fixes
x86:
- IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
- Hyper-V TSC page
- per-vcpu tsc_offset in debugfs
- accelerated INS/OUTS in nVMX
- cleanups and fixes"
* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
KVM: MIPS: Drop dubious EntryHi optimisation
KVM: MIPS: Invalidate TLB by regenerating ASIDs
KVM: MIPS: Split kernel/user ASID regeneration
KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
KVM: arm64: Require in-kernel irqchip for PMU support
KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
KVM: PPC: BookE: Fix a sanity check
KVM: PPC: Book3S HV: Take out virtual core piggybacking code
KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
ARM: gic-v3: Work around definition of gic_write_bpr1
KVM: nVMX: Fix the NMI IDT-vectoring handling
KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
KVM: nVMX: Fix reload apic access page warning
kvmconfig: add virtio-gpu to config fragment
config: move x86 kvm_guest.config to a common location
arm64: KVM: Remove duplicating init code for setting VMID
ARM: KVM: Support vgic-v3
...
Pull s390 updates from Martin Schwidefsky:
"The new features and main improvements in this merge for v4.9
- Support for the UBSAN sanitizer
- Set HAVE_EFFICIENT_UNALIGNED_ACCESS, it improves the code in some
places
- Improvements for the in-kernel fpu code, in particular the overhead
for multiple consecutive in kernel fpu users is recuded
- Add a SIMD implementation for the RAID6 gen and xor operations
- Add RAID6 recovery based on the XC instruction
- The PCI DMA flush logic has been improved to increase the speed of
the map / unmap operations
- The time synchronization code has seen some updates
And bug fixes all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
s390/con3270: fix insufficient space padding
s390/con3270: fix use of uninitialised data
MAINTAINERS: update DASD maintainer
s390/cio: fix accidental interrupt enabling during resume
s390/dasd: add missing \n to end of dev_err messages
s390/config: Enable config options for Docker
s390/dasd: make query host access interruptible
s390/dasd: fix panic during offline processing
s390/dasd: fix hanging offline processing
s390/pci_dma: improve lazy flush for unmap
s390/pci_dma: split dma_update_trans
s390/pci_dma: improve map_sg
s390/pci_dma: simplify dma address calculation
s390/pci_dma: remove dma address range check
iommu/s390: simplify registration of I/O address translation parameters
s390: migrate exception table users off module.h and onto extable.h
s390: export header for CLP ioctl
s390/vmur: fix irq pointer dereference in int handler
s390/dasd: add missing KOBJ_CHANGE event for unformatted devices
s390: enable UBSAN
...
When a new function is attached to an iommu domain we need to register
I/O address translation parameters. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
start_dma and end_dma correctly describe the range of usable I/O addresses.
Simplify the code by using these values directly.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This is a clean up. In get_irq_table() only if DTE entry is changed
iommu_completion_wait() need be called. Otherwise no need to do it.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The current code missed freeing domain id when free a domain of
struct dma_ops_domain.
Signed-off-by: Baoquan He <bhe@redhat.com>
Fixes: ec487d1a11 ('x86, AMD IOMMU: add domain allocation and deallocation functions')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Change it as it's designed for and keep it consistent with other
places.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
On machines with no 32-bit addressable RAM whatsoever, we shouldn't
even touch the v7s format as it's never going to work.
Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Reported-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With our DMA ops enabled for PCI devices, we should avoid allocating
IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
lead to faults, corruption or other badness. To be safe, punch out holes
for all of the relevant host bridge's windows when initialising a DMA
domain for a PCI device.
CC: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Inki Dae <inki.dae@samsung.com>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When an MSI doorbell is located downstream of an IOMMU, attaching
devices to a DMA ops domain and switching on translation leads to a rude
shock when their attempt to write to the physical address returned by
the irqchip driver faults (or worse, writes into some already-mapped
buffer) and no interrupt is forthcoming.
Address this by adding a hook for relevant irqchip drivers to call from
their compose_msi_msg() callback, to swizzle the physical address with
an appropriatly-mapped IOVA for any device attached to one of our DMA
ops domains.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
For non-aperture-based IOMMUs, the domain geometry seems to have become
the de-facto way of indicating the input address space size. That is
quite a useful thing from the users' perspective, so let's do the same.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With everything else now in place, fill in an of_xlate callback and the
appropriate registration to plumb into the generic configuration
machinery, and watch everything just work.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the final step of preparation for full generic configuration support,
swap our fixed-size master_cfg for the generic iommu_fwspec. For the
legacy DT bindings, the driver simply gets to act as its own 'firmware'.
Farewell, arbitrary MAX_MASTER_STREAMIDS!
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Stream Match Registers are one of the more awkward parts of the SMMUv2
architecture; there are typically never enough to assign one to each
stream ID in the system, and configuring them such that a single ID
matches multiple entries is catastrophically bad - at best, every
transaction raises a global fault; at worst, they go *somewhere*.
To address the former issue, we can mask ID bits such that a single
register may be used to match multiple IDs belonging to the same device
or group, but doing so also heightens the risk of the latter problem
(which can be nasty to debug).
Tackle both problems at once by replacing the simple bitmap allocator
with something much cleverer. Now that we have convenient in-memory
representations of the stream mapping table, it becomes straightforward
to properly validate new SMR entries against the current state, opening
the door to arbitrary masking and SMR sharing.
Another feature which falls out of this is that with IDs shared by
separate devices being automatically accounted for, simply associating a
group pointer with the S2CR offers appropriate group allocation almost
for free, so hook that up in the process.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We iterate over the SMEs associated with a master config quite a lot in
various places, and are about to do so even more. Let's wrap the idiom
in a handy iterator macro before the repetition gets out of hand.
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Simplify things somewhat by stashing our arm_smmu_device instance in
drvdata, so that it's readily available to our driver model callbacks.
Then we can excise the private list entirely, since the driver core
already has a perfectly good list of SMMU devices we can use in the one
instance we actually need to. Finally, make a further modest code saving
with the relatively new of_device_get_match_data() helper.
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To be able to support the generic bindings and handle of_xlate() calls,
we need to be able to associate SMMUs and stream IDs directly with
devices *before* allocating IOMMU groups. Furthermore, to support real
default domains with multi-device groups we also have to handle domain
attach on a per-device basis, as the "whole group at a time" assumption
fails to properly handle subsequent devices added to a group after the
first has already triggered default domain creation and attachment.
To that end, use the now-vacant dev->archdata.iommu field for easy
config and SMMU instance lookup, and unify config management by chopping
down the platform-device-specific tree and probing the "mmu-masters"
property on-demand instead. This may add a bit of one-off overhead to
initially adding a new device, but we're about to deprecate that binding
in favour of the inherently-more-efficient generic ones anyway.
For the sake of simplicity, this patch does temporarily regress the case
of aliasing PCI devices by losing the duplicate stream ID detection that
the previous per-group config had. Stay tuned, because we'll be back to
fix that in a better and more general way momentarily...
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Making S2CRs first-class citizens within the driver with a high-level
representation of their state offers a neat solution to a few problems:
Firstly, the information about which context a device's stream IDs are
associated with is already present by necessity in the S2CR. With that
state easily accessible we can refer directly to it and obviate the need
to track an IOMMU domain in each device's archdata (its earlier purpose
of enforcing correct attachment of multi-device groups now being handled
by the IOMMU core itself).
Secondly, the core API now deprecates explicit domain detach and expects
domain attach to move devices smoothly from one domain to another; for
SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
to the device. By giving the driver a suitable abstraction of those
S2CRs to work with, we can massively reduce the overhead of the current
heavy-handed "detach, free resources, reallocate resources, attach"
approach.
Thirdly, making the software state hardware-shaped and attached to the
SMMU instance once again makes suspend/resume of this register group
that much simpler to implement in future.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In order to consider SMR masking, we really want to be able to validate
ID/mask pairs against existing SMR contents to prevent stream match
conflicts, which at best would cause transactions to fault unexpectedly,
and at worst lead to silent unpredictable behaviour. With our SMMU
instance data holding only an allocator bitmap, and the SMR values
themselves scattered across master configs hanging off devices which we
may have no way of finding, there's essentially no way short of digging
everything back out of the hardware. Similarly, the thought of power
management ops to support suspend/resume faces the exact same problem.
By massaging the software state into a closer shape to the underlying
hardware, everything comes together quite nicely; the allocator and the
high-level view of the data become a single centralised state which we
can easily keep track of, and to which any updates can be validated in
full before being synchronised to the hardware itself.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Rather than assuming fixed worst-case values for stream IDs and SMR
masks, keep track of whatever implemented bits the hardware actually
reports. This also obviates the slightly questionable validation of SMR
fields in isolation - rather than aborting the whole SMMU probe for a
hardware configuration which is still architecturally valid, we can
simply refuse masters later if they try to claim an unrepresentable ID
or mask (which almost certainly implies a DT error anyway).
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Implement the SMMUv3 equivalent of d346180e70 ("iommu/arm-smmu: Treat
all device transactions as unprivileged"), so that once again those
pesky DMA controllers with their privileged instruction fetches don't
unexpectedly fault in stage 1 domains due to VMSAv8 rules.
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the device <-> stream ID relationship suitably abstracted and
of_xlate() hooked up, the PCI dependency now looks, and is, entirely
arbitrary. Any bus using the of_dma_configure() mechanism will work,
so extend support to the platform and AMBA buses which do just that.
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we can properly describe the mapping between PCI RIDs and
stream IDs via "iommu-map", and have it fed it to the driver
automatically via of_xlate(), rework the SMMUv3 driver to benefit from
that, and get rid of the current misuse of the "iommus" binding.
Since having of_xlate wired up means that masters will now be given the
appropriate DMA ops, we also need to make sure that default domains work
properly. This necessitates dispensing with the "whole group at a time"
notion for attaching to a domain, as devices which share a group get
attached to the group's default domain one by one as they are initially
probed.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
other than allocating and filling in the entire stream table with bypass
entries, which for some configurations would waste *gigabytes* of RAM.
Otherwise, all transactions on unknown stream IDs will simply be aborted
with a C_BAD_STREAMID event.
Rather than render the system unusable in the case of an invalid DT,
avoid enabling the SMMU altogether such that everything bypasses
(though letting the explicit disable_bypass option take precedence).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Introduce a common structure to hold the per-device firmware data that
most IOMMU drivers need to keep track of. This enables us to configure
much of that data from common firmware code, and consolidate a lot of
the equivalent implementations, device look-up tables, etc. which are
currently strewn across IOMMU drivers.
This will also be enable us to address the outstanding "multiple IOMMUs
on the platform bus" problem by tweaking IOMMU API calls to prefer
dev->fwspec->ops before falling back to dev->bus->iommu_ops, and thus
gracefully handle those troublesome systems which we currently cannot.
As the first user, hook up the OF IOMMU configuration mechanism. The
driver-defined nature of DT cells means that we still need the drivers
to translate and add the IDs themselves, but future users such as the
much less free-form ACPI IORT will be much simpler and self-contained.
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we have a way to pick up the RID translation and target IOMMU,
hook up of_iommu_configure() to bring PCI devices into the of_xlate
mechanism and allow them IOMMU-backed DMA ops without the need for
driver-specific handling.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The cmdq lock is taken whenever we issue commands into the command queue,
which can occur in IRQ context (as a result of unmap) or in process
context (as a result of a threaded IRQ handler or device probe).
This can lead to a theoretical deadlock if the interrupt handler
performing the unmap hits whilst the lock is taken, so explicitly use
the {irqsave,irqrestore} spin_lock accessors for the cmdq lock.
Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When the SMMUv3 driver attempts to send a command, it adds an entry to the
command queue. This is a circular buffer, where both the producer and
consumer have a wrap bit. When producer.index == consumer.index and
producer.wrap == consumer.wrap, the list is empty. When producer.index ==
consumer.index and producer.wrap != consumer.wrap, the list is full.
If the list is full when the driver needs to add a command, it waits for
the SMMU to consume one command, and advance the consumer pointer. The
problem is that we currently rely on "X before Y" operation to know if
entries have been consumed, which is a bit fiddly since it only makes
sense when the distance between X and Y is less than or equal to the size
of the queue. At the moment when the list is full, we use "Consumer before
Producer + 1", which is out of range and returns a value opposite to what
we expect: when the queue transitions to not full, we stay in the polling
loop and time out, printing an error.
Given that the actual bug was difficult to determine, simplify the polling
logic by relying exclusively on queue_full and queue_empty, that don't
have this range constraint. Polling the queue is now straightforward:
* When we want to add a command and the list is full, wait until it isn't
full and retry.
* After adding a sync, wait for the list to be empty before returning.
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fill in the last bits of machinery required to drive a stage 1 context
bank in v7 short descriptor format. By default we'll prefer to use it
only when the CPUs are also using the same format, such that we're
guaranteed that everything will be strictly 32-bit.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
SMMUv3 only sends interrupts for event queues (EVTQ and PRIQ) when they
transition from empty to non-empty. At the moment, if the SMMU adds new
items to a queue before the event thread finished consuming a previous
batch, the driver ignores any new item. The queue is then stuck in
non-empty state and all subsequent events will be lost.
As an example, consider the following flow, where (P, C) is the SMMU view
of producer/consumer indices, and (p, c) the driver view.
P C | p c
1. SMMU appends a PPR to the PRI queue, 1 0 | 0 0
sends an MSI
2. PRIQ handler is called. 1 0 | 1 0
3. SMMU appends a PPR to the PRI queue. 2 0 | 1 0
4. PRIQ thread removes the first element. 2 1 | 1 1
5. PRIQ thread believes that the queue is empty, goes into idle
indefinitely.
To avoid this, always synchronize the producer index and drain the queue
once before leaving an event handler. In order to prevent races on the
local producer index, move all event queue handling into the threads.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There is no need to call devm_free_irq when driver detach.
devres_release_all which is called after 'drv->remove' will
release all managed resources.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The semaphore used by the AMD IOMMU to signal command
completion lived on the stack until now, which was safe as
the driver busy-waited on the semaphore with IRQs disabled,
so the stack can't go away under the driver.
But the recently introduced vmap-based stacks break this as
the physical address of the semaphore can't be determinded
easily anymore. The driver used the __pa() macro, but that
only works in the direct-mapping. The result were
Completion-Wait timeout errors seen by the IOMMU driver,
breaking system boot.
Since putting the semaphore on the stack is bad design
anyway, move the semaphore into 'struct amd_iommu'. It is
protected by the per-iommu lock and now in the direct
mapping again. This fixes the Completion-Wait timeout errors
and makes AMD IOMMU systems boot again with vmap-based
stacks enabled.
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a domain is allocated through the get_valid_domain_for_dev
path, it will be context-mapped before the RMRR regions are
mapped in the page-table. This opens a short time window
where device-accesses to these regions fail and causing DMAR
faults.
Fix this by mapping the RMRR regions before the domain is
context-mapped.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Split out the search for an already existing domain and the
context mapping of the device to the new domain.
This allows to map possible RMRR regions into the domain
before it is context mapped.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Let's fix the error handle of ipmmu_add_device
when failing to find utlbs, otherwise we take a
risk of pontential memleak.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit e85e8f69ce ("iommu/amd: Remove statistics code")
removed that configuration.
Also remove function definition (suggested by Joerg Roedel)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver
uses to determine if it should enable vAPIC support, by setting
the ga_mode bit in the device's interrupt remapping table entry.
Currently, it is enabled for all pass-through device if vAPIC mode
is enabled.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch implements irq_set_vcpu_affinity() function to set up interrupt
remapping table entry with vapic mode for pass-through devices.
In case requirements for vapic mode are not met, it falls back to set up
the IRTE in legacy mode.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler.
When IOMMU hardware receives an interrupt targeting a blocking vcpu,
it creates an entry in the GALOG, and generates an interrupt to notify
the AMD IOMMU driver.
At this point, the driver processes the log entry, and notify the SVM
driver via the registered iommu_ga_log_notifier function.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds support to detect and initialize IOMMU Guest vAPIC log
(GALOG). By default, it also enable GALog interrupt to notify IOMMU driver
when GA Log entry is created.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch enables support for the new 128-bit IOMMU IRTE format,
which can be used for both legacy and vapic interrupt remapping modes.
It replaces the existing operations on IRTE, which can only support
the older 32-bit IRTE format, with calls to the new struct amd_irt_ops.
It also provides helper functions for setting up, accessing, and
updating interrupt remapping table entries in different mode.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, IOMMU support two interrupt remapping table entry formats,
32-bit (legacy) and 128-bit (GA). The spec also implies that it might
support additional modes/formats in the future.
So, this patch introduces the new struct amd_irte_ops, which allows
the same code to work with different irte formats by providing hooks
for various operations on an interrupt remapping table entry.
Suggested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move existing unions and structs for accessing/managing IRTE to a proper
header file. This is mainly to simplify variable declarations in subsequent
patches.
Besides, this patch also introduces new struct irte_ga for the new
128-bit IRTE format.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
which can be used to specify different interrupt remapping mode for
passthrough devices to VM guest:
* legacy: Legacy interrupt remapping (w/ 32-bit IRTE)
* vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE)
Note that in vapic mode, it can also supports legacy interrupt remapping
for non-passthrough devices with the 128-bit IRTE.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The disable_bypass cmdline option changes the SMMUv3 driver to put down
faulting stream table entries by default, as opposed to bypassing
transactions from unconfigured devices.
In this mode of operation, it is entirely expected to see aborting
entries in the stream table if and when we come to installing a valid
translation, so don't trigger a BUG() as a result of misdiagnosing these
entries as stream table corruption.
Cc: <stable@vger.kernel.org>
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Tested-by: Robin Murphy <robin.murphy@arm.com>
Reported-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Enabling stalling faults can result in hardware deadlock on poorly
designed systems, particularly those with a PCI root complex upstream of
the SMMU.
Although it's not really Linux's job to save hardware integrators from
their own misfortune, it *is* our job to stop userspace (e.g. VFIO
clients) from hosing the system for everybody else, even if they might
already be required to have elevated privileges.
Given that the fault handling code currently executes entirely in IRQ
context, there is nothing that can sensibly be done to recover from
things like page faults anyway, so let's rip this code out for now and
avoid the potential for deadlock.
Cc: <stable@vger.kernel.org>
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Reported-by: Matt Evans <matt.evans@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the unlikely event of a global command queue error, the ARM SMMUv3
driver attempts to convert the problematic command into a CMD_SYNC and
resume the command queue. Unfortunately, this code is pretty badly
broken:
1. It uses the index into the error string table as the CMDQ index,
so we probably read the wrong entry out of the queue
2. The arguments to queue_write are the wrong way round, so we end up
writing from the queue onto the stack.
These happily cancel out, so the kernel is likely to stay alive, but
the command queue will probably fault again when we resume.
This patch fixes the error handling code to use the correct queue index
and write back the CMD_SYNC to the faulting entry.
Cc: <stable@vger.kernel.org>
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Reported-by: Diwakar Subraveti <Diwakar.Subraveti@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Due to the attribute bits being all over the place in the different
types of short-descriptor PTEs, when remapping an existing entry, e.g.
splitting a section into pages, we take the approach of decomposing
the PTE attributes back to the IOMMU API flags to start from scratch.
On inspection, though, the existing code seems to have got the read-only
bit backwards and ignored the XN bit. How embarrassing...
Fortunately the primary user so far, the Mediatek IOMMU, both never
splits blocks (because it only serves non-overlapping DMA API calls) and
also ignores permissions anyway, but let's put things right before any
future users trip up.
Cc: <stable@vger.kernel.org>
Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Where a device driver has set a 64-bit DMA mask to indicate the absence
of addressing limitations, we still need to ensure that we don't
allocate IOVAs beyond the actual input size of the IOMMU. The reported
aperture is the most reliable way we have of inferring that input
address size, so use that to enforce a hard upper limit where available.
Fixes: 0db2e5d18f ("iommu: Implement common IOMMU ops for DMA mapping")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Due to the limitations of having to wait until we see a device's DMA
restrictions before we know how we want an IOVA domain initialised,
there is a window for error if a DMA ops domain is allocated but later
freed without ever being used. In that case, init_iova_domain() was
never called, so calling put_iova_domain() from iommu_put_dma_cookie()
ends up trying to take an uninitialised lock and crashing.
Make things robust by skipping the call unless the IOVA domain actually
has been initialised, as we probably should have done from the start.
Fixes: 0db2e5d18f ("iommu: Implement common IOMMU ops for DMA mapping")
Cc: stable@vger.kernel.org
Reported-by: Nate Watterson <nwatters@codeaurora.org>
Reviewed-by: Nate Watterson <nwatters@codeaurora.org>
Tested-by: Nate Watterson <nwatters@codeaurora.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
of_platform_device_create returns NULL on error so an IS_ERR test is
incorrect here and a NULL check is required.
The Coccinelle semantic patch used to make this change is as follows:
@@
expression e;
@@
e = of_platform_device_create(...);
if(
- IS_ERR(e)
+ !e
)
{
<+...
return
- PTR_ERR(e)
+ -ENODEV
;
...+>
}
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix to return a negative error code from the alloc_irq_index() error
handling case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fixes the following sparse warning:
drivers/iommu/amd_iommu.c:106:1: warning:
symbol '__pcpu_scope_flush_queue' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This was an oversight while merging these functions. Fix it.
Cc: Honghui Zhang <honghui.zhang@mediatek.com>
Fixes: 9ca340c98c ('iommu/mediatek: move the common struct into header file')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the updates:
* Big endian support and preparation for defered probing for the
Exynos IOMMU driver
* Simplifications in iommu-group id handling
* Support for Mediatek generation one IOMMU hardware
* Conversion of the AMD IOMMU driver to use the generic IOVA
allocator. This driver now also benefits from the recent
scalability improvements in the IOVA code.
* Preparations to use generic DMA mapping code in the Rockchip
IOMMU driver
* Device tree adaption and conversion to use generic page-table
code for the MSM IOMMU driver
* An iova_to_phys optimization in the ARM-SMMU driver to greatly
improve page-table teardown performance with VFIO
* Various other small fixes and conversions
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJXl3e+AAoJECvwRC2XARrjMIgP/1Mm9qIfcaAxKY4ByqbVfrH8
313PO6rpwUhhywUmnf/1F/x+JbuLv8MmRXfSc106mdB1rq9NXpkORYKrqVxs0cSq
6u6TzZWbF6WN1ipqXxDITNFBSy7u97K1VuFaKyYFfLbg8xrkcdkMZJ7BqM2xIEdk
rnRKcfHo6wsmCXJ6InsUPmKAqU6AfMewZTGjO+v77Gce0rZEbsJ8n7BRKC9vO2bc
akvN2W+zzEUSyhbuyYQBG+agpmC5GJvz4u+6QvAP5sxTWfAsnwAoPpP4xxR+/KjT
eicHlja4v0YK6Hr4AJaMxoKfKIrCdqpWm0D2tg/edyWZCeg98AW/w7/s0I8OD3ao
Otj6IqC8nPk0pYciOeEPQ7aqPbvKAqU2FYWt7lWamrdr98u2R3p2nXGl0KthoAj6
JqzrCZXvBS7sj1IPLlGpj939yvbKbjpE0p7y1qhI1VEBXoBWFNvlKydkYx76BTGK
F6paGVqn2Zwy00AqAsylTEkvIK063zwShZ6nPqz4bMdVlgzjrjCzdDecjfbHr8Ic
6D2oCwyF+RJ8qw+Ecm9EmWFik80sgb+iUTeeYEXNf+YzLYt5McIj7fi3N+sUPel3
YJ4S4x0sIpgUZZ1i+rOo8ZPAFHRU6SRPYV+ewaeYKrMt+Un5dTn9SddpqrJdbiUu
YrF36BaQjc123IRGKrSd
=xiS2
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- big-endian support and preparation for defered probing for the Exynos
IOMMU driver
- simplifications in iommu-group id handling
- support for Mediatek generation one IOMMU hardware
- conversion of the AMD IOMMU driver to use the generic IOVA allocator.
This driver now also benefits from the recent scalability
improvements in the IOVA code.
- preparations to use generic DMA mapping code in the Rockchip IOMMU
driver
- device tree adaption and conversion to use generic page-table code
for the MSM IOMMU driver
- an iova_to_phys optimization in the ARM-SMMU driver to greatly
improve page-table teardown performance with VFIO
- various other small fixes and conversions
* tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits)
iommu/amd: Initialize dma-ops domains with 3-level page-table
iommu/amd: Update Alias-DTE in update_device_table()
iommu/vt-d: Return error code in domain_context_mapping_one()
iommu/amd: Use container_of to get dma_ops_domain
iommu/amd: Flush iova queue before releasing dma_ops_domain
iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
iommu/amd: Use dev_data->domain in get_domain()
iommu/amd: Optimize map_sg and unmap_sg
iommu/amd: Introduce dir2prot() helper
iommu/amd: Implement timeout to flush unmap queues
iommu/amd: Implement flush queue
iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
iommu/amd: Set up data structures for flush queue
iommu/amd: Remove align-parameter from __map_single()
iommu/amd: Remove other remains of old address allocator
iommu/amd: Make use of the generic IOVA allocator
iommu/amd: Remove special mapping code for dma_ops path
iommu/amd: Pass gfp-flags to iommu_map_page()
iommu/amd: Implement apply_dm_region call-back
iommu/amd: Create a list of reserved iova addresses
...
- Removal of most of_platform_populate() calls in arch code. Now the DT
core code calls it in the default case and platforms only need to call
it if they have special needs.
- Use pr_fmt on all the DT core print statements.
- CoreSight binding doc improvements to block name descriptions.
- Add dt_to_config script which can parse dts files and list
corresponding kernel config options.
- Fix memory leak hit with a PowerMac DT.
- Correct a bunch of STMicro compatible strings to use the correct
vendor prefix.
- Fix DA9052 PMIC binding doc to match what is actually used in dts
files.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXm9KcAAoJEPr7XbWNvGHDRT4QAIIIOSB4AWHardnMLROgGge9
aOQKZ/05O9feOcxYKe8FkQbcH+IujJjrUL+yrRD36yGQPAyBP21gtcmmfrkCcwFM
kH915f/JbGvXpfwEf8dcarHhzYH6FFJiQGduPpWfwSSWynx+xq5EKPwCqYzMg8bN
SExxt7vUx1MKFOExZ0K8BNCo8VMVLUWQoJ1DNeJDuL25Op4EU3i2l1HQNYV/3XDk
BSA3x7Lw3GjrWEH20VWYn2Azq1OFLY+E2FC2lnG4nbkk5X8dZbUH9PR1Sk7uTQDj
uxTjWe59NBpliCxKSAbMbTAU/WwSB1pJ0I+zDJBiQsdFT+nb5F4zOrs3qSKHa/A9
Rv6AC8k5gdSMrDB1dOspfF2vWvOOInXgNV4/Kza0D92mbCpwyUuF+vhE6rfcMrZU
OiD7rj2/fvO7Y9fUAhrp6zrfrOfH9B1Z9vS+940AlK96YwPE2+J0SA2vBxR/wg8H
7fj4Ud5X+SFisXWQhh5Wlv0W9o6e7C7fsi8vpkQ7gufmezLFWVnJKsUfQaxGEwhG
Hkhm9kuSHHMd+6dEnn2756DnNfJAtQv6rSR0/QR4Lf9y5L4dvR3kAQIci8X/nx4P
sIk+IJWGZG6wziZq59hh+SO6HEqdSNuvh+5sbR0iUimdE/1HsDBdPiocXf/r8iwK
NY9nGeZPRrXmFgdpoZfm
=wLMr
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
- remove most of_platform_populate() calls in arch code. Now the DT
core code calls it in the default case and platforms only need to
call it if they have special needs
- use pr_fmt on all the DT core print statements
- CoreSight binding doc improvements to block name descriptions
- add dt_to_config script which can parse dts files and list
corresponding kernel config options
- fix memory leak hit with a PowerMac DT
- correct a bunch of STMicro compatible strings to use the correct
vendor prefix
- fix DA9052 PMIC binding doc to match what is actually used in dts
files
* tag 'devicetree-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (35 commits)
documentation: da9052: Update regulator bindings names to match DA9052/53 DTS expectations
xtensa: Partially Revert "xtensa: Remove unnecessary of_platform_populate with default match table"
xtensa: Fix build error due to missing include file
MIPS: ath79: Add missing include file
Fix spelling errors in Documentation/devicetree
ARM: dts: fix STMicroelectronics compatible strings
powerpc/dts: fix STMicroelectronics compatible strings
Documentation: dt: i2c: use correct STMicroelectronics vendor prefix
scripts/dtc: dt_to_config - kernel config options for a devicetree
of: fdt: mark unflattened tree as detached
of: overlay: add resolver error prints
coresight: document binding acronyms
Documentation/devicetree: document cavium-pip rx-delay/tx-delay properties
of: use pr_fmt prefix for all console printing
of/irq: Mark initialised interrupt controllers as populated
of: fix memory leak related to safe_name()
Revert "of/platform: export of_default_bus_match_table"
of: unittest: use of_platform_default_populate() to populate default bus
memory: omap-gpmc: use of_platform_default_populate() to populate default bus
bus: uniphier-system-bus: use of_platform_default_populate() to populate default bus
...
Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.
The resulting warnings look something like this:
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
if (ctx != dev_priv->kernel_context)
^
even if the code itself is fine.
Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.
(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).
This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A two-level page-table can map up to 1GB of address space.
With the IOVA allocator now in use, the allocated addresses
are often more closely to 4G, which requires the address
space to be increased much more often. Avoid that by using a
three-level page-table by default.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Not doing so might cause IO-Page-Faults when a device uses
an alias request-id and the alias-dte is left in a lower
page-mode which does not cover the address allocated from
the iova-allocator.
Fixes: 492667dacc ('x86/amd-iommu: Remove amd_iommu_pd_table')
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In 'commit <55d940430ab9> ("iommu/vt-d: Get rid of domain->iommu_lock")',
the error handling path is changed a little, which makes the function
always return 0.
This path fixes this.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Fixes: 55d940430a ('iommu/vt-d: Get rid of domain->iommu_lock')
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This is better than storing an extra pointer in struct
protection_domain, because this pointer can now be removed
from the struct.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Before a dma_ops_domain can be freed, we need to make sure
it is not longer referenced by the flush queue. So empty the
queue before a dma_ops_domain can be freed.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This domain type is not yet handled in the
iommu_ops->domain_free() call-back. Fix that.
Fixes: 0bb6e243d7 ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Optimize these functions so that they need only one call
into the address alloctor. This also saves a couple of
io-tlb flushes in the unmap_sg path.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This function converts dma_data_direction to
iommu-protection flags. This will be needed on multiple
places in the code, so this will save some code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In case the queue doesn't fill up, we flush the TLB at least
10ms after the unmap happened to make sure that the TLB is
cleaned up.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the flush queue the IOMMU TLBs will not be flushed at
every dma-ops unmap operation. The unmapped ranges will be
queued and flushed at once, when the queue is full. This
makes unmapping operations a lot faster (on average) and
restores the performance of the old address allocator.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If domain == NULL is passed to the function, it will queue a
completion-wait command on all IOMMUs in the system.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The flush queue is the equivalent to defered-flushing in the
Intel VT-d driver. This patch sets up the data structures
needed for this.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove the old address allocation code and make use of the
generic IOVA allocator that is also used by other dma-ops
implementations.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use the iommu-api map/unmap functions instead. This will be
required anyway when IOVA code is used for address
allocation.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Put the MSI-range, the HT-range and the MMIO ranges of PCI
devices into that range, so that these addresses are not
allocated for DMA.
Copy this address list into every created dma_ops_domain.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This new call-back will be used by the iommu driver to do
reserve the given dm_region in its iova space before the
mapping is created.
The call-back is temporary until the dma-ops implementation
is part of the common iommu code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The default domain for a device might also be
identity-mapped. In this case the kernel would crash when
unity mappings are defined for the device. Fix that by
making sure the domain is a dma_ops domain.
Fixes: 0bb6e243d7 ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Ida handling can be much simplified by using the ida_simple_.. functions.
This change also fixes the bug that previously checking for errors
returned by ida_get_new() was incomplete.
ida_get_new() can return errors other than EAGAIN, e.g. ENOSPC.
This case wasn't handled.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
iommu_group_ida and iommu_group_mutex can be initialized statically.
There's no need to do this dynamically in the init function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
According to the manual: "Hardware access to ... invalidation queue ...
are always coherent."
Remove unnecassary clflushes accordingly.
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use devm_request_irq to simplify error handling path,
when probe smmu device.
Also devm_{request|free}_irq when init or destroy domain context.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There is a race condition in the AMD IOMMU init code that
causes requested unity mappings to be blocked by the IOMMU
for a short period of time. This results on boot failures
and IO_PAGE_FAULTs on some machines.
Fix this by making sure the unity mappings are installed
before all other DMA is blocked.
Fixes: aafd8ba0ca ('iommu/amd: Implement add_device and remove_device')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum
number of possible domains is 64K; indeed this is the maximum value
that the cap_ndoms() macro will expand to. Since the value 65536
will not fix in a u16, the 'did' variable must be promoted to an
int, otherwise the test for < 65536 will always be true and the
loop will never end.
The symptom, in my case, was a hung machine during suspend.
Fixes: 3bd4f9112f ("iommu/vt-d: Fix overflow of iommu->domains array")
Signed-off-by: Aaron Campbell <aaron@monkey.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The implementation of iova_to_phys for the long-descriptor ARM
io-pgtable code always masks with the granule size when inserting the
low virtual address bits into the physical address determined from the
page tables. In cases where the leaf entry is found before the final
level of table (i.e. due to a block mapping), this results in rounding
down to the bottom page of the block mapping. Consequently, the physical
address range batching in the vfio_unmap_unpin is defeated and we end
up taking the long way home.
This patch fixes the problem by masking the virtual address with the
appropriate mask for the level at which the leaf descriptor is located.
The short-descriptor code already gets this right, so no change is
needed there.
Cc: <stable@vger.kernel.org>
Reported-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The PCIe ACS capability will affect the layout of iommu groups.
Generally speaking, if the path from root port to the PCIe device
is ACS enabled, the iommu will create a single iommu group for this
PCIe device. If all PCIe devices on the path are ACS enabled then
Linux can determine this path is ACS enabled.
Linux use two PCIe configuration registers to determine the ACS
status of PCIe devices:
ACS Capability Register and ACS Control Register.
The first register is used to check the implementation of ACS function
of a PCIe device, the second register is used to check the enable status
of ACS function. If one PCIe device has implemented and enabled the ACS
function then Linux will determine this PCIe device enabled ACS.
From the Chapter:6.12 of PCI Express Base Specification Revision 3.1a,
we can find that when a PCIe device implements ACS function, the enable
status is set to disabled by default and can be enabled by ACS-aware
software.
ACS will affect the iommu groups topology, so, the iommu driver is
ACS-aware software. This patch adds a call to pci_request_acs() to the
arm-smmu driver to enable the ACS function in PCIe devices that support
it, when they get probed.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Wei Chen <Wei.Chen@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Set geometry for allocated domains and fix .domain_alloc() callback to
work with IOMMU_DOMAIN_DMA domain type, which is used for implicit
domains on ARM64.
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use DMA API instead of architecture internal functions like
__cpuc_flush_dcache_area() etc.
The biggest difficulty here is that dma_map and _sync calls require some
struct device, while there is no real 1:1 relation between an IOMMU
domain and some device. To overcome this, a simple platform device is
registered for each allocated IOMMU domain.
With this patch, this driver can be used on both ARM and ARM64
platforms, such as RK3288 and RK3399 respectively.
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In .probe(), devm_kzalloc() is called with size == 0 and works only
by luck, due to internal behavior of the allocator and the fact
that the proper allocation size is small. Let's use proper value for
calculating the size.
Fixes: cd6438c5f8 ("iommu/rockchip: Reconstruct to support multi slaves")
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The iommu_dma_alloc() in iommu/dma-iommu.c calls iommu_map_sg()
that requires the callback iommu_ops .map_sg(). Adding the
default_iommu_map_sg() to Rockchip IOMMU accordingly.
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Signed-off-by: Shunqian Zheng <xxm@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Even though the IOMMU shares IRQ with its master, the struct device
passed to {request,free}_irq is supposed to represent the device that is
signalling the interrupt. This patch makes the driver use IOMMU device
instead of master's device to make things clear.
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 2a0cb4e2d4 ("iommu/amd: Add new map for storing IVHD dev entry
type HID") added a call to DUMP_printk in init_iommu_from_acpi() which
used the value of devid before this variable was initialized.
Fixes: 2a0cb4e2d4 ('iommu/amd: Add new map for storing IVHD dev entry type HID')
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The valid range of 'did' in get_iommu_domain(*iommu, did)
is 0..cap_ndoms(iommu->cap), so don't exceed that
range in free_all_cpu_cached_iovas().
The user-visible impact of the out-of-bounds access is the machine
hanging on suspend-to-ram. It is, in fact, a kernel panic, but due
to already suspended devices, that's often not visible to the user.
Fixes: 22e2f9fa63 ("iommu/vt-d: Use per-cpu IOVA caching")
Signed-off-by: Jan Niehusmann <jan@gondor.com>
Tested-By: Marius Vlad <marius.c.vlad@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The of_iommu_init() is called multiple times by arch code,
make it postcore_initcall_sync, then we can drop relevant
calls fully.
Note, the IOMMUs should have a chance to perform some basic
initialisation before we start adding masters to them. So
postcore_initcall_sync is good choice, it ensures of_iommu_init()
called before of_platform_populate.
Acked-by: Rich Felker <dalias@libc.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Now that the driver is DT adapted, bus_set_iommu gets called only
when on compatible matching. So the driver should not break multiplatform
builds now. So remove the BROKEN config.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This iommu uses the armv7 short descriptor format. So use the
generic ARMV7S pagetable ops instead of rewriting the same stuff
in the driver.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This adds the xlate callback which gets invoked during
device registration from DT. The master devices gets added
through this.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There are only two functions left in msm_iommu_dev.c. Move it to
msm_iommu.c and delete the file.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The driver currently works based on platform data. Remove this
and add support for DT. A single master can have multiple ports
connected to more than one iommu.
master
|
|
|
------------------------
| |
IOMMU0 IOMMU1
| |
ctx0 ctx1 ctx0 ctx1
This association of master and iommus/contexts were previously
represented by platform data parent/child device details. The client
drivers were responsible for programming all of the iommus/contexts
for the device. Now while adapting to generic DT bindings we maintain the
list of iommus, contexts that each master domain is connected to and
program all of them on attach/detach.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add initial support for big endian by always writing the pte
in le32. Note, revisit if hardware capable of doing big endian
fetches.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mediatek SoC's M4U has two generations of HW architcture. Generation one
uses flat, one layer pagetable, and was shipped with ARM architecture, it
only supports 4K size page mapping. MT2701 SoC uses this generation one
m4u HW. Generation two uses the ARM short-descriptor translation table
format for address translation, and was shipped with ARM64 architecture,
MT8173 uses this generation two m4u HW. All the two generation iommu HW
only have one iommu domain, and all its iommu clients share the same
iova address.
These two generation m4u HW have slit different register groups and
register offset, but most register names are the same. This patch add iommu
support for mediatek SoC mt2701.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move the struct defines of mtk iommu into a new header files for
common use.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The device_node will be released in of_iommu_configure, it may be double
released if call of_node_put in mtk_iommu_of_xlate.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
alloc_workqueue replaces deprecated create_workqueue().
A dedicated workqueue has been used since the workitem (viz
&fault->work), is involved in IO page-fault handling.
WQ_MEM_RECLAIM has been set to guarantee forward progress under memory
pressure, which is a requirement here.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This seems to be required on some X58 chipsets on systems
with more than one IOMMU. QI does not work until it is
enabled on all IOMMUs in the system.
Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com>
Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com>
Fixes: 5f0a7f7614 ('iommu/vt-d: Make root entry visible for hardware right after allocation')
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
On a system with an Intel PCIe port configured as an NTB device, iommu
initialization fails with
DMAR: Device scope type does not match for 0000:80:03.0
This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):
[0A0h 0160 1] Device Scope Entry Type : 02
[0A1h 0161 1] Entry Length : 08
[0A2h 0162 2] Reserved : 0000
[0A4h 0164 1] Enumeration ID : 00
[0A5h 0165 1] PCI Bus Number : 80
[0A6h 0166 2] PCI Path : 03,00
but the device has a type 0 PCI header:
80:03.0 Bridge [0680]: Intel Corporation Device [8086:2f0d] (rev 02)
00: 86 80 0d 2f 00 00 10 00 02 00 80 06 10 00 80 00
10: 0c 00 c0 00 c0 38 00 00 0c 00 00 00 80 38 00 00
20: 00 00 00 c8 00 00 10 c8 00 00 00 00 86 80 00 00
30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 01 00 00
VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Use the class
0x0680 ("Other bridge device") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>