Commit Graph

52 Commits

Author SHA1 Message Date
Robin Murphy fefe8527a1 iommu/io-pgtable: Remove tlb_flush_leaf
The only user of tlb_flush_leaf is a particularly hairy corner of the
Arm short-descriptor code, which wants a synchronous invalidation to
minimise the races inherent in trying to split a large page mapping.
This is already far enough into "here be dragons" territory that no
sensible caller should ever hit it, and thus it really doesn't need
optimising. Although using tlb_flush_walk there may technically be
more heavyweight than needed, it does the job and saves everyone else
having to carry around useless baggage.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/9844ab0c5cb3da8b2f89c6c2da16941910702b41.1606324115.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-12-08 15:23:37 +00:00
Keqian Zhu f12e0d2290 iommu: Defer the early return in arm_(v7s/lpae)_map
Although handling a mapping request with no permissions is a
trivial no-op, defer the early return until after the size/range
checks so that we are consistent with other mapping requests.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Link: https://lore.kernel.org/r/20201207115758.9400-1-zhukeqian1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-12-08 13:56:30 +00:00
Baolin Wang f34ce7a701 iommu: Add gfp parameter to io_pgtable_ops->map()
Now the ARM page tables are always allocated by GFP_ATOMIC parameter,
but the iommu_ops->map() function has been added a gfp_t parameter by
commit 781ca2de89 ("iommu: Add gfp parameter to iommu_ops::map"),
thus io_pgtable_ops->map() should use the gfp parameter passed from
iommu_ops->map() to allocate page pages, which can avoid wasting the
memory allocators atomic pools for some non-atomic contexts.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/3093df4cb95497aaf713fca623ce4ecebb197c2e.1591930156.git.baolin.wang@linux.alibaba.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 14:29:47 +02:00
Robin Murphy fb485eb18e iommu/io-pgtable-arm: Rationalise TCR handling
Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Robin Murphy 7618e47909 iommu/io-pgtable-arm: Improve attribute handling
By VMSA rules, using Normal Non-Cacheable type with a shareability
attribute of anything other than Outer Shareable is liable to lead into
unpredictable territory:

| Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c)
|
| A memory region with a resultant memory type attribute of Normal, and
| a resultant cacheability attribute of Inner Non-cacheable, Outer
| Non-cacheable, must have a resultant shareability attribute of Outer
| Shareable, otherwise shareability is UNPREDICTABLE

Although the SMMU architectures seem to give some slightly stronger
guarantees of Non-Cacheable output types becoming implicitly Outer
Shareable in most cases, we may as well be explicit and not take any
chances. It's also weird that LPAE attribute handling is currently split
between prot_to_pte() and init_pte() given that it can all be statically
determined up-front. Thus, collect *all* the LPAE attributes into
prot_to_pte() in order to logically pick the shareability based on the
incoming IOMMU API prot value, and tweak the short-descriptor code to
stop setting TTBR0.NOS for Non-Cacheable walks.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Robin Murphy d1e5f26f14 iommu/io-pgtable-arm: Rationalise TTBRn handling
TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Drop change to ttbr value]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:39:23 +00:00
Robin Murphy b5813c164e iommu/io-pgtable: Make selftest gubbins consistently __init
The selftests run as an initcall, but the annotation of the various
callbacks and data seems to be somewhat arbitrary. Add it consistently
for everything related to the selftests.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 19:34:31 +00:00
Joerg Roedel 4c00889341 Merge branch 'arm/smmu' into arm/mediatek 2019-08-30 16:12:10 +02:00
Yong Wu 4c019de653 iommu/io-pgtable-arm-v7s: Extend to support PA[33:32] for MediaTek
MediaTek extend the arm v7s descriptor to support up to 34 bits PA where
the bit32 and bit33 are encoded in the bit9 and bit4 of the PTE
respectively. Meanwhile the iova still is 32bits.

Regarding whether the pagetable address could be over 4GB, the mt8183
support it while the previous mt8173 don't, thus keep it as is.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Yong Wu 73d50811bc iommu/io-pgtable-arm-v7s: Rename the quirk from MTK_4GB to MTK_EXT
In previous mt2712/mt8173, MediaTek extend the v7s to support 4GB dram.
But in the latest mt8183, We extend it to support the PA up to 34bit.
Then the "MTK_4GB" name is not so fit, This patch only change the quirk
name to "MTK_EXT".

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Yong Wu 7f315c9da9 iommu/io-pgtable-arm-v7s: Use ias/oas to check the valid iova/pa
Use ias/oas to check the valid iova/pa. Synchronize this checking with
io-pgtable-arm.c.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Yong Wu 5950b9541b iommu/io-pgtable-arm-v7s: Add paddr_to_iopte and iopte_to_paddr helpers
Add two helper functions: paddr_to_iopte and iopte_to_paddr.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Will Deacon 3951c41af4 iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page()
With all the pieces in place, we can finally propagate the
iommu_iotlb_gather structure from the call to unmap() down to the IOMMU
drivers' implementation of ->tlb_add_page(). Currently everybody ignores
it, but the machinery is now there to defer invalidation.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:59 +01:00
Will Deacon a2d3a382d6 iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap()
Update the io-pgtable ->unmap() function to take an iommu_iotlb_gather
pointer as an argument, and update the callers as appropriate.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:59 +01:00
Will Deacon e953f7f2fa iommu/io-pgtable: Remove unused ->tlb_sync() callback
The ->tlb_sync() callback is no longer used, so it can be removed.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:58 +01:00
Will Deacon abfd6fe0cd iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page()
The ->tlb_add_flush() callback in the io-pgtable API now looks a bit
silly:

  - It takes a size and a granule, which are always the same
  - It takes a 'bool leaf', which is always true
  - It only ever flushes a single page

With that in mind, replace it with an optional ->tlb_add_page() callback
that drops the useless parameters.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:57 +01:00
Will Deacon 10b7a7d912 iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf()
Now that all IOMMU drivers using the io-pgtable API implement the
->tlb_flush_walk() and ->tlb_flush_leaf() callbacks, we can use them in
the io-pgtable code instead of ->tlb_add_flush() immediately followed by
->tlb_sync().

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:57 +01:00
Will Deacon 298f78895b iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops
In preparation for TLB flush gathering in the IOMMU API, rename the
iommu_gather_ops structure in io-pgtable to iommu_flush_ops, which
better describes its purpose and avoids the potential for confusion
between different levels of the API.

$ find linux/ -type f -name '*.[ch]' | xargs sed -i 's/gather_ops/flush_ops/g'

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-24 13:32:33 +01:00
Will Deacon f71da46719 iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync()
Commit b6b65ca20b ("iommu/io-pgtable-arm: Add support for non-strict
mode") added an unconditional call to io_pgtable_tlb_sync() immediately
after the case where we replace a block entry with a table entry during
an unmap() call. This is redundant, since the IOMMU API will call
iommu_tlb_sync() on this path and the patch in question mentions this:

 | To save having to reason about it too much, make sure the invalidation
 | in arm_lpae_split_blk_unmap() just performs its own unconditional sync
 | to minimise the window in which we're technically violating the break-
 | before-make requirement on a live mapping. This might work out redundant
 | with an outer-level sync for strict unmaps, but we'll never be splitting
 | blocks on a DMA fastpath anyway.

However, this sync gets in the way of deferred TLB invalidation for leaf
entries and is at best a questionable, unproven hack. Remove it.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-24 13:32:33 +01:00
Joerg Roedel 39debdc1d7 Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu 2019-07-01 13:44:41 +02:00
Bjorn Andersson 9e6ea59f3f iommu/io-pgtable: Support non-coherent page tables
Describe the memory related to page table walks as non-cacheable for
iommu instances that are not DMA coherent.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[will: Use cfg->coherent_walk, fix arm-v7s, ensure outer-shareable for NC]
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-25 13:26:47 +01:00
Will Deacon 4f41845b34 iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag
IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just
an indication of whether or not the page-table walker for the IOMMU is
coherent with the CPU caches. Since cache coherency is more than just a
quirk, replace the flag with its own field in the io_pgtable_cfg
structure.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-25 12:51:25 +01:00
Thomas Gleixner caab277b1d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:07 +02:00
Nicolas Boichat 0a352554da iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging
IOMMUs using ARMv7 short-descriptor format require page tables (level 1
and 2) to be allocated within the first 4GB of RAM, even on 64-bit
systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is
defined (e.g.  on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.  Note that
we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is
not strictly necessary, and would cause a warning in mm/sl*b.c, as we
did not update GFP_SLAB_BUG_MASK.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org
Fixes: ad67f5a654 ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Huaisheng Ye <yehs1@lenovo.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Tomasz Figa <tfiga@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yong Wu <yong.wu@mediatek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:37 -07:00
Nicolas Boichat 032ebd8548 iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables
L1 tables are allocated with __get_dma_pages, and therefore already
ignored by kmemleak.

Without this, the kernel would print this error message on boot,
when the first L1 table is allocated:

[    2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black
[    2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S                4.19.16 #8
[    2.831227] Workqueue: events deferred_probe_work_func
[    2.836353] Call trace:
...
[    2.852532]  paint_ptr+0xa0/0xa8
[    2.855750]  kmemleak_ignore+0x38/0x6c
[    2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
[    2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
[    2.868354]  alloc_io_pgtable_ops+0x3c/0x78
...

Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 10:16:06 +01:00
Rob Herring b77cf11f09 iommu: Allow io-pgtable to be used outside of drivers/iommu/
Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
use the page table library. Specifically, some ARM Mali GPUs use the
ARM page table formats.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 11:26:48 +01:00
Yong Wu 2713fe3715 Revert "iommu/io-pgtable-arm: Check for v7s-incapable systems"
This reverts commit 82db33dc5e.

After the commit 29859aeb8a ("iommu/io-pgtable-arm-v7s: Abort
allocation when table address overflows the PTE"), v7s will return fail
if the page table allocation isn't expected. this PHYS_OFFSET check
is unnecessary now.

And this check may lead to fail. For example, If CONFIG_RANDOMIZE_BASE
is enabled, the "memstart_addr" will be updated randomly, then the
PHYS_OFFSET may be random.

Reported-by: CK Hu <ck.hu@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-12-17 10:19:10 +01:00
Robin Murphy b2dfeba654 iommu/io-pgtable-arm-v7s: Add support for non-strict mode
As for LPAE, it's simply a case of skipping the leaf invalidation for a
regular unmap, and ensuring that the one in split_blk_unmap() is paired
with an explicit sync ASAP rather than relying on one which might only
eventually happen way down the line.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-01 13:01:34 +01:00
Jean-Philippe Brucker 29859aeb8a iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE
When run on a 64-bit system in selftest, the v7s driver may obtain page
table with physical addresses larger than 32-bit. Level-2 tables are 1KB
and are are allocated with slab, which doesn't accept the GFP_DMA32
flag. Currently map() truncates the address written in the PTE, causing
iova_to_phys() or unmap() to access invalid memory. Kasan reports it as
a use-after-free. To avoid any nasty surprise, test if the physical
address fits in a PTE before returning a new table. 32-bit systems,
which are the main users of this page table format, shouldn't see any
difference.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-26 11:34:58 +01:00
YueHaibing f793b13ef0 iommu/io-pgtable-arm: Use for_each_set_bit to simplify code
We can use for_each_set_bit() to simplify code slightly in the
ARM io-pgtable self tests while unmapping.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-05-03 15:31:07 +02:00
Vivek Gautam 193e67c00e iommu/io-pgtable: Use size_t return type for all foo_unmap
Unmap returns a size_t all throughout the IOMMU framework.
Make io-pgtable match this convention.
Moreover, there isn't a need to have a signed int return type
as we return 0 in case of failures.

Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13 19:31:32 +01:00
Joerg Roedel a593472591 Merge branches 'iommu/fixes', 'arm/omap', 'arm/exynos', 'x86/amd', 'x86/vt-d' and 'core' into next 2017-10-13 17:32:24 +02:00
Robin Murphy 4d689b6194 iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync
Now that the core API issues its own post-unmap TLB sync call, push that
operation out from the io-pgtable-arm-v7s internals into the users. For
now, we leave the invalidation implicit in the unmap operation, since
none of the current users would benefit much from any change to that.

Note that the conversion of msm_iommu is implicit, since that apparently
has no specific TLB sync operation anyway.

CC: Yong Wu <yong.wu@mediatek.com>
CC: Rob Clark <robdclark@gmail.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-10-02 15:45:25 +02:00
Yong Wu 5c62c1c679 iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA
Fix the commit 81b3c25218 ("iommu/io-pgtable: Introduce explicit
coherency"). If there is no IO_PGTABLE_QUIRK_NO_DMA, we should call
dma_sync_single_for_device for cache synchronization.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Fixes: 81b3c25218 ('iommu/io-pgtable: Introduce explicit coherency')
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-27 16:56:17 +02:00
Robin Murphy 7655739143 iommu/io-pgtable: Sanitise map/unmap addresses
It may be an egregious error to attempt to use addresses outside the
range of the pagetable format, but that still doesn't mean we should
merrily wreak havoc by silently mapping/unmapping whatever truncated
portions of them might happen to correspond to real addresses.

Add some up-front checks to sanitise our inputs so that buggy callers
don't invite potential memory corruption.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-20 10:30:28 +01:00
Will Deacon 77f3445866 iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table
When writing a new table entry, we must ensure that the contents of the
table is made visible to the SMMU page table walker before the updated
table entry itself.

This is currently achieved using wmb(), which expands to an expensive and
unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when
writing the table entry, but this doesn't have memory ordering semantics
on !SMP systems.

Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this
does more than we require (since it targets the outer-shareable domain),
but it's likely to be significantly faster than the DSB approach.

Reported-by: Linu Cherian <linu.cherian@cavium.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:58:02 +01:00
Robin Murphy 119ff305b0 iommu/io-pgtable-arm-v7s: Support lockless operation
Mirroring the LPAE implementation, rework the v7s code to be robust
against concurrent operations. The same two potential races exist, and
are solved in the same manner, with the fixed 2-level structure making
life ever so slightly simpler.

What complicates matters compared to LPAE, however, is large page
entries, since we can't update a block of 16 PTEs atomically, nor assume
available software bits to do clever things with. As most users are
never likely to do partial unmaps anyway (due to DMA API rules), it
doesn't seem unreasonable for this case to remain behind a serialising
lock; we just pull said lock down into the bowels of the implementation
so it's well out of the way of the normal call paths.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:58:00 +01:00
Robin Murphy 81b3c25218 iommu/io-pgtable: Introduce explicit coherency
Once we remove the serialising spinlock, a potential race opens up for
non-coherent IOMMUs whereby a caller of .map() can be sure that cache
maintenance has been performed on their new PTE, but will have no
guarantee that such maintenance for table entries above it has actually
completed (e.g. if another CPU took an interrupt immediately after
writing the table entry, but before initiating the DMA sync).

Handling this race safely will add some potentially non-trivial overhead
to installing a table entry, which we would much rather avoid on
coherent systems where it will be unnecessary, and where we are stirivng
to minimise latency by removing the locking in the first place.

To that end, let's introduce an explicit notion of cache-coherency to
io-pgtable, such that we will be able to avoid penalising IOMMUs which
know enough to know when they are coherent.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:58:00 +01:00
Robin Murphy b9f1ef30ac iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
Whilst the short-descriptor format's split_blk_unmap implementation has
no need to be recursive, it followed the pattern of the LPAE version
anyway for the sake of consistency. With the latter now reworked for
both efficiency and future scalability improvements, tweak the former
similarly, not least to make it less obtuse.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:57:59 +01:00
Robin Murphy 9db829d281 iommu/io-pgtable-arm-v7s: Check table PTEs more precisely
Whilst we don't support the PXN bit at all, so should never encounter a
level 1 section or supersection PTE with it set, it would still be wise
to check both table type bits to resolve any theoretical ambiguity.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:57:58 +01:00
Arvind Yadav 60ab7a75c8 iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops.
File size before:
   text	   data	    bss	    dec	    hex	filename
   6146	     56	      9	   6211	   1843	drivers/iommu/io-pgtable-arm-v7s.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
   6170	     24	      9	   6203	   183b	drivers/iommu/io-pgtable-arm-v7s.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23 17:57:57 +01:00
Oleksandr Tyshchenko a03849e721 iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
Do a check for already installed leaf entry at the current level before
dereferencing it in order to avoid walking the page table down with
wrong pointer to the next level.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-03-10 18:23:34 +00:00
Robin Murphy 5baf1e9d0b iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag
The short-descriptor format also allows privileged-only mappings, so
let's wire it up.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-19 15:56:18 +00:00
Kefeng Wang 4ae8a5c528 iommu/io-pgtable-arm: Use for_each_set_bit to simplify the code
We can use for_each_set_bit() to simplify the code slightly in the
ARM io-pgtable self tests.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29 15:57:40 +00:00
Robin Murphy 82db33dc5e iommu/io-pgtable-arm: Check for v7s-incapable systems
On machines with no 32-bit addressable RAM whatsoever, we shouldn't
even touch the v7s format as it's never going to work.

Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Reported-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 09:34:23 +01:00
Robin Murphy e633fc7a13 iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks
Due to the attribute bits being all over the place in the different
types of short-descriptor PTEs, when remapping an existing entry, e.g.
splitting a section into pages, we take the approach of decomposing
the PTE attributes back to the IOMMU API flags to start from scratch.

On inspection, though, the existing code seems to have got the read-only
bit backwards and ignored the XN bit. How embarrassing...

Fortunately the primary user so far, the Mediatek IOMMU, both never
splits blocks (because it only serves non-overlapping DMA API calls) and
also ignores permissions anyway, but let's put things right before any
future users trip up.

Cc: <stable@vger.kernel.org>
Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-19 09:40:16 +01:00
Robin Murphy e88ccab12a iommu/io-pgtable-arm-v7s: Support IOMMU_MMIO flag
Teach the short-descriptor format to create Device mappings when asked.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-07 15:07:50 +02:00
Yong Wu 1afe23194d iommu/io-pgtable: Add MTK 4GB mode in Short-descriptor
In MT8173, Normally the first 1GB PA is for the HW SRAM and Regs,
so the PA will be 33bits if the dram size is 4GB. We have a
"DRAM 4GB mode" toggle bit for this. If it's enabled, from CPU's
point of view, the dram PA will be from 0x1_00000000~0x1_ffffffff.

In short descriptor, the pagetable descriptor is always 32bit.
Mediatek extend bit9 in the lvl1 and lvl2 pgtable descriptor
as the 4GB mode.

In the 4GB mode, the bit9 must be set, then M4U help add 0x1_00000000
based on the PA in pagetable. Thus the M4U output address to EMI is
always 33bits(the input address is still 32bits).

We add a special quirk for this MTK-4GB mode. And in the standard
spec, Bit9 in the lvl1 is "IMPLEMENTATION DEFINED", while it's AP[2]
in the lvl2, therefore if this quirk is enabled, NO_PERMS is also
expected.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-05 15:39:37 +02:00
Robin Murphy 048b31ca97 iommu/io-pgtable-armv7s: Fix kmem_cache_alloc() flags
Whilst the default SLUB allocator happily just merges the original
allocation flags from kmem_cache_create() with those passed through
kmem_cache_alloc(), there is a code path in the SLAB allocator which
will aggressively BUG_ON() if the cache was created with SLAB_CACHE_DMA
but GFP_DMA is not specified for an allocation:

  kernel BUG at mm/slab.c:2536!
  Internal error: Oops - BUG: 0 [#1] SMP ARM
  Modules linked in:[    1.299311] Modules linked in:

  CPU: 1 PID: 1 Comm: swapper/0 Not tainted
  4.5.0-rc6-koelsch-05892-ge7e45ad53ab6795e #2270
  Hardware name: Generic R8A7791 (Flattened Device Tree)
  task: ef422040 ti: ef442000 task.ti: ef442000
  PC is at cache_alloc_refill+0x2a0/0x530
  LR is at _raw_spin_unlock+0x8/0xc
...
  [<c02c6928>] (cache_alloc_refill) from [<c02c6630>] (kmem_cache_alloc+0x7c/0xd4)
  [<c02c6630>] (kmem_cache_alloc) from [<c04444bc>]
  (__arm_v7s_alloc_table+0x5c/0x278)
  [<c04444bc>] (__arm_v7s_alloc_table) from [<c0444e1c>]
  (__arm_v7s_map.constprop.6+0x68/0x25c)
  [<c0444e1c>] (__arm_v7s_map.constprop.6) from [<c0445044>]
  (arm_v7s_map+0x34/0xa4)
  [<c0445044>] (arm_v7s_map) from [<c0c18ee4>] (arm_v7s_do_selftests+0x140/0x418)
  [<c0c18ee4>] (arm_v7s_do_selftests) from [<c0201760>]
  (do_one_initcall+0x100/0x1b4)
  [<c0201760>] (do_one_initcall) from [<c0c00d4c>]
  (kernel_init_freeable+0x120/0x1e8)
  [<c0c00d4c>] (kernel_init_freeable) from [<c067a364>] (kernel_init+0x8/0xec)
  [<c067a364>] (kernel_init) from [<c0206b68>] (ret_from_fork+0x14/0x2c)
  Code: 1a000003 e7f001f2 e3130001 0a000000 (e7f001f2)
  ---[ end trace 190f6f6b84352efd ]---

Keep the peace by adding GFP_DMA when allocating a table.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-03-02 14:48:50 +01:00
Robin Murphy 3850db49da iommu/io-pgtable: Rationalise quirk handling
As the number of io-pgtable implementations grows beyond 1, it's time
to rationalise the quirks mechanism before things have a chance to
start getting really ugly and out-of-hand.

To that end:
- Indicate exactly which quirks each format can/does support.
- Fail creating a table if a caller wants unsupported quirks.
- Properly document where each quirk applies and why.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-17 14:15:09 +00:00