Commit Graph

1956 Commits

Author SHA1 Message Date
Robin Murphy c0d05cde2a iommu/of: Remove PCI host bridge node check
of_pci_iommu_init() tries to be clever and stop its alias walk at the
device represented by master_np, in case of weird PCI topologies where
the bridge to the IOMMU and the rest of the system is not at the root.
It turns out this is a bit short-sighted, since there are plenty of
other callers of pci_for_each_dma_alias() which would also need the same
behaviour in that situation, and the only platform so far with such a
topology (Cavium ThunderX2) already solves it more generally via a PCI
quirk. As this check is effectively redundant, and returning a boolean
value as an int is a bit broken anyway, let's just get rid of it.

Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Fixes: d87beb7492 ("iommu/of: Handle PCI aliases properly")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-22 12:05:43 +02:00
Geert Uytterhoeven 986a5f7017 iommu/qcom: Depend on HAS_DMA to fix compile error
If NO_DMA=y:

    warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT && HAS_DMA && (ARM || ARM64 || COMPILE_TEST && !GENERIC_ATOMIC64))

and

    drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_sync_pte':
    io-pgtable-arm.c:(.text+0x206): undefined reference to `bad_dma_ops'
    drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_free_pages':
    io-pgtable-arm.c:(.text+0x6a6): undefined reference to `bad_dma_ops'
    drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_alloc_pages':
    io-pgtable-arm.c:(.text+0x812): undefined reference to `bad_dma_ops'
    io-pgtable-arm.c:(.text+0x81c): undefined reference to `bad_dma_ops'
    io-pgtable-arm.c:(.text+0x862): undefined reference to `bad_dma_ops'
    drivers/iommu/io-pgtable-arm.o: In function `arm_lpae_run_tests':
    io-pgtable-arm.c:(.init.text+0x86): undefined reference to `alloc_io_pgtable_ops'
    io-pgtable-arm.c:(.init.text+0x47c): undefined reference to `free_io_pgtable_ops'
    drivers/iommu/qcom_iommu.o: In function `qcom_iommu_init_domain':
    qcom_iommu.c:(.text+0x1ce): undefined reference to `alloc_io_pgtable_ops'
    drivers/iommu/qcom_iommu.o: In function `qcom_iommu_domain_free':
    qcom_iommu.c:(.text+0x754): undefined reference to `free_io_pgtable_ops'

QCOM_IOMMU selects IOMMU_IO_PGTABLE_LPAE, which bypasses its dependency
on HAS_DMA.  Make QCOM_IOMMU depend on HAS_DMA to fix this.

Fixes: 0ae349a0f3 ("iommu/qcom: Add qcom_iommu")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-19 15:30:41 +02:00
Arnd Bergmann 3bd71e18c5 iommu/vt-d: Fix harmless section mismatch warning
Building with gcc-4.6 results in this warning due to
dmar_table_print_dmar_entry being inlined as in newer compiler versions:

WARNING: vmlinux.o(.text+0x5c8bee): Section mismatch in reference from the function dmar_walk_remapping_entries() to the function .init.text:dmar_table_print_dmar_entry()
The function dmar_walk_remapping_entries() references
the function __init dmar_table_print_dmar_entry().
This is often because dmar_walk_remapping_entries lacks a __init
annotation or the annotation of dmar_table_print_dmar_entry is wrong.

This removes the __init annotation to avoid the warning. On compilers
that don't show the warning today, this should have no impact since the
function gets inlined anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-19 11:44:46 +02:00
Guenter Roeck a4aaeccc7a iommu: Add missing dependencies
parisc:allmodconfig, xtensa:allmodconfig, and possibly others generate
the following Kconfig warning.

warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects
IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT &&
HAS_DMA && (ARM || ARM64 || COMPILE_TEST && !GENERIC_ATOMIC64))

IOMMU_IO_PGTABLE_LPAE depends on (COMPILE_TEST && !GENERIC_ATOMIC64),
so any configuration option selecting it needs to have the same dependencies.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-19 11:42:19 +02:00
Linus Torvalds 4dfc278803 IOMMU Updates for Linux v4.14
Slightly more changes than usual this time:
 
 	- KDump Kernel IOMMU take-over code for AMD IOMMU. The code now
 	  tries to preserve the mappings of the kernel so that master
 	  aborts for devices are avoided. Master aborts cause some
 	  devices to fail in the kdump kernel, so this code makes the
 	  dump more likely to succeed when AMD IOMMU is enabled.
 
 	- Common flush queue implementation for IOVA code users. The
 	  code is still optional, but AMD and Intel IOMMU drivers had
 	  their own implementation which is now unified.
 
 	- Finish support for iommu-groups. All drivers implement this
 	  feature now so that IOMMU core code can rely on it.
 
 	- Finish support for 'struct iommu_device' in iommu drivers. All
 	  drivers now use the interface.
 
 	- New functions in the IOMMU-API for explicit IO/TLB flushing.
 	  This will help to reduce the number of IO/TLB flushes when
 	  IOMMU drivers support this interface.
 
 	- Support for mt2712 in the Mediatek IOMMU driver
 
 	- New IOMMU driver for QCOM hardware
 
 	- System PM support for ARM-SMMU
 
 	- Shutdown method for ARM-SMMU-v3
 
 	- Some constification patches
 
 	- Various other small improvements and fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZtCFNAAoJECvwRC2XARrjZnQP/AxC/ezQpq82HbegF4sM/cVE
 Ep7TeTqodEl75FS/6txe2wU0pwodqk/LB9ajfQZUbE1w8vKsNEqi5qf4ZYHGoxYI
 5bWyjJBzKIlwENH5lsBpQNt6XLevrYmRsFy7F0tRYy+qPQq8k+js2i7/XkCL3q7L
 3xklF847RRoITaTOhhaROx1pF23dSMEsS2XGuWHcZfjORtep4wcFKzd/2SvlCWCo
 P2bRU7jBzfJuuGSA80gaiUbDmrULTUfYuZNp7njASzCgsDmagERtvDEpdoXPNNSp
 u6s4LjU1Dp3fgr6g6cFRO7B6JUbWd619nwo9so/c/wZN54yEngBF9EyeeF3mv2O5
 ZbM2mOW3RlZcjxFT/AC8G4cZwwP6MpCEQOdqknoqc6ZQwcDqwN0o9I4+po0wsiAU
 89ijZZe9Mx0p9lNpihaBEB1erAUWPo5Obh62zo80W3h6x9WzkGQWM+PyFK2DYoaC
 8biEZzcc21sLEHvXQkcEGJSKrihHr9sluOqvxmCw5QAkYIFAeZRoeH7JtZWjVCnr
 T3XvaG1G1Aw6tS7ErxufdKawREAGki0Rm9i1baiH9sqNj5rllM01Y+PgU6E21Nbg
 iZp9gJLjfwM4vhYLlovvQK5PRoOBsCkyCpEI4GJqjLeam5p/WN06CFFf0ifQofYr
 qDPCVDkWHWV8nugFFKE7
 =EVh9
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "Slightly more changes than usual this time:

   - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries
     to preserve the mappings of the kernel so that master aborts for
     devices are avoided. Master aborts cause some devices to fail in
     the kdump kernel, so this code makes the dump more likely to
     succeed when AMD IOMMU is enabled.

   - common flush queue implementation for IOVA code users. The code is
     still optional, but AMD and Intel IOMMU drivers had their own
     implementation which is now unified.

   - finish support for iommu-groups. All drivers implement this feature
     now so that IOMMU core code can rely on it.

   - finish support for 'struct iommu_device' in iommu drivers. All
     drivers now use the interface.

   - new functions in the IOMMU-API for explicit IO/TLB flushing. This
     will help to reduce the number of IO/TLB flushes when IOMMU drivers
     support this interface.

   - support for mt2712 in the Mediatek IOMMU driver

   - new IOMMU driver for QCOM hardware

   - system PM support for ARM-SMMU

   - shutdown method for ARM-SMMU-v3

   - some constification patches

   - various other small improvements and fixes"

* tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits)
  iommu/vt-d: Don't be too aggressive when clearing one context entry
  iommu: Introduce Interface for IOMMU TLB Flushing
  iommu/s390: Constify iommu_ops
  iommu/vt-d: Avoid calling virt_to_phys() on null pointer
  iommu/vt-d: IOMMU Page Request needs to check if address is canonical.
  arm/tegra: Call bus_set_iommu() after iommu_device_register()
  iommu/exynos: Constify iommu_ops
  iommu/ipmmu-vmsa: Make ipmmu_gather_ops const
  iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable
  iommu/amd: Rename a few flush functions
  iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY
  iommu/mediatek: Fix a build warning of BIT(32) in ARM
  iommu/mediatek: Fix a build fail of m4u_type
  iommu: qcom: annotate PM functions as __maybe_unused
  iommu/pamu: Fix PAMU boot crash
  memory: mtk-smi: Degrade SMI init to module_init
  iommu/mediatek: Enlarge the validate PA range for 4GB mode
  iommu/mediatek: Disable iommu clock when system suspend
  iommu/mediatek: Move pgtable allocation into domain_alloc
  iommu/mediatek: Merge 2 M4U HWs into one iommu domain
  ...
2017-09-09 15:03:24 -07:00
Linus Torvalds 0d519f2d1e pci-v4.14-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ
 vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq
 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh
 XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc
 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6
 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ
 nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam
 c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz
 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf
 h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE
 GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ
 EG76ridTolUFVV+wzJD9
 =9cLP
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - add enhanced Downstream Port Containment support, which prints more
   details about Root Port Programmed I/O errors (Dongdong Liu)

 - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang)

 - add MediaTek MT2712 and MT7622 support (Ryder Lee)

 - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang)

 - add Qualcom IPQ8074 support (Varadarajan Narayanan)

 - add R-Car r8a7743/5 device tree support (Biju Das)

 - add Rockchip per-lane PHY support for better power management (Shawn
   Lin)

 - fix IRQ mapping for hot-added devices by replacing the
   pci_fixup_irqs() boot-time design with a host bridge hook called at
   probe-time (Lorenzo Pieralisi, Matthew Minter)

 - fix race when enabling two devices that results in upstream bridge
   not being enabled correctly (Srinath Mannam)

 - fix pciehp power fault infinite loop (Keith Busch)

 - fix SHPC bridge MSI hotplug events by enabling bus mastering
   (Aleksandr Bezzubikov)

 - fix a VFIO issue by correcting PCIe capability sizes (Alex
   Williamson)

 - fix an INTD issue on Xilinx and possibly other drivers by unifying
   INTx IRQ domain support (Paul Burton)

 - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg
   Roedel)

 - allow APM X-Gene device assignment to guests by adding an ACS quirk
   (Feng Kan)

 - fix driver crashes by disabling Extended Tags on Broadcom HT2100
   (Extended Tags support is required for PCIe Receivers but not
   Requesters, and we now enable them by default when Requesters support
   them) (Sinan Kaya)

 - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs
   use the real Requester ID (not a phantom RID) (Robin Murphy)

 - prevent assignment of Intel VMD children to guests (which may be
   supported eventually, but isn't yet) by not associating an IOMMU with
   them (Jon Derrick)

 - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott
   Bauer)

 - fix a Function-Level Reset issue with Intel 750 NVMe by waiting
   longer (up to 60sec instead of 1sec) for device to become ready
   (Sinan Kaya)

 - fix a Function-Level Reset issue on iProc Stingray by working around
   hardware defects in the CRS implementation (Oza Pawandeep)

 - fix an issue with Intel NVMe P3700 after an iProc reset by adding a
   delay during shutdown (Oza Pawandeep)

 - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking
   in compose_msi_msg() (Stephen Hemminger)

 - fix a wireless LAN driver timeout by clearing DesignWare MSI
   interrupt status after it is handled, not before (Faiz Abbas)

 - fix DesignWare ATU enable checking (Jisheng Zhang)

 - reduce Layerscape dependencies on the bootloader by doing more
   initialization in the driver (Hou Zhiqiang)

 - improve Intel VMD performance allowing allocation of more IRQ vectors
   than present CPUs (Keith Busch)

 - improve endpoint framework support for initial DMA mask, different
   BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon
   Vijay Abraham I, Stan Drozd)

 - rework CRS support to add periodic messages while we poll during
   enumeration and after Function-Level Reset and prepare for possible
   other uses of CRS (Sinan Kaya)

 - clean up Root Port AER handling by removing unnecessary code and
   moving error handler methods to struct pcie_port_service_driver
   (Christoph Hellwig)

 - clean up error handling paths in various drivers (Bjorn Andersson,
   Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen,
   Lorenzo Pieralisi, Sergei Shtylyov)

 - clean up SR-IOV resource handling by disabling VF decoding before
   updating the corresponding resource structs (Gavin Shan)

 - clean up DesignWare-based drivers by unifying quirks to update Class
   Code and Interrupt Pin and related handling of write-protected
   registers (Hou Zhiqiang)

 - clean up by adding empty generic pcibios_align_resource() and
   pcibios_fixup_bus() and removing empty arch-specific implementations
   (Palmer Dabbelt)

 - request exclusive reset control for several drivers to allow cleanup
   elsewhere (Philipp Zabel)

 - constify various structures (Arvind Yadav, Bhumika Goyal)

 - convert from full_name() to %pOF (Rob Herring)

 - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn
   Lin)

* tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits)
  PCI: xgene: Clean up whitespace
  PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset
  PCI: xgene: Fix platform_get_irq() error handling
  PCI: xilinx-nwl: Fix platform_get_irq() error handling
  PCI: rockchip: Fix platform_get_irq() error handling
  PCI: altera: Fix platform_get_irq() error handling
  PCI: spear13xx: Fix platform_get_irq() error handling
  PCI: artpec6: Fix platform_get_irq() error handling
  PCI: armada8k: Fix platform_get_irq() error handling
  PCI: dra7xx: Fix platform_get_irq() error handling
  PCI: exynos: Fix platform_get_irq() error handling
  PCI: iproc: Clean up whitespace
  PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP
  PCI: iproc: Add 500ms delay during device shutdown
  PCI: Fix typos and whitespace errors
  PCI: Remove unused "res" variable from pci_resource_io()
  PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag()
  PCI/AER: Reformat AER register definitions
  iommu/vt-d: Prevent VMD child devices from being remapping targets
  x86/PCI: Use is_vmd() rather than relying on the domain number
  ...
2017-09-08 15:47:43 -07:00
Linus Torvalds b1b6f83ac9 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm changes from Ingo Molnar:
 "PCID support, 5-level paging support, Secure Memory Encryption support

  The main changes in this cycle are support for three new, complex
  hardware features of x86 CPUs:

   - Add 5-level paging support, which is a new hardware feature on
     upcoming Intel CPUs allowing up to 128 PB of virtual address space
     and 4 PB of physical RAM space - a 512-fold increase over the old
     limits. (Supercomputers of the future forecasting hurricanes on an
     ever warming planet can certainly make good use of more RAM.)

     Many of the necessary changes went upstream in previous cycles,
     v4.14 is the first kernel that can enable 5-level paging.

     This feature is activated via CONFIG_X86_5LEVEL=y - disabled by
     default.

     (By Kirill A. Shutemov)

   - Add 'encrypted memory' support, which is a new hardware feature on
     upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system
     RAM to be encrypted and decrypted (mostly) transparently by the
     CPU, with a little help from the kernel to transition to/from
     encrypted RAM. Such RAM should be more secure against various
     attacks like RAM access via the memory bus and should make the
     radio signature of memory bus traffic harder to intercept (and
     decrypt) as well.

     This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled
     by default.

     (By Tom Lendacky)

   - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a
     hardware feature that attaches an address space tag to TLB entries
     and thus allows to skip TLB flushing in many cases, even if we
     switch mm's.

     (By Andy Lutomirski)

  All three of these features were in the works for a long time, and
  it's coincidence of the three independent development paths that they
  are all enabled in v4.14 at once"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits)
  x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
  x86/mm: Use pr_cont() in dump_pagetable()
  x86/mm: Fix SME encryption stack ptr handling
  kvm/x86: Avoid clearing the C-bit in rsvd_bits()
  x86/CPU: Align CR3 defines
  x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
  acpi, x86/mm: Remove encryption mask from ACPI page protection type
  x86/mm, kexec: Fix memory corruption with SME on successive kexecs
  x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt
  x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y
  x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID
  x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
  x86/mm: Allow userspace have mappings above 47-bit
  x86/mm: Prepare to expose larger address space to userspace
  x86/mpx: Do not allow MPX if we have mappings above 47-bit
  x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit()
  x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD
  x86/mm/dump_pagetables: Fix printout of p4d level
  x86/mm/dump_pagetables: Generalize address normalization
  x86/boot: Fix memremap() related build failure
  ...
2017-09-04 12:21:28 -07:00
Joerg Roedel 47b59d8e40 Merge branches 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next 2017-09-01 11:31:42 +02:00
Filippo Sironi 5082219b6a iommu/vt-d: Don't be too aggressive when clearing one context entry
Previously, we were invalidating context cache and IOTLB globally when
clearing one context entry.  This is a tad too aggressive.
Invalidate the context cache and IOTLB for the interested device only.

Signed-off-by: Filippo Sironi <sironi@amazon.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-01 11:30:41 +02:00
Jérôme Glisse 30ef7d2c05 iommu/intel: update to new mmu_notifier semantic
Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31 16:12:59 -07:00
Jérôme Glisse f0d1c713d6 iommu/amd: update to new mmu_notifier semantic
Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31 16:12:59 -07:00
Jon Derrick 5823e330b5 iommu/vt-d: Prevent VMD child devices from being remapping targets
VMD child devices must use the VMD endpoint's ID as the requester.  Because
of this, there needs to be a way to link the parent VMD endpoint's IOMMU
group and associated mappings to the VMD child devices such that attaching
and detaching child devices modify the endpoint's mappings, while
preventing early detaching on a singular device removal or unbinding.

The reassignment of individual VMD child devices devices to VMs is outside
the scope of VMD, but may be implemented in the future. For now it is best
to prevent any such attempts.

Prevent VMD child devices from returning an IOMMU, which prevents it from
exposing an iommu_group sysfs directory and allowing subsequent binding by
userspace-access drivers such as VFIO.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-30 16:43:58 -05:00
Joerg Roedel add02cfdc9 iommu: Introduce Interface for IOMMU TLB Flushing
With the current IOMMU-API the hardware TLBs have to be
flushed in every iommu_ops->unmap() call-back.

For unmapping large amounts of address space, like it
happens when a KVM domain with assigned devices is
destroyed, this causes thousands of unnecessary TLB flushes
in the IOMMU hardware because the unmap call-back runs for
every unmapped physical page.

With the TLB Flush Interface and the new iommu_unmap_fast()
function introduced here the need to clean the hardware TLBs
is removed from the unmapping code-path. Users of
iommu_unmap_fast() have to explicitly call the TLB-Flush
functions to sync the page-table changes to the hardware.

Three functions for TLB-Flushes are introduced:

	* iommu_flush_tlb_all() - Flushes all TLB entries
	                          associated with that
				  domain. TLBs entries are
				  flushed when this function
				  returns.

	* iommu_tlb_range_add() - This will add a given
				  range to the flush queue
				  for this domain.

	* iommu_tlb_sync() - Flushes all queued ranges from
			     the hardware TLBs. Returns when
			     the flush is finished.

The semantic of this interface is intentionally similar to
the iommu_gather_ops from the io-pgtable code.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 18:07:13 +02:00
Arvind Yadav cceb845195 iommu/s390: Constify iommu_ops
iommu_ops are not supposed to change at runtime.
Functions 'bus_set_iommu' working with const iommu_ops provided
by <linux/iommu.h>. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 17:53:29 +02:00
Ashok Raj 11b93ebfa0 iommu/vt-d: Avoid calling virt_to_phys() on null pointer
New kernels with debug show panic() from __phys_addr() checks. Avoid
calling virt_to_phys()  when pasid_state_tbl pointer is null

To: Joerg Roedel <joro@8bytes.org>
To: linux-kernel@vger.kernel.org>
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>

Fixes: 2f26e0a9c9 ('iommu/vt-d: Add basic SVM PASID support')
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 17:48:18 +02:00
Ashok Raj 9d8c3af316 iommu/vt-d: IOMMU Page Request needs to check if address is canonical.
Page Request from devices that support device-tlb would request translation
to pre-cache them in device to avoid overhead of IOMMU lookups.

IOMMU needs to check for canonicallity of the address before performing
page-fault processing.

To: Joerg Roedel <joro@8bytes.org>
To: linux-kernel@vger.kernel.org>
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Reported-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 17:48:18 +02:00
Joerg Roedel 96302d89a0 arm/tegra: Call bus_set_iommu() after iommu_device_register()
The bus_set_iommu() function will call the add_device()
call-back which needs the iommu to be registered.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Fixes: 0b480e4470 ('iommu/tegra: Add support for struct iommu_device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 17:28:32 +02:00
Arvind Yadav 0b9a36947c iommu/exynos: Constify iommu_ops
iommu_ops are not supposed to change at runtime.
Functions 'iommu_device_set_ops' and 'bus_set_iommu' working with
const iommu_ops provided by <linux/iommu.h>. So mark the non-const
structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 15:13:54 +02:00
Bhumika Goyal 8da4af9586 iommu/ipmmu-vmsa: Make ipmmu_gather_ops const
Make these const as they are not modified anywhere.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 15:10:53 +02:00
Oleksandr Tyshchenko a175a67d30 iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable
Reserving a free context is both quicker and more likely to fail
(due to limited hardware resources) than setting up a pagetable.
What is more the pagetable init/cleanup code could require
the context to be set up.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Robin Murphy <robin.murphy@arm.com>
CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
CC: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-30 15:10:46 +02:00
Joerg Roedel 0688a09990 iommu/amd: Rename a few flush functions
Rename a few iommu cache-flush functions that start with
iommu_ to start with amd_iommu now. This is to prevent name
collisions with generic iommu code later on.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-28 11:31:31 +02:00
Baoquan He ec62b1ab0f iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY
In get_domain(), 'domain' could be NULL before it's passed to dma_ops_domain()
to dereference. And the current code calling get_domain() can't deal with the
returned 'domain' well if its value is NULL.

So before dma_ops_domain() calling, check if 'domain' is NULL, If yes just return
ERR_PTR(-EBUSY) directly.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: df3f7a6e8e ('iommu/amd: Use is_attach_deferred call-back')
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-28 11:28:37 +02:00
Yong Wu 4193998043 iommu/mediatek: Fix a build warning of BIT(32) in ARM
The commit ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode") introduce the following build warning while ARCH=arm:

   drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_iova_to_phys':
   include/linux/bitops.h:6:24: warning: left shift count >= width
of type [-Wshift-count-overflow]
    #define BIT(nr)   (1UL << (nr))
                           ^
>> drivers/iommu/mtk_iommu.c:407:9: note: in expansion of macro 'BIT'
      pa |= BIT(32);

  drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_probe':
   include/linux/bitops.h:6:24: warning: left shift count >= width
of type [-Wshift-count-overflow]
    #define BIT(nr)   (1UL << (nr))
                           ^
   drivers/iommu/mtk_iommu.c:589:35: note: in expansion of macro 'BIT'
     data->enable_4GB = !!(max_pfn > (BIT(32) >> PAGE_SHIFT));

Use BIT_ULL instead of BIT.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-28 11:26:21 +02:00
Yong Wu 4f1c8ea16b iommu/mediatek: Fix a build fail of m4u_type
The commit ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode") introduce the following build error:

   drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init':
>> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data'
 has no member named 'm4u_type'; did you mean 'm4u_dom'?
     if (data->enable_4GB && data->m4u_type != M4U_MT8173) {

This patch fix it, use "m4u_plat" instead of "m4u_type".

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-28 11:26:21 +02:00
Arnd Bergmann 6ce5b0f22d iommu: qcom: annotate PM functions as __maybe_unused
The qcom_iommu_disable_clocks() function is only called from PM
code that is hidden in an #ifdef, causing a harmless warning without
CONFIG_PM:

drivers/iommu/qcom_iommu.c:601:13: error: 'qcom_iommu_disable_clocks' defined but not used [-Werror=unused-function]
 static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
drivers/iommu/qcom_iommu.c:581:12: error: 'qcom_iommu_enable_clocks' defined but not used [-Werror=unused-function]
 static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)

Replacing that #ifdef with __maybe_unused annotations lets the compiler
drop the functions silently instead.

Fixes: 0ae349a0f3 ("iommu/qcom: Add qcom_iommu")
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-28 11:24:52 +02:00
Ingo Molnar 413d63d71b Merge branch 'linus' into x86/mm to pick up fixes and to fix conflicts
Conflicts:
	arch/x86/kernel/head64.c
	arch/x86/mm/mmap.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-26 09:19:13 +02:00
Joerg Roedel 3ff2dcc058 iommu/pamu: Fix PAMU boot crash
Commit 68a17f0be6 introduced an initialization order
problem, where devices are linked against an iommu which is
not yet initialized. Fix it by initializing the iommu-device
before the iommu-ops are registered against the bus.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: 68a17f0be6 ('iommu/pamu: Add support for generic iommu-device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-23 16:28:09 +02:00
Yong Wu 30e2fccf95 iommu/mediatek: Enlarge the validate PA range for 4GB mode
This patch is for 4GB mode, mainly for 4 issues:
1) Fix a 4GB bug:
   if the dram base is 0x4000_0000, the dram size is 0xc000_0000.
   then the code just meet a corner case because max_pfn is
   0x10_0000.
   data->enable_4GB = !!(max_pfn > (0xffffffffUL >> PAGE_SHIFT));
   It's true at the case above. That is unexpected.
2) In mt2712, there is a new register for the 4GB PA range(0x118)
   we should enlarge the max PA range, or the HW will report
   error.
   The dram range is from 0x1_0000_0000 to 0x1_ffff_ffff in the 4GB
   mode, we cut out the bit[32:30] of the SA(Start address) and
   EA(End address) into this REG_MMU_VLD_PA_RNG(0x118).
3) In mt2712, the register(0x13c) is extended for 4GB mode.
   bit[7:6] indicate the valid PA[32:33]. Thus, we don't mask the
   value and print it directly for debug.
4) if 4GB is enabled, the dram PA range is from 0x1_0000_0000 to
   0x1_ffff_ffff. Thus, the PA from iova_to_pa should also '|' BIT(32)

Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:38:00 +02:00
Yong Wu 6254b64f57 iommu/mediatek: Disable iommu clock when system suspend
When system suspend, infra power domain may be off, and the iommu's
clock must be disabled when system off, or the iommu's bclk clock maybe
disabled after system resume.

Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:37:59 +02:00
Yong Wu 4b00f5ac12 iommu/mediatek: Move pgtable allocation into domain_alloc
After adding the global list for M4U HW, We get a chance to
move the pagetable allocation into the mtk_iommu_domain_alloc.
Let the domain_alloc do the right thing.

This patch is for fixing this problem[1].
[1]: https://patchwork.codeaurora.org/patch/53987/

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:37:59 +02:00
Yong Wu 7c3a2ec028 iommu/mediatek: Merge 2 M4U HWs into one iommu domain
In theory, If there are 2 M4U HWs, there should be 2 IOMMU domains.
But one IOMMU domain(4GB iova range) is enough for us currently,
It's unnecessary to maintain 2 pagetables.

Besides, This patch can simplify our consumer code largely. They don't
need map a iova range from one domain into another, They can share the
iova address easily.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:37:59 +02:00
Yong Wu e6dec92308 iommu/mediatek: Add mt2712 IOMMU support
The M4U IP blocks in mt2712 is MTK's generation2 M4U which use the
ARM Short-descriptor like mt8173, and most of the HW registers are
the same.

The difference is that there are 2 M4U HWs in mt2712 while there's
only one in mt8173. The purpose of 2 M4U HWs is for balance the
bandwidth.

Normally if there are 2 M4U HWs, there should be 2 iommu domains,
each M4U has a iommu domain.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:37:58 +02:00
Yong Wu a9467d9542 iommu/mediatek: Move MTK_M4U_TO_LARB/PORT into mtk_iommu.c
The definition of MTK_M4U_TO_LARB and MTK_M4U_TO_PORT are shared by
all the gen2 M4U HWs. Thus, Move them out from mt8173-larb-port.h,
and put them into the c file.

Suggested-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:37:58 +02:00
Magnus Damm 7af9a5fdb9 iommu/ipmmu-vmsa: Use iommu_device_sysfs_add()/remove()
Extend the driver to make use of iommu_device_sysfs_add()/remove()
functions to hook up initial sysfs support.

Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-22 16:19:00 +02:00
Joerg Roedel 2479c631d1 iommu/amd: Fix section mismatch warning
The variable amd_iommu_pre_enabled is used in non-init
code-paths, so remove the __initdata annotation.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 3ac3e5ee5e ('iommu/amd: Copy old trans table from old kernel')
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-19 10:34:31 +02:00
Joerg Roedel ae162efbf2 iommu/amd: Fix compiler warning in copy_device_table()
This was reported by the kbuild bot. The condition in which
entry would be used uninitialized can not happen, because
when there is no iommu this function would never be called.
But its no fast-path, so fix the warning anyway.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-19 10:33:42 +02:00
Joerg Roedel af6ee6c1c4 Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu 2017-08-19 00:42:01 +02:00
Robin Murphy 1464d0b1de iommu: Avoid NULL group dereference
The recently-removed FIXME in iommu_get_domain_for_dev() turns out to
have been a little misleading, since that check is still worthwhile even
when groups *are* universal. We have a few IOMMU-aware drivers which
only care whether their device is already attached to an existing domain
or not, for which the previous behaviour of iommu_get_domain_for_dev()
was ideal, and who now crash if their device does not have an IOMMU.

With IOMMU groups now serving as a reliable indicator of whether a
device has an IOMMU or not (barring false-positives from VFIO no-IOMMU
mode), drivers could arguably do this:

	group = iommu_group_get(dev);
	if (group) {
		domain = iommu_get_domain_for_dev(dev);
		iommu_group_put(group);
	}

However, rather than duplicate that code across multiple callsites,
particularly when it's still only the domain they care about, let's skip
straight to the next step and factor out the check into the common place
it applies - in iommu_get_domain_for_dev() itself. Sure, it ends up
looking rather familiar, but now it's backed by the reasoning of having
a robust API able to do the expected thing for all devices regardless.

Fixes: 05f80300dc ("iommu: Finish making iommu_group support mandatory")
Reported-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-18 11:41:17 +02:00
Joerg Roedel c184ae83c8 iommu/tegra-gart: Add support for struct iommu_device
Add a struct iommu_device to each tegra-gart and register it
with the iommu-core. Also link devices added to the driver
to their respective hardware iommus.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-17 16:31:34 +02:00
Joerg Roedel 0b480e4470 iommu/tegra: Add support for struct iommu_device
Add a struct iommu_device to each tegra-smmu and register it
with the iommu-core. Also link devices added to the driver
to their respective hardware iommus.

Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-17 16:31:34 +02:00
Joerg Roedel fa94cf84af Merge branch 'core' into arm/tegra 2017-08-17 16:31:27 +02:00
Robin Murphy a2d866f7d6 iommu/arm-smmu: Add system PM support
With all our hardware state tracked in such a way that we can naturally
restore it as part of the necessary reset, resuming is trivial, and
there's nothing to do on suspend at all.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-16 17:27:29 +01:00
Robin Murphy 90df373cc6 iommu/arm-smmu: Track context bank state
Echoing what we do for Stream Map Entries, maintain a software shadow
state for context bank configuration. With this in place, we are mere
moments away from blissfully easy suspend/resume support.

Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: fix sparse warning by only clearing .cfg during domain destruction]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-16 17:27:28 +01:00
Nate Watterson 7aa8619a66 iommu/arm-smmu-v3: Implement shutdown method
The shutdown method disables the SMMU to avoid corrupting a new kernel
started with kexec.

Signed-off-by: Nate Watterson <nwatters@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-16 17:27:28 +01:00
Joerg Roedel 13cf017446 iommu/vt-d: Make use of iova deferred flushing
Remove the deferred flushing implementation in the Intel
VT-d driver and use the one from the common iova code
instead.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:53 +02:00
Joerg Roedel c8acb28b33 iommu/vt-d: Allow to flush more than 4GB of device TLBs
The shift qi_flush_dev_iotlb() is done on an int, which
limits the mask to 32 bits. Make the mask 64 bits wide so
that more than 4GB of address range can be flushed at once.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:53 +02:00
Joerg Roedel 9003d61863 iommu/amd: Make use of iova queue flushing
Rip out the implementation in the AMD IOMMU driver and use
the one in the common iova code instead.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:52 +02:00
Joerg Roedel 9a005a800a iommu/iova: Add flush timer
Add a timer to flush entries from the Flush-Queues every
10ms. This makes sure that no stale TLB entries remain for
too long after an IOVA has been unmapped.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:52 +02:00
Joerg Roedel 8109c2a2f8 iommu/iova: Add locking to Flush-Queues
The lock is taken from the same CPU most of the time. But
having it allows to flush the queue also from another CPU if
necessary.

This will be used by a timer to regularily flush any pending
IOVAs from the Flush-Queues.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:52 +02:00
Joerg Roedel fb418dab8a iommu/iova: Add flush counters to Flush-Queue implementation
There are two counters:

	* fq_flush_start_cnt  - Increased when a TLB flush
	                        is started.

	* fq_flush_finish_cnt - Increased when a TLB flush
				is finished.

The fq_flush_start_cnt is assigned to every Flush-Queue
entry on its creation. When freeing entries from the
Flush-Queue, the value in the entry is compared to the
fq_flush_finish_cnt. The entry can only be freed when its
value is less than the value of fq_flush_finish_cnt.

The reason for these counters it to take advantage of IOMMU
TLB flushes that happened on other CPUs. These already
flushed the TLB for Flush-Queue entries on other CPUs so
that they can already be freed without flushing the TLB
again.

This makes it less likely that the Flush-Queue is full and
saves IOMMU TLB flushes.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:51 +02:00