linux

Commit Graph

Author	SHA1	Message	Date
Joerg Roedel	2926a2aa5c	iommu: Fix wrong freeing of iommu_device->dev The struct iommu_device has a 'struct device' embedded into it, not as a pointer, but the whole struct. In the conversion of the iommu drivers to use struct iommu_device it was forgotten that the relase function for that struct device simply calls kfree() on the pointer. This frees memory that was never allocated and causes memory corruption. To fix this issue, use a pointer to struct device instead of embedding the whole struct. This needs some updates in the iommu sysfs code as well as the Intel VT-d and AMD IOMMU driver. Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `39ab9555c2` ('iommu: Add sysfs bindings for struct iommu_device') Cc: stable@vger.kernel.org # >= v4.11 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:58:48 +02:00
Linus Torvalds	fb4e3beeff	IOMMU Updates for Linux v4.13 This update comes with: * Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM * Some Errata workarounds for ARM SMMU implemenations * Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before * Support for amd_iommu=off when booting with kexec. Problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel * The Rockchip IOMMU driver is now available on ARM64 * Align the return value of the iommu_ops->device_group call-backs to not miss error values * Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT * Various other small cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+ buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO 4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz 4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN Ukehbl4OEKzZpD3oLPPk =pmBf -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...	2017-07-12 10:00:04 -07:00
Linus Torvalds	f72e24a124	This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlldmw0LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOiKA/+Ln1mFLSf3nfTzIHa24Bbk8ZTGr0B8TD4Vmyyt8iG oO3AeaTLn3d6ugbH/uih/tPz8PuyXsdiTC1rI/ejDMiwMTSjW6phSiIHGcStSR9X VFNhmMFacp7QpUpvxceV0XZYKDViAoQgHeGdp3l+K5h/v4AYePV/v/5RjQPaEyOh YLbCzETO+24mRWdJxdAqtTW4ovYhzj6XsiJ+pAjlV0+SWU6m5L5E+VAPNi1vqv1H 1O2KeCFvVYEpcnfL3qnkw2timcjmfCfeFAd9mCUAc8mSRBfs3QgDTKw3XdHdtRml LU2WuA5cpMrOdBO4mVra2plo8E2szvpB1OZZXoKKdCpK3VGwVpVHcTvClK2Ks/3B GDLieroEQNu2ZIUIdWXf/g2x6le3BcC9MmpkAhnGPqCZ7skaIBO5Cjpxm0zTJAPl PPY3CMBBEktAvys6DcudOYGixNjKUuAm5lnfpcfTEklFdG0AjhdK/jZOplAFA6w4 LCiy0rGHM8ZbVAaFxbYoFCqgcjnv6EjSiqkJxVI4fu/Q7v9YXfdPnEmE0PJwCVo5 +i7aCLgrYshTdHr/F3e5EuofHN3TDHwXNJKGh/x97t+6tt326QMvDKX059Kxst7R rFukGbrYvG8Y7yXwrSDbusl443ta0Ht7T1oL4YUoJTZp0nScAyEluDTmrH1JVCsT R4o= =0Fso -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...	2017-07-06 19:20:54 -07:00
Christoph Hellwig	5860acc1a9	x86: remove arch specific dma_supported implementation And instead wire it up as method for all the dma_map_ops instances. Note that this also means the arch specific check will be fully instead of partially applied in the AMD iommu driver. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:46 -07:00
Joerg Roedel	6a7086431f	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-06-28 14:45:02 +02:00
Arvind Yadav	01e1932a17	iommu/vt-d: Constify intel_dma_ops Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:17:12 +02:00
Sebastian Andrzej Siewior	58c4a95f90	iommu/vt-d: Don't disable preemption while accessing deferred_flush() get_cpu() disables preemption and returns the current CPU number. The CPU number is only used once while retrieving the address of the local's CPU deferred_flush pointer. We can instead use raw_cpu_ptr() while we remain preemptible. The worst thing that can happen is that flush_unmaps_timeout() is invoked multiple times: once by taskA after seeing HIGH_WATER_MARK and then preempted to another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0 right after its read because another CPU invoked flush_unmaps_timeout() for this CPU. The access to flush_data is protected by a spinlock so even if we get migrated to another CPU or preempted - the data structure is protected. While at it, I marked deferred_flush static since I can't find a reference to it outside of this file. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:24:40 +02:00
Peter Xu	b316d02a13	iommu/vt-d: Unwrap __get_valid_domain_for_dev() We do find_domain() in __get_valid_domain_for_dev(), while we do the same thing in get_valid_domain_for_dev(). No need to do it twice. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-30 11:41:32 +02:00
Thomas Gleixner	b608fe356f	iommu/vt-d: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state checks in dmar_parse_one_atsr() and dmar_iommu_notify_scope_dev() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20170516184735.712365947@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-05-23 10:01:36 +02:00
KarimAllah Ahmed	f73a7eee90	iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings Ever since commit `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") the kdump kernel copies the IOMMU context tables from the previous kernel. Each device mappings will be destroyed once the driver for the respective device takes over. This unfortunately breaks the workflow of mapping and unmapping a new context to the IOMMU. The mapping function assumes that either: 1) Unmapping did the proper IOMMU flushing and it only ever flush if the IOMMU unit supports caching invalid entries. 2) The system just booted and the initialization code took care of flushing all IOMMU caches. This assumption is not true for the kdump kernel since the context tables have been copied from the previous kernel and translations could have been cached ever since. So make sure to flush the IOTLB as well when we destroy these old copied mappings. Cc: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: stable@vger.kernel.org v4.2+ Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-17 14:44:47 +02:00
Joerg Roedel	2c0248d688	Merge branches 'arm/exynos', 'arm/omap', 'arm/rockchip', 'arm/mediatek', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd' and 'core' into next	2017-05-04 18:06:17 +02:00
Shaohua Li	bfd20f1cc8	x86, iommu/vt-d: Add an option to disable Intel IOMMU force on IOMMU harms performance signficantly when we run very fast networking workloads. It's 40GB networking doing XDP test. Software overhead is almost unaware, but it's the IOTLB miss (based on our analysis) which kills the performance. We observed the same performance issue even with software passthrough (identity mapping), only the hardware passthrough survives. The pps with iommu (with software passthrough) is only about ~30% of that without it. This is a limitation in hardware based on our observation, so we'd like to disable the IOMMU force on, but we do want to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not eabling IOMMU is totally ok. So introduce a new boot option to disable the force on. It's kind of silly we need to run into intel_iommu_init even without force on, but we need to disable TBOOT PMR registers. For system without the boot option, nothing is changed. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-04-26 23:57:53 +02:00
Joerg Roedel	161b28aae1	iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off When booting into a kexec kernel with intel_iommu=off, and the previous kernel had intel_iommu=on, the IOMMU hardware is still enabled and gets not disabled by the new kernel. This causes the boot to fail because DMA is blocked by the hardware. Disable the IOMMUs when we find it enabled in the kexec kernel and boot with intel_iommu=off. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-29 17:02:00 +02:00
Robin Murphy	9d3a4de4cb	iommu: Disambiguate MSI region types The introduction of reserved regions has left a couple of rough edges which we could do with sorting out sooner rather than later. Since we are not yet addressing the potential dynamic aspect of software-managed reservations and presenting them at arbitrary fixed addresses, it is incongruous that we end up displaying hardware vs. software-managed MSI regions to userspace differently, especially since ARM-based systems may actually require one or the other, or even potentially both at once, (which iommu-dma currently has no hope of dealing with at all). Let's resolve the former user-visible inconsistency ASAP before the ABI has been baked into a kernel release, in a way that also lays the groundwork for the latter shortcoming to be addressed by follow-up patches. For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use IOMMU_RESV_MSI to describe the hardware type, and document everything a little bit. Since the x86 MSI remapping hardware falls squarely under this meaning of IOMMU_RESV_MSI, apply that type to their regions as well, so that we tell the same story to userspace across all platforms. Secondly, as the various region types require quite different handling, and it really makes little sense to ever try combining them, convert the bitfield-esque #defines to a plain enum in the process before anyone gets the wrong impression. Fixes: `d30ddcaa7b` ("iommu: Add a new type field in iommu_resv_region") Reviewed-by: Eric Auger <eric.auger@redhat.com> CC: Alex Williamson <alex.williamson@redhat.com> CC: David Woodhouse <dwmw2@infradead.org> CC: kvm@vger.kernel.org Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-22 16:16:17 +01:00
Koos Vriezen	5003ae1e73	iommu/vt-d: Fix NULL pointer dereference in device_to_iommu The function device_to_iommu() in the Intel VT-d driver lacks a NULL-ptr check, resulting in this oops at boot on some platforms: BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0 PGD 0 [...] Call Trace: ? find_or_alloc_domain.constprop.29+0x1a/0x300 ? dw_dma_probe+0x561/0x580 [dw_dmac_core] ? __get_valid_domain_for_dev+0x39/0x120 ? __intel_map_single+0x138/0x180 ? intel_alloc_coherent+0xb6/0x120 ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm] ? mutex_lock+0x9/0x30 ? kernfs_add_one+0xdb/0x130 ? devres_add+0x19/0x60 ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm] ? platform_drv_probe+0x30/0x90 ? driver_probe_device+0x1ed/0x2b0 ? __driver_attach+0x8f/0xa0 ? driver_probe_device+0x2b0/0x2b0 ? bus_for_each_dev+0x55/0x90 ? bus_add_driver+0x110/0x210 ? 0xffffffffa11ea000 ? driver_register+0x52/0xc0 ? 0xffffffffa11ea000 ? do_one_initcall+0x32/0x130 ? free_vmap_area_noflush+0x37/0x70 ? kmem_cache_alloc+0x88/0xd0 ? do_init_module+0x51/0x1c4 ? load_module+0x1ee9/0x2430 ? show_taint+0x20/0x20 ? kernel_read_file+0xfd/0x190 ? SyS_finit_module+0xa3/0xb0 ? do_syscall_64+0x4a/0xb0 ? entry_SYSCALL64_slow_path+0x25/0x25 Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49 RIP [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0 RSP <ffffc90001457a78> CR2: 00000000000007ab ---[ end trace 16f974b6d58d0aad ]--- Add the missing pointer check. Fixes: `1c387188c6` ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions") Signed-off-by: Koos Vriezen <koos.vriezen@gmail.com> Cc: stable@vger.kernel.org # 4.8.15+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-22 12:01:42 +01:00
Joerg Roedel	a7fdb6e648	iommu/vt-d: Fix crash when accessing VT-d sysfs entries The link between the iommu sysfs-device and the struct intel_iommu is no longer stored as driver-data. Update the code to use the new access method. Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: `39ab9555c2` ('iommu: Add sysfs bindings for struct iommu_device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-28 15:48:15 +01:00
Lucas Stach	712c604dcd	mm: wire up GFP flag passing in dma_alloc_from_contiguous The callers of the DMA alloc functions already provide the proper context GFP flags. Make sure to pass them through to the CMA allocator, to make the CMA compaction context aware. Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alexander Graf <agraf@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:55 -08:00
Joerg Roedel	8d2932dd06	Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas', 'arm/smmu', 'arm/mediatek', 'arm/core', 'x86/vt-d' and 'core' into next	2017-02-10 15:13:10 +01:00
Joerg Roedel	e3d10af112	iommu: Make iommu_device_link/unlink take a struct iommu_device This makes the interface more consistent with iommu_device_sysfs_add/remove. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
Joerg Roedel	39ab9555c2	iommu: Add sysfs bindings for struct iommu_device There is currently support for iommu sysfs bindings, but those need to be implemented in the IOMMU drivers. Add a more generic version of this by adding a struct device to struct iommu_device and use that for the sysfs bindings. Also convert the AMD and Intel IOMMU driver to make use of it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
Joerg Roedel	b0119e8708	iommu: Introduce new 'struct iommu_device' This struct represents one hardware iommu in the iommu core code. For now it only has the iommu-ops associated with it, but that will be extended soon. The register/unregister interface is also added, as well as making use of it in the Intel and AMD IOMMU drivers. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
David Dillow	f7116e115a	iommu/vt-d: Don't over-free page table directories dma_pte_free_level() recurses down the IOMMU page tables and frees directory pages that are entirely contained in the given PFN range. Unfortunately, it incorrectly calculates the starting address covered by the PTE under consideration, which can lead to it clearing an entry that is still in use. This occurs if we have a scatterlist with an entry that has a length greater than 1026 MB and is aligned to 2 MB for both the IOMMU and physical addresses. For example, if __domain_mapping() is asked to map a two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000, it will ask if dma_pte_free_pagetable() is asked to PFNs from 0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from 0xffff80000 to 0xffff801ff because of this issue. The current code will set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside the range being cleared. Properly setting the level_pfn for the current level under consideration catches that this PTE is outside of the range being cleared. This patch also changes the value passed into dma_pte_free_level() when it recurses. This only affects the first PTE of the range being cleared, and is handled by the existing code that ensures we start our cursor no lower than start_pfn. This was found when using dma_map_sg() to map large chunks of contiguous memory, which immediatedly led to faults on the first access of the erroneously-deleted mappings. Fixes: `3269ee0bd6` ("intel-iommu: Fix leaks in pagetable freeing") Reviewed-by: Benjamin Serebrin <serebrin@google.com> Signed-off-by: David Dillow <dillow@google.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-31 12:50:05 +01:00
Ashok Raj	21e722c4c8	iommu/vt-d: Tylersburg isoch identity map check is done too late. The check to set identity map for tylersburg is done too late. It needs to be done before the check for identity_map domain is done. To: Joerg Roedel <joro@8bytes.org> To: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Cc: Ashok Raj <ashok.raj@intel.com> Fixes: `86080ccc22` ("iommu/vt-d: Allocate si_domain in init_dmars()") Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reported-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-31 12:50:05 +01:00
Eric Auger	0659b8dc45	iommu/vt-d: Implement reserved region get/put callbacks This patch registers the [FEE0_0000h - FEF0_000h] 1MB MSI range as a reserved region and RMRR regions as direct regions. This will allow to report those reserved regions in the iommu-group sysfs. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-01-23 11:48:17 +00:00
Jacob Pan	65ca7f5f7d	iommu/vt-d: Fix pasid table size encoding Different encodings are used to represent supported PASID bits and number of PASID table entries. The current code assigns ecap_pss directly to extended context table entry PTS which is wrong and could result in writing non-zero bits to the reserved fields. IOMMU fault reason 11 will be reported when reserved bits are nonzero. This patch converts ecap_pss to extend context entry pts encoding based on VT-d spec. Chapter 9.4 as follows: - number of PASID bits = ecap_pss + 1 - number of PASID table entries = 2^(pts + 5) Software assigned limit of pasid_max value is also respected to match the allocation limitation of PASID table. cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-04 15:18:57 +01:00
Xunlei Pang	aec0e86172	iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers under kdump, it can be steadily reproduced on several different machines, the dmesg log is like: HP HPSA Driver (v 3.4.16-0) hpsa 0000:02:00.0: using doorbell to reset controller hpsa 0000:02:00.0: board ready after hard reset. hpsa 0000:02:00.0: Waiting for controller to respond to no-op DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set hpsa 0000:02:00.0: controller message 03:00 timed out hpsa 0000:02:00.0: no-op failed; re-trying After some debugging, we found that the fault addr is from DMA initiated at the driver probe stage after reset(not in-flight DMA), and the corresponding pte entry value is correct, the fault is likely due to the old iommu caches of the in-flight DMA before it. Thus we need to flush the old cache after context mapping is setup for the device, where the device is supposed to finish reset at its driver probe stage and no in-flight DMA exists hereafter. I'm not sure if the hardware is responsible for invalidating all the related caches allocated in the iommu hardware before, but seems not the case for hpsa, actually many device drivers have problems in properly resetting the hardware. Anyway flushing (again) by software in kdump kernel when the device gets context mapped which is a quite infrequent operation does little harm. With this patch, the problematic machine can survive the kdump tests. CC: Myron Stowe <myron.stowe@gmail.com> CC: Joseph Szczypek <jszczype@redhat.com> CC: Don Brace <don.brace@microsemi.com> CC: Baoquan He <bhe@redhat.com> CC: Dave Young <dyoung@redhat.com> Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Fixes: `dbcd861f25` ("iommu/vt-d: Do not re-use domain-ids from the old kernel") Fixes: `cf484d0e69` ("iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-04 15:14:04 +01:00
Linus Torvalds	e71c3978d6	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...	2016-12-12 19:25:04 -08:00
Anna-Maria Gleixner	21647615db	iommu/vt-d: Convert to hotplug state machine Install the callbacks via the state machine. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: rt@linutronix.de Cc: David Woodhouse <dwmw2@infradead.org> Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-12-02 00:52:37 +01:00
Linus Torvalds	105ecadc6d	Merge git://git.infradead.org/intel-iommu Pull IOMMU fixes from David Woodhouse: "Two minor fixes. The first fixes the assignment of SR-IOV virtual functions to the correct IOMMU unit, and the second fixes the excessively large (and physically contiguous) PASID tables used with SVM" * git://git.infradead.org/intel-iommu: iommu/vt-d: Fix PASID table allocation iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions	2016-11-27 08:24:46 -08:00
Joerg Roedel	bea64033dd	iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path It turns out that the disable_dmar_iommu() code-path tried to get the device_domain_lock recursivly, which will dead-lock when this code runs on dmar removal. Fix both code-paths that could lead to the dead-lock. Fixes: `55d940430a` ('iommu/vt-d: Get rid of domain->iommu_lock') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-11-08 15:08:26 +01:00
Ashok Raj	1c387188c6	iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions The VT-d specification (§8.3.3) says: ‘Virtual Functions’ of a ‘Physical Function’ are under the scope of the same remapping unit as the ‘Physical Function’. The BIOS is not required to list all the possible VFs in the scope tables, and arguably shouldn't make any attempt to do so, since there could be a huge number of them. This has been broken basically for ever — the VF is never going to match against a specific unit's scope, so it ends up being assigned to the INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but now we're looking at Root-Complex integrated devices with SR-IOV support it's going to start being wrong. Fix it to simply use pci_physfn() before doing the lookup for PCI devices. Cc: stable@vger.kernel.org Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2016-10-30 05:32:51 -06:00
Joerg Roedel	1c5ebba95b	iommu/vt-d: Make sure RMRRs are mapped before domain goes public When a domain is allocated through the get_valid_domain_for_dev path, it will be context-mapped before the RMRR regions are mapped in the page-table. This opens a short time window where device-accesses to these regions fail and causing DMAR faults. Fix this by mapping the RMRR regions before the domain is context-mapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-09-05 13:00:28 +02:00
Joerg Roedel	76208356a0	iommu/vt-d: Split up get_domain_for_dev function Split out the search for an already existing domain and the context mapping of the device to the new domain. This allows to map possible RMRR regions into the domain before it is context mapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-09-05 13:00:28 +02:00
Krzysztof Kozlowski	00085f1efa	dma-mapping: use unsigned long for dma_attrs The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-04 08:50:07 -04:00
Linus Torvalds	dd9671172a	IOMMU Updates for Linux v4.8 In the updates: * Big endian support and preparation for defered probing for the Exynos IOMMU driver * Simplifications in iommu-group id handling * Support for Mediatek generation one IOMMU hardware * Conversion of the AMD IOMMU driver to use the generic IOVA allocator. This driver now also benefits from the recent scalability improvements in the IOVA code. * Preparations to use generic DMA mapping code in the Rockchip IOMMU driver * Device tree adaption and conversion to use generic page-table code for the MSM IOMMU driver * An iova_to_phys optimization in the ARM-SMMU driver to greatly improve page-table teardown performance with VFIO * Various other small fixes and conversions -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJXl3e+AAoJECvwRC2XARrjMIgP/1Mm9qIfcaAxKY4ByqbVfrH8 313PO6rpwUhhywUmnf/1F/x+JbuLv8MmRXfSc106mdB1rq9NXpkORYKrqVxs0cSq 6u6TzZWbF6WN1ipqXxDITNFBSy7u97K1VuFaKyYFfLbg8xrkcdkMZJ7BqM2xIEdk rnRKcfHo6wsmCXJ6InsUPmKAqU6AfMewZTGjO+v77Gce0rZEbsJ8n7BRKC9vO2bc akvN2W+zzEUSyhbuyYQBG+agpmC5GJvz4u+6QvAP5sxTWfAsnwAoPpP4xxR+/KjT eicHlja4v0YK6Hr4AJaMxoKfKIrCdqpWm0D2tg/edyWZCeg98AW/w7/s0I8OD3ao Otj6IqC8nPk0pYciOeEPQ7aqPbvKAqU2FYWt7lWamrdr98u2R3p2nXGl0KthoAj6 JqzrCZXvBS7sj1IPLlGpj939yvbKbjpE0p7y1qhI1VEBXoBWFNvlKydkYx76BTGK F6paGVqn2Zwy00AqAsylTEkvIK063zwShZ6nPqz4bMdVlgzjrjCzdDecjfbHr8Ic 6D2oCwyF+RJ8qw+Ecm9EmWFik80sgb+iUTeeYEXNf+YzLYt5McIj7fi3N+sUPel3 YJ4S4x0sIpgUZZ1i+rOo8ZPAFHRU6SRPYV+ewaeYKrMt+Un5dTn9SddpqrJdbiUu YrF36BaQjc123IRGKrSd =xiS2 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - big-endian support and preparation for defered probing for the Exynos IOMMU driver - simplifications in iommu-group id handling - support for Mediatek generation one IOMMU hardware - conversion of the AMD IOMMU driver to use the generic IOVA allocator. This driver now also benefits from the recent scalability improvements in the IOVA code. - preparations to use generic DMA mapping code in the Rockchip IOMMU driver - device tree adaption and conversion to use generic page-table code for the MSM IOMMU driver - an iova_to_phys optimization in the ARM-SMMU driver to greatly improve page-table teardown performance with VFIO - various other small fixes and conversions * tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/amd: Initialize dma-ops domains with 3-level page-table iommu/amd: Update Alias-DTE in update_device_table() iommu/vt-d: Return error code in domain_context_mapping_one() iommu/amd: Use container_of to get dma_ops_domain iommu/amd: Flush iova queue before releasing dma_ops_domain iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back iommu/amd: Use dev_data->domain in get_domain() iommu/amd: Optimize map_sg and unmap_sg iommu/amd: Introduce dir2prot() helper iommu/amd: Implement timeout to flush unmap queues iommu/amd: Implement flush queue iommu/amd: Allow NULL pointer parameter for domain_flush_complete() iommu/amd: Set up data structures for flush queue iommu/amd: Remove align-parameter from __map_single() iommu/amd: Remove other remains of old address allocator iommu/amd: Make use of the generic IOVA allocator iommu/amd: Remove special mapping code for dma_ops path iommu/amd: Pass gfp-flags to iommu_map_page() iommu/amd: Implement apply_dm_region call-back iommu/amd: Create a list of reserved iova addresses ...	2016-08-01 07:25:10 -04:00
Linus Torvalds	194dc870a5	Add braces to avoid "ambiguous ‘else’" compiler warnings Some of our "for_each_xyz()" macro constructs make gcc unhappy about lack of braces around if-statements inside or outside the loop, because the loop construct itself has a "if-then-else" statement inside of it. The resulting warnings look something like this: drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’: drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses] if (ctx != dev_priv->kernel_context) ^ even if the code itself is fine. Since the warning is fairly easy to avoid by adding a braces around the if-statement near the for_each_xyz() construct, do so, rather than disabling the otherwise potentially useful warning. (The if-then-else statements used in the "for_each_xyz()" constructs are designed to be inherently safe even with no braces, but in this case it's quite understandable that gcc isn't really able to tell that). This finally leaves the standard "allmodconfig" build with just a handful of remaining warnings, so new and valid warnings hopefully will stand out. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-27 20:03:31 -07:00
Joerg Roedel	f360d3241f	Merge branches 'x86/amd', 'x86/vt-d', 'arm/exynos', 'arm/mediatek', 'arm/msm', 'arm/rockchip', 'arm/smmu' and 'core' into next	2016-07-26 16:02:37 +02:00
Wei Yang	5c365d18a7	iommu/vt-d: Return error code in domain_context_mapping_one() In 'commit <55d940430ab9> ("iommu/vt-d: Get rid of domain->iommu_lock")', the error handling path is changed a little, which makes the function always return 0. This path fixes this. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Fixes: `55d940430a` ('iommu/vt-d: Get rid of domain->iommu_lock') Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-07-14 10:26:30 +02:00
Aaron Campbell	0caa7616a6	iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum number of possible domains is 64K; indeed this is the maximum value that the cap_ndoms() macro will expand to. Since the value 65536 will not fix in a u16, the 'did' variable must be promoted to an int, otherwise the test for < 65536 will always be true and the loop will never end. The symptom, in my case, was a hung machine during suspend. Fixes: `3bd4f9112f` ("iommu/vt-d: Fix overflow of iommu->domains array") Signed-off-by: Aaron Campbell <aaron@monkey.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-07-04 13:34:52 +02:00
Jan Niehusmann	3bd4f9112f	iommu/vt-d: Fix overflow of iommu->domains array The valid range of 'did' in get_iommu_domain(*iommu, did) is 0..cap_ndoms(iommu->cap), so don't exceed that range in free_all_cpu_cached_iovas(). The user-visible impact of the out-of-bounds access is the machine hanging on suspend-to-ram. It is, in fact, a kernel panic, but due to already suspended devices, that's often not visible to the user. Fixes: `22e2f9fa63` ("iommu/vt-d: Use per-cpu IOVA caching") Signed-off-by: Jan Niehusmann <jan@gondor.com> Tested-By: Marius Vlad <marius.c.vlad@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-27 13:21:37 +02:00
Joerg Roedel	a4c34ff1c0	iommu/vt-d: Enable QI on all IOMMUs before setting root entry This seems to be required on some X58 chipsets on systems with more than one IOMMU. QI does not work until it is enabled on all IOMMUs in the system. Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Fixes: `5f0a7f7614` ('iommu/vt-d: Make root entry visible for hardware right after allocation') Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-17 11:29:48 +02:00
Wei Yang	86f004c77c	iommu/vt-d: Reduce extra first level entry in iommu->domains In commit <8bf478163e69> ("iommu/vt-d: Split up iommu->domains array"), it it splits iommu->domains in two levels. Each first level contains 256 entries of second level. In case of the ndomains is exact a multiple of 256, it would have one more extra first level entry for current implementation. This patch refines this calculation to reduce the extra first level entry. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-15 13:36:58 +02:00
Linus Torvalds	2566278551	Merge git://git.infradead.org/intel-iommu Pull intel IOMMU updates from David Woodhouse: "This patchset improves the scalability of the Intel IOMMU code by resolving two spinlock bottlenecks and eliminating the linearity of the IOVA allocator, yielding up to ~5x performance improvement and approaching 'iommu=off' performance" * git://git.infradead.org/intel-iommu: iommu/vt-d: Use per-cpu IOVA caching iommu/iova: introduce per-cpu caching to iova allocation iommu/vt-d: change intel-iommu to use IOVA frame numbers iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs iommu/vt-d: only unmap mapped entries iommu/vt-d: correct flush_unmaps pfn usage iommu/vt-d: per-cpu deferred invalidation queues iommu/vt-d: refactoring of deferred flush entries	2016-05-27 13:49:24 -07:00
Joerg Roedel	6c0b43df74	Merge branches 'arm/io-pgtable', 'arm/rockchip', 'arm/omap', 'x86/vt-d', 'ppc/pamu', 'core' and 'x86/amd' into next	2016-05-09 19:39:17 +02:00
Omer Peleg	22e2f9fa63	iommu/vt-d: Use per-cpu IOVA caching Commit `9257b4a2` ('iommu/iova: introduce per-cpu caching to iova allocation') introduced per-CPU IOVA caches to massively improve scalability. Use them. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:44:48 -04:00
Omer Peleg	2aac630429	iommu/vt-d: change intel-iommu to use IOVA frame numbers Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of pointers to "struct iova". This avoids using the iova struct from the IOVA red-black tree and the resulting explicit find_iova() on unmap. This patch will allow us to cache IOVAs in the next patch, in order to avoid rbtree operations for the majority of map/unmap operations. Note: In eliminating the find_iova() operation, we have also eliminated the sanity check previously done in the unmap flow. Arguably, this was overhead that is better avoided in production code, but it could be brought back as a debug option for driver development. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:07:22 -04:00
Omer Peleg	0824c5920b	iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb() for domains with no dev iotlb devices. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [gvdl@google.com: fixed locking issues] Signed-off-by: Godfrey van der Linden <gvdl@google.com> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:06:15 -04:00
Omer Peleg	769530e4ba	iommu/vt-d: only unmap mapped entries Current unmap implementation unmaps the entire area covered by the IOVA range, which is a power-of-2 aligned region. The corresponding map, however, only maps those pages originally mapped by the user. This discrepancy can lead to unmapping of already unmapped entries, which is unneeded work. With this patch, only mapped pages are unmapped. This is also a baseline for a map/unmap implementation based on IOVAs and not iova structures, which will allow caching. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:06:01 -04:00
Omer Peleg	f5c0c08b1e	iommu/vt-d: correct flush_unmaps pfn usage Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi() dma addresses. (x86_64 mm and dma have the same size for pages at the moment, but this usage improves consistency.) Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:05:56 -04:00
Omer Peleg	aa4732406e	iommu/vt-d: per-cpu deferred invalidation queues The IOMMU's IOTLB invalidation is a costly process. When iommu mode is not set to "strict", it is done asynchronously. Current code amortizes the cost of invalidating IOTLB entries by batching all the invalidations in the system and performing a single global invalidation instead. The code queues pending invalidations in a global queue that is accessed under the global "async_umap_flush_lock" spinlock, which can result is significant spinlock contention. This patch splits this deferred queue into multiple per-cpu deferred queues, and thus gets rid of the "async_umap_flush_lock" and its contention. To keep existing deferred invalidation behavior, it still invalidates the pending invalidations of all CPUs whenever a CPU reaches its watermark or a timeout occurs. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:05:24 -04:00

1 2 3 4 5 ...

325 Commits