linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	e5a594643a	dma-mapping updates for 4.18: - replaceme the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlsU1hwLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYPraxAAocC7JiFKW133/VugCtGA1x9uE8DPHealtsWTAeEq KOOB3GxWMU2hKqQ4km5tcfdWoGJvvab6hmDXcitzZGi2JajO7Ae0FwIy3yvxSIKm iH/ON7c4sJt8gKrXYsLVylmwDaimNs4a6xfODoCRgnWuovI2QrrZzupnlzPNsiOC lv8ezzcW+Ay/gvDD/r72psO+w3QELETif/OzR/qTOtvLrVabM06eHmPQ8Wb98smu /UPMMv6/3XwQnxpxpdyqN+p/gUdneXithzT261wTeZ+8gDXmcWBwHGcMBCimcoBi FklW52moazIPIsTysqoNlVFsLGJTeS4p2D3BLAp5NwWYsLv+zHUVZsI1JY/8u5Ox mM11LIfvu9JtUzaqD9SvxlxIeLhhYZZGnUoV3bQAkpHSQhN/xp2YXd5NWSo5ac2O dch83+laZkZgd6ryw6USpt/YTPM/UHBYy7IeGGHX/PbmAke0ZlvA6Rae7kA5DG59 7GaLdwQyrHp8uGFgwze8P+R4POSk1ly73HHLBT/pFKnDD7niWCPAnBzuuEQGJs00 0zuyWLQyzOj1l6HCAcMNyGnYSsMp8Fx0fvEmKR/EYs8O83eJKXi6L9aizMZx4v1J 0wTolUWH6SIIdz474YmewhG5YOLY7mfe9E8aNr8zJFdwRZqwaALKoteRGUxa3f6e zUE= =6Acj -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - replace the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits) dma-direct: don't crash on device without dma_mask nds32: use generic dma_noncoherent_ops nds32: implement the unmap_sg DMA operation nds32: consolidate DMA cache maintainance routines x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag x86/pci-dma: remove the explicit nodac and allowdac option x86/pci-dma: remove the experimental forcesac boot option Documentation/x86: remove a stray reference to pci-nommu.c core, dma-direct: add a flag 32-bit dma limits dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs dma-debug: check scatterlist segments c6x: use generic dma_noncoherent_ops arc: use generic dma_noncoherent_ops arc: fix arc_dma_{map,unmap}_page arc: fix arc_dma_sync_sg_for_{cpu,device} arc: simplify arc_dma_sync_single_for_{cpu,device} dma-mapping: provide a generic dma-noncoherent implementation dma-mapping: simplify Kconfig dependencies riscv: add swiotlb support riscv: only enable ZONE_DMA32 for 64-bit ...	2018-06-04 10:58:12 -07:00
Joerg Roedel	1f56835711	Merge branches 'arm/io-pgtable', 'arm/qcom', 'arm/tegra', 'x86/vt-d', 'x86/amd' and 'core' into next	2018-05-29 16:58:17 +02:00
Robin Murphy	4b123757ee	iommu/io-pgtable-arm: Make allocations NUMA-aware We would generally expect pagetables to be read by the IOMMU more than written by the CPU, so in NUMA systems it makes sense to locate them close to the former and avoid cross-node pagetable walks if at all possible. As it turns out, we already have a handle on the IOMMU device for the sake of coherency management, so it's trivial to grab the appropriate NUMA node when allocating new pagetable pages. Note that we drop the semantics of alloc_pages_exact(), but that's fine since they have never been necessary: the only time we're allocating more than one page is for stage 2 top-level concatenation, but since that is based on the number of IPA bits, the size is always some exact power of two anyway. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-29 16:57:58 +02:00
Anna-Maria Gleixner	ea3fd04028	iommu/amd: Prevent possible null pointer dereference and infinite loop The check for !dev_data->domain in __detach_device() emits a warning and returns. The calling code in detach_device() dereferences dev_data->domain afterwards unconditionally, so in case that dev_data->domain is NULL the warning will be immediately followed by a NULL pointer dereference. The calling code in cleanup_domain() loops infinite when !dev_data->domain and the check in __detach_device() returns immediately because dev_list is not changed. do_detach() duplicates this check without throwing a warning. Move the check with the explanation of the do_detach() code into the caller detach_device() and return immediately. Throw an error, when hitting the condition in cleanup_domain(). Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:39:18 +02:00
Anna-Maria Gleixner	29a0c41541	iommu/amd: Fix grammar of comments Suggested-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:36:43 +02:00
Lu Baolu	1eefe5a034	iommu: Clean up the comments for iommu_group_alloc @name parameter has been removed. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:34:59 +02:00
Lu Baolu	bb37f7db07	iommu/vt-d: Remove unnecessary parentheses Remove unnecessary parentheses to comply with preferred coding style. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:34:53 +02:00
Lu Baolu	ab96746aaa	iommu/vt-d: Clean up pasid quirk for pre-production devices The pasid28 quirk is needed only for some pre-production devices. Remove it to make the code concise. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:34:52 +02:00
Lu Baolu	fcc35c6342	iommu/vt-d: Clean up unused variable in find_or_alloc_domain Remove it to make the code more concise. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:34:52 +02:00
Peter Xu	87684fd997	iommu/vt-d: Fix iotlb psi missing for mappings When caching mode is enabled for IOMMU, we should send explicit IOTLB PSIs even for newly created mappings. However these events are missing for all intel_iommu_map() callers, e.g., iommu_map(). One direct user is the vfio-pci driver. To make sure we'll send the PSIs always when necessary, this patch firstly introduced domain_mapping() helper for page mappings, then fixed the problem by generalizing the explicit map IOTLB PSI logic into that new helper. With that, we let iommu_domain_identity_map() to use the simplified version to avoid sending the notifications, while for all the rest of cases we send the notifications always. For VM case, we send the PSIs to all the backend IOMMUs for the domain. This patch allows the nested device assignment to work with QEMU (assign device firstly to L1 guest, then assign it again to L2 guest). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:31:41 +02:00
Peter Xu	eed91a0b85	iommu/vt-d: Introduce __mapping_notify_one() Introduce this new helper to notify one newly created mapping on one single IOMMU. We can further leverage this helper in the next patch. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:31:41 +02:00
Andy Shevchenko	7f9584df84	iommu: Remove extra NULL check when call strtobool() strtobool() does check for NULL parameter already. No need to repeat. While here, switch to kstrtobool() and unshadow actual error code (which is still -EINVAL). No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-15 16:27:48 +02:00
Gil Kupfer	cef74409ea	PCI: Add "pci=noats" boot parameter Adds a "pci=noats" boot parameter. When supplied, all ATS related functions fail immediately and the IOMMU is configured to not use device-IOTLB. Any function that checks for ATS capabilities directly against the devices should also check this flag. Currently, such functions exist only in IOMMU drivers, and they are covered by this patch. The motivation behind this patch is the existence of malicious devices. Lots of research has been done about how to use the IOMMU as protection from such devices. When ATS is supported, any I/O device can access any physical address by faking device-IOTLB entries. Adding the ability to ignore these entries lets sysadmins enhance system security. Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Joerg Roedel <jroedel@suse.de>	2018-05-10 17:56:02 -05:00
Christoph Hellwig	f616ab59c2	dma-mapping: move the NEED_DMA_MAP_STATE config symbol to lib/Kconfig This way we have one central definition of it, and user can select it as needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG is select, which fixes some incorrect checks in a few network drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>	2018-05-09 06:56:08 +02:00
Arnd Bergmann	40fa84e101	iommu: rockchip: fix building without CONFIG_OF We get a build error when compiling the iommu driver without CONFIG_OF: drivers/iommu/rockchip-iommu.c: In function 'rk_iommu_of_xlate': drivers/iommu/rockchip-iommu.c:1101:2: error: implicit declaration of function 'of_dev_put'; did you mean 'of_node_put'? [-Werror=implicit-function-declaration] This replaces the of_dev_put() with the equivalent platform_device_put(). Fixes: `5fd577c3ea` ("iommu/rockchip: Use OF_IOMMU to attach devices automatically") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:36:07 +02:00
Gary R Hook	e7f63ffc1b	iommu/amd: Update logging information for new event type A new events have been defined in the AMD IOMMU spec: 0x09 - "invalid PPR request" Add support for logging this type of event. Signed-off-by: Gary R Hook <gary.hook@amd.com> ~ ~ ~ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:33:18 +02:00
Gary R Hook	d64c0486ed	iommu/amd: Update the PASID information printed to the system log Provide detailed data for each event, as appropriate. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:33:17 +02:00
Joerg Roedel	a85894cd77	iommu/vt-d: Use WARN_ON_ONCE instead of BUG_ON in qi_flush_dev_iotlb() A misaligned address is only worth a warning, and not stopping the while execution path with a BUG_ON(). Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:32:11 +02:00
Changbin Du	0dfc0c792d	iommu/vt-d: fix shift-out-of-bounds in bug checking It allows to flush more than 4GB of device TLBs. So the mask should be 64bit wide. UBSAN captured this fault as below. [ 3.760024] ================================================================================ [ 3.768440] UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1348:3 [ 3.774864] shift exponent 64 is too large for 32-bit type 'int' [ 3.780853] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 4.17.0-rc1+ #89 [ 3.788661] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016 [ 3.796034] Call Trace: [ 3.798472] <IRQ> [ 3.800479] dump_stack+0x90/0xfb [ 3.803787] ubsan_epilogue+0x9/0x40 [ 3.807353] __ubsan_handle_shift_out_of_bounds+0x10e/0x170 [ 3.812916] ? qi_flush_dev_iotlb+0x124/0x180 [ 3.817261] qi_flush_dev_iotlb+0x124/0x180 [ 3.821437] iommu_flush_dev_iotlb+0x94/0xf0 [ 3.825698] iommu_flush_iova+0x10b/0x1c0 [ 3.829699] ? fq_ring_free+0x1d0/0x1d0 [ 3.833527] iova_domain_flush+0x25/0x40 [ 3.837448] fq_flush_timeout+0x55/0x160 [ 3.841368] ? fq_ring_free+0x1d0/0x1d0 [ 3.845200] ? fq_ring_free+0x1d0/0x1d0 [ 3.849034] call_timer_fn+0xbe/0x310 [ 3.852696] ? fq_ring_free+0x1d0/0x1d0 [ 3.856530] run_timer_softirq+0x223/0x6e0 [ 3.860625] ? sched_clock+0x5/0x10 [ 3.864108] ? sched_clock+0x5/0x10 [ 3.867594] __do_softirq+0x1b5/0x6f5 [ 3.871250] irq_exit+0xd4/0x130 [ 3.874470] smp_apic_timer_interrupt+0xb8/0x2f0 [ 3.879075] apic_timer_interrupt+0xf/0x20 [ 3.883159] </IRQ> [ 3.885255] RIP: 0010:poll_idle+0x60/0xe7 [ 3.889252] RSP: 0018:ffffb1b201943e30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 3.896802] RAX: 0000000080200000 RBX: 000000000000008e RCX: 000000000000001f [ 3.903918] RDX: 0000000000000000 RSI: 000000002819aa06 RDI: 0000000000000000 [ 3.911031] RBP: ffff9e93c6b33280 R08: 00000010f717d567 R09: 000000000010d205 [ 3.918146] R10: ffffb1b201943df8 R11: 0000000000000001 R12: 00000000e01b169d [ 3.925260] R13: 0000000000000000 R14: ffffffffb12aa400 R15: 0000000000000000 [ 3.932382] cpuidle_enter_state+0xb4/0x470 [ 3.936558] do_idle+0x222/0x310 [ 3.939779] cpu_startup_entry+0x78/0x90 [ 3.943693] start_secondary+0x205/0x2e0 [ 3.947607] secondary_startup_64+0xa5/0xb0 [ 3.951783] ================================================================================ Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:32:11 +02:00
Shameer Kolothum	cd2c9fcf5c	iommu/dma: Move PCI window region reservation back into dma specific path. This pretty much reverts commit `273df96353` ("iommu/dma: Make PCI window reservation generic") by moving the PCI window region reservation back into the dma specific path so that these regions doesn't get exposed via the IOMMU API interface. With this change, the vfio interface will report only iommu specific reserved regions to the user space. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Fixes: `273df96353` ('iommu/dma: Make PCI window reservation generic') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:32:10 +02:00
Heiko Stuebner	2f8c7f2e76	iommu/rockchip: Make clock handling optional iommu clocks are optional, so the driver should not fail if they are not present. Instead just set the number of clocks to 0, which the clk-blk APIs can handle just fine. Fixes: `f2e3a5f557` ("iommu/rockchip: Control clocks needed to access the IOMMU") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:32:10 +02:00
Arnd Bergmann	94c793acca	iommu/amd: Hide unused iommu_table_lock The newly introduced lock is only used when CONFIG_IRQ_REMAP is enabled: drivers/iommu/amd_iommu.c:86:24: error: 'iommu_table_lock' defined but not used [-Werror=unused-variable] static DEFINE_SPINLOCK(iommu_table_lock); This moves the definition next to the user, within the #ifdef protected section of the file. Fixes: `ea6166f4b8` ("iommu/amd: Split irq_lookup_table out of the amd_iommu_devtable_lock") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:31:57 +02:00
Jagannathan Raman	aa7528fe35	iommu/vt-d: Fix usage of force parameter in intel_ir_reconfigure_irte() It was noticed that the IRTE configured for guest OS kernel was over-written while the guest was running. As a result, vt-d Posted Interrupts configured for the guest are not being delivered directly, and instead bounces off the host. Every interrupt delivery takes a VM Exit. It was noticed that the following stack is doing the over-write: [ 147.463177] modify_irte+0x171/0x1f0 [ 147.463405] intel_ir_set_affinity+0x5c/0x80 [ 147.463641] msi_domain_set_affinity+0x32/0x90 [ 147.463881] irq_do_set_affinity+0x37/0xd0 [ 147.464125] irq_set_affinity_locked+0x9d/0xb0 [ 147.464374] __irq_set_affinity+0x42/0x70 [ 147.464627] write_irq_affinity.isra.5+0xe1/0x110 [ 147.464895] proc_reg_write+0x38/0x70 [ 147.465150] __vfs_write+0x36/0x180 [ 147.465408] ? handle_mm_fault+0xdf/0x200 [ 147.465671] ? _cond_resched+0x15/0x30 [ 147.465936] vfs_write+0xad/0x1a0 [ 147.466204] SyS_write+0x52/0xc0 [ 147.466472] do_syscall_64+0x74/0x1a0 [ 147.466744] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 reversing the sense of force check in intel_ir_reconfigure_irte() restores proper posted interrupt functionality Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Fixes: `d491bdff88` ('iommu/vt-d: Reevaluate vector configuration on activate()') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:31:39 +02:00
Dmitry Osipenko	130a2fdf0b	iommu/tegra: gart: Fix gart_iommu_unmap() It must return the number of unmapped bytes on success, returning 0 means that unmapping failed and in result only one page is unmapped. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:30:28 +02:00
Dmitry Osipenko	40c9b882fa	iommu/tegra: gart: Add debugging facility Page mapping could overwritten by an accident (a bug). We can catch this case by checking 'VALID' bit of GART's page entry prior to mapping of a page. Since that check introduces a small performance impact, it should be enabled explicitly using new GART's kernel module 'debug' parameter. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 16:30:21 +02:00
YueHaibing	f793b13ef0	iommu/io-pgtable-arm: Use for_each_set_bit to simplify code We can use for_each_set_bit() to simplify code slightly in the ARM io-pgtable self tests while unmapping. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 15:31:07 +02:00
Wolfram Sang	7d1bf14f86	iommu/qcom: Simplify getting .drvdata We should get drvdata from struct device directly. Going via platform_device is an unneeded step back and forth. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 15:18:26 +02:00
Geert Uytterhoeven	48e6f7652d	iommu: Remove depends on HAS_DMA in case of platform dependency Remove dependencies on HAS_DMA where a Kconfig symbol depends on another symbol that implies HAS_DMA, and, optionally, on "\|\| COMPILE_TEST". In most cases this other symbol is an architecture or platform specific symbol, or PCI. Generic symbols and drivers without platform dependencies keep their dependencies on HAS_DMA, to prevent compiling subsystems or drivers that cannot work anyway. This simplifies the dependencies, and allows to improve compile-testing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 15:02:06 +02:00
Dmitry Safonov	6c50d79f66	iommu/vt-d: Ratelimit each dmar fault printing There is a ratelimit for printing, but it's incremented each time the cpu recives dmar fault interrupt. While one interrupt may signal about many faults. So, measuring the impact it turns out that reading/clearing one fault takes < 1 usec, and printing info about the fault takes ~170 msec. Having in mind that maximum number of fault recording registers per remapping hardware unit is 256.. IRQ handler may run for (170*256) msec. And as fault-serving loop runs without a time limit, during servicing new faults may occur.. Ratelimit each fault printing rather than each irq printing. Fixes: commit `c43fce4eeb` ("iommu/vt-d: Ratelimit fault handler") BUG: spinlock lockup suspected on CPU#0, CliShell/9903 lock: 0xffffffff81a47440, .magic: dead4ead, .owner: kworker/u16:2/8915, .owner_cpu: 6 CPU: 0 PID: 9903 Comm: CliShell Call Trace:$\n' [..] dump_stack+0x65/0x83$\n' [..] spin_dump+0x8f/0x94$\n' [..] do_raw_spin_lock+0x123/0x170$\n' [..] _raw_spin_lock_irqsave+0x32/0x3a$\n' [..] uart_chars_in_buffer+0x20/0x4d$\n' [..] tty_chars_in_buffer+0x18/0x1d$\n' [..] n_tty_poll+0x1cb/0x1f2$\n' [..] tty_poll+0x5e/0x76$\n' [..] do_select+0x363/0x629$\n' [..] compat_core_sys_select+0x19e/0x239$\n' [..] compat_SyS_select+0x98/0xc0$\n' [..] sysenter_dispatch+0x7/0x25$\n' [..] NMI backtrace for cpu 6 CPU: 6 PID: 8915 Comm: kworker/u16:2 Workqueue: dmar_fault dmar_fault_work Call Trace:$\n' [..] wait_for_xmitr+0x26/0x8f$\n' [..] serial8250_console_putchar+0x1c/0x2c$\n' [..] uart_console_write+0x40/0x4b$\n' [..] serial8250_console_write+0xe6/0x13f$\n' [..] call_console_drivers.constprop.13+0xce/0x103$\n' [..] console_unlock+0x1f8/0x39b$\n' [..] vprintk_emit+0x39e/0x3e6$\n' [..] printk+0x4d/0x4f$\n' [..] dmar_fault+0x1a8/0x1fc$\n' [..] dmar_fault_work+0x15/0x17$\n' [..] process_one_work+0x1e8/0x3a9$\n' [..] worker_thread+0x25d/0x345$\n' [..] kthread+0xea/0xf2$\n' [..] ret_from_fork+0x58/0x90$\n' Cc: Alex Williamson <alex.williamson@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-05-03 14:38:58 +02:00
Linus Torvalds	e5c372280b	IOMMU Updates for Linux v4.17 These updates come with: - OF_IOMMU support for the Rockchip iommu driver so that it can use generic DT bindings - Rework of locking in the AMD IOMMU interrupt remapping code to make it work better in RT kernels - Support for improved iotlb flushing in the AMD IOMMU driver - Support for 52-bit physical and virtual addressing in the ARM-SMMU - Various other small fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJazi2BAAoJECvwRC2XARrjYVwP/AyXK7CjRvaiHFopIUO0WpwY V3GiKrODtSNHqPSKuFnqIssIhxZPw/SKFz6E/pe08pZ/pxHYTxeTL78Wz7D1+4Sp n0YokSM5qLb660OTQVnyKNCku8cEMCb9hkQ/75SFgwcILQYF93cZBDIdBn93OKVO 6xAOE+tqd8Daulnk0YpdiCTFTJPzYHPl6B7scoUav26uaKxWeMJxeYe+EXC+4WQG U1u/jDiVXyllzGgRqqfrmO4L2acmsK8HL97hD4+m1URJKDlb8ho6xwaRThFZWqXS SbrYnvH0ruWGrLiQKmVUssw8FqbcXCzq3236g2O8jE4jqWSm70twg+q31iMjwD7v bwsJGMkk7aLrquv9Zpaylpf8tRECk5bjhTFC2zB0pdum5XLx47j0IHKWMLPYhkCz E0pBefvuhoSTbt/5X0urSRzH2Hk4ljEsM+QjlfH8SN3ALTljFjay607wbxC7t35M LEL5AuNsDDBddoJIi9D13CdJEZa4lps8dbpB8m40lQVvmiLPLcKraaG0RfKQ397T wsxhsDOQYM2FCwfUP3n8RTsMKRIp/UVkKY+2G7AsKofciSeulK6nDbrV7jFnitx4 vTxbRgpNejJpqzKZG/W9lCGWk1BhmQK/Cbu6JW5IA4+ew9omWkFp61U6rtc645Te 6cNEYBiMz/RZIiC2b18J =kte5 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - OF_IOMMU support for the Rockchip iommu driver so that it can use generic DT bindings - rework of locking in the AMD IOMMU interrupt remapping code to make it work better in RT kernels - support for improved iotlb flushing in the AMD IOMMU driver - support for 52-bit physical and virtual addressing in the ARM-SMMU - various other small fixes and cleanups * tag 'iommu-updates-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits) iommu/io-pgtable-arm: Avoid warning with 32-bit phys_addr_t iommu/rockchip: Support sharing IOMMU between masters iommu/rockchip: Add runtime PM support iommu/rockchip: Fix error handling in init iommu/rockchip: Use OF_IOMMU to attach devices automatically iommu/rockchip: Use IOMMU device for dma mapping operations dt-bindings: iommu/rockchip: Add clock property iommu/rockchip: Control clocks needed to access the IOMMU iommu/rockchip: Fix TLB flush of secondary IOMMUs iommu/rockchip: Use iopoll helpers to wait for hardware iommu/rockchip: Fix error handling in attach iommu/rockchip: Request irqs in rk_iommu_probe() iommu/rockchip: Fix error handling in probe iommu/rockchip: Prohibit unbind and remove iommu/amd: Return proper error code in irq_remapping_alloc() iommu/amd: Make amd_iommu_devtable_lock a spin_lock iommu/amd: Drop the lock while allocating new irq remap table iommu/amd: Factor out setting the remap table for a devid iommu/amd: Use `table' instead `irt' as variable name in amd_iommu_update_ga() iommu/amd: Remove the special case from alloc_irq_table() ...	2018-04-11 18:50:41 -07:00
Randy Dunlap	514c603249	headers: untangle kmemleak.h from mm.h Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious reason. It looks like it's only a convenience, so remove kmemleak.h from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that don't already #include it. Also remove <linux/kmemleak.h> from source files that do not use it. This is tested on i386 allmodconfig and x86_64 allmodconfig. It would be good to run it through the 0day bot for other $ARCHes. I have neither the horsepower nor the storage space for the other $ARCHes. Update: This patch has been extensively build-tested by both the 0day bot & kisskb/ozlabs build farms. Both of them reported 2 build failures for which patches are included here (in v2). [ slab.h is the second most used header file after module.h; kernel.h is right there with slab.h. There could be some minor error in the counting due to some #includes having comments after them and I didn't combine all of those. ] [akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr] Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org Link: http://kisskb.ellerman.id.au/kisskb/head/13396/ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> [2 build failures] Reported-by: Fengguang Wu <fengguang.wu@intel.com> [2 build failures] Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-04-05 21:36:27 -07:00
Linus Torvalds	2fcd2b306a	Merge branch 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dma mapping updates from Ingo Molnar: "This tree, by Christoph Hellwig, switches over the x86 architecture to the generic dma-direct and swiotlb code, and also unifies more of the dma-direct code between architectures. The now unused x86-only primitives are removed" * 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS dma/swiotlb: Remove swiotlb_{alloc,free}_coherent() dma/direct: Handle force decryption for DMA coherent buffers in common code dma/direct: Handle the memory encryption bit in common code dma/swiotlb: Remove swiotlb_set_mem_attributes() set_memory.h: Provide set_memory_{en,de}crypted() stubs x86/dma: Remove dma_alloc_coherent_gfp_flags() iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() x86/dma/amd_gart: Use dma_direct_{alloc,free}() x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA x86/dma: Use generic swiotlb_ops x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) x86/dma: Remove dma_alloc_coherent_mask()	2018-04-02 17:18:45 -07:00
Linus Torvalds	2451d1e59d	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Ingo Molnar: "The main x86 APIC/IOAPIC changes in this cycle were: - Robustify kexec support to more carefully restore IRQ hardware state before calling into kexec/kdump kernels. (Baoquan He) - Clean up the local APIC code a bit (Dou Liyang) - Remove unused callbacks (David Rientjes)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Finish removing unused callbacks x86/apic: Drop logical_smp_processor_id() inline x86/apic: Modernize the pending interrupt code x86/apic: Move pending interrupt check code into it's own function x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified x86/apic: Rename variables and functions related to x86_io_apic_ops x86/apic: Remove the (now) unused disable_IO_APIC() function x86/apic: Fix restoring boot IRQ mode in reboot and kexec/kdump x86/apic: Split disable_IO_APIC() into two functions to fix CONFIG_KEXEC_JUMP=y x86/apic: Split out restore_boot_irq_mode() from disable_IO_APIC() x86/apic: Make setup_local_APIC() static x86/apic: Simplify init_bsp_APIC() usage x86/x2apic: Mark set_x2apic_phys_mode() as __init	2018-04-02 13:38:43 -07:00
Joerg Roedel	d4f96fd5c2	Merge branches 'x86/amd', 'x86/vt-d', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/exynos', 'arm/renesas', 'arm/smmu' and 'core' into next	2018-03-29 15:24:40 +02:00
Robin Murphy	7868805969	iommu/io-pgtable-arm: Avoid warning with 32-bit phys_addr_t It's not entirely unreasonable for io-pgtable-arm to be built for configurations with 32-bit phys_addr_t, where the compiler rightly raises a warning about the 36-bit shift. That particular code path should never actually run on those systems, but we still want it to compile cleanly, which is easily done by using an unambiguous u64 as the intermediate type instead. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 15:19:10 +02:00
Jeffy Chen	57c26957bd	iommu/rockchip: Support sharing IOMMU between masters There would be some masters sharing the same IOMMU device. Put them in the same iommu group and share the same iommu domain. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:29 +02:00
Jeffy Chen	0f181d3cf7	iommu/rockchip: Add runtime PM support When the power domain is powered off, the IOMMU cannot be accessed and register programming must be deferred until the power domain becomes enabled. Add runtime PM support, and use runtime PM device link from IOMMU to master to enable and disable IOMMU. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:28 +02:00
Jeffy Chen	4d88a8a4c3	iommu/rockchip: Fix error handling in init It's hard to undo bus_set_iommu() in the error path, so move it to the end of rk_iommu_probe(). Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:28 +02:00
Jeffy Chen	5fd577c3ea	iommu/rockchip: Use OF_IOMMU to attach devices automatically Converts the rockchip-iommu driver to use the OF_IOMMU infrastructure, which allows attaching master devices to their IOMMUs automatically according to DT properties. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:28 +02:00
Jeffy Chen	9176a303d9	iommu/rockchip: Use IOMMU device for dma mapping operations Use the first registered IOMMU device for dma mapping operations, and drop the domain platform device. This is similar to exynos iommu driver. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:27 +02:00
Tomasz Figa	f2e3a5f557	iommu/rockchip: Control clocks needed to access the IOMMU Current code relies on master driver enabling necessary clocks before IOMMU is accessed, however there are cases when the IOMMU should be accessed while the master is not running yet, for example allocating V4L2 videobuf2 buffers, which is done by the VB2 framework using DMA mapping API and doesn't engage the master driver at all. This patch fixes the problem by letting clocks needed for IOMMU operation to be listed in Device Tree and making the driver enable them for the time of accessing the hardware. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:26 +02:00
Tomasz Figa	bf2a5e717a	iommu/rockchip: Fix TLB flush of secondary IOMMUs Due to the bug in current code, only first IOMMU has the TLB lines flushed in rk_iommu_zap_lines. This patch fixes the inner loop to execute for all IOMMUs and properly flush the TLB. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:26 +02:00
Tomasz Figa	0416bf6479	iommu/rockchip: Use iopoll helpers to wait for hardware This patch converts the rockchip-iommu driver to use the in-kernel iopoll helpers to wait for certain status bits to change in registers instead of an open-coded custom macro. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:26 +02:00
Tomasz Figa	f6717d727c	iommu/rockchip: Fix error handling in attach Currently if the driver encounters an error while attaching device, it will leave the IOMMU in an inconsistent state. Even though it shouldn't really happen in reality, let's just add proper error path to keep things consistent. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:25 +02:00
Jeffy Chen	d0b912bd4c	iommu/rockchip: Request irqs in rk_iommu_probe() Move request_irq to the end of rk_iommu_probe(). Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:22 +02:00
Jeffy Chen	6d9ffaad7e	iommu/rockchip: Fix error handling in probe Add missing iommu_device_sysfs_remove in error path. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:22 +02:00
Jeffy Chen	98b72b94de	iommu/rockchip: Prohibit unbind and remove Removal of IOMMUs cannot be done reliably. This is similar to exynos iommu driver. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 12:22:21 +02:00
Sebastian Andrzej Siewior	29d049be94	iommu/amd: Return proper error code in irq_remapping_alloc() In the unlikely case when alloc_irq_table() is not able to return a remap table then "ret" will be assigned with an error code. Later, the code checks `index' and if it is negative (which it is because it is initialized with `-1') and then then function properly aborts but returns `-1' instead `-ENOMEM' what was intended. In order to correct this, I assign -ENOMEM to index. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:17 +02:00
Sebastian Andrzej Siewior	2cd1083d79	iommu/amd: Make amd_iommu_devtable_lock a spin_lock Before commit `0bb6e243d7` ("iommu/amd: Support IOMMU_DOMAIN_DMA type allocation") amd_iommu_devtable_lock had a read_lock() user but now there are none. In fact, after the mentioned commit we had only write_lock() user of the lock. Since there is no reason to keep it as writer lock, change its type to a spin_lock. I think that we might even be able to remove the lock because all its current user seem to have their own protection. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:17 +02:00
Sebastian Andrzej Siewior	993ca6e063	iommu/amd: Drop the lock while allocating new irq remap table The irq_remap_table is allocated while the iommu_table_lock is held with interrupts disabled. >From looking at the call sites, all callers are in the early device initialisation (apic_bsp_setup(), pci_enable_device(), pci_enable_msi()) so make sense to drop the lock which also enables interrupts and try to allocate that memory with GFP_KERNEL instead GFP_ATOMIC. Since during the allocation the iommu_table_lock is dropped, we need to recheck if table exists after the lock has been reacquired. I think that it is impossible that the "devid" entry appears in irq_lookup_table while the lock is dropped since the same device can only be probed once. However I check for both cases, just to be sure. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:16 +02:00
Sebastian Andrzej Siewior	2fcc1e8ac4	iommu/amd: Factor out setting the remap table for a devid Setting the IRQ remap table for a specific devid (or its alias devid) includes three steps. Those three steps are always repeated each time this is done. Introduce a new helper function, move those steps there and use that function instead. The compiler can still decide if it is worth to inline. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:16 +02:00
Sebastian Andrzej Siewior	4fde541c9d	iommu/amd: Use `table' instead `irt' as variable name in amd_iommu_update_ga() The variable of type struct irq_remap_table is always named `table' except in amd_iommu_update_ga() where it is called `irt'. Make it consistent and name it also `table'. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:15 +02:00
Sebastian Andrzej Siewior	fde65dd3d3	iommu/amd: Remove the special case from alloc_irq_table() alloc_irq_table() has a special ioapic argument. If set then it will pre-allocate / reserve the first 32 indexes. The argument is only once true and it would make alloc_irq_table() a little simpler if we would extract the special bits to the caller. The caller of irq_remapping_alloc() is holding irq_domain_mutex so the initialization of iommu->irte_ops->set_allocated() should not race against other user. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:15 +02:00
Sebastian Andrzej Siewior	ea6166f4b8	iommu/amd: Split irq_lookup_table out of the amd_iommu_devtable_lock The function get_irq_table() reads/writes irq_lookup_table while holding the amd_iommu_devtable_lock. It also modifies amd_iommu_dev_table[].data[2]. set_dte_entry() is using amd_iommu_dev_table[].data[0\|1] (under the domain->lock) so it should be okay. The access to the iommu is serialized with its own (iommu's) lock. So split out get_irq_table() out of amd_iommu_devtable_lock's lock. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:15 +02:00
Sebastian Andrzej Siewior	2bc0018089	iommu/amd: Split domain id out of amd_iommu_devtable_lock domain_id_alloc() and domain_id_free() is used for id management. Those two function share a bitmap (amd_iommu_pd_alloc_bitmap) and set/clear bits based on id allocation. There is no need to share this with amd_iommu_devtable_lock, it can use its own lock for this operation. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:15 +02:00
Sebastian Andrzej Siewior	779da73273	iommu/amd: Turn dev_data_list into a lock less list alloc_dev_data() adds new items to dev_data_list and search_dev_data() is searching for items in this list. Both protect the access to the list with a spinlock. There is no need to navigate forth and back within the list and there is also no deleting of a specific item. This qualifies the list to become a lock less list and as part of this, the spinlock can be removed. With this change the ordering of those items within the list is changed: before the change new items were added to the end of the list, now they are added to the front. I don't think it matters but wanted to mention it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:14 +02:00
Sebastian Andrzej Siewior	39ffe39545	iommu/amd: Take into account that alloc_dev_data() may return NULL find_dev_data() does not check whether the return value alloc_dev_data() is NULL. This was okay once because the pointer was returned once as-is. Since commit `df3f7a6e8e` ("iommu/amd: Use is_attach_deferred call-back") the pointer may be used within find_dev_data() so a NULL check is required. Cc: Baoquan He <bhe@redhat.com> Fixes: `df3f7a6e8e` ("iommu/amd: Use is_attach_deferred call-back") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:38:14 +02:00
Shaokun Zhang	f746a025d3	iommu/vt-d:Remove unused variable Unused after commit <42e8c186b595> ("iommu/vt-d: Simplify io/tlb flushing in intel_iommu_unmap"), cleanup it. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-29 10:31:25 +02:00
Robin Murphy	dcd189e6d2	iommu/arm-smmu-v3: Support 52-bit virtual address Stage 1 input addresses are effectively 64-bit in SMMUv3 anyway, so really all that's involved is letting io-pgtable know the appropriate upper bound for T0SZ. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:06 +01:00
Robin Murphy	6619c91385	iommu/arm-smmu-v3: Support 52-bit physical address Implement SMMUv3.1 support for 52-bit physical addresses. Since a 52-bit OAS implies 64KB translation granule support, permitting level 1 block entries there is simple, and the rest is just extending address fields. Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:06 +01:00
Robin Murphy	6c89928ff7	iommu/io-pgtable-arm: Support 52-bit physical address Bring io-pgtable-arm in line with the ARMv8.2-LPA feature allowing 52-bit physical addresses when using the 64KB translation granule. This will be supported by SMMUv3.1. Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:05 +01:00
Robin Murphy	7417b99c49	iommu/arm-smmu-v3: Clean up queue definitions As with registers and tables, use GENMASK and the bitfield accessors consistently for queue fields, to save some lines and ease maintenance a little. This now leaves everything in a nice state where all named field definitions expect to be used with bitfield accessors (although since single-bit fields can still be used directly we leave some of those uses as-is to avoid unnecessary churn), while the few remaining *_MASK definitions apply exclusively to in-place values. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:05 +01:00
Robin Murphy	ba08bdcbf7	iommu/arm-smmu-v3: Clean up table definitions As with registers, use GENMASK and the bitfield accessors consistently for table fields, to save some lines and ease maintenance a little. This also catches a subtle off-by-one wherein bit 5 of CD.T0SZ was missing. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:05 +01:00
Robin Murphy	cbcee19ac4	iommu/arm-smmu-v3: Clean up register definitions The FIELD_{GET,PREP} accessors provided by linux/bitfield.h allow us to define multi-bit register fields solely in terms of their bit positions via GENMASK(), without needing explicit _SHIFT and _MASK definitions. As well as the immediate reduction in lines of code, this avoids the awkwardness of values sometimes being pre-shifted and sometimes not, which means we can factor out some common values like memory attributes. Furthermore, it also makes it trivial to verify the definitions against the architecture spec, on which note let's also fix up a few field names to properly match the current release (IHI0070B). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:04 +01:00
Robin Murphy	1cf9e54e91	iommu/arm-smmu-v3: Clean up address masking Before trying to add the SMMUv3.1 support for 52-bit addresses, make things bearable by cleaning up the various address mask definitions to use GENMASK_ULL() consistently. The fact that doing so reveals (and fixes) a latent off-by-one in Q_BASE_ADDR_MASK only goes to show what a jolly good idea it is... Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:04 +01:00
Nate Watterson	940ded9c21	iommu/arm-smmu-v3: limit reporting of MSI allocation failures Currently, the arm-smmu-v3 driver expects to allocate MSIs for all SMMUs with FEAT_MSI set. This results in unwarranted "failed to allocate MSIs" warnings being printed on systems where FW was either deliberately configured to force the use of SMMU wired interrupts -or- is altogether incapable of describing SMMU MSI topology (ACPI IORT prior to rev.C). Remedy this by checking msi_domain before attempting to allocate SMMU MSIs. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:03 +01:00
Robin Murphy	4c8996d7d7	iommu/arm-smmu-v3: Warn about missing IRQs It is annoyingly non-obvious when DMA transactions silently go missing due to undetected SMMU faults. Help skip the first few debugging steps in those situations by making it clear when we have neither wired IRQs nor MSIs with which to raise error conditions. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-03-27 14:12:03 +01:00
Yong Wu	70ca608b2e	iommu/mediatek: Fix protect memory setting In MediaTek's IOMMU design, When a iommu translation fault occurs (HW can NOT translate the destination address to a valid physical address), the IOMMU HW output the dirty data into a special memory to avoid corrupting the main memory, this is called "protect memory". the register(0x114) for protect memory is a little different between mt8173 and mt2712. In the mt8173, bit[30:6] in the register represents [31:7] of the physical address. In the 4GB mode, the register bit[31] should be 1. While in the mt2712, the bits don't shift. bit[31:7] in the register represents [31:7] in the physical address, and bit[1:0] in the register represents bit[33:32] of the physical address if it has. Fixes: `e6dec92308` ("iommu/mediatek: Add mt2712 IOMMU support") Reported-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-21 06:13:57 -05:00
Lu Baolu	971401015d	iommu/vt-d: Use real PASID for flush in caching mode If caching mode is supported, the hardware will cache none-present or erroneous translation entries. Hence, software should explicitly invalidate the PASID cache after a PASID table entry becomes present. We should issue such invalidation with the PASID value that we have changed. PASID 0 is not reserved for this case. Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Sankaran Rajesh <rajesh.sankaran@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-20 13:43:27 -05:00
Christoph Hellwig	d657c5c73c	iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() Use the dma_direct_*() helpers and clean up the code flow. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-9-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 10:01:58 +01:00
Christoph Hellwig	b468620f2a	iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() This cleans up the code a lot by removing duplicate logic. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-8-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 10:01:58 +01:00
Christoph Hellwig	fec777c385	x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) The generic DMA-direct (CONFIG_DMA_DIRECT_OPS=y) implementation is now functionally equivalent to the x86 nommu dma_map implementation, so switch over to using it. That includes switching from using x86_dma_supported in various IOMMU drivers to use dma_direct_supported instead, which provides the same functionality. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-4-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 10:01:56 +01:00
Jeffy Chen	b6d57f1da7	iommu/omap: Increase group ref in .device_group() Increase group refcounting in omap_iommu_device_group(). Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-15 15:54:52 +01:00
Gary R Hook	90ca3859f5	iommu/amd: Use dev_err to send events to the system log Remove printk and use a more preferable error logging function. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-15 15:20:55 +01:00
Lu Baolu	bbe4b3af9d	iommu/vt-d: Fix a potential memory leak A memory block was allocated in intel_svm_bind_mm() but never freed in a failure path. This patch fixes this by free it to avoid memory leakage. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-15 15:15:22 +01:00
Marc Zyngier	1a4e90f25b	iommu/rockchip: Perform a reset on shutdown Trying to do a kexec whilst the iommus are still on is proving to be a challenging exercise. It is terribly unsafe, as we're reusing the memory allocated for the page tables, leading to a likely crash. Let's implement a shutdown method that will at least try to stop DMA from going crazy behind our back. Note that we need to be extra cautious when doing so, as the IOMMU may not be clocked if controlled by a another master, as typical on Rockchip system. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-15 15:11:56 +01:00
Suravee Suthikulpanit	eb5ecd1a40	iommu/amd: Add support for fast IOTLB flushing Since AMD IOMMU driver currently flushes all TLB entries when page size is more than one, use the same interface for both iommu_ops.flush_iotlb_all() and iommu_ops.iotlb_sync(). Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-03-15 13:37:34 +01:00
Ingo Molnar	ed7158bae4	treewide/trivial: Remove ';;$' typo noise On lkml suggestions were made to split up such trivial typo fixes into per subsystem patches: --- a/arch/x86/boot/compressed/eboot.c +++ b/arch/x86/boot/compressed/eboot.c @@ -439,7 +439,7 @@ setup_uga32(void *uga_handle, unsigned long size, u32 width, u32 height) struct efi_uga_draw_protocol uga = NULL, first_uga; efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID; unsigned long nr_ugas; - u32 handles = (u32 )uga_handle;; + u32 handles = (u32 *)uga_handle; efi_status_t status = EFI_INVALID_PARAMETER; int i; This patch is the result of the following script: $ sed -i 's/;;$/;/g' $(git grep -E ';;$' \| grep "\.[ch]:" \| grep -vwE 'for\|ia64' \| cut -d: -f1 \| sort \| uniq) ... followed by manual review to make sure it's all good. Splitting this up is just crazy talk, let's get over with this and just do it. Reported-by: Pavel Machek <pavel@ucw.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-02-22 10:59:33 +01:00
Baoquan He	51b146c572	x86/apic: Rename variables and functions related to x86_io_apic_ops The names of x86_io_apic_ops and its two member variables are misleading: The ->read() member is to read IO_APIC reg, while ->disable() which is called by native_disable_io_apic()/irq_remapping_disable_io_apic() is actually used to restore boot IRQ mode, not to disable the IO-APIC. So rename x86_io_apic_ops to 'x86_apic_ops' since it doesn't only handle the IO-APIC, but also the local APIC. Also rename its member variables and the related callbacks. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: douly.fnst@cn.fujitsu.com Cc: joro@8bytes.org Cc: prarit@redhat.com Cc: uobergfe@redhat.com Link: http://lkml.kernel.org/r/20180214054656.3780-6-bhe@redhat.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-02-17 11:47:45 +01:00
Scott Wood	df42a04b15	iommu/amd: Avoid locking get_irq_table() from atomic context get_irq_table() previously acquired amd_iommu_devtable_lock which is not a raw lock, and thus cannot be acquired from atomic context on PREEMPT_RT. Many calls to modify_irte*() come from atomic context due to the IRQ desc->lock, as does amd_iommu_update_ga() due to the preemption disabling in vcpu_load/put(). The only difference between calling get_irq_table() and reading from irq_lookup_table[] directly, other than the lock acquisition and amd_iommu_rlookup_table[] check, is if the table entry is unpopulated, which should never happen when looking up a devid that came from an irq_2_irte struct, as get_irq_table() would have already been called on that devid during irq_remapping_alloc(). The lock acquisition is not needed in these cases because entries in irq_lookup_table[] never change once non-NULL -- nor would the amd_iommu_devtable_lock usage in get_irq_table() provide meaningful protection if they did, since it's released before using the looked up table in the get_irq_table() caller. Rename the old get_irq_table() to alloc_irq_table(), and create a new lockless get_irq_table() to be used in non-allocating contexts that WARNs if it doesn't find what it's looking for. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-15 12:03:26 +01:00
Shameer Kolothum	f51dc89265	iommu/dma: Add HW MSI(GICv3 ITS) address regions reservation Modified iommu_dma_get_resv_regions() to include GICv3 ITS region on ACPI based ARM platfiorms which may require HW MSI reservations. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-14 15:15:41 +01:00
Vivek Gautam	193e67c00e	iommu/io-pgtable: Use size_t return type for all foo_unmap Unmap returns a size_t all throughout the IOMMU framework. Make io-pgtable match this convention. Moreover, there isn't a need to have a signed int return type as we return 0 in case of failures. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 19:31:32 +01:00
Suravee Suthikulpanit	c5611a8751	iommu: Do not return error code for APIs with size_t return type Currently, iommu_unmap, iommu_unmap_fast and iommu_map_sg return size_t. However, some of the return values are error codes (< 0), which can be misinterpreted as large size. Therefore, returning size 0 instead to signify failure to map/unmap. Cc: Joerg Roedel <joro@8bytes.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 19:31:20 +01:00
Dmitry Safonov	d15a339eac	iommu/vt-d: Add __init for dmar_register_bus_notifier() It's called only from intel_iommu_init(), which is init function. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 17:40:53 +01:00
Yong Wu	f3e827d73e	iommu/mediatek: Move attach_device after iommu-group is ready for M4Uv1 In the commit `05f80300dc` ("iommu: Finish making iommu_group support mandatory"), the iommu framework has supposed all the iommu drivers have their owner iommu-group, it get rid of the FIXME workarounds while the group is NULL. But the flow of Mediatek M4U gen1 looks a bit trick that it will hang at this case: ========================================== Unable to handle kernel NULL pointer dereference at virtual address 00000030 pgd = c0004000 [00000030] *pgd=00000000 PC is at mutex_lock+0x28/0x54 LR is at iommu_attach_device+0xa4/0xd4 pc : [<c07632e8>] lr : [<c04736fc>] psr: 60000013 sp : df0edbb8 ip : df0edbc8 fp : df0edbc4 r10: c114da14 r9 : df2a3e40 r8 : 00000003 r7 : df27a210 r6 : df2a90c4 r5 : 00000030 r4 : 00000000 r3 : df0f8000 r2 : fffff000 r1 : df29c610 r0 : 00000030 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none xxx (mutex_lock) from [<c04736fc>] (iommu_attach_device+0xa4/0xd4) (iommu_attach_device) from [<c011b9dc>] (__arm_iommu_attach_device+0x28/0x90) (__arm_iommu_attach_device) from [<c011ba60>] (arm_iommu_attach_device+0x1c/0x30) (arm_iommu_attach_device) from [<c04759ac>] (mtk_iommu_add_device+0xfc/0x214) (mtk_iommu_add_device) from [<c0472aa4>] (add_iommu_group+0x3c/0x68) (add_iommu_group) from [<c047d044>] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [<c04734a4>] (bus_set_iommu+0xb0/0xec) (bus_set_iommu) from [<c0476310>] (mtk_iommu_probe+0x328/0x368) (mtk_iommu_probe) from [<c048189c>] (platform_drv_probe+0x5c/0xc0) (platform_drv_probe) from [<c047f510>] (driver_probe_device+0x2f4/0x4d8) (driver_probe_device) from [<c047f800>] (__driver_attach+0x10c/0x128) (__driver_attach) from [<c047d044>] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [<c047ec78>] (driver_attach+0x2c/0x30) (driver_attach) from [<c047e640>] (bus_add_driver+0x1e0/0x278) (bus_add_driver) from [<c048052c>] (driver_register+0x88/0x108) (driver_register) from [<c04817ec>] (__platform_driver_register+0x50/0x58) (__platform_driver_register) from [<c0b31380>] (m4u_init+0x24/0x28) (m4u_init) from [<c0101c38>] (do_one_initcall+0xf0/0x17c) ========================= The root cause is that the device's iommu-group is NULL while arm_iommu_attach_device is called. This patch prepare a new iommu-group for the iommu consumer devices to fix this issue. CC: Robin Murphy <robin.murphy@arm.com> CC: Honghui Zhang <honghui.zhang@mediatek.com> Fixes: `05f80300dc` ("iommu: Finish making iommu_group support mandatory") Reported-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 14:40:31 +01:00
Robin Murphy	6d7cf02a86	iommu/exynos: Use generic group callback Since iommu_group_get_for_dev() already tries iommu_group_get() and will not call ops->device_group if the group is already non-NULL, the check in get_device_iommu_group() is always redundant and it reduces to a duplicate of the generic version; let's just use that one instead. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 14:33:53 +01:00
Scott Wood	01ee04bade	iommu/amd: Don't use dev_data in irte_ga_set_affinity() search_dev_data() acquires a non-raw lock, which can't be done from atomic context on PREEMPT_RT. There is no need to look at dev_data because guest_mode should never be set if use_vapic is not set. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 14:18:06 +01:00
Scott Wood	27790398c2	iommu/amd: Use raw locks on atomic context paths Several functions in this driver are called from atomic context, and thus raw locks must be used in order to be safe on PREEMPT_RT. This includes paths that must wait for command completion, which is a potential PREEMPT_RT latency concern but not easily avoidable. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-02-13 14:18:06 +01:00
Linus Torvalds	ef9417e8a9	IOMMU Updates for Linux v4.16 Including: - 5-level page-table support for the Intel IOMMU. - Error reporting improvements for the AMD IOMMU driver - Additional DT bindings for ipmmu-vmsa (Renesas) - Smaller fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJafGLMAAoJECvwRC2XARrjPTUP/0g/n8H5j35DevM56G62MrNq fNweMxPm7AqZQR/dnIkPnlH5NWfP1z5PZ47H/nAMAqd7cKHVOfUmzoufiUSGP92V eweFF4ufjqA+V5fluGcnt0UNxgbEGs+cEgf9jbEkUlpmFisV7BwOCGIJbVdHMrxG jkrr/L17iX82uqIru9JmfB2K0pEPBtBHQSZpooGHAyGsR4xU6nX1X64mV/a9Oh/2 qzfzRsAbF5ZtAszktVz9j2AMfp40BrrAcHzmvepjS5yTjlH9t5J8UdM48GHWU+Zp ptmlJ3fJybe0yUI6GDfG9M6+/RX0T/xMvV1QcSJW6KP0q/i9p4hrIQufoOzstMYM uCsFPlhMLFSDcQy6CZ3M6VEsU5mdJ0KMn0xAN8rBLAok1ScGKrlP5qWpXJLeUJRp Ie7R4WVT+Ly/SLppoiLagiTW3ZD/gQh+YPNgYwXptMdDmiqSRdXm0nF6bzTiKk1Z 8h8oEj2ittwBTC+fXuP+1C/wOKYL6KJUGnykLcHBDO+/wkEWOP0KM6939+T7IjHt zkiUapRegRvWyDOq1HFVl0tBCRLo1dqwG/3PFpqHUkj6Iyqyhd8y/V5IM3GTSI+d 1tHBz6dXin62N/xYu/ScpmPMerpjP/AtMqd3dvx7Q+9vgNIAVSPMKFqeXhQ3P2ph +p1CdWvPYPb7wUhTvcja =+LFh -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This time there are not a lot of changes coming from the IOMMU side. That is partly because I returned from my parental leave late in the development process and probably partly because everyone was busy with Spectre and Meltdown mitigation work and didn't find the time for IOMMU work. So here are the few changes that queued up for this merge window: - 5-level page-table support for the Intel IOMMU. - error reporting improvements for the AMD IOMMU driver - additional DT bindings for ipmmu-vmsa (Renesas) - small fixes and cleanups" * tag 'iommu-updates-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Clean up of_iommu_init_fn iommu/ipmmu-vmsa: Remove redundant of_iommu_init_fn hook iommu/msm: Claim bus ops on probe iommu/vt-d: Enable 5-level paging mode in the PASID entry iommu/vt-d: Add a check for 5-level paging support iommu/vt-d: Add a check for 1GB page support iommu/vt-d: Enable upto 57 bits of domain address width iommu/vt-d: Use domain instead of cache fetching iommu/exynos: Don't unconditionally steal bus ops iommu/omap: Fix debugfs_create_*() usage iommu/vt-d: clean up pr_irq if request_threaded_irq fails iommu: Check the result of iommu_group_get() for NULL iommu/ipmmu-vmsa: Add r8a779(70\|95) DT bindings iommu/ipmmu-vmsa: Add r8a7796 DT binding iommu/amd: Set the device table entry PPR bit for IOMMU V2 devices iommu/amd - Record more information about unknown events	2018-02-08 12:03:54 -08:00
Linus Torvalds	105cf3c8c6	pci-v4.16-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJad5lgAAoJEFmIoMA60/r8s2kQAI3PztawDpaCP9Z12pkbBHSt Ho0xTyk9rCZi9kQJbNjc+a+QrlA3QmTHXIXerB3LSWoh7M+XhsECjem92eHpgLNS JvYPhTfOrCr0vdiAmOz6hD0AqN/psrbfzgiJhSwomsGEFS77k7kERSJckRv81sxb Aj5F/WjucAgLorwm4auveAJEQ7atE7/6pkXzoqYm4G6NLOb46jUcRGndrnvXZBlz fws8fBM4BHyi7i25CYQl24tFq1CGax1rIPgLg+4KnH76bQk/N6Ju0sGVSzfh+hG8 SIerK9bJbzGRAuNKoxB3aO1dyzsK3x9WztE2mG98w5trOISPIR1FqnvC/225FWAU d6eIXiC7wKnEx+DElNTzCjzfHc7SAJoupO32H7CoiTe5zPUlWlxJ1zLYkK1gt50q m8PRBiYTglxyznzrO0drtcdjEzvbdZNRrsYnul4wi1vSHzjk6F6XLtzT10XWM1M1 1pXLB8384FTj0Hu4bq6Y3Aivkmz0Sf+eQM2NaOwe+Zj7/1VV0d3lvi4LUXkqzLCA FoXPJSMxG2Qu+iflCeYRQBJjExaZH3eNLZ3dT6QpcJrjaFVedd9u5DeeFqNL27zV bhr8TdqrR4p4rc8EBAGoCapw96IxLZROKB3gxbrZVOpfIZpzthwHbElHX6aqUgF4 w/EV1JWs36WXWaxFk8wd =ttq9 -----END PGP SIGNATURE----- Merge tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - skip AER driver error recovery callbacks for correctable errors reported via ACPI APEI, as we already do for errors reported via the native path (Tyler Baicar) - fix DPC shared interrupt handling (Alex Williamson) - print full DPC interrupt number (Keith Busch) - enable DPC only if AER is available (Keith Busch) - simplify DPC code (Bjorn Helgaas) - calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn Helgaas) - enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn Helgaas) - move ASPM internal interfaces out of public header (Bjorn Helgaas) - allow hot-removal of VGA devices (Mika Westerberg) - speed up unplug and shutdown by assuming Thunderbolt controllers don't support Command Completed events (Lukas Wunner) - add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling, Jay Cornwall) - expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes) - clean up PCI DMA interface usage (Christoph Hellwig) - remove PCI pool API (replaced with DMA pool) (Romain Perier) - deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan Kaya) - move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring) - add PCI-specific wrappers for dev_info(), etc (Frederick Lawler) - remove warnings on sysfs mmap failure (Bjorn Helgaas) - quiet ROM validation messages (Alex Deucher) - remove redundant memory alloc failure messages (Markus Elfring) - fill in types for compile-time VGA and other I/O port resources (Bjorn Helgaas) - make "pci=pcie_scan_all" work for Root Ports as well as Downstream Ports to help AmigaOne X1000 (Bjorn Helgaas) - add SPDX tags to all PCI files (Bjorn Helgaas) - quirk Marvell 9128 DMA aliases (Alex Williamson) - quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas) - fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas Cassel) - use DMA API to get MSI address for DesignWare IP (Niklas Cassel) - fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I) - fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun) - add support for ARTPEC-7 SoC (Niklas Cassel) - add endpoint-mode support for ARTPEC (Niklas Cassel) - add Cadence PCIe host and endpoint controller driver (Cyrille Pitchen) - handle multiple INTx status bits being set in dra7xx (Vignesh R) - translate dra7xx hwirq range to fix INTD handling (Vignesh R) - remove deprecated Exynos PHY initialization code (Jaehoon Chung) - fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu) - fix NULL pointer dereference in iProc BCMA driver (Ray Jui) - fix Keystone interrupt-controller-node lookup (Johan Hovold) - constify qcom driver structures (Julia Lawall) - rework Tegra config space mapping to increase space available for endpoints (Vidya Sagar) - simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy) - remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy) - add support for Global Fabric Manager Server (GFMS) event to Microsemi Switchtec switch driver (Logan Gunthorpe) - add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao) * tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller PCI: endpoint: Fix EPF device name to support multi-function devices PCI: endpoint: Add the function number as argument to EPC ops PCI: cadence: Add host driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller PCI: Add vendor ID for Cadence PCI: Add generic function to probe PCI host controllers PCI: generic: fix missing call of pci_free_resource_list() PCI: OF: Add generic function to parse and allocate PCI resources PCI: Regroup all PCI related entries into drivers/pci/Makefile PCI/DPC: Reformat DPC register definitions PCI/DPC: Add and use DPC Status register field definitions PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error() PCI/DPC: Remove unnecessary RP PIO register structs PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info() PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info() PCI/DPC: Make RP PIO log size check more generic PCI/DPC: Rename local "status" to "dpc_status" PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error() ...	2018-02-06 09:59:40 -08:00
Linus Torvalds	fe53d1443a	ARM: SoC driver updates for 4.16 A number of new drivers get added this time, along with many low-priority bugfixes. The most interesting changes by subsystem are: bus drivers: - Updates to the Broadcom bus interface driver to support newer SoC types - The TI OMAP sysc driver now supports updated DT bindings memory controllers: - A new driver for Tegra186 gets added - A new driver for the ti-emif sram, to allow relocating suspend/resume handlers there SoC specific: - A new driver for Qualcomm QMI, the interface to the modem on MSM SoCs - A new driver for power domains on the actions S700 SoC - A driver for the Xilinx Zynq VCU logicoreIP reset controllers: - A new driver for Amlogic Meson-AGX - various bug fixes tee subsystem: - A new user interface got added to enable asynchronous communication with the TEE supplicant. - A new method of using user space memory for communication with the TEE is added -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJac0fQAAoJEGCrR//JCVIntjEQALc6kflEGJc/FPundbx9V3F/ b+3+EX/uMnBnKsgZprz9ACPhx5eBH9QWja3A1zmIarb5c+q7zbBZDwhzUb8J8Yg8 xEb0im7Wx/GcKjUYZVKYBxtz9KjkXDzhrq8IAvPg6ShNcIy/8hq7ZO3iOkGsTDcy /PyioWKC5g0dhJgtp91X1kgog5tuTaWOg39uUOqyEzwVu1vYVa4w+eeCzjEd6I// 68R/zDQ52+hWw6WZGoYOsNYzuriOflnJRnNpwuGhMhLNULBJfWnd4hkqGm4E+hFa 5dzW6vVAdIqjemFqPzCBT2WB4UG871aZX8DJ9HgnfX+g970nlsm1JY8Ck9MJNJum aDkqZG41ArUYzDFWu8vJ2SKsue5lEZp6TEO2mLEVYrdOjOgedj0Zxqmq2DYeigxd +ccOVgKJ9SsYw9ft1LkQ5BHCgOh3C7y9Kcg7oBnaEI5OTVvtf5PwEkT2cwbvgxYl EVKLhlJ0Af+QXOW8E5JbNQETpYw52DMm6UKHiYn/JCGxB/8J0bgJzImDJI4Dtu2h zqJITKJeTepqbfA5pmNfKa+20RhmsktdRCw2NN/QynY7EEtGjHAUVnlpZT2mrDco 0m62b7Erek/776vJN5ECzE5e6XCs2N0MDE6Anp121C5zEmig/SMBrUosMzP7Jnis IDVC/QWkb3u85wK20Vc1 =yz0k -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "A number of new drivers get added this time, along with many low-priority bugfixes. The most interesting changes by subsystem are: bus drivers: - Updates to the Broadcom bus interface driver to support newer SoC types - The TI OMAP sysc driver now supports updated DT bindings memory controllers: - A new driver for Tegra186 gets added - A new driver for the ti-emif sram, to allow relocating suspend/resume handlers there SoC specific: - A new driver for Qualcomm QMI, the interface to the modem on MSM SoCs - A new driver for power domains on the actions S700 SoC - A driver for the Xilinx Zynq VCU logicoreIP reset controllers: - A new driver for Amlogic Meson-AGX - various bug fixes tee subsystem: - A new user interface got added to enable asynchronous communication with the TEE supplicant. - A new method of using user space memory for communication with the TEE is added" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (84 commits) of: platform: fix OF node refcount leak soc: fsl: guts: Add a NULL check for devm_kasprintf() bus: ti-sysc: Fix smartreflex sysc mask psci: add CPU_IDLE dependency soc: xilinx: Fix Kconfig alignment soc: xilinx: xlnx_vcu: Use bitwise & rather than logical && on clkoutdiv soc: xilinx: xlnx_vcu: Depends on HAS_IOMEM for xlnx_vcu soc: bcm: brcmstb: Be multi-platform compatible soc: brcmstb: biuctrl: exit without warning on non brcmstb platforms Revert "soc: brcmstb: Only register SoC device on STB platforms" bus: omap: add MODULE_LICENSE tags soc: brcmstb: Only register SoC device on STB platforms tee: shm: Potential NULL dereference calling tee_shm_register() soc: xilinx: xlnx_vcu: Add Xilinx ZYNQMP VCU logicoreIP init driver dt-bindings: soc: xilinx: Add DT bindings to xlnx_vcu driver soc: xilinx: Create folder structure for soc specific drivers of: platform: populate /firmware/ node from of_platform_default_populate_init() soc: samsung: Add SPDX license identifiers soc: qcom: smp2p: Use common error handling code in qcom_smp2p_probe() tee: shm: don't put_page on null shm->pages ...	2018-02-01 16:35:31 -08:00
David Rientjes	5ff7091f5a	mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks Commit `4d4bbd8526` ("mm, oom_reaper: skip mm structs with mmu notifiers") prevented the oom reaper from unmapping private anonymous memory with the oom reaper when the oom victim mm had mmu notifiers registered. The rationale is that doing mmu_notifier_invalidate_range_{start,end}() around the unmap_page_range(), which is needed, can block and the oom killer will stall forever waiting for the victim to exit, which may not be possible without reaping. That concern is real, but only true for mmu notifiers that have blockable invalidate_range_{start,end}() callbacks. This patch adds a "flags" field to mmu notifier ops that can set a bit to indicate that these callbacks do not block. The implementation is steered toward an expensive slowpath, such as after the oom reaper has grabbed mm->mmap_sem of a still alive oom victim. [rientjes@google.com: mmu_notifier_invalidate_range_end() can also call the invalidate_range() must not block, fix comment] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1801091339570.240101@chino.kir.corp.google.com [akpm@linux-foundation.org: make mm_has_blockable_invalidate_notifiers() return bool, use rwsem_is_locked()] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1712141329500.74052@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dimitri Sivanich <sivanich@hpe.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Joerg Roedel <joro@8bytes.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-01-31 17:18:38 -08:00
Joerg Roedel	fedbd940d1	Merge branches 'arm/renesas', 'arm/omap', 'arm/exynos', 'x86/amd', 'x86/vt-d' and 'core' into next	2018-01-17 15:29:14 +01:00
Robin Murphy	b0c560f7d8	iommu: Clean up of_iommu_init_fn Now that no more drivers rely on arbitrary early initialisation via an of_iommu_init_fn hook, let's clean up the redundant remnants. The IOMMU_OF_DECLARE() macro needs to remain for now, as the probe-deferral mechanism has no other nice way to detect built-in drivers before they have registered themselves, such that it can make the right decision. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:25:50 +01:00
Robin Murphy	e7747d88e0	iommu/ipmmu-vmsa: Remove redundant of_iommu_init_fn hook Having of_iommu_init() call ipmmu_init() via ipmmu_vmsa_iommu_of_setup() does nothing that the subsys_initcall wouldn't do slightly later anyway, since probe-deferral of masters means it is no longer critical to register the driver super-early. Clean it up. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:25:49 +01:00
Robin Murphy	892d7aaddb	iommu/msm: Claim bus ops on probe Since the MSM IOMMU driver now probes via DT exclusively rather than platform data, dependent masters should be deferred until the IOMMU itself is ready. Thus we can do away with the early initialisation hook to unconditionally claim the bus ops, and instead do that only once an IOMMU is actually probed. Furthermore, this should also make the driver safe for multiplatform kernels on non-MSM SoCs. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:25:49 +01:00
Sohil Mehta	2f13eb7c58	iommu/vt-d: Enable 5-level paging mode in the PASID entry If the CPU has support for 5-level paging enabled and the IOMMU also supports 5-level paging then enable the 5-level paging mode for first- level translations - used when SVM is enabled. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:02:50 +01:00
Sohil Mehta	f1ac10c24e	iommu/vt-d: Add a check for 5-level paging support Add a check to verify IOMMU 5-level paging support. If the CPU supports supports 5-level paging but the IOMMU does not support it then disable SVM by not allocating PASID tables. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:02:50 +01:00
Sohil Mehta	59103caa68	iommu/vt-d: Add a check for 1GB page support Add a check to verify IOMMU 1GB page support. If the CPU supports 1GB pages but the IOMMU does not support it then disable SVM by not allocating PASID tables. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:02:50 +01:00
Sohil Mehta	5e3b4a15dd	iommu/vt-d: Enable upto 57 bits of domain address width Update the IOMMU default domain address width to 57 bits. This would enable the IOMMU to do upto 5-levels of paging for second level translations - IOVA translation requests without PASID. Even though the maximum supported address width is being increased to 57, __iommu_calculate_agaw() would set the actual supported address width to the maximum support available in IOMMU hardware. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 15:02:49 +01:00
Peter Xu	9d2e6505f6	iommu/vt-d: Use domain instead of cache fetching after commit `a1ddcbe930` ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter to iommu_flush_iotlb_psi(), so no need to fetch it from cache again. More importantly, a NULL reference pointer bug is reported on RHEL7 (and it can be reproduced on some old upstream kernels too, e.g., v4.13) by unplugging an 40g nic from a VM (hard to test unplug on real host, but it should be the same): https://bugzilla.redhat.com/show_bug.cgi?id=1531367 [ 24.391863] pciehp 0000:00:03.0:pcie004: Slot(0): Attention button pressed [ 24.393442] pciehp 0000:00:03.0:pcie004: Slot(0): Powering off due to button press [ 29.721068] i40evf 0000:01:00.0: Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK [ 29.783557] iommu: Removing device 0000:01:00.0 from group 3 [ 29.784662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000304 [ 29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120 [ 29.786486] PGD 0 [ 29.786487] P4D 0 [ 29.786812] [ 29.787390] Oops: 0000 [#1] SMP [ 29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_ng [ 29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14 [ 29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-1.el7 04/01/2014 [ 29.797593] Workqueue: pciehp-0 pciehp_power_thread [ 29.798328] task: ffff94f5745b4a00 task.stack: ffffb326805ac000 [ 29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120 [ 29.799919] RSP: 0018:ffffb326805afbd0 EFLAGS: 00010086 [ 29.800666] RAX: ffff94f5bc56e800 RBX: 0000000000000000 RCX: 0000000200000025 [ 29.801667] RDX: ffff94f5bc56e000 RSI: 0000000000000082 RDI: 0000000000000000 [ 29.802755] RBP: ffffb326805afbf8 R08: 0000000000000000 R09: ffff94f5bc86bbf0 [ 29.803772] R10: ffffb326805afba8 R11: 00000000000ffdc4 R12: ffff94f5bc86a400 [ 29.804789] R13: 0000000000000000 R14: 00000000ffdc4000 R15: 0000000000000000 [ 29.805792] FS: 0000000000000000(0000) GS:ffff94f5bfc00000(0000) knlGS:0000000000000000 [ 29.806923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.807736] CR2: 0000000000000304 CR3: 000000003499d000 CR4: 00000000000006f0 [ 29.808747] Call Trace: [ 29.809156] flush_unmaps_timeout+0x126/0x1c0 [ 29.809800] domain_exit+0xd6/0x100 [ 29.810322] device_notifier+0x6b/0x70 [ 29.810902] notifier_call_chain+0x4a/0x70 [ 29.812822] __blocking_notifier_call_chain+0x47/0x60 [ 29.814499] blocking_notifier_call_chain+0x16/0x20 [ 29.816137] device_del+0x233/0x320 [ 29.817588] pci_remove_bus_device+0x6f/0x110 [ 29.819133] pci_stop_and_remove_bus_device+0x1a/0x20 [ 29.820817] pciehp_unconfigure_device+0x7a/0x1d0 [ 29.822434] pciehp_disable_slot+0x52/0xe0 [ 29.823931] pciehp_power_thread+0x8a/0xa0 [ 29.825411] process_one_work+0x18c/0x3a0 [ 29.826875] worker_thread+0x4e/0x3b0 [ 29.828263] kthread+0x109/0x140 [ 29.829564] ? process_one_work+0x3a0/0x3a0 [ 29.831081] ? kthread_park+0x60/0x60 [ 29.832464] ret_from_fork+0x25/0x30 [ 29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 <80> bf [ 29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: ffffb326805afbd0 [ 29.840362] CR2: 0000000000000304 [ 29.841716] ---[ end trace b10ec0d6900868d3 ]--- This patch fixes that problem if applied to v4.13 kernel. The bug does not exist on latest upstream kernel since it's fixed as a side effect of commit `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing", 2017-08-15). But IMHO it's still good to have this patch upstream. CC: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Fixes: `a1ddcbe930` ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi") Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 14:45:18 +01:00
Robin Murphy	dc98b8480d	iommu/exynos: Don't unconditionally steal bus ops Removing the early device registration hook overlooked the fact that it only ran conditionally on a compatible device being present in the DT. With exynos_iommu_init() now running as an unconditional initcall, problems arise on non-Exynos systems when other IOMMU drivers find themselves unable to install their ops on the platform bus, or at worst the Exynos ops get called with someone else's domain and all hell breaks loose. The global ops/cache setup could probably all now be triggered from the first IOMMU probe, as with dma_dev assigment, but for the time being the simplest fix is to resurrect the logic from commit `a7b67cd5d9` ("iommu/exynos: Play nice in multi-platform builds") to explicitly check the DT for the presence of an Exynos IOMMU before trying anything. Fixes: `928055a01b` ("iommu/exynos: Remove custom platform device registration code") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 14:38:31 +01:00
Geert Uytterhoeven	f18affbea8	iommu/omap: Fix debugfs_create_() usage When exposing data access through debugfs, the correct debugfs_create_() functions must be used, depending on data type. Remove all casts from data pointers passed to debugfs_create_() functions, as such casts prevent the compiler from flagging bugs. omap_iommu.nr_tlb_entries is "int", hence casting to "u8 " exposes only a part of it. Fix this by using debugfs_create_u32() instead. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-01-17 14:23:33 +01:00
Christoph Hellwig	4fac8076df	ia64: clean up swiotlb support Move the few remaining bits of swiotlb glue towards their callers, and remove the pointless on ia64 swiotlb variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>	2018-01-15 09:35:53 +01:00
Sinan Kaya	d5bf0f4f2b	iommu/amd: Deprecate pci_get_bus_and_slot() pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Getting ready to remove pci_get_bus_and_slot() function in favor of pci_get_domain_bus_and_slot(). Hard-code the domain number as 0 for the AMD IOMMU driver. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Reviewed-by: Gary R Hook <gary.hook@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de>	2018-01-11 17:31:05 -06:00
Robin Murphy	563b5cbe33	iommu/arm-smmu-v3: Cope with duplicated Stream IDs For PCI devices behind an aliasing PCIe-to-PCI/X bridge, the bridge alias to DevFn 0.0 on the subordinate bus may match the original RID of the device, resulting in the same SID being present in the device's fwspec twice. This causes trouble later in arm_smmu_write_strtab_ent() when we wind up visiting the STE a second time and find it already live. Avoid the issue by giving arm_smmu_install_ste_for_dev() the cleverness to skip over duplicates. It seems mildly counterintuitive compared to preventing the duplicates from existing in the first place, but since the DT and ACPI probe paths build their fwspecs differently, this is actually the cleanest and most self-contained way to deal with it. Cc: <stable@vger.kernel.org> Fixes: `8f78515425` ("iommu/arm-smmu: Implement of_xlate() for SMMUv3") Reported-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Tested-by: Jayachandran C. <jnair@caviumnetworks.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-01-02 16:45:51 +00:00
Jean-Philippe Brucker	57d72e159b	iommu/arm-smmu-v3: Don't free page table ops twice Kasan reports a double free when finalise_stage_fn fails: the io_pgtable ops are freed by arm_smmu_domain_finalise and then again by arm_smmu_domain_free. Prevent this by leaving pgtbl_ops empty on failure. Cc: <stable@vger.kernel.org> Fixes: `48ec83bcbc` ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-01-02 16:45:45 +00:00
Thomas Gleixner	702cb0a028	genirq/irqdomain: Rename early argument of irq_domain_activate_irq() The 'early' argument of irq_domain_activate_irq() is actually used to denote reservation mode. To avoid confusion, rename it before abuse happens. No functional change. Fixes: `7249164346` ("genirq/irqdomain: Update irq_domain_ops.activate() signature") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandru Chirvasitu <achirvasub@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org	2017-12-29 21:13:04 +01:00
Arnd Bergmann	62bf8ae896	memory: tegra: Changes for v4.16-rc1 The Tegra memory controller driver will now instruct the SMMU driver to create groups, which will make it easier for device drivers to share an IOMMU domain between multiple devices. Initial Tegra186 support is also added in a separate driver. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAlo6tlMTHHRyZWRpbmdA bnZpZGlhLmNvbQAKCRDdI6zXfz6zoR7MEACpoyPkV5fGz3z8rdsYaVZSA+upYEHw CC4v5XslNCucmk57XIYTbAZRu7W5MuzP3hq+DoYM5B5EHxgt1a2YHNtjjg3eLx1T U+8hjoOPOYu7FBVmUf4V6Buvmws90DxtfvqcfIScJJEAPgKsT/wfGXNstc88kDnO c+huQxk5QZsyGBgt1sS61zlymaLpDlSoE6kScXnjM1K6XtUdZ+8zmoMdZ7Ka8DUv kwBoYXgUpVminWpSrfuV7f3/8BtTSJ/b2rUPz7Ren0JJTFrhIi/ckeQ5YhTa46Dn oqR0f6MuLRGADT+E5DfoF7otvE2Fvq0Uq/YlypIf9gtiit5mhPM7ZXb3LOwDGiCp 0mwljW9Uov7ZpFnb8b9SlVf1b+zDeeLxtARq4qzgky5OkWZp4hXZw+Os3ebXQYW3 d/MUfwdqgNKOccGRsRkk9OEdUiFGpGtInb0xM9OWBCHvdrXhbK9llZ3wnrE+M3SY BELM+94YorqVPx5JTmiM7kTozVwpkvZQG6kU2J81Y55FI1ELQF7Mk0yzFs562tG5 2abc07xkSUCdC6zsrgNOy+AtuRH8oHccUfZRACLUhzHNCrvfr3MrEcEv8l4VRJp4 ZLfR8UjmtTI8YllBKdSdUtFB4nN5oyoIPCNsgwzXcTnKV5SzAyA/KuDW6IWoNU1L 6ekAxpeiHVnLKA== =ncAl -----END PGP SIGNATURE----- Merge tag 'tegra-for-4.16-memory' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers Pull "memory: tegra: Changes for v4.16-rc1" from Thierry Reding: The Tegra memory controller driver will now instruct the SMMU driver to create groups, which will make it easier for device drivers to share an IOMMU domain between multiple devices. Initial Tegra186 support is also added in a separate driver. * tag 'tegra-for-4.16-memory' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tegra/linux: iommu/tegra-smmu: Fix return value check in tegra_smmu_group_get() iommu/tegra: Allow devices to be grouped memory: tegra: Create SMMU display groups memory: tegra: Add Tegra186 support dt-bindings: memory: Add Tegra186 support dt-bindings: misc: Add Tegra186 MISC registers bindings	2017-12-21 18:05:04 +01:00
Wei Yongjun	83476bfaf6	iommu/tegra-smmu: Fix return value check in tegra_smmu_group_get() In case of error, the function iommu_group_alloc() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: `7f4c9176f7` ("iommu/tegra: Allow devices to be grouped") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-12-20 18:32:08 +01:00
Jerry Snitselaar	72d5481138	iommu/vt-d: clean up pr_irq if request_threaded_irq fails It is unlikely request_threaded_irq will fail, but if it does for some reason we should clear iommu->pr_irq in the error path. Also intel_svm_finish_prq shouldn't try to clean up the page request interrupt if pr_irq is 0. Without these, if request_threaded_irq were to fail the following occurs: fail with no fixes: [ 0.683147] ------------[ cut here ]------------ [ 0.683148] NULL pointer, cannot free irq [ 0.683158] WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:1632 irq_domain_free_irqs+0x126/0x140 [ 0.683160] Modules linked in: [ 0.683163] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2 #3 [ 0.683165] Hardware name: /NUC7i3BNB, BIOS BNKBL357.86A.0036.2017.0105.1112 01/05/2017 [ 0.683168] RIP: 0010:irq_domain_free_irqs+0x126/0x140 [ 0.683169] RSP: 0000:ffffc90000037ce8 EFLAGS: 00010292 [ 0.683171] RAX: 000000000000001d RBX: ffff880276283c00 RCX: ffffffff81c5e5e8 [ 0.683172] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000246 [ 0.683174] RBP: ffff880276283c00 R08: 0000000000000000 R09: 000000000000023c [ 0.683175] R10: 0000000000000007 R11: 0000000000000000 R12: 000000000000007a [ 0.683176] R13: 0000000000000001 R14: 0000000000000000 R15: 0000010010000000 [ 0.683178] FS: 0000000000000000(0000) GS:ffff88027ec80000(0000) knlGS:0000000000000000 [ 0.683180] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.683181] CR2: 0000000000000000 CR3: 0000000001c09001 CR4: 00000000003606e0 [ 0.683182] Call Trace: [ 0.683189] intel_svm_finish_prq+0x3c/0x60 [ 0.683191] free_dmar_iommu+0x1ac/0x1b0 [ 0.683195] init_dmars+0xaaa/0xaea [ 0.683200] ? klist_next+0x19/0xc0 [ 0.683203] ? pci_do_find_bus+0x50/0x50 [ 0.683205] ? pci_get_dev_by_id+0x52/0x70 [ 0.683208] intel_iommu_init+0x498/0x5c7 [ 0.683211] pci_iommu_init+0x13/0x3c [ 0.683214] ? e820__memblock_setup+0x61/0x61 [ 0.683217] do_one_initcall+0x4d/0x1a0 [ 0.683220] kernel_init_freeable+0x186/0x20e [ 0.683222] ? set_debug_rodata+0x11/0x11 [ 0.683225] ? rest_init+0xb0/0xb0 [ 0.683226] kernel_init+0xa/0xff [ 0.683229] ret_from_fork+0x1f/0x30 [ 0.683259] Code: 89 ee 44 89 e7 e8 3b e8 ff ff 5b 5d 44 89 e7 44 89 ee 41 5c 41 5d 41 5e e9 a8 84 ff ff 48 c7 c7 a8 71 a7 81 31 c0 e8 6a d3 f9 ff <0f> ff 5b 5d 41 5c 41 5d 41 5 e c3 0f 1f 44 00 00 66 2e 0f 1f 84 [ 0.683285] ---[ end trace f7650e42792627ca ]--- with iommu->pr_irq = 0, but no check in intel_svm_finish_prq: [ 0.669561] ------------[ cut here ]------------ [ 0.669563] Trying to free already-free IRQ 0 [ 0.669573] WARNING: CPU: 3 PID: 1 at kernel/irq/manage.c:1546 __free_irq+0xa4/0x2c0 [ 0.669574] Modules linked in: [ 0.669577] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2 #4 [ 0.669579] Hardware name: /NUC7i3BNB, BIOS BNKBL357.86A.0036.2017.0105.1112 01/05/2017 [ 0.669581] RIP: 0010:__free_irq+0xa4/0x2c0 [ 0.669582] RSP: 0000:ffffc90000037cc0 EFLAGS: 00010082 [ 0.669584] RAX: 0000000000000021 RBX: 0000000000000000 RCX: ffffffff81c5e5e8 [ 0.669585] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000046 [ 0.669587] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000023c [ 0.669588] R10: 0000000000000007 R11: 0000000000000000 R12: ffff880276253960 [ 0.669589] R13: ffff8802762538a4 R14: ffff880276253800 R15: ffff880276283600 [ 0.669593] FS: 0000000000000000(0000) GS:ffff88027ed80000(0000) knlGS:0000000000000000 [ 0.669594] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.669596] CR2: 0000000000000000 CR3: 0000000001c09001 CR4: 00000000003606e0 [ 0.669602] Call Trace: [ 0.669616] free_irq+0x30/0x60 [ 0.669620] intel_svm_finish_prq+0x34/0x60 [ 0.669623] free_dmar_iommu+0x1ac/0x1b0 [ 0.669627] init_dmars+0xaaa/0xaea [ 0.669631] ? klist_next+0x19/0xc0 [ 0.669634] ? pci_do_find_bus+0x50/0x50 [ 0.669637] ? pci_get_dev_by_id+0x52/0x70 [ 0.669639] intel_iommu_init+0x498/0x5c7 [ 0.669642] pci_iommu_init+0x13/0x3c [ 0.669645] ? e820__memblock_setup+0x61/0x61 [ 0.669648] do_one_initcall+0x4d/0x1a0 [ 0.669651] kernel_init_freeable+0x186/0x20e [ 0.669653] ? set_debug_rodata+0x11/0x11 [ 0.669656] ? rest_init+0xb0/0xb0 [ 0.669658] kernel_init+0xa/0xff [ 0.669661] ret_from_fork+0x1f/0x30 [ 0.669662] Code: 7a 08 75 0e e9 c3 01 00 00 4c 39 7b 08 74 57 48 89 da 48 8b 5a 18 48 85 db 75 ee 89 ee 48 c7 c7 78 67 a7 81 31 c0 e8 4c 37 fa ff <0f> ff 48 8b 34 24 4c 89 ef e 8 0e 4c 68 00 49 8b 46 40 48 8b 80 [ 0.669688] ---[ end trace 58a470248700f2fc ]--- Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-12-20 09:48:56 -07:00
Jordan Crouse	9ae9df035c	iommu: Check the result of iommu_group_get() for NULL The result of iommu_group_get() was being blindly used in both attach and detach which results in a dereference when trying to work with an unknown device. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-12-20 09:48:36 -07:00
Gary R Hook	ff18c4e598	iommu/amd: Set the device table entry PPR bit for IOMMU V2 devices The AMD IOMMU specification Rev 3.00 (December 2016) introduces a new Enhanced PPR Handling Support (EPHSup) bit in the MMIO register offset 0030h (IOMMU Extended Feature Register). When EPHSup=1, the IOMMU hardware requires the PPR bit of the device table entry (DTE) to be set in order to support PPR for a particular endpoint device. Please see https://support.amd.com/TechDocs/48882_IOMMU.pdf for this revision of the AMD IOMMU specification. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-12-20 09:47:08 -07:00
Gary R Hook	f9fc049ef1	iommu/amd - Record more information about unknown events When an unknown type event occurs, the default information written to the syslog should dump raw event data. This could provide insight into the event that occurred. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-12-20 09:47:07 -07:00
Thierry Reding	7f4c9176f7	iommu/tegra: Allow devices to be grouped Implement the ->device_group() and ->of_xlate() callbacks which are used in order to group devices. Each group can then share a single domain. This is implemented primarily in order to achieve the same semantics on Tegra210 and earlier as on Tegra186 where the Tegra SMMU was replaced by an ARM SMMU. Users of the IOMMU API can now use the same code to share domains between devices, whereas previously they used to attach each device individually. Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-12-15 10:12:32 +01:00
Linus Torvalds	e56d565d67	IOMMU fixes for Linux v4.15-rc3 * Fix VT-d handling of scatterlists where sg->offset exceeds PAGE_SIZE -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJaKDGkAAoJECObm247sIsi6t4P/3WyPbB8ik8eb0J4R8yKfJsa lqJ/KWwxDMwkOlIjW+cU07mnpZSBahHFKNr4T+5LdhSSOdEbepGC8aMeN188UHww MyqzWEgKOFL76BMqL20YWVXdIwhejZz/jVyboblS8e9ufr3Wn53btfFUdhmeZ7Yc 9vL5wu2QQPG6oRcEPNmk9ev039MPBkk/F4GM+bfmVnaqsdo0+aLSzBBT/LWZ/W8l TUnrVwRFVg3J6voWH2L3i+5LmyJiKRVXPKLppoOLMNtUc4oviHAVW2VNA6BxKpaL R3TH/N9isfMzIlk54kSU5b+90p/esXWJZZEIwv1qaF2263FVjjTBko7RcMFobAir Gz6rPB4VhPsSfYikVJXsljrJntHFFgiw/3J0X52wd7ZFPJvpp5ku/1Lql9RFcNBM ymB8jrPY5nA+IaiIqm0We2gMLUrQw3ejdorXp54KRBr78dQWRq7IF0imA7yOjSCf 8wqf3vOVTdAPhB77XII4tYj2torOzBuYwM1PMbyfgCC0Ux5xUQh8/f1L05wVzIaw fMkcE581OEogyyktAVwPQ8KFJwsf7Lqywm6CzYqxDQX8FKgvWdGlSfV6MG/ma7ng bNm5fdqwdAOloSHRnuuRiC1B9azqTuZZPbdkcY1F8qk+08AnC77GL8o6ofkHMI9i GeAscUDDvNmVTl7mHKmZ =1sMC -----END PGP SIGNATURE----- Merge tag 'iommu-v4.15-rc3' of git://github.com/awilliam/linux-vfio Pull IOMMU fix from Alex Williamson: "Fix VT-d handling of scatterlists where sg->offset exceeds PAGE_SIZE" * tag 'iommu-v4.15-rc3' of git://github.com/awilliam/linux-vfio: iommu/vt-d: Fix scatterlist offset handling	2017-12-06 10:53:02 -08:00
Kees Cook	e99e88a9d2	treewide: setup_timer() -> timer_setup() This converts all remaining cases of the old setup_timer() API into using timer_setup(), where the callback argument is the structure already holding the struct timer_list. These should have no behavioral changes, since they just change which pointer is passed into the callback with the same available pointers after conversion. It handles the following examples, in addition to some other variations. Casting from unsigned long: void my_callback(unsigned long data) { struct something ptr = (struct something )data; ... } ... setup_timer(&ptr->my_timer, my_callback, ptr); and forced object casts: void my_callback(struct something ptr) { ... } ... setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr); become: void my_callback(struct timer_list t) { struct something ptr = from_timer(ptr, t, my_timer); ... } ... timer_setup(&ptr->my_timer, my_callback, 0); Direct function assignments: void my_callback(unsigned long data) { struct something ptr = (struct something )data; ... } ... ptr->my_timer.function = my_callback; have a temporary cast added, along with converting the args: void my_callback(struct timer_list t) { struct something ptr = from_timer(ptr, t, my_timer); ... } ... ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback; And finally, callbacks without a data assignment: void my_callback(unsigned long data) { ... } ... setup_timer(&ptr->my_timer, my_callback, 0); have their argument renamed to verify they're unused during conversion: void my_callback(struct timer_list unused) { ... } ... timer_setup(&ptr->my_timer, my_callback, 0); The conversion is done with the following Coccinelle script: spatch --very-quiet --all-includes --include-headers \ -I ./arch/x86/include -I ./arch/x86/include/generated \ -I ./include -I ./arch/x86/include/uapi \ -I ./arch/x86/include/generated/uapi -I ./include/uapi \ -I ./include/generated/uapi --include ./include/linux/kconfig.h \ --dir . \ --cocci-file ~/src/data/timer_setup.cocci @fix_address_of@ expression e; @@ setup_timer( -&(e) +&e , ...) // Update any raw setup_timer() usages that have a NULL callback, but // would otherwise match change_timer_function_usage, since the latter // will update all function assignments done in the face of a NULL // function initialization in setup_timer(). @change_timer_function_usage_NULL@ expression _E; identifier _timer; type _cast_data; @@ ( -setup_timer(&_E->_timer, NULL, _E); +timer_setup(&_E->_timer, NULL, 0); \| -setup_timer(&_E->_timer, NULL, (_cast_data)_E); +timer_setup(&_E->_timer, NULL, 0); \| -setup_timer(&_E._timer, NULL, &_E); +timer_setup(&_E._timer, NULL, 0); \| -setup_timer(&_E._timer, NULL, (_cast_data)&_E); +timer_setup(&_E._timer, NULL, 0); ) @change_timer_function_usage@ expression _E; identifier _timer; struct timer_list _stl; identifier _callback; type _cast_func, _cast_data; @@ ( -setup_timer(&_E->_timer, _callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, &_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, &_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)&_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, &_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, &_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| _E->_timer@_stl.function = _callback; \| _E->_timer@_stl.function = &_callback; \| _E->_timer@_stl.function = (_cast_func)_callback; \| _E->_timer@_stl.function = (_cast_func)&_callback; \| _E._timer@_stl.function = _callback; \| _E._timer@_stl.function = &_callback; \| _E._timer@_stl.function = (_cast_func)_callback; \| _E._timer@_stl.function = (_cast_func)&_callback; ) // callback(unsigned long arg) @change_callback_handle_cast depends on change_timer_function_usage@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; identifier _handle; @@ void _callback( -_origtype _origarg +struct timer_list t ) { ( ... when != _origarg _handletype _handle = -(_handletype )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle = -(void )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle; ... when != _handle _handle = -(_handletype )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle; ... when != _handle _handle = -(void )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg ) } // callback(unsigned long arg) without existing variable @change_callback_handle_cast_no_arg depends on change_timer_function_usage && !change_callback_handle_cast@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; @@ void _callback( -_origtype _origarg +struct timer_list t ) { + _handletype _origarg = from_timer(_origarg, t, _timer); + ... when != _origarg - (_handletype )_origarg + _origarg ... when != _origarg } // Avoid already converted callbacks. @match_callback_converted depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier t; @@ void _callback(struct timer_list t) { ... } // callback(struct something handle) @change_callback_handle_arg depends on change_timer_function_usage && !match_callback_converted && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; @@ void _callback( -_handletype _handle +struct timer_list t ) { + _handletype _handle = from_timer(_handle, t, _timer); ... } // If change_callback_handle_arg ran on an empty function, remove // the added handler. @unchange_callback_handle_arg depends on change_timer_function_usage && change_callback_handle_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; identifier t; @@ void _callback(struct timer_list t) { - _handletype _handle = from_timer(_handle, t, _timer); } // We only want to refactor the setup_timer() data argument if we've found // the matching callback. This undoes changes in change_timer_function_usage. @unchange_timer_function_usage depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg && !change_callback_handle_arg@ expression change_timer_function_usage._E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type change_timer_function_usage._cast_data; @@ ( -timer_setup(&_E->_timer, _callback, 0); +setup_timer(&_E->_timer, _callback, (_cast_data)_E); \| -timer_setup(&_E._timer, _callback, 0); +setup_timer(&_E._timer, _callback, (_cast_data)&_E); ) // If we fixed a callback from a .function assignment, fix the // assignment cast now. @change_timer_function_assignment depends on change_timer_function_usage && (change_callback_handle_cast \|\| change_callback_handle_cast_no_arg \|\| change_callback_handle_arg)@ expression change_timer_function_usage._E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_func; typedef TIMER_FUNC_TYPE; @@ ( _E->_timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -&_callback +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -(_cast_func)_callback; +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -&_callback; +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -(_cast_func)_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; ) // Sometimes timer functions are called directly. Replace matched args. @change_timer_function_calls depends on change_timer_function_usage && (change_callback_handle_cast \|\| change_callback_handle_cast_no_arg \|\| change_callback_handle_arg)@ expression _E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_data; @@ _callback( ( -(_cast_data)_E +&_E->_timer \| -(_cast_data)&_E +&_E._timer \| -_E +&_E->_timer ) ) // If a timer has been configured without a data argument, it can be // converted without regard to the callback argument, since it is unused. @match_timer_function_unused_data@ expression _E; identifier _timer; identifier _callback; @@ ( -setup_timer(&_E->_timer, _callback, 0); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, 0L); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, 0UL); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0L); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0UL); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0); +timer_setup(&_timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0L); +timer_setup(&_timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0UL); +timer_setup(&_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0); +timer_setup(_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0L); +timer_setup(_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0UL); +timer_setup(_timer, _callback, 0); ) @change_callback_unused_data depends on match_timer_function_unused_data@ identifier match_timer_function_unused_data._callback; type _origtype; identifier _origarg; @@ void _callback( -_origtype _origarg +struct timer_list unused ) { ... when != _origarg } Signed-off-by: Kees Cook <keescook@chromium.org>	2017-11-21 15:57:07 -08:00
Robin Murphy	29a90b7089	iommu/vt-d: Fix scatterlist offset handling The intel-iommu DMA ops fail to correctly handle scatterlists where sg->offset is greater than PAGE_SIZE - the IOVA allocation is computed appropriately based on the page-aligned portion of the offset, but the mapping is set up relative to sg->page, which means it fails to actually cover the whole buffer (and in the worst case doesn't cover it at all): (sg->dma_address + sg->dma_len) ----+ sg->dma_address ---------+ \| iov_pfn------+ \| \| \| \| \| v v v iova: a b c d e f \|--------\|--------\|--------\|--------\|--------\| <...calculated....> [_____mapped______] pfn: 0 1 2 3 4 5 \|--------\|--------\|--------\|--------\|--------\| ^ ^ ^ \| \| \| sg->page ----+ \| \| sg->offset --------------+ \| (sg->offset + sg->length) ----------+ As a result, the caller ends up overrunning the mapping into whatever lies beyond, which usually goes badly: [ 429.645492] DMAR: DRHD: handling fault status reg 2 [ 429.650847] DMAR: [DMA Write] Request device [02:00.4] fault addr f2682000 ... Whilst this is a fairly rare occurrence, it can happen from the result of intermediate scatterlist processing such as scatterwalk_ffwd() in the crypto layer. Whilst that particular site could be fixed up, it still seems worthwhile to bring intel-iommu in line with other DMA API implementations in handling this robustly. To that end, fix the intel_map_sg() path to line up the mapping correctly (in units of MM pages rather than VT-d pages to match the aligned_nrpages() calculation) regardless of the offset, and use sg_phys() consistently for clarity. Reported-by: Harsh Jain <Harsh@chelsio.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed by: Ashok Raj <ashok.raj@intel.com> Tested by: Jacob Pan <jacob.jun.pan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-17 10:28:58 -07:00
Linus Torvalds	2cd83ba5be	IOMMU Updates for Linux v4.15 * Enforce MSI multiple IRQ alignment in AMD IOMMU * VT-d PASID error handling fixes * Add r8a7795 IPMMU support * Manage runtime PM links on exynos at {add,remove}_device callbacks * Fix Mediatek driver name to avoid conflict * Add terminate support to qcom fault handler * 64-bit IOVA optimizations * Simplfy IOVA domain destruction, better use of rcache, and skip anchor nodes on copy * Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s} * Drop command queue lock when waiting for CMD_SYNC completion on ARM SMMU implementations supporting MSI to cacheable memory * iomu-vmsa cleanup inspired by missed IOTLB sync callbacks * Fix sleeping lock with preemption disabled for RT * Dual MMU support for TI DRA7xx DSPs * Optional flush option on IOVA allocation avoiding overhead when caller can try other options -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJaCgbBAAoJECObm247sIsiUp0P/16GwExsZFKqD0B7XBiBq+hZ Q03XiPEnfnMBt05bw4dH8lnfpDPR1+6I3OzrxByc3VurfsBZ6IUh6oXPREcwdkNE +iyS7/1RlXd7EJGsaY91fiL1Gm7iPQS/MJ50GcL288gdF68cAHygDoLPR0r1bHBC PzjLhmgSYcqYQwRgnoclUe4/sNEIWWu3Z0trltNmiQ9eFEXW4WCc1/OzCoAmfxY7 EPiKynP+vCBTvzZJ/7m4oBpfciO/WuF8KhnNDjmL/lKGAQGGChiUyQq77pkdaUzm 6MjGfLAzO1xJ3DYM1ozDGBnVXR3AlDE2f+MlveT9bwVHkFLB3WF5atb70Cx70Zju Lp2ZC2sm42AlHN0WqLGU9Q1pGPhZwXGkoF5kAxcsSh1rWC+bjfjtelpIg9PLngMi rBAgRLLveGMM3jQcWv4259dZkByJOyFurQwYzHToacj+bE0hGzLv5cFWS/AD8kIE rdxFBdNWlMARDWoJO+jYOzEr/QTDXovLgWsNvS2C2QDfmdXL1s0yqjdU9ahC3wWJ pi//PydoXoyuEbN3M04ccjDNCZLX+JYIvvaZAxF5gGVt99VCKhf2Rw2wtLPviPNN K+8NmUp1xAZX7Ak47HDNHIS3ZVB6CgP5pmoBvqZW1k5yyf5kvbvi2axQta/cyr6a 4ey4GeRpF+iad1n1U/WX =o4qF -----END PGP SIGNATURE----- Merge tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio Pull IOMMU updates from Alex Williamson: "As Joerg mentioned[1], he's out on paternity leave through the end of the year and I'm filling in for him in the interim: - Enforce MSI multiple IRQ alignment in AMD IOMMU - VT-d PASID error handling fixes - Add r8a7795 IPMMU support - Manage runtime PM links on exynos at {add,remove}_device callbacks - Fix Mediatek driver name to avoid conflict - Add terminate support to qcom fault handler - 64-bit IOVA optimizations - Simplfy IOVA domain destruction, better use of rcache, and skip anchor nodes on copy - Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s} - Drop command queue lock when waiting for CMD_SYNC completion on ARM SMMU implementations supporting MSI to cacheable memory - iomu-vmsa cleanup inspired by missed IOTLB sync callbacks - Fix sleeping lock with preemption disabled for RT - Dual MMU support for TI DRA7xx DSPs - Optional flush option on IOVA allocation avoiding overhead when caller can try other options [1] https://lkml.org/lkml/2017/10/22/72" * tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio: (54 commits) iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq iommu/mediatek: Fix driver name iommu/ipmmu-vmsa: Hook up r8a7795 DT matching code iommu/ipmmu-vmsa: Allow two bit SL0 iommu/ipmmu-vmsa: Make IMBUSCTR setup optional iommu/ipmmu-vmsa: Write IMCTR twice iommu/ipmmu-vmsa: IPMMU device is 40-bit bus master iommu/ipmmu-vmsa: Make use of IOMMU_OF_DECLARE() iommu/ipmmu-vmsa: Enable multi context support iommu/ipmmu-vmsa: Add optional root device feature iommu/ipmmu-vmsa: Introduce features, break out alias iommu/ipmmu-vmsa: Unify ipmmu_ops iommu/ipmmu-vmsa: Clean up struct ipmmu_vmsa_iommu_priv iommu/ipmmu-vmsa: Simplify group allocation iommu/ipmmu-vmsa: Unify domain alloc/free iommu/ipmmu-vmsa: Fix return value check in ipmmu_find_group_dma() iommu/vt-d: Clear pasid table entry when memory unbound iommu/vt-d: Clear Page Request Overflow fault bit iommu/vt-d: Missing checks for pasid tables if allocation fails iommu/amd: Limit the IOVA page range to the specified addresses ...	2017-11-14 16:43:27 -08:00
Alex Williamson	56f19441da	Merge branches 'iommu/arm/smmu', 'iommu/updates', 'iommu/vt-d', 'iommu/ipmmu-vmsa' and 'iommu/iova' into iommu-next-20171113.0	2017-11-13 12:40:51 -07:00
Ingo Molnar	141d3b1daa	Merge branch 'linus' into x86/apic, to resolve conflicts Conflicts: arch/x86/include/asm/x2apic.h Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-11-07 10:51:10 +01:00
Sebastian Andrzej Siewior	94e2cc4dba	iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq get_cpu_ptr() disabled preemption and returns the ->fq object of the current CPU. raw_cpu_ptr() does the same except that it not disable preemption which means the scheduler can move it to another CPU after it obtained the per-CPU object. In this case this is not bad because the data structure itself is protected with a spin_lock. This change shouldn't matter however on RT it does because the sleeping lock can't be accessed with disabled preemption. Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Reported-by: vinadhy@gmail.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 11:24:35 -07:00
Matthias Brugger	395df08d2e	iommu/mediatek: Fix driver name There exist two Mediatek iommu drivers for the two different generations of the device. But both drivers have the same name "mtk-iommu". This breaks the registration of the second driver: Error: Driver 'mtk-iommu' is already registered, aborting... Fix this by changing the name for first generation to "mtk-iommu-v1". Fixes: `b17336c55d` ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:40:53 -07:00
Magnus Damm	58b8e8bf40	iommu/ipmmu-vmsa: Hook up r8a7795 DT matching code Tie in r8a7795 features and update the IOMMU_OF_DECLARE compat string to include the updated compat string. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	c295f504fb	iommu/ipmmu-vmsa: Allow two bit SL0 Introduce support for two bit SL0 bitfield in IMTTBCR by using a separate feature flag. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	f5c858912a	iommu/ipmmu-vmsa: Make IMBUSCTR setup optional Introduce a feature to allow opt-out of setting up IMBUSCR. The default case is unchanged. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	d574893aee	iommu/ipmmu-vmsa: Write IMCTR twice Write IMCTR both in the root device and the leaf node. To allow access of IMCTR introduce the following function: - ipmmu_ctx_write_all() While at it also rename context functions: - ipmmu_ctx_read() -> ipmmu_ctx_read_root() - ipmmu_ctx_write() -> ipmmu_ctx_write_root() Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	1c894225bf	iommu/ipmmu-vmsa: IPMMU device is 40-bit bus master The r8a7795 IPMMU supports 40-bit bus mastering. Both the coherent DMA mask and the streaming DMA mask are set to unlock the 40-bit address space for coherent allocations and streaming operations. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	cda52fcd99	iommu/ipmmu-vmsa: Make use of IOMMU_OF_DECLARE() Hook up IOMMU_OF_DECLARE() support in case CONFIG_IOMMU_DMA is enabled. The only current supported case for 32-bit ARM is disabled, however for 64-bit ARM usage of OF is required. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	5fd163416f	iommu/ipmmu-vmsa: Enable multi context support Add support for up to 8 contexts. Each context is mapped to one domain. One domain is assigned one or more slave devices. Contexts are allocated dynamically and slave devices are grouped together based on which IPMMU device they are connected to. This makes slave devices tied to the same IPMMU device share the same IOVA space. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	fd5140e29a	iommu/ipmmu-vmsa: Add optional root device feature Add root device handling to the IPMMU driver by allowing certain DT compat strings to enable has_cache_leaf_nodes that in turn will support both root devices with interrupts and leaf devices that face the actual IPMMU consumer devices. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Magnus Damm	33f3ac9b51	iommu/ipmmu-vmsa: Introduce features, break out alias Introduce struct ipmmu_features to track various hardware and software implementation changes inside the driver for different kinds of IPMMU hardware. Add use_ns_alias_offset as a first example of a feature to control if the secure register bank offset should be used or not. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Robin Murphy	49c875f030	iommu/ipmmu-vmsa: Unify ipmmu_ops The remaining difference between the ARM-specific and iommu-dma ops is in the {add,remove}_device implementations, but even those have some overlap and duplication. By stubbing out the few arm_iommu_*() calls, we can get rid of the rest of the inline #ifdeffery to both simplify the code and improve build coverage. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Robin Murphy	e4efe4a9a2	iommu/ipmmu-vmsa: Clean up struct ipmmu_vmsa_iommu_priv Now that the IPMMU instance pointer is the only thing remaining in the private data structure, we no longer need the extra level of indirection and can simply stash that directlty in the fwspec. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:29:39 -07:00
Robin Murphy	b354c73edc	iommu/ipmmu-vmsa: Simplify group allocation We go through quite the merry dance in order to find masters behind the same IPMMU instance, so that we can ensure they are grouped together. None of which is really necessary, since the master's private data already points to the particular IPMMU it is associated with, and that IPMMU instance data is the perfect place to keep track of a per-instance group directly. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:27:10 -07:00
Robin Murphy	1c7e7c0278	iommu/ipmmu-vmsa: Unify domain alloc/free We have two implementations for ipmmu_ops->alloc depending on CONFIG_IOMMU_DMA, the difference being whether they accept the IOMMU_DOMAIN_DMA type or not. However, iommu_dma_get_cookie() is guaranteed to return an error when !CONFIG_IOMMU_DMA, so if ipmmu_domain_alloc_dma() was actually checking and handling the return value correctly, it would behave the same as ipmmu_domain_alloc() anyway. Similarly for freeing; iommu_put_dma_cookie() is robust by design. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:27:06 -07:00
weiyongjun (A)	105a004e21	iommu/ipmmu-vmsa: Fix return value check in ipmmu_find_group_dma() In case of error, the function iommu_group_get() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: `3ae4729202` ("iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-06 10:26:26 -07:00
Lu Baolu	4fa064b26c	iommu/vt-d: Clear pasid table entry when memory unbound In intel_svm_unbind_mm(), pasid table entry must be cleared during svm free. Otherwise, hardware may be set up with a wild pointer. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:51:34 -06:00
Lu Baolu	973b546451	iommu/vt-d: Clear Page Request Overflow fault bit Currently Page Request Overflow bit in IOMMU Fault Status register is not cleared. Not clearing this bit would mean that any future page-request is going to be automatically dropped by IOMMU. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:51:33 -06:00
Lu Baolu	2e2e35d512	iommu/vt-d: Missing checks for pasid tables if allocation fails intel_svm_alloc_pasid_tables() might return an error but never be checked by the callers. Later when intel_svm_bind_mm() is called, there are no checks for valid pasid tables before enabling them. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu, Yi L <yi.l.liu@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:51:32 -06:00
Gary R Hook	b92b4fb5c1	iommu/amd: Limit the IOVA page range to the specified addresses The extent of pages specified when applying a reserved region should include up to the last page of the range, but not the page following the range. Signed-off-by: Gary R Hook <gary.hook@amd.com> Fixes: `8d54d6c8b8` ('iommu/amd: Implement apply_dm_region call-back') Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:34 -06:00
Rob Clark	049541e178	iommu: qcom: wire up fault handler This is quite useful for debugging. Currently, always TERMINATE the translation when the fault handler returns (since this is all we need for debugging drivers). But I expect the SVM work should eventually let us do something more clever. Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:33 -06:00
Colin Ian King	2c40367cbf	iommu/amd: remove unused variable flush_addr Variable flush_addr is being assigned but is never read; it is redundant and can be removed. Cleans up the clang warning: drivers/iommu/amd_iommu.c:2388:2: warning: Value stored to 'flush_addr' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:32 -06:00
Alex Williamson	07d1c91b6c	iommu/amd: Fix alloc_irq_index() increment On an is_allocated() interrupt index, we ALIGN() the current index and then increment it via the for loop, guaranteeing that it is no longer aligned for alignments >1. We instead need to align the next index, to guarantee forward progress, moving the increment-only to the case where the index was found to be unallocated. Fixes: `37946d95fc` ('iommu/amd: Add align parameter to alloc_irq_index()') Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:31 -06:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Robin Murphy	8ff0f72371	iommu/arm-smmu-v3: Use burst-polling for sync completion While CMD_SYNC is unlikely to complete immediately such that we never go round the polling loop, with a lightly-loaded queue it may still do so long before the delay period is up. If we have no better completion notifier, use similar logic as we have for SMMUv2 to spin a number of times before each backoff, so that we have more chance of catching syncs which complete relatively quickly and avoid delaying unnecessarily. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:10 +01:00
Will Deacon	a529ea19aa	iommu/arm-smmu-v3: Consolidate identical timeouts We have separate (identical) timeout values for polling for a queue to drain and waiting for an MSI to signal CMD_SYNC completion. In reality, we only wait for the command queue to drain if we're waiting on a sync, so just merged these two timeouts into a single constant. Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:10 +01:00
Will Deacon	49806599c3	iommu/arm-smmu-v3: Split arm_smmu_cmdq_issue_sync in half arm_smmu_cmdq_issue_sync is a little unwieldy now that it supports both MSI and event-based polling, so split it into two functions to make things easier to follow. Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:09 +01:00
Robin Murphy	37de98f8f1	iommu/arm-smmu-v3: Use CMD_SYNC completion MSI As an IRQ, the CMD_SYNC interrupt is not particularly useful, not least because we often need to wait for sync completion within someone else's IRQ handler anyway. However, when the SMMU is both coherent and supports MSIs, we can have a lot more fun by not using it as an interrupt at all. Following the example suggested in the architecture and using a write targeting normal memory, we can let callers wait on a status variable outside the lock instead of having to stall the entire queue or even touch MMIO registers. Since multiple sync commands are guaranteed to complete in order, a simple incrementing sequence count is all we need to unambiguously support any realistic number of overlapping waiters. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:08 +01:00
Robin Murphy	dce032a15c	iommu/arm-smmu-v3: Forget about cmdq-sync interrupt The cmdq-sync interrupt is never going to be particularly useful, since for stage 1 DMA at least we'll often need to wait for sync completion within someone else's IRQ handler, thus have to implement polling anyway. Beyond that, the overhead of taking an interrupt, then still having to grovel around in the queue to figure out which sync command completed, doesn't seem much more attractive than simple polling either. Furthermore, if an implementation both has wired interrupts and supports MSIs, then we don't want to be taking the IRQ unnecessarily if we're using the MSI write to update memory. Let's just make life simpler by not even bothering to claim it in the first place. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:07 +01:00
Robin Murphy	2f657add07	iommu/arm-smmu-v3: Specialise CMD_SYNC handling CMD_SYNC already has a bit of special treatment here and there, but as we're about to extend it with more functionality for completing outside the CMDQ lock, things are going to get rather messy if we keep trying to cram everything into a single generic command interface. Instead, let's break out the issuing of CMD_SYNC into its own specific helper where upcoming changes will have room to breathe. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:06 +01:00
Robin Murphy	2a22baa2d1	iommu/arm-smmu-v3: Correct COHACC override message Slightly confusingly, when reporting a mismatch of the ID register value, we still refer to the IORT COHACC override flag as the "dma-coherent property" if we booted with ACPI. Update the message to be firmware-agnostic in line with SMMUv2. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:05 +01:00
Yisheng Xie	9cff86fd2b	iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S According to Spec, it is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which means we should not disable stall mode if stall or terminate mode is not configuable. Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which means if stall mode is force we should always set CD.S. As Jean-Philippe's suggestion, this patch introduce a feature bit ARM_SMMU_FEAT_STALL_FORCE, which means smmu only supports stall force. Therefore, we can avoid the ILLEGAL setting of STE.S1STALLD.by checking ARM_SMMU_FEAT_STALL_FORCE. This patch keeps the ARM_SMMU_FEAT_STALLS as the meaning of stall supported (force or configuable) to easy to expand the future function, i.e. we can only use ARM_SMMU_FEAT_STALLS to check whether we should register fault handle or enable master can_stall, etc to supporte platform SVM. The feature bit, STE.S1STALLD and CD.S setting will be like: STALL_MODEL FEATURE S1STALLD CD.S 0b00 ARM_SMMU_FEAT_STALLS 0b1 0b0 0b01 !ARM_SMMU_FEAT_STALLS && !ARM_SMMU_FEAT_STALL_FORCE 0b0 0b0 0b10 ARM_SMMU_FEAT_STALLS && ARM_SMMU_FEAT_STALL_FORCE 0b0 0b1 after apply this patch. Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:04 +01:00
Feng Kan	74f55d3441	iommu/arm-smmu: Enable bypass transaction caching for ARM SMMU 500 The ARM SMMU identity mapping performance was poor compared with the DMA mode. It was found that enable caching would restore the performance back to normal. The S2CRB_TLBEN bit in the ACR register would allow for caching of the stream to context register bypass transaction information. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Feng Kan <fkan@apm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:54 +01:00
Will Deacon	704c038255	iommu/arm-smmu-v3: Ensure we sync STE when only changing config field The SMMUv3 architecture permits caching of data structures deemed to be "reachable" by the SMU, which includes STEs marked as invalid. When transitioning an STE to a bypass/fault configuration at init or detach time, we mistakenly elide the CMDQ_OP_CFGI_STE operation in some cases, therefore potentially leaving the old STE state cached in the SMMU. This patch fixes the problem by ensuring that we perform the CMDQ_OP_CFGI_STE operation irrespective of the validity of the previous STE. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:54 +01:00
Robin Murphy	6948d4a7e1	iommu/arm-smmu: Remove ACPICA workarounds Now that the kernel headers have synced with the relevant upstream ACPICA updates, it's time to clean up the temporary local definitions. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:53 +01:00
Joerg Roedel	a593472591	Merge branches 'iommu/fixes', 'arm/omap', 'arm/exynos', 'x86/amd', 'x86/vt-d' and 'core' into next	2017-10-13 17:32:24 +02:00
Joerg Roedel	ce76353f16	iommu/amd: Finish TLB flush in amd_iommu_unmap() The function only sends the flush command to the IOMMU(s), but does not wait for its completion when it returns. Fix that. Fixes: `601367d76b` ('x86/amd-iommu: Remove iommu_flush_domain function') Cc: stable@vger.kernel.org # >= 2.6.33 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-13 17:32:19 +02:00
Tomasz Nowicki	538d5b3332	iommu/iova: Make rcache flush optional on IOVA allocation failure Since IOVA allocation failure is not unusual case we need to flush CPUs' rcache in hope we will succeed in next round. However, it is useful to decide whether we need rcache flush step because of two reasons: - Not scalability. On large system with ~100 CPUs iterating and flushing rcache for each CPU becomes serious bottleneck so we may want to defer it. - free_cpu_cached_iovas() does not care about max PFN we are interested in. Thus we may flush our rcaches and still get no new IOVA like in the commonly used scenario: if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev)) iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift); if (!iova) iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift); 1. First alloc_iova_fast() call is limited to DMA_BIT_MASK(32) to get PCI devices a SAC address 2. alloc_iova() fails due to full 32-bit space 3. rcaches contain PFNs out of 32-bit space so free_cpu_cached_iovas() throws entries away for nothing and alloc_iova() fails again 4. Next alloc_iova_fast() call cannot take advantage of rcache since we have just defeated caches. In this case we pick the slowest option to proceed. This patch reworks flushed_rcache local flag to be additional function argument instead and control rcache flush step. Also, it updates all users to do the flush as the last chance. Signed-off-by: Tomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-12 14:18:02 +02:00
Thomas Gleixner	331b57d148	Merge branch 'irq/urgent' into x86/apic Pick up core changes which affect the vector rework.	2017-10-12 11:02:50 +02:00
Marek Szyprowski	9d25e3cc83	iommu/exynos: Remove initconst attribute to avoid potential kernel oops Exynos SYSMMU registers standard platform device with sysmmu_of_match table, what means that this table is accessed every time a new platform device is registered in a system. This might happen also after the boot, so the table must not be attributed as initconst to avoid potential kernel oops caused by access to freed memory. Fixes: `6b21a5db36` ("iommu/exynos: Support for device tree") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-12 10:25:11 +02:00
Tom Lendacky	aba2d9a638	iommu/amd: Do not disable SWIOTLB if SME is active When SME memory encryption is active it will rely on SWIOTLB to handle DMA for devices that cannot support the addressing requirements of having the encryption mask set in the physical address. The IOMMU currently disables SWIOTLB if it is not running in passthrough mode. This is not desired as non-PCI devices attempting DMA may fail. Update the code to check if SME is active and not disable SWIOTLB. Fixes: `2543a786aa` ("iommu/amd: Allow the AMD IOMMU to work with memory encryption") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 19:49:30 +02:00
Joerg Roedel	53b9ec3fbb	iommu/amd: Enforce alignment for MSI IRQs Make use of the new alignment capability of alloc_irq_index() to enforce IRQ index alignment for MSI. Reported-by: Thomas Gleixner <tglx@linutronix.de> Fixes: `2b32450634` ('iommu/amd: Add routines to manage irq remapping tables') Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 16:04:20 +02:00
Joerg Roedel	37946d95fc	iommu/amd: Add align parameter to alloc_irq_index() For multi-MSI IRQ ranges the IRQ index needs to be aligned to the power-of-two of the requested IRQ count. Extend the alloc_irq_index() function to allow such an allocation. Reported-by: Thomas Gleixner <tglx@linutronix.de> Fixes: `2b32450634` ('iommu/amd: Add routines to manage irq remapping tables') Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 16:03:43 +02:00
Christos Gkekas	b117e03805	iommu/vt-d: Delete unnecessary check in domain_context_mapping_one() Variable did_old is unsigned so checking whether it is greater or equal to zero is not necessary. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 14:09:50 +02:00
Joerg Roedel	ec154bf56b	iommu/vt-d: Don't register bus-notifier under dmar_global_lock The notifier function will take the dmar_global_lock too, so lockdep complains about inverse locking order when the notifier is registered under the dmar_global_lock. Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Fixes: `59ce0515cd` ('iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-06 15:09:30 +02:00
Robin Murphy	4d689b6194	iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync Now that the core API issues its own post-unmap TLB sync call, push that operation out from the io-pgtable-arm-v7s internals into the users. For now, we leave the invalidation implicit in the unmap operation, since none of the current users would benefit much from any change to that. Note that the conversion of msm_iommu is implicit, since that apparently has no specific TLB sync operation anyway. CC: Yong Wu <yong.wu@mediatek.com> CC: Rob Clark <robdclark@gmail.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:25 +02:00
Robin Murphy	32b124492b	iommu/io-pgtable-arm: Convert to IOMMU API TLB sync Now that the core API issues its own post-unmap TLB sync call, push that operation out from the io-pgtable-arm internals into the users. For now, we leave the invalidation implicit in the unmap operation, since none of the current users would benefit much from any change to that. CC: Magnus Damm <damm+renesas@opensource.se> CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:25 +02:00
Robin Murphy	abbb8a0938	iommu/iova: Don't try to copy anchor nodes Anchor nodes are not reserved IOVAs in the way that copy_reserved_iova() cares about - while the failure from reserve_iova() is benign since the target domain will already have its own anchor, we still don't want to be triggering spurious warnings. Reported-by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Fixes: `bb68b2fbfb` ('iommu/iova: Add rbtree anchor node') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:24 +02:00
Robin Murphy	e8b1984027	iommu/iova: Try harder to allocate from rcache magazine When devices with different DMA masks are using the same domain, or for PCI devices where we usually try a speculative 32-bit allocation first, there is a fair possibility that the top PFN of the rcache stack at any given time may be unsuitable for the lower limit, prompting a fallback to allocating anew from the rbtree. Consequently, we may end up artifically increasing pressure on the 32-bit IOVA space as unused IOVAs accumulate lower down in the rcache stacks, while callers with 32-bit masks also impose unnecessary rbtree overhead. In such cases, let's try a bit harder to satisfy the allocation locally first - scanning the whole stack should still be relatively inexpensive. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:16 +02:00
Robin Murphy	b826ee9a4f	iommu/iova: Make rcache limit_pfn handling more robust When popping a pfn from an rcache, we are currently checking it directly against limit_pfn for viability. Since this represents iova->pfn_lo, it is technically possible for the corresponding iova->pfn_hi to be greater than limit_pfn. Although we generally get away with it in practice since limit_pfn is typically a power-of-two boundary and the IOVAs are size-aligned, it's pretty trivial to make the iova_rcache_get() path take the allocation size into account for complete safety. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:15 +02:00
Robin Murphy	7595dc588a	iommu/iova: Simplify domain destruction All put_iova_domain() should have to worry about is freeing memory - by that point the domain must no longer be live, so the act of cleaning up doesn't need to be concurrency-safe or maintain the rbtree in a self-consistent state. There's no need to waste time with locking or emptying the rcache magazines, and we can just use the postorder traversal helper to clear out the remaining rbtree entries in-place. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:15 +02:00
Robin Murphy	973f5fbedb	iommu/iova: Simplify cached node logic The logic of __get_cached_rbnode() is a little obtuse, but then __get_prev_node_of_cached_rbnode_or_last_node_and_update_limit_pfn() wouldn't exactly roll off the tongue... Now that we have the invariant that there is always a valid node to start searching downwards from, everything gets a bit easier to follow if we simplify that function to do what it says on the tin and return the cached node (or anchor node as appropriate) directly. In turn, we can then deduplicate the rb_prev() and limit_pfn logic into the main loop itself, further reduce the amount of code under the lock, and generally make the inner workings a bit less subtle. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:58 +02:00
Robin Murphy	bb68b2fbfb	iommu/iova: Add rbtree anchor node Add a permanent dummy IOVA reservation to the rbtree, such that we can always access the top of the address space instantly. The immediate benefit is that we remove the overhead of the rb_last() traversal when not using the cached node, but it also paves the way for further simplifications. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Zhen Lei	aa3ac9469c	iommu/iova: Make dma_32bit_pfn implicit Now that the cached node optimisation can apply to all allocations, the couple of users which were playing tricks with dma_32bit_pfn in order to benefit from it can stop doing so. Conversely, there is also no need for all the other users to explicitly calculate a 'real' 32-bit PFN, when init_iova_domain() can happily do that itself from the page granularity. CC: Thierry Reding <thierry.reding@gmail.com> CC: Jonathan Hunter <jonathanh@nvidia.com> CC: David Airlie <airlied@linux.ie> CC: Sudeep Dutt <sudeep.dutt@intel.com> CC: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: use iova_shift(), rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Robin Murphy	e60aa7b538	iommu/iova: Extend rbtree node caching The cached node mechanism provides a significant performance benefit for allocations using a 32-bit DMA mask, but in the case of non-PCI devices or where the 32-bit space is full, the loss of this benefit can be significant - on large systems there can be many thousands of entries in the tree, such that walking all the way down to find free space every time becomes increasingly awful. Maintain a similar cached node for the whole IOVA space as a superset of the 32-bit space so that performance can remain much more consistent. Inspired by work by Zhen Lei <thunder.leizhen@huawei.com>. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Zhen Lei	086c83acb7	iommu/iova: Optimise the padding calculation The mask for calculating the padding size doesn't change, so there's no need to recalculate it every loop iteration. Furthermore, Once we've done that, it becomes clear that we don't actually need to calculate a padding size at all - by flipping the arithmetic around, we can just combine the upper limit, size, and mask directly to check against the lower limit. For an arm64 build, this alone knocks 20% off the object code size of the entire alloc_iova() function! Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: simplified more of the arithmetic, rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:56 +02:00
Zhen Lei	2070f940a6	iommu/iova: Optimise rbtree searching Checking the IOVA bounds separately before deciding which direction to continue the search (if necessary) results in redundantly comparing both pfns twice each. GCC can already determine that the final comparison op is redundant and optimise it down to 3 in total, but we can go one further with a little tweak of the ordering (which makes the intent of the code that much cleaner as a bonus). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: rewrote commit message to clarify] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:56 +02:00
Arvind Yadav	3c6bae6213	iommu/amd: pr_err() strings should end with newlines pr_err() messages should end with a new-line to avoid other messages being concatenated. So replace '/n' with '\n'. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Fixes: `45a01c4293` ('iommu/amd: Add function copy_dev_tables()') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:01:35 +02:00
Yong Wu	1ff9b17ced	iommu/mediatek: Limit the physical address in 32bit for v7s The ARM short descriptor has already limited the physical address to 32bit after the commit <76557391433c> ("iommu/io-pgtable: Sanitise map/unmap addresses"). But in MediaTek 4GB mode, the physical address is from 0x1_0000_0000 to 0x1_ffff_ffff. this will cause: WARNING: CPU: 4 PID: 3900 at xxx/drivers/iommu/io-pgtable-arm-v7s.c:482 arm_v7s_map+0x40/0xf8 Modules linked in: CPU: 4 PID: 3900 Comm: weston Tainted: G S W 4.9.44 #1 Hardware name: MediaTek MT2712m1v1 board (DT) task: ffffffc0eaa5b280 task.stack: ffffffc0e9858000 PC is at arm_v7s_map+0x40/0xf8 LR is at mtk_iommu_map+0x64/0x90 pc : [<ffffff80085b09e8>] lr : [<ffffff80085b29fc>] pstate: 000001c5 sp : ffffffc0e985b920 x29: ffffffc0e985b920 x28: 0000000127d00000 x27: 0000000000100000 x26: ffffff8008f9e000 x25: 0000000000000003 x24: 0000000000100000 x23: 0000000127d00000 x22: 00000000ff800000 x21: ffffffc0f7ec8ce0 x20: 0000000000000003 x19: 0000000000000003 x18: 0000000000000002 x17: 0000007f7e5d72c0 x16: ffffff80082b0f08 x15: 0000000000000001 x14: 000000000000003f x13: 0000000000000000 x12: 0000000000000028 x11: 0088000000000000 x10: 0000000000000000 x9 : ffffff80092fa000 x8 : ffffffc0e9858000 x7 : ffffff80085b29d8 x6 : 0000000000000000 x5 : ffffff80085b09a8 x4 : 0000000000000003 x3 : 0000000000100000 x2 : 0000000127d00000 x1 : 00000000ff800000 x0 : 0000000000000001 ... Call trace: [<ffffff80085b09e8>] arm_v7s_map+0x40/0xf8 [<ffffff80085b29fc>] mtk_iommu_map+0x64/0x90 [<ffffff80085ab5f8>] iommu_map+0x100/0x3a0 [<ffffff80085ab99c>] default_iommu_map_sg+0x104/0x168 [<ffffff80085aead8>] iommu_dma_alloc+0x238/0x3f8 [<ffffff8008098b30>] __iommu_alloc_attrs+0xa8/0x260 [<ffffff80085f364c>] mtk_drm_gem_create+0xac/0x180 [<ffffff80085f3894>] mtk_drm_gem_dumb_create+0x54/0xc8 [<ffffff80085d576c>] drm_mode_create_dumb_ioctl+0xa4/0xd8 [<ffffff80085cb2a0>] drm_ioctl+0x1c0/0x490 In order to satify this, Limit the physical address to 32bit. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 16:58:28 +02:00
Yong Wu	5c62c1c679	iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA Fix the commit `81b3c25218` ("iommu/io-pgtable: Introduce explicit coherency"). If there is no IO_PGTABLE_QUIRK_NO_DMA, we should call dma_sync_single_for_device for cache synchronization. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Fixes: `81b3c25218` ('iommu/io-pgtable: Introduce explicit coherency') Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 16:56:17 +02:00
Thomas Gleixner	5ba204a181	iommu/amd: Reevaluate vector configuration on activate() With the upcoming reservation/management scheme, early activation will assign a special vector. The final activation at request_irq() assigns a real vector, which needs to be updated in the tables. Split out the reconfiguration code in set_affinity and use it for reactivation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213155.944883733@linutronix.de	2017-09-25 20:52:01 +02:00
Thomas Gleixner	d491bdff88	iommu/vt-d: Reevaluate vector configuration on activate() With the upcoming reservation/management scheme, early activation will assign a special vector. The final activation at request_irq() assigns a real vector, which needs to be updated in the tables. Split out the reconfiguration code in set_affinity and use it for reactivation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213155.853028808@linutronix.de	2017-09-25 20:52:00 +02:00
Thomas Gleixner	7249164346	genirq/irqdomain: Update irq_domain_ops.activate() signature The irq_domain_ops.activate() callback has no return value and no way to tell the function that the activation is early. The upcoming changes to support a reservation scheme which allows to assign interrupt vectors on x86 only when the interrupt is actually requested requires: - A return value, so activation can fail at request_irq() time - Information that the activate invocation is early, i.e. before request_irq(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de	2017-09-25 20:38:24 +02:00
Robin Murphy	c0d05cde2a	iommu/of: Remove PCI host bridge node check of_pci_iommu_init() tries to be clever and stop its alias walk at the device represented by master_np, in case of weird PCI topologies where the bridge to the IOMMU and the rest of the system is not at the root. It turns out this is a bit short-sighted, since there are plenty of other callers of pci_for_each_dma_alias() which would also need the same behaviour in that situation, and the only platform so far with such a topology (Cavium ThunderX2) already solves it more generally via a PCI quirk. As this check is effectively redundant, and returning a boolean value as an int is a bit broken anyway, let's just get rid of it. Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Fixes: `d87beb7492` ("iommu/of: Handle PCI aliases properly") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-22 12:05:43 +02:00
Geert Uytterhoeven	986a5f7017	iommu/qcom: Depend on HAS_DMA to fix compile error If NO_DMA=y: warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT && HAS_DMA && (ARM \|\| ARM64 \|\| COMPILE_TEST && !GENERIC_ATOMIC64)) and drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_sync_pte': io-pgtable-arm.c:(.text+0x206): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_free_pages': io-pgtable-arm.c:(.text+0x6a6): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_alloc_pages': io-pgtable-arm.c:(.text+0x812): undefined reference to `bad_dma_ops' io-pgtable-arm.c:(.text+0x81c): undefined reference to `bad_dma_ops' io-pgtable-arm.c:(.text+0x862): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `arm_lpae_run_tests': io-pgtable-arm.c:(.init.text+0x86): undefined reference to `alloc_io_pgtable_ops' io-pgtable-arm.c:(.init.text+0x47c): undefined reference to `free_io_pgtable_ops' drivers/iommu/qcom_iommu.o: In function `qcom_iommu_init_domain': qcom_iommu.c:(.text+0x1ce): undefined reference to `alloc_io_pgtable_ops' drivers/iommu/qcom_iommu.o: In function `qcom_iommu_domain_free': qcom_iommu.c:(.text+0x754): undefined reference to `free_io_pgtable_ops' QCOM_IOMMU selects IOMMU_IO_PGTABLE_LPAE, which bypasses its dependency on HAS_DMA. Make QCOM_IOMMU depend on HAS_DMA to fix this. Fixes: `0ae349a0f3` ("iommu/qcom: Add qcom_iommu") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 15:30:41 +02:00
Marek Szyprowski	7a974b29fe	iommu/exynos: Rework runtime PM links management add_device is a bit more suitable for establishing runtime PM links than the xlate callback. This change also makes it possible to implement proper cleanup - in remove_device callback. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 14:48:01 +02:00
Arnd Bergmann	3bd71e18c5	iommu/vt-d: Fix harmless section mismatch warning Building with gcc-4.6 results in this warning due to dmar_table_print_dmar_entry being inlined as in newer compiler versions: WARNING: vmlinux.o(.text+0x5c8bee): Section mismatch in reference from the function dmar_walk_remapping_entries() to the function .init.text:dmar_table_print_dmar_entry() The function dmar_walk_remapping_entries() references the function __init dmar_table_print_dmar_entry(). This is often because dmar_walk_remapping_entries lacks a __init annotation or the annotation of dmar_table_print_dmar_entry is wrong. This removes the __init annotation to avoid the warning. On compilers that don't show the warning today, this should have no impact since the function gets inlined anyway. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:44:46 +02:00
Guenter Roeck	a4aaeccc7a	iommu: Add missing dependencies parisc:allmodconfig, xtensa:allmodconfig, and possibly others generate the following Kconfig warning. warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT && HAS_DMA && (ARM \|\| ARM64 \|\| COMPILE_TEST && !GENERIC_ATOMIC64)) IOMMU_IO_PGTABLE_LPAE depends on (COMPILE_TEST && !GENERIC_ATOMIC64), so any configuration option selecting it needs to have the same dependencies. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:42:19 +02:00
Suman Anna	9d5018deec	iommu/omap: Add support to program multiple iommus A client user instantiates and attaches to an iommu_domain to program the OMAP IOMMU associated with the domain. The iommus programmed by a client user are bound with the iommu_domain through the user's device archdata. The OMAP IOMMU driver currently supports only one IOMMU per IOMMU domain per user. The OMAP IOMMU driver has been enhanced to support allowing multiple IOMMUs to be programmed by a single client user. This support is being added mainly to handle the DSP subsystems on the DRA7xx SoCs, which have two MMUs within the same subsystem. These MMUs provide translations for a processor core port and an internal EDMA port. This support allows both the MMUs to be programmed together, but with each one retaining it's own internal state objects. The internal EDMA block is managed by the software running on the DSPs, and this design provides on-par functionality with previous generation OMAP DSPs where the EDMA and the DSP core shared the same MMU. The multiple iommus are expected to be provided through a sentinel terminated array of omap_iommu_arch_data objects through the client user's device archdata. The OMAP driver core is enhanced to loop through the array of attached iommus and program them for all common operations. The sentinel-terminated logic is used so as to not change the omap_iommu_arch_data structure. NOTE: 1. The IOMMU group and IOMMU core registration is done only for the DSP processor core MMU even though both MMUs are represented by their own platform device and are probed individually. The IOMMU device linking uses this registered MMU device. The struct iommu_device for the second MMU is not used even though memory for it is allocated. 2. The OMAP IOMMU debugfs code still continues to operate on individual IOMMU objects. Signed-off-by: Suman Anna <s-anna@ti.com> [t-kristo@ti.com: ported support to 4.13 based kernel] Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:32:05 +02:00
Suman Anna	0d3642883b	iommu/omap: Change the attach detection logic The OMAP IOMMU driver allows only a single device (eg: a rproc device) to be attached per domain. The current attach detection logic relies on a check for an attached iommu for the respective client device. Change this logic to use the client device pointer instead in preparation for supporting multiple iommu devices to be bound to a single iommu domain, and thereby to a client device. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:32:05 +02:00
Linus Torvalds	4dfc278803	IOMMU Updates for Linux v4.14 Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - Common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - Finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - Finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - New functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - Support for mt2712 in the Mediatek IOMMU driver - New IOMMU driver for QCOM hardware - System PM support for ARM-SMMU - Shutdown method for ARM-SMMU-v3 - Some constification patches - Various other small improvements and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZtCFNAAoJECvwRC2XARrjZnQP/AxC/ezQpq82HbegF4sM/cVE Ep7TeTqodEl75FS/6txe2wU0pwodqk/LB9ajfQZUbE1w8vKsNEqi5qf4ZYHGoxYI 5bWyjJBzKIlwENH5lsBpQNt6XLevrYmRsFy7F0tRYy+qPQq8k+js2i7/XkCL3q7L 3xklF847RRoITaTOhhaROx1pF23dSMEsS2XGuWHcZfjORtep4wcFKzd/2SvlCWCo P2bRU7jBzfJuuGSA80gaiUbDmrULTUfYuZNp7njASzCgsDmagERtvDEpdoXPNNSp u6s4LjU1Dp3fgr6g6cFRO7B6JUbWd619nwo9so/c/wZN54yEngBF9EyeeF3mv2O5 ZbM2mOW3RlZcjxFT/AC8G4cZwwP6MpCEQOdqknoqc6ZQwcDqwN0o9I4+po0wsiAU 89ijZZe9Mx0p9lNpihaBEB1erAUWPo5Obh62zo80W3h6x9WzkGQWM+PyFK2DYoaC 8biEZzcc21sLEHvXQkcEGJSKrihHr9sluOqvxmCw5QAkYIFAeZRoeH7JtZWjVCnr T3XvaG1G1Aw6tS7ErxufdKawREAGki0Rm9i1baiH9sqNj5rllM01Y+PgU6E21Nbg iZp9gJLjfwM4vhYLlovvQK5PRoOBsCkyCpEI4GJqjLeam5p/WN06CFFf0ifQofYr qDPCVDkWHWV8nugFFKE7 =EVh9 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - new functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - support for mt2712 in the Mediatek IOMMU driver - new IOMMU driver for QCOM hardware - system PM support for ARM-SMMU - shutdown method for ARM-SMMU-v3 - some constification patches - various other small improvements and fixes" * tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits) iommu/vt-d: Don't be too aggressive when clearing one context entry iommu: Introduce Interface for IOMMU TLB Flushing iommu/s390: Constify iommu_ops iommu/vt-d: Avoid calling virt_to_phys() on null pointer iommu/vt-d: IOMMU Page Request needs to check if address is canonical. arm/tegra: Call bus_set_iommu() after iommu_device_register() iommu/exynos: Constify iommu_ops iommu/ipmmu-vmsa: Make ipmmu_gather_ops const iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable iommu/amd: Rename a few flush functions iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY iommu/mediatek: Fix a build warning of BIT(32) in ARM iommu/mediatek: Fix a build fail of m4u_type iommu: qcom: annotate PM functions as __maybe_unused iommu/pamu: Fix PAMU boot crash memory: mtk-smi: Degrade SMI init to module_init iommu/mediatek: Enlarge the validate PA range for 4GB mode iommu/mediatek: Disable iommu clock when system suspend iommu/mediatek: Move pgtable allocation into domain_alloc iommu/mediatek: Merge 2 M4U HWs into one iommu domain ...	2017-09-09 15:03:24 -07:00
Linus Torvalds	0d519f2d1e	pci-v4.14-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ EG76ridTolUFVV+wzJD9 =9cLP -----END PGP SIGNATURE----- Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add enhanced Downstream Port Containment support, which prints more details about Root Port Programmed I/O errors (Dongdong Liu) - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang) - add MediaTek MT2712 and MT7622 support (Ryder Lee) - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang) - add Qualcom IPQ8074 support (Varadarajan Narayanan) - add R-Car r8a7743/5 device tree support (Biju Das) - add Rockchip per-lane PHY support for better power management (Shawn Lin) - fix IRQ mapping for hot-added devices by replacing the pci_fixup_irqs() boot-time design with a host bridge hook called at probe-time (Lorenzo Pieralisi, Matthew Minter) - fix race when enabling two devices that results in upstream bridge not being enabled correctly (Srinath Mannam) - fix pciehp power fault infinite loop (Keith Busch) - fix SHPC bridge MSI hotplug events by enabling bus mastering (Aleksandr Bezzubikov) - fix a VFIO issue by correcting PCIe capability sizes (Alex Williamson) - fix an INTD issue on Xilinx and possibly other drivers by unifying INTx IRQ domain support (Paul Burton) - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg Roedel) - allow APM X-Gene device assignment to guests by adding an ACS quirk (Feng Kan) - fix driver crashes by disabling Extended Tags on Broadcom HT2100 (Extended Tags support is required for PCIe Receivers but not Requesters, and we now enable them by default when Requesters support them) (Sinan Kaya) - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs use the real Requester ID (not a phantom RID) (Robin Murphy) - prevent assignment of Intel VMD children to guests (which may be supported eventually, but isn't yet) by not associating an IOMMU with them (Jon Derrick) - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott Bauer) - fix a Function-Level Reset issue with Intel 750 NVMe by waiting longer (up to 60sec instead of 1sec) for device to become ready (Sinan Kaya) - fix a Function-Level Reset issue on iProc Stingray by working around hardware defects in the CRS implementation (Oza Pawandeep) - fix an issue with Intel NVMe P3700 after an iProc reset by adding a delay during shutdown (Oza Pawandeep) - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking in compose_msi_msg() (Stephen Hemminger) - fix a wireless LAN driver timeout by clearing DesignWare MSI interrupt status after it is handled, not before (Faiz Abbas) - fix DesignWare ATU enable checking (Jisheng Zhang) - reduce Layerscape dependencies on the bootloader by doing more initialization in the driver (Hou Zhiqiang) - improve Intel VMD performance allowing allocation of more IRQ vectors than present CPUs (Keith Busch) - improve endpoint framework support for initial DMA mask, different BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon Vijay Abraham I, Stan Drozd) - rework CRS support to add periodic messages while we poll during enumeration and after Function-Level Reset and prepare for possible other uses of CRS (Sinan Kaya) - clean up Root Port AER handling by removing unnecessary code and moving error handler methods to struct pcie_port_service_driver (Christoph Hellwig) - clean up error handling paths in various drivers (Bjorn Andersson, Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen, Lorenzo Pieralisi, Sergei Shtylyov) - clean up SR-IOV resource handling by disabling VF decoding before updating the corresponding resource structs (Gavin Shan) - clean up DesignWare-based drivers by unifying quirks to update Class Code and Interrupt Pin and related handling of write-protected registers (Hou Zhiqiang) - clean up by adding empty generic pcibios_align_resource() and pcibios_fixup_bus() and removing empty arch-specific implementations (Palmer Dabbelt) - request exclusive reset control for several drivers to allow cleanup elsewhere (Philipp Zabel) - constify various structures (Arvind Yadav, Bhumika Goyal) - convert from full_name() to %pOF (Rob Herring) - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn Lin) * tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits) PCI: xgene: Clean up whitespace PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset PCI: xgene: Fix platform_get_irq() error handling PCI: xilinx-nwl: Fix platform_get_irq() error handling PCI: rockchip: Fix platform_get_irq() error handling PCI: altera: Fix platform_get_irq() error handling PCI: spear13xx: Fix platform_get_irq() error handling PCI: artpec6: Fix platform_get_irq() error handling PCI: armada8k: Fix platform_get_irq() error handling PCI: dra7xx: Fix platform_get_irq() error handling PCI: exynos: Fix platform_get_irq() error handling PCI: iproc: Clean up whitespace PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP PCI: iproc: Add 500ms delay during device shutdown PCI: Fix typos and whitespace errors PCI: Remove unused "res" variable from pci_resource_io() PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag() PCI/AER: Reformat AER register definitions iommu/vt-d: Prevent VMD child devices from being remapping targets x86/PCI: Use is_vmd() rather than relying on the domain number ...	2017-09-08 15:47:43 -07:00
Linus Torvalds	b1b6f83ac9	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "PCID support, 5-level paging support, Secure Memory Encryption support The main changes in this cycle are support for three new, complex hardware features of x86 CPUs: - Add 5-level paging support, which is a new hardware feature on upcoming Intel CPUs allowing up to 128 PB of virtual address space and 4 PB of physical RAM space - a 512-fold increase over the old limits. (Supercomputers of the future forecasting hurricanes on an ever warming planet can certainly make good use of more RAM.) Many of the necessary changes went upstream in previous cycles, v4.14 is the first kernel that can enable 5-level paging. This feature is activated via CONFIG_X86_5LEVEL=y - disabled by default. (By Kirill A. Shutemov) - Add 'encrypted memory' support, which is a new hardware feature on upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system RAM to be encrypted and decrypted (mostly) transparently by the CPU, with a little help from the kernel to transition to/from encrypted RAM. Such RAM should be more secure against various attacks like RAM access via the memory bus and should make the radio signature of memory bus traffic harder to intercept (and decrypt) as well. This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled by default. (By Tom Lendacky) - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a hardware feature that attaches an address space tag to TLB entries and thus allows to skip TLB flushing in many cases, even if we switch mm's. (By Andy Lutomirski) All three of these features were in the works for a long time, and it's coincidence of the three independent development paths that they are all enabled in v4.14 at once" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits) x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y) x86/mm: Use pr_cont() in dump_pagetable() x86/mm: Fix SME encryption stack ptr handling kvm/x86: Avoid clearing the C-bit in rsvd_bits() x86/CPU: Align CR3 defines x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages acpi, x86/mm: Remove encryption mask from ACPI page protection type x86/mm, kexec: Fix memory corruption with SME on successive kexecs x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y x86/mm: Allow userspace have mappings above 47-bit x86/mm: Prepare to expose larger address space to userspace x86/mpx: Do not allow MPX if we have mappings above 47-bit x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit() x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD x86/mm/dump_pagetables: Fix printout of p4d level x86/mm/dump_pagetables: Generalize address normalization x86/boot: Fix memremap() related build failure ...	2017-09-04 12:21:28 -07:00
Joerg Roedel	47b59d8e40	Merge branches 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-09-01 11:31:42 +02:00
Filippo Sironi	5082219b6a	iommu/vt-d: Don't be too aggressive when clearing one context entry Previously, we were invalidating context cache and IOTLB globally when clearing one context entry. This is a tad too aggressive. Invalidate the context cache and IOTLB for the interested device only. Signed-off-by: Filippo Sironi <sironi@amazon.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-01 11:30:41 +02:00
Jérôme Glisse	30ef7d2c05	iommu/intel: update to new mmu_notifier semantic Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:12:59 -07:00
Jérôme Glisse	f0d1c713d6	iommu/amd: update to new mmu_notifier semantic Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:12:59 -07:00
Jon Derrick	5823e330b5	iommu/vt-d: Prevent VMD child devices from being remapping targets VMD child devices must use the VMD endpoint's ID as the requester. Because of this, there needs to be a way to link the parent VMD endpoint's IOMMU group and associated mappings to the VMD child devices such that attaching and detaching child devices modify the endpoint's mappings, while preventing early detaching on a singular device removal or unbinding. The reassignment of individual VMD child devices devices to VMs is outside the scope of VMD, but may be implemented in the future. For now it is best to prevent any such attempts. Prevent VMD child devices from returning an IOMMU, which prevents it from exposing an iommu_group sysfs directory and allowing subsequent binding by userspace-access drivers such as VFIO. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-30 16:43:58 -05:00
Joerg Roedel	add02cfdc9	iommu: Introduce Interface for IOMMU TLB Flushing With the current IOMMU-API the hardware TLBs have to be flushed in every iommu_ops->unmap() call-back. For unmapping large amounts of address space, like it happens when a KVM domain with assigned devices is destroyed, this causes thousands of unnecessary TLB flushes in the IOMMU hardware because the unmap call-back runs for every unmapped physical page. With the TLB Flush Interface and the new iommu_unmap_fast() function introduced here the need to clean the hardware TLBs is removed from the unmapping code-path. Users of iommu_unmap_fast() have to explicitly call the TLB-Flush functions to sync the page-table changes to the hardware. Three functions for TLB-Flushes are introduced: * iommu_flush_tlb_all() - Flushes all TLB entries associated with that domain. TLBs entries are flushed when this function returns. * iommu_tlb_range_add() - This will add a given range to the flush queue for this domain. * iommu_tlb_sync() - Flushes all queued ranges from the hardware TLBs. Returns when the flush is finished. The semantic of this interface is intentionally similar to the iommu_gather_ops from the io-pgtable code. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 18:07:13 +02:00
Arvind Yadav	cceb845195	iommu/s390: Constify iommu_ops iommu_ops are not supposed to change at runtime. Functions 'bus_set_iommu' working with const iommu_ops provided by <linux/iommu.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:53:29 +02:00
Ashok Raj	11b93ebfa0	iommu/vt-d: Avoid calling virt_to_phys() on null pointer New kernels with debug show panic() from __phys_addr() checks. Avoid calling virt_to_phys() when pasid_state_tbl pointer is null To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org> Cc: iommu@lists.linux-foundation.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:48:18 +02:00
Ashok Raj	9d8c3af316	iommu/vt-d: IOMMU Page Request needs to check if address is canonical. Page Request from devices that support device-tlb would request translation to pre-cache them in device to avoid overhead of IOMMU lookups. IOMMU needs to check for canonicallity of the address before performing page-fault processing. To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org> Cc: iommu@lists.linux-foundation.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reported-by: Sudeep Dutt <sudeep.dutt@intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:48:18 +02:00
Joerg Roedel	96302d89a0	arm/tegra: Call bus_set_iommu() after iommu_device_register() The bus_set_iommu() function will call the add_device() call-back which needs the iommu to be registered. Reported-by: Jon Hunter <jonathanh@nvidia.com> Fixes: `0b480e4470` ('iommu/tegra: Add support for struct iommu_device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:28:32 +02:00
Arvind Yadav	0b9a36947c	iommu/exynos: Constify iommu_ops iommu_ops are not supposed to change at runtime. Functions 'iommu_device_set_ops' and 'bus_set_iommu' working with const iommu_ops provided by <linux/iommu.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:13:54 +02:00
Bhumika Goyal	8da4af9586	iommu/ipmmu-vmsa: Make ipmmu_gather_ops const Make these const as they are not modified anywhere. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:10:53 +02:00
Oleksandr Tyshchenko	a175a67d30	iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable Reserving a free context is both quicker and more likely to fail (due to limited hardware resources) than setting up a pagetable. What is more the pagetable init/cleanup code could require the context to be set up. Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> CC: Robin Murphy <robin.murphy@arm.com> CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> CC: Joerg Roedel <joro@8bytes.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:10:46 +02:00
Joerg Roedel	0688a09990	iommu/amd: Rename a few flush functions Rename a few iommu cache-flush functions that start with iommu_ to start with amd_iommu now. This is to prevent name collisions with generic iommu code later on. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:31:31 +02:00
Baoquan He	ec62b1ab0f	iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY In get_domain(), 'domain' could be NULL before it's passed to dma_ops_domain() to dereference. And the current code calling get_domain() can't deal with the returned 'domain' well if its value is NULL. So before dma_ops_domain() calling, check if 'domain' is NULL, If yes just return ERR_PTR(-EBUSY) directly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `df3f7a6e8e` ('iommu/amd: Use is_attach_deferred call-back') Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:28:37 +02:00
Yong Wu	4193998043	iommu/mediatek: Fix a build warning of BIT(32) in ARM The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build warning while ARCH=arm: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_iova_to_phys': include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow] #define BIT(nr) (1UL << (nr)) ^ >> drivers/iommu/mtk_iommu.c:407:9: note: in expansion of macro 'BIT' pa \|= BIT(32); drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_probe': include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow] #define BIT(nr) (1UL << (nr)) ^ drivers/iommu/mtk_iommu.c:589:35: note: in expansion of macro 'BIT' data->enable_4GB = !!(max_pfn > (BIT(32) >> PAGE_SHIFT)); Use BIT_ULL instead of BIT. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:26:21 +02:00
Yong Wu	4f1c8ea16b	iommu/mediatek: Fix a build fail of m4u_type The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build error: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init': >> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data' has no member named 'm4u_type'; did you mean 'm4u_dom'? if (data->enable_4GB && data->m4u_type != M4U_MT8173) { This patch fix it, use "m4u_plat" instead of "m4u_type". Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:26:21 +02:00
Arnd Bergmann	6ce5b0f22d	iommu: qcom: annotate PM functions as __maybe_unused The qcom_iommu_disable_clocks() function is only called from PM code that is hidden in an #ifdef, causing a harmless warning without CONFIG_PM: drivers/iommu/qcom_iommu.c:601:13: error: 'qcom_iommu_disable_clocks' defined but not used [-Werror=unused-function] static void qcom_iommu_disable_clocks(struct qcom_iommu_dev qcom_iommu) drivers/iommu/qcom_iommu.c:581:12: error: 'qcom_iommu_enable_clocks' defined but not used [-Werror=unused-function] static int qcom_iommu_enable_clocks(struct qcom_iommu_dev qcom_iommu) Replacing that #ifdef with __maybe_unused annotations lets the compiler drop the functions silently instead. Fixes: `0ae349a0f3` ("iommu/qcom: Add qcom_iommu") Acked-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:24:52 +02:00
Ingo Molnar	413d63d71b	Merge branch 'linus' into x86/mm to pick up fixes and to fix conflicts Conflicts: arch/x86/kernel/head64.c arch/x86/mm/mmap.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-26 09:19:13 +02:00
Joerg Roedel	3ff2dcc058	iommu/pamu: Fix PAMU boot crash Commit `68a17f0be6` introduced an initialization order problem, where devices are linked against an iommu which is not yet initialized. Fix it by initializing the iommu-device before the iommu-ops are registered against the bus. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: `68a17f0be6` ('iommu/pamu: Add support for generic iommu-device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-23 16:28:09 +02:00
Yong Wu	30e2fccf95	iommu/mediatek: Enlarge the validate PA range for 4GB mode This patch is for 4GB mode, mainly for 4 issues: 1) Fix a 4GB bug: if the dram base is 0x4000_0000, the dram size is 0xc000_0000. then the code just meet a corner case because max_pfn is 0x10_0000. data->enable_4GB = !!(max_pfn > (0xffffffffUL >> PAGE_SHIFT)); It's true at the case above. That is unexpected. 2) In mt2712, there is a new register for the 4GB PA range(0x118) we should enlarge the max PA range, or the HW will report error. The dram range is from 0x1_0000_0000 to 0x1_ffff_ffff in the 4GB mode, we cut out the bit[32:30] of the SA(Start address) and EA(End address) into this REG_MMU_VLD_PA_RNG(0x118). 3) In mt2712, the register(0x13c) is extended for 4GB mode. bit[7:6] indicate the valid PA[32:33]. Thus, we don't mask the value and print it directly for debug. 4) if 4GB is enabled, the dram PA range is from 0x1_0000_0000 to 0x1_ffff_ffff. Thus, the PA from iova_to_pa should also '\|' BIT(32) Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:38:00 +02:00
Yong Wu	6254b64f57	iommu/mediatek: Disable iommu clock when system suspend When system suspend, infra power domain may be off, and the iommu's clock must be disabled when system off, or the iommu's bclk clock maybe disabled after system resume. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	4b00f5ac12	iommu/mediatek: Move pgtable allocation into domain_alloc After adding the global list for M4U HW, We get a chance to move the pagetable allocation into the mtk_iommu_domain_alloc. Let the domain_alloc do the right thing. This patch is for fixing this problem[1]. [1]: https://patchwork.codeaurora.org/patch/53987/ Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	7c3a2ec028	iommu/mediatek: Merge 2 M4U HWs into one iommu domain In theory, If there are 2 M4U HWs, there should be 2 IOMMU domains. But one IOMMU domain(4GB iova range) is enough for us currently, It's unnecessary to maintain 2 pagetables. Besides, This patch can simplify our consumer code largely. They don't need map a iova range from one domain into another, They can share the iova address easily. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	e6dec92308	iommu/mediatek: Add mt2712 IOMMU support The M4U IP blocks in mt2712 is MTK's generation2 M4U which use the ARM Short-descriptor like mt8173, and most of the HW registers are the same. The difference is that there are 2 M4U HWs in mt2712 while there's only one in mt8173. The purpose of 2 M4U HWs is for balance the bandwidth. Normally if there are 2 M4U HWs, there should be 2 iommu domains, each M4U has a iommu domain. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:58 +02:00
Yong Wu	a9467d9542	iommu/mediatek: Move MTK_M4U_TO_LARB/PORT into mtk_iommu.c The definition of MTK_M4U_TO_LARB and MTK_M4U_TO_PORT are shared by all the gen2 M4U HWs. Thus, Move them out from mt8173-larb-port.h, and put them into the c file. Suggested-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:58 +02:00
Magnus Damm	7af9a5fdb9	iommu/ipmmu-vmsa: Use iommu_device_sysfs_add()/remove() Extend the driver to make use of iommu_device_sysfs_add()/remove() functions to hook up initial sysfs support. Suggested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:19:00 +02:00
Joerg Roedel	2479c631d1	iommu/amd: Fix section mismatch warning The variable amd_iommu_pre_enabled is used in non-init code-paths, so remove the __initdata annotation. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `3ac3e5ee5e` ('iommu/amd: Copy old trans table from old kernel') Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-19 10:34:31 +02:00
Joerg Roedel	ae162efbf2	iommu/amd: Fix compiler warning in copy_device_table() This was reported by the kbuild bot. The condition in which entry would be used uninitialized can not happen, because when there is no iommu this function would never be called. But its no fast-path, so fix the warning anyway. Reported-by: kbuild test robot <fengguang.wu@intel.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-19 10:33:42 +02:00
Joerg Roedel	af6ee6c1c4	Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu	2017-08-19 00:42:01 +02:00
Robin Murphy	1464d0b1de	iommu: Avoid NULL group dereference The recently-removed FIXME in iommu_get_domain_for_dev() turns out to have been a little misleading, since that check is still worthwhile even when groups are universal. We have a few IOMMU-aware drivers which only care whether their device is already attached to an existing domain or not, for which the previous behaviour of iommu_get_domain_for_dev() was ideal, and who now crash if their device does not have an IOMMU. With IOMMU groups now serving as a reliable indicator of whether a device has an IOMMU or not (barring false-positives from VFIO no-IOMMU mode), drivers could arguably do this: group = iommu_group_get(dev); if (group) { domain = iommu_get_domain_for_dev(dev); iommu_group_put(group); } However, rather than duplicate that code across multiple callsites, particularly when it's still only the domain they care about, let's skip straight to the next step and factor out the check into the common place it applies - in iommu_get_domain_for_dev() itself. Sure, it ends up looking rather familiar, but now it's backed by the reasoning of having a robust API able to do the expected thing for all devices regardless. Fixes: `05f80300dc` ("iommu: Finish making iommu_group support mandatory") Reported-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-18 11:41:17 +02:00
Joerg Roedel	c184ae83c8	iommu/tegra-gart: Add support for struct iommu_device Add a struct iommu_device to each tegra-gart and register it with the iommu-core. Also link devices added to the driver to their respective hardware iommus. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-17 16:31:34 +02:00
Joerg Roedel	0b480e4470	iommu/tegra: Add support for struct iommu_device Add a struct iommu_device to each tegra-smmu and register it with the iommu-core. Also link devices added to the driver to their respective hardware iommus. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-17 16:31:34 +02:00
Joerg Roedel	fa94cf84af	Merge branch 'core' into arm/tegra	2017-08-17 16:31:27 +02:00
Robin Murphy	a2d866f7d6	iommu/arm-smmu: Add system PM support With all our hardware state tracked in such a way that we can naturally restore it as part of the necessary reset, resuming is trivial, and there's nothing to do on suspend at all. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:29 +01:00
Robin Murphy	90df373cc6	iommu/arm-smmu: Track context bank state Echoing what we do for Stream Map Entries, maintain a software shadow state for context bank configuration. With this in place, we are mere moments away from blissfully easy suspend/resume support. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: fix sparse warning by only clearing .cfg during domain destruction] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:28 +01:00
Nate Watterson	7aa8619a66	iommu/arm-smmu-v3: Implement shutdown method The shutdown method disables the SMMU to avoid corrupting a new kernel started with kexec. Signed-off-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:28 +01:00
Joerg Roedel	13cf017446	iommu/vt-d: Make use of iova deferred flushing Remove the deferred flushing implementation in the Intel VT-d driver and use the one from the common iova code instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:53 +02:00
Joerg Roedel	c8acb28b33	iommu/vt-d: Allow to flush more than 4GB of device TLBs The shift qi_flush_dev_iotlb() is done on an int, which limits the mask to 32 bits. Make the mask 64 bits wide so that more than 4GB of address range can be flushed at once. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:53 +02:00
Joerg Roedel	9003d61863	iommu/amd: Make use of iova queue flushing Rip out the implementation in the AMD IOMMU driver and use the one in the common iova code instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	9a005a800a	iommu/iova: Add flush timer Add a timer to flush entries from the Flush-Queues every 10ms. This makes sure that no stale TLB entries remain for too long after an IOVA has been unmapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	8109c2a2f8	iommu/iova: Add locking to Flush-Queues The lock is taken from the same CPU most of the time. But having it allows to flush the queue also from another CPU if necessary. This will be used by a timer to regularily flush any pending IOVAs from the Flush-Queues. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	fb418dab8a	iommu/iova: Add flush counters to Flush-Queue implementation There are two counters: * fq_flush_start_cnt - Increased when a TLB flush is started. * fq_flush_finish_cnt - Increased when a TLB flush is finished. The fq_flush_start_cnt is assigned to every Flush-Queue entry on its creation. When freeing entries from the Flush-Queue, the value in the entry is compared to the fq_flush_finish_cnt. The entry can only be freed when its value is less than the value of fq_flush_finish_cnt. The reason for these counters it to take advantage of IOMMU TLB flushes that happened on other CPUs. These already flushed the TLB for Flush-Queue entries on other CPUs so that they can already be freed without flushing the TLB again. This makes it less likely that the Flush-Queue is full and saves IOMMU TLB flushes. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	1928210107	iommu/iova: Implement Flush-Queue ring buffer Add a function to add entries to the Flush-Queue ring buffer. If the buffer is full, call the flush-callback and free the entries. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	42f87e71c3	iommu/iova: Add flush-queue data structures This patch adds the basic data-structures to implement flush-queues in the generic IOVA code. It also adds the initialization and destroy routines for these data structures. The initialization routine is designed so that the use of this feature is optional for the users of IOVA code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:50 +02:00
Robin Murphy	da4b02750a	iommu/of: Fix of_iommu_configure() for disabled IOMMUs Sudeep reports that the logic got slightly broken when a PCI iommu-map entry targets an IOMMU marked as disabled in DT, since of_pci_map_rid() succeeds in following a phandle, and of_iommu_xlate() doesn't return an error value, but we miss checking whether ops was actually non-NULL. Whilst this could be solved with a point fix in of_pci_iommu_init(), it suggests that all the juggling of ERR_PTR values through the ops pointer is proving rather too complicated for its own good, so let's instead simplify the whole flow (with a side-effect of eliminating the cause of the bug). The fact that we now rely on iommu_fwspec means that we no longer need to pass around an iommu_ops pointer at all - we can simply propagate a regular int return value until we know whether we have a viable IOMMU, then retrieve the ops from the fwspec if and when we actually need them. This makes everything a bit more uniform and certainly easier to follow. Fixes: `d87beb7492` ("iommu/of: Handle PCI aliases properly") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:50 +02:00
Joerg Roedel	f42c223514	iommu/s390: Add support for iommu_device handling Add support for the iommu_device_register interface to make the s390 hardware iommus visible to the iommu core and in sysfs. Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:22:45 +02:00
Baoquan He	20b46dff13	iommu/amd: Disable iommu only if amd_iommu=off is specified It's ok to disable iommu early in normal kernel or in kdump kernel when amd_iommu=off is specified. While we should not disable it in kdump kernel when on-flight dma is still on-going. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:41 +02:00
Baoquan He	daae2d25a4	iommu/amd: Don't copy GCR3 table root pointer When iommu is pre_enabled in kdump kernel, if a device is set up with guest translations (DTE.GV=1), then don't copy GCR3 table root pointer but move the device over to an empty guest-cr3 table and handle the faults in the PPR log (which answer them with INVALID). After all these PPR faults are recoverable for the device and we should not allow the device to change old-kernels data when we don't have to. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:41 +02:00
Baoquan He	b336781b82	iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled AMD pointed out it's unsafe to update the device-table while iommu is enabled. It turns out that device-table pointer update is split up into two 32bit writes in the IOMMU hardware. So updating it while the IOMMU is enabled could have some nasty side effects. The safe way to work around this is to always allocate the device-table below 4G, including the old device-table in normal kernel and the device-table used for copying the content of the old device-table in kdump kernel. Meanwhile we need check if the address of old device-table is above 4G because it might has been touched accidentally in corrupted 1st kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:40 +02:00
Baoquan He	df3f7a6e8e	iommu/amd: Use is_attach_deferred call-back Implement call-back is_attach_deferred and use it to defer the domain attach from iommu driver init to device driver init when iommu is pre-enabled in kdump kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	e01d1913b0	iommu: Add is_attach_deferred call-back to iommu-ops This new call-back will be used to check if the domain attach need be deferred for now. If yes, the domain attach/detach will return directly. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	53019a9e88	iommu/amd: Do sanity check for address translation and irq remap of old dev table entry Firstly split the dev table entry copy into address translation part and irq remapping part. Because these two parts could be enabled independently. Secondly do sanity check for address translation and irq remap of old dev table entry separately. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	3ac3e5ee5e	iommu/amd: Copy old trans table from old kernel Here several things need be done: - If iommu is pre-enabled in a normal kernel, just disable it and print warning. - If any one of IOMMUs is not pre-enabled in kdump kernel, just continue as it does in normal kernel. - If failed to copy dev table of old kernel, continue to proceed as it does in normal kernel. - Only if all IOMMUs are pre-enabled and copy dev table is done well, free the dev table allocated in early_amd_iommu_init() and make amd_iommu_dev_table point to the copied one. - Disable and Re-enable event/cmd buffer, install the copied DTE table to reg, and detect and enable guest vapic. - Flush all caches Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	45a01c4293	iommu/amd: Add function copy_dev_tables() Add function copy_dev_tables to copy the old DEV table entries of the panicked kernel to the new allocated device table. Since all iommus share the same device table the copy only need be done one time. Here add a new global old_dev_tbl_cpy to point to the newly allocated device table which the content of old device table will be copied to. Besides, we also need to: - Check whether all IOMMUs actually use the same device table with the same size - Verify that the size of the old device table is the expected size. - Reserve the old domain id occupied in 1st kernel to avoid touching the old io-page tables. Then on-flight DMA can continue looking it up. And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be reused in copy_dev_tables(). Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	07a80a6b59	iommu/amd: Define bit fields for DTE particularly In AMD-Vi spec several bits of IO PTE fields and DTE fields are similar so that both of them can share the same MACRO definition. However defining them respectively can make code more read-able. Do it now. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:38 +02:00

... 3 4 5 6 7 ...

2343 Commits