Commit Graph

125 Commits

Author SHA1 Message Date
Daniel Drake da72a379b2 iommu/vt-d: Ignore devices with out-of-spec domain number
VMD subdevices are created with a PCI domain ID of 0x10000 or
higher.

These subdevices are also handled like all other PCI devices by
dmar_pci_bus_notifier().

However, when dmar_alloc_pci_notify_info() take records of such devices,
it will truncate the domain ID to a u16 value (in info->seg).
The device at (e.g.) 10000:00:02.0 is then treated by the DMAR code as if
it is 0000:00:02.0.

In the unlucky event that a real device also exists at 0000:00:02.0 and
also has a device-specific entry in the DMAR table,
dmar_insert_dev_scope() will crash on:
   BUG_ON(i >= devices_cnt);

That's basically a sanity check that only one PCI device matches a
single DMAR entry; in this case we seem to have two matching devices.

Fix this by ignoring devices that have a domain number higher than
what can be looked up in the DMAR table.

This problem was carefully diagnosed by Jian-Hong Pan.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Fixes: 59ce0515cd ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-03-14 09:38:39 +01:00
Zhenzhong Duan b0bb0c22c4 iommu/vt-d: Fix the wrong printing in RHSA parsing
When base address in RHSA structure doesn't match base address in
each DRHD structure, the base address in last DRHD is printed out.

This doesn't make sense when there are multiple DRHD units, fix it
by printing the buggy RHSA's base address.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com>
Fixes: fd0c889489 ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-03-14 09:37:58 +01:00
Hans de Goede 5983369644 iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:

 * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
 * significant kernel issues that need prompt attention if they should ever
 * appear at runtime.
 *
 * Do not use these macros when checking for invalid external inputs

The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.

Some distros, e.g. Fedora, have tools watching for the kernel backtraces
logged by the WARN macros and offer the user an option to file a bug for
this when these are encountered. The WARN_TAINT in warn_invalid_dmar()
+ another iommu WARN_TAINT, addressed in another patch, have lead to over
a 100 bugs being filed this way.

This commit replaces the WARN_TAINT("...") calls, with
pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) calls
avoiding the backtrace and thus also avoiding bug-reports being filed
about this against the kernel.

Fixes: fd0c889489 ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
Fixes: e625b4a95d ("iommu/vt-d: Parse ANDD records")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200309140138.3753-2-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1564895
2020-03-13 14:25:06 +01:00
Qian Cai f515241652 iommu/vt-d: Silence RCU-list debugging warnings
Similar to the commit 02d715b4a8 ("iommu/vt-d: Fix RCU list debugging
warnings"), there are several other places that call
list_for_each_entry_rcu() outside of an RCU read side critical section
but with dmar_global_lock held. Silence those false positives as well.

 drivers/iommu/intel-iommu.c:4288 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1ad/0xb97

 drivers/iommu/dmar.c:366 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x125/0xb97

 drivers/iommu/intel-iommu.c:5057 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffffa71892c8 (dmar_global_lock){++++}, at: intel_iommu_init+0x61a/0xb13

Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-03-10 10:30:52 +01:00
Lu Baolu 857f081426 iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
Address field in device TLB invalidation descriptor is qualified
by the S field. If S field is zero, a single page at page address
specified by address [63:12] is requested to be invalidated. If S
field is set, the least significant bit in the address field with
value 0b (say bit N) indicates the invalidation address range. The
spec doesn't require the address [N - 1, 0] to be cleared, hence
remove the unnecessary WARN_ON_ONCE().

Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
to invalidating all the cached mappings on an endpoint, and below
overflow error will be triggered.

[...]
UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

Reported-and-tested-by: Frank <fgndev@posteo.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:36:27 +01:00
jimyan 53291622e2 iommu/vt-d: Don't reject Host Bridge due to scope mismatch
On a system with two host bridges(0000:00:00.0,0000:80:00.0), iommu
initialization fails with

    DMAR: Device scope type does not match for 0000:80:00.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):

but the device has a type 0 PCI header:
80:00.0 Class 0600: Device 8086:2020 (rev 06)
00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00
30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Add the class
0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: jimyan <jimyan@baidu.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-17 10:57:30 +01:00
Lu Baolu 33cd6e642d iommu/vt-d: Flush PASID-based iotlb for iova over first level
When software has changed first-level tables, it should invalidate
the affected IOTLB and the paging-structure-caches using the PASID-
based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Deepa Dinamani 6c3a44ed3c iommu/vt-d: Turn off translations at shutdown
The intel-iommu driver assumes that the iommu state is
cleaned up at the start of the new kernel.
But, when we try to kexec boot something other than the
Linux kernel, the cleanup cannot be relied upon.
Hence, cleanup before we go down for reboot.

Keeping the cleanup at initialization also, in case BIOS
leaves the IOMMU enabled.

I considered turning off iommu only during kexec reboot, but a clean
shutdown seems always a good idea. But if someone wants to make it
conditional, such as VMM live update, we can do that.  There doesn't
seem to be such a condition at this time.

Tested that before, the info message
'DMAR: Translation was enabled for <iommu> but we are not in kdump mode'
would be reported for each iommu. The message will not appear when the
DMA-remapping is not enabled on entry to the kernel.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-11-11 16:07:13 +01:00
Kyung Min Park fd730007a0 iommu/vt-d: Add Scalable Mode fault information
Intel VT-d specification revision 3 added support for Scalable Mode
Translation for DMA remapping. Add the Scalable Mode fault reasons to
show detailed fault reasons when the translation fault happens.

Link: https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf

Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-11 12:36:53 +02:00
Thomas Gleixner 3b20eb2372 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 320
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 02111
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 33 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000435.254582722@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:05 +02:00
Gustavo A. R. Silva 553d66cb1e iommu/vt-d: Use struct_size() helper
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace code of the following form:

size = sizeof(*info) + level * sizeof(info->path[0]);

with:

size = struct_size(info, path, level);

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-26 16:43:21 +02:00
Anshuman Khandual 98fa15f34c mm: replace all open encodings for NUMA_NO_NODE
Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

All these places for replacement were found by running the following
grep patterns on the entire kernel code.  Please let me know if this
might have missed some instances.  This might also have replaced some
false positives.  I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

This patch (of 2):

At present there are multiple places where invalid node number is
encoded as -1.  Even though implicitly understood it is always better to
have macros in there.  Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE.  This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.

Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk>			[mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org>			[dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Acked-by: Doug Ledford <dledford@redhat.com>		[drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Julia Cartwright cffaaf0c81 iommu/dmar: Fix buffer overflow during PCI bus notification
Commit 57384592c4 ("iommu/vt-d: Store bus information in RMRR PCI
device path") changed the type of the path data, however, the change in
path type was not reflected in size calculations.  Update to use the
correct type and prevent a buffer overflow.

This bug manifests in systems with deep PCI hierarchies, and can lead to
an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
or can lead to overflow of slab-allocated data.

   BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
   Write of size 1 at addr ffffffff90445d80 by task swapper/0/1
   CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.87-rt49-02406-gd0a0e96 #1
   Call Trace:
    ? dump_stack+0x46/0x59
    ? print_address_description+0x1df/0x290
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? kasan_report+0x256/0x340
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? e820__memblock_setup+0xb0/0xb0
    ? dmar_dev_scope_init+0x424/0x48f
    ? __down_write_common+0x1ec/0x230
    ? dmar_dev_scope_init+0x48f/0x48f
    ? dmar_free_unused_resources+0x109/0x109
    ? cpumask_next+0x16/0x20
    ? __kmem_cache_create+0x392/0x430
    ? kmem_cache_create+0x135/0x2f0
    ? e820__memblock_setup+0xb0/0xb0
    ? intel_iommu_init+0x170/0x1848
    ? _raw_spin_unlock_irqrestore+0x32/0x60
    ? migrate_enable+0x27a/0x5b0
    ? sched_setattr+0x20/0x20
    ? migrate_disable+0x1fc/0x380
    ? task_rq_lock+0x170/0x170
    ? try_to_run_init_process+0x40/0x40
    ? locks_remove_file+0x85/0x2f0
    ? dev_prepare_static_identity_mapping+0x78/0x78
    ? rt_spin_unlock+0x39/0x50
    ? lockref_put_or_lock+0x2a/0x40
    ? dput+0x128/0x2f0
    ? __rcu_read_unlock+0x66/0x80
    ? __fput+0x250/0x300
    ? __rcu_read_lock+0x1b/0x30
    ? mntput_no_expire+0x38/0x290
    ? e820__memblock_setup+0xb0/0xb0
    ? pci_iommu_init+0x25/0x63
    ? pci_iommu_init+0x25/0x63
    ? do_one_initcall+0x7e/0x1c0
    ? initcall_blacklisted+0x120/0x120
    ? kernel_init_freeable+0x27b/0x307
    ? rest_init+0xd0/0xd0
    ? kernel_init+0xf/0x120
    ? rest_init+0xd0/0xd0
    ? ret_from_fork+0x1f/0x40
   The buggy address belongs to the variable:
    dmar_pci_notify_info_buf+0x40/0x60

Fixes: 57384592c4 ("iommu/vt-d: Store bus information in RMRR PCI device path")
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:24:37 +01:00
Linus Torvalds 8e143b90e4 IOMMU Updates for Linux v4.21
Including (in no particular order):
 
 	- Page table code for AMD IOMMU now supports large pages where
 	  smaller page-sizes were mapped before. VFIO had to work around
 	  that in the past and I included a patch to remove it (acked by
 	  Alex Williamson)
 
 	- Patches to unmodularize a couple of IOMMU drivers that would
 	  never work as modules anyway.
 
 	- Work to unify the the iommu-related pointers in
 	  'struct device' into one pointer. This work is not finished
 	  yet, but will probably be in the next cycle.
 
 	- NUMA aware allocation in iommu-dma code
 
 	- Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver
 
 	- Scalable mode support for the Intel VT-d driver
 
 	- PM runtime improvements for the ARM-SMMU driver
 
 	- Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom
 
 	- Various smaller fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcKkEoAAoJECvwRC2XARrjCCoQAJxsgaAF5Z0s7z8j2A9SkaGp
 SIMnUAI5mDOdyhTOAI+eehpRzg5UVYt/JjFYnHz8HWqbSc8YOvDvHafmhMFIwYvO
 hq5knbs6ns2jJNFO+M4dioDq+3THdqkGIF5xoHdGTP7cn9+XyQ8lAoHo0RuL122U
 PJGqX7Cp4XnFP4HMb3uQYhVeBV7mU+XqAdB+4aDnQkzI5LkQCRr74GcqOm+Rlnyc
 cmQWc2arUMjgc1TJIrex8dx9dT6lq8kOmhyEg/IjHeGaZyJ3HqA+30XDDLEExN0G
 MeVawuxJz40HgXlkXr+iZTQtIFYkXdKvJH6rptMbOfbDeDz+YZ01TbtAMMH9o4jX
 yxjjMjdcWTsWYQ/MHHdsoMP34cajCi/EYPMNksbycw+E3Y+X/bSReCoWC0HUK8/+
 Z4TpZ9mZVygtJR+QNZ+pE9oiJpb4sroM10zTnbMoVHNnvfsO01FYk7FMPkolSKLw
 zB4MDswQYgchoFR9Z4ZB4PycYTzeafLKYgDPDoD1vIJgDavuidwvDWDRTDc+aMWM
 siIIewq19To9jDJkVjX4dsT/p99KVKgAR/Ps6jjWkAroha7g6GcmlYZHIJnyop04
 jiaSXUsk8aRucP/CRz5xdMmaGoN7BsNmpUjcrquc6Povk/6gvXvpY04oCs1+gNMX
 ipL9E3GTFCVBubRFrksv
 =DT9A
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

 - Page table code for AMD IOMMU now supports large pages where smaller
   page-sizes were mapped before. VFIO had to work around that in the
   past and I included a patch to remove it (acked by Alex Williamson)

 - Patches to unmodularize a couple of IOMMU drivers that would never
   work as modules anyway.

 - Work to unify the the iommu-related pointers in 'struct device' into
   one pointer. This work is not finished yet, but will probably be in
   the next cycle.

 - NUMA aware allocation in iommu-dma code

 - Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver

 - Scalable mode support for the Intel VT-d driver

 - PM runtime improvements for the ARM-SMMU driver

 - Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom

 - Various smaller fixes and improvements

* tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (78 commits)
  iommu: Check for iommu_ops == NULL in iommu_probe_device()
  ACPI/IORT: Don't call iommu_ops->add_device directly
  iommu/of: Don't call iommu_ops->add_device directly
  iommu: Consolitate ->add/remove_device() calls
  iommu/sysfs: Rename iommu_release_device()
  dmaengine: sh: rcar-dmac: Use device_iommu_mapped()
  xhci: Use device_iommu_mapped()
  powerpc/iommu: Use device_iommu_mapped()
  ACPI/IORT: Use device_iommu_mapped()
  iommu/of: Use device_iommu_mapped()
  driver core: Introduce device_iommu_mapped() function
  iommu/tegra: Use helper functions to access dev->iommu_fwspec
  iommu/qcom: Use helper functions to access dev->iommu_fwspec
  iommu/of: Use helper functions to access dev->iommu_fwspec
  iommu/mediatek: Use helper functions to access dev->iommu_fwspec
  iommu/ipmmu-vmsa: Use helper functions to access dev->iommu_fwspec
  iommu/dma: Use helper functions to access dev->iommu_fwspec
  iommu/arm-smmu: Use helper functions to access dev->iommu_fwspec
  ACPI/IORT: Use helper functions to access dev->iommu_fwspec
  iommu: Introduce wrappers around dev->iommu_fwspec
  ...
2019-01-01 15:55:29 -08:00
Lu Baolu 5d308fc1ec iommu/vt-d: Add 256-bit invalidation descriptor support
Intel vt-d spec rev3.0 requires software to use 256-bit
descriptors in invalidation queue. As the spec reads in
section 6.5.2:

Remapping hardware supporting Scalable Mode Translations
(ECAP_REG.SMTS=1) allow software to additionally program
the width of the descriptors (128-bits or 256-bits) that
will be written into the Queue. Software should setup the
Invalidation Queue for 256-bit descriptors before progra-
mming remapping hardware for scalable-mode translation as
128-bit descriptors are treated as invalid descriptors
(see Table 21 in Section 6.5.2.10) in scalable-mode.

This patch adds 256-bit invalidation descriptor support
if the hardware presents scalable mode capability.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-12-11 10:45:58 +01:00
Lu Baolu 89a6079df7 iommu/vt-d: Force IOMMU on for platform opt in hint
Intel VT-d spec added a new DMA_CTRL_PLATFORM_OPT_IN_FLAG flag in DMAR
ACPI table [1] for BIOS to report compliance about platform initiated
DMA restricted to RMRR ranges when transferring control to the OS. This
means that during OS boot, before it enables IOMMU none of the connected
devices can bypass DMA protection for instance by overwriting the data
structures used by the IOMMU. The OS also treats this as a hint that the
IOMMU should be enabled to prevent DMA attacks from possible malicious
devices.

A use of this flag is Kernel DMA protection for Thunderbolt [2] which in
practice means that IOMMU should be enabled for PCIe devices connected
to the Thunderbolt ports. With IOMMU enabled for these devices, all DMA
operations are limited in the range reserved for it, thus the DMA
attacks are prevented. All these devices are enumerated in the PCI/PCIe
module and marked with an untrusted flag.

This forces IOMMU to be enabled if DMA_CTRL_PLATFORM_OPT_IN_FLAG is set
in DMAR ACPI table and there are PCIe devices marked as untrusted in the
system. This can be turned off by adding "intel_iommu=off" in the kernel
command line, if any problems are found.

[1] https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf
[2] https://docs.microsoft.com/en-us/windows/security/information-protection/kernel-dma-protection-for-thunderbolt

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
2018-12-05 12:01:55 +03:00
Jacob Pan 1c48db4492 iommu/vt-d: Fix dev iotlb pfsid use
PFSID should be used in the invalidation descriptor for flushing
device IOTLBs on SRIOV VFs.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: "Ashok Raj" <ashok.raj@intel.com>
Cc: "Lu Baolu" <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-07-06 13:26:10 +02:00
Kees Cook 6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Joerg Roedel 1f56835711 Merge branches 'arm/io-pgtable', 'arm/qcom', 'arm/tegra', 'x86/vt-d', 'x86/amd' and 'core' into next 2018-05-29 16:58:17 +02:00
Joerg Roedel a85894cd77 iommu/vt-d: Use WARN_ON_ONCE instead of BUG_ON in qi_flush_dev_iotlb()
A misaligned address is only worth a warning, and not
stopping the while execution path with a BUG_ON().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-05-03 16:32:11 +02:00
Changbin Du 0dfc0c792d iommu/vt-d: fix shift-out-of-bounds in bug checking
It allows to flush more than 4GB of device TLBs. So the mask should be
64bit wide. UBSAN captured this fault as below.

[    3.760024] ================================================================================
[    3.768440] UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1348:3
[    3.774864] shift exponent 64 is too large for 32-bit type 'int'
[    3.780853] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            4.17.0-rc1+ #89
[    3.788661] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[    3.796034] Call Trace:
[    3.798472]  <IRQ>
[    3.800479]  dump_stack+0x90/0xfb
[    3.803787]  ubsan_epilogue+0x9/0x40
[    3.807353]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[    3.812916]  ? qi_flush_dev_iotlb+0x124/0x180
[    3.817261]  qi_flush_dev_iotlb+0x124/0x180
[    3.821437]  iommu_flush_dev_iotlb+0x94/0xf0
[    3.825698]  iommu_flush_iova+0x10b/0x1c0
[    3.829699]  ? fq_ring_free+0x1d0/0x1d0
[    3.833527]  iova_domain_flush+0x25/0x40
[    3.837448]  fq_flush_timeout+0x55/0x160
[    3.841368]  ? fq_ring_free+0x1d0/0x1d0
[    3.845200]  ? fq_ring_free+0x1d0/0x1d0
[    3.849034]  call_timer_fn+0xbe/0x310
[    3.852696]  ? fq_ring_free+0x1d0/0x1d0
[    3.856530]  run_timer_softirq+0x223/0x6e0
[    3.860625]  ? sched_clock+0x5/0x10
[    3.864108]  ? sched_clock+0x5/0x10
[    3.867594]  __do_softirq+0x1b5/0x6f5
[    3.871250]  irq_exit+0xd4/0x130
[    3.874470]  smp_apic_timer_interrupt+0xb8/0x2f0
[    3.879075]  apic_timer_interrupt+0xf/0x20
[    3.883159]  </IRQ>
[    3.885255] RIP: 0010:poll_idle+0x60/0xe7
[    3.889252] RSP: 0018:ffffb1b201943e30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[    3.896802] RAX: 0000000080200000 RBX: 000000000000008e RCX: 000000000000001f
[    3.903918] RDX: 0000000000000000 RSI: 000000002819aa06 RDI: 0000000000000000
[    3.911031] RBP: ffff9e93c6b33280 R08: 00000010f717d567 R09: 000000000010d205
[    3.918146] R10: ffffb1b201943df8 R11: 0000000000000001 R12: 00000000e01b169d
[    3.925260] R13: 0000000000000000 R14: ffffffffb12aa400 R15: 0000000000000000
[    3.932382]  cpuidle_enter_state+0xb4/0x470
[    3.936558]  do_idle+0x222/0x310
[    3.939779]  cpu_startup_entry+0x78/0x90
[    3.943693]  start_secondary+0x205/0x2e0
[    3.947607]  secondary_startup_64+0xa5/0xb0
[    3.951783] ================================================================================

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-05-03 16:32:11 +02:00
Dmitry Safonov 6c50d79f66 iommu/vt-d: Ratelimit each dmar fault printing
There is a ratelimit for printing, but it's incremented each time the
cpu recives dmar fault interrupt. While one interrupt may signal about
*many* faults.
So, measuring the impact it turns out that reading/clearing one fault
takes < 1 usec, and printing info about the fault takes ~170 msec.

Having in mind that maximum number of fault recording registers per
remapping hardware unit is 256.. IRQ handler may run for (170*256) msec.
And as fault-serving loop runs without a time limit, during servicing
new faults may occur..

Ratelimit each fault printing rather than each irq printing.

Fixes: commit c43fce4eeb ("iommu/vt-d: Ratelimit fault handler")

BUG: spinlock lockup suspected on CPU#0, CliShell/9903
 lock: 0xffffffff81a47440, .magic: dead4ead, .owner: kworker/u16:2/8915, .owner_cpu: 6
CPU: 0 PID: 9903 Comm: CliShell
Call Trace:$\n'
[..] dump_stack+0x65/0x83$\n'
[..] spin_dump+0x8f/0x94$\n'
[..] do_raw_spin_lock+0x123/0x170$\n'
[..] _raw_spin_lock_irqsave+0x32/0x3a$\n'
[..] uart_chars_in_buffer+0x20/0x4d$\n'
[..] tty_chars_in_buffer+0x18/0x1d$\n'
[..] n_tty_poll+0x1cb/0x1f2$\n'
[..] tty_poll+0x5e/0x76$\n'
[..] do_select+0x363/0x629$\n'
[..] compat_core_sys_select+0x19e/0x239$\n'
[..] compat_SyS_select+0x98/0xc0$\n'
[..] sysenter_dispatch+0x7/0x25$\n'
[..]
NMI backtrace for cpu 6
CPU: 6 PID: 8915 Comm: kworker/u16:2
Workqueue: dmar_fault dmar_fault_work
Call Trace:$\n'
[..] wait_for_xmitr+0x26/0x8f$\n'
[..] serial8250_console_putchar+0x1c/0x2c$\n'
[..] uart_console_write+0x40/0x4b$\n'
[..] serial8250_console_write+0xe6/0x13f$\n'
[..] call_console_drivers.constprop.13+0xce/0x103$\n'
[..] console_unlock+0x1f8/0x39b$\n'
[..] vprintk_emit+0x39e/0x3e6$\n'
[..] printk+0x4d/0x4f$\n'
[..] dmar_fault+0x1a8/0x1fc$\n'
[..] dmar_fault_work+0x15/0x17$\n'
[..] process_one_work+0x1e8/0x3a9$\n'
[..] worker_thread+0x25d/0x345$\n'
[..] kthread+0xea/0xf2$\n'
[..] ret_from_fork+0x58/0x90$\n'

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-05-03 14:38:58 +02:00
Dmitry Safonov d15a339eac iommu/vt-d: Add __init for dmar_register_bus_notifier()
It's called only from intel_iommu_init(), which is init function.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13 17:40:53 +01:00
Lu Baolu 973b546451 iommu/vt-d: Clear Page Request Overflow fault bit
Currently Page Request Overflow bit in IOMMU Fault Status register
is not cleared. Not clearing this bit would mean that any  future
page-request is going to be automatically dropped by IOMMU.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-11-03 10:51:33 -06:00
Joerg Roedel ec154bf56b iommu/vt-d: Don't register bus-notifier under dmar_global_lock
The notifier function will take the dmar_global_lock too, so
lockdep complains about inverse locking order when the
notifier is registered under the dmar_global_lock.

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Fixes: 59ce0515cd ('iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-10-06 15:09:30 +02:00
Arnd Bergmann 3bd71e18c5 iommu/vt-d: Fix harmless section mismatch warning
Building with gcc-4.6 results in this warning due to
dmar_table_print_dmar_entry being inlined as in newer compiler versions:

WARNING: vmlinux.o(.text+0x5c8bee): Section mismatch in reference from the function dmar_walk_remapping_entries() to the function .init.text:dmar_table_print_dmar_entry()
The function dmar_walk_remapping_entries() references
the function __init dmar_table_print_dmar_entry().
This is often because dmar_walk_remapping_entries lacks a __init
annotation or the annotation of dmar_table_print_dmar_entry is wrong.

This removes the __init annotation to avoid the warning. On compilers
that don't show the warning today, this should have no impact since the
function gets inlined anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-19 11:44:46 +02:00
Joerg Roedel c8acb28b33 iommu/vt-d: Allow to flush more than 4GB of device TLBs
The shift qi_flush_dev_iotlb() is done on an int, which
limits the mask to 32 bits. Make the mask 64 bits wide so
that more than 4GB of address range can be flushed at once.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:23:53 +02:00
Andy Shevchenko 94116f8126 ACPI: Switch to use generic guid_t in acpi_evaluate_dsm()
acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
bytes. Instead we convert them to use guid_t type. At the same time we
convert current users.

acpi_str_to_uuid() becomes useless after the conversion and it's safe to
get rid of it.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-07 12:20:49 +02:00
Andy Shevchenko f9808079aa iommu/dmar: Remove redundant ' != 0' when check return code
Usual pattern when we check for return code, which might be negative
errno, is either (ret) or (!ret).

Remove extra ' != 0' from condition.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-22 15:42:17 +01:00
Andy Shevchenko 3f6db6591a iommu/dmar: Remove redundant assignment of ret
There is no need to assign ret to 0 in some cases. Moreover it might
shadow some errors in the future.

Remove such assignments.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-22 15:42:17 +01:00
Andy Shevchenko 4a8ed2b819 iommu/dmar: Return directly from a loop in dmar_dev_scope_status()
There is no need to have a temporary variable.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-22 15:42:17 +01:00
Andy Shevchenko 8326c5d205 iommu/dmar: Rectify return code handling in detect_intel_iommu()
There is inconsistency in return codes across the functions called from
detect_intel_iommu().

Make it consistent and propagate return code to the caller.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-22 15:42:17 +01:00
Andy Shevchenko c37a01779b iommu/vt-d: Fix crash on boot when DMAR is disabled
By default CONFIG_INTEL_IOMMU_DEFAULT_ON is not set and thus
dmar_disabled variable is set.

Intel IOMMU driver based on above doesn't set intel_iommu_enabled
variable.

The commit b0119e8708 ("iommu: Introduce new 'struct iommu_device'")
mistakenly assumes it never happens and tries to unregister not ever
registered resources, which crashes the kernel at boot time:

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
	IP: iommu_device_unregister+0x31/0x60

Make unregister procedure conditional in free_iommu().

Fixes: b0119e8708 ("iommu: Introduce new 'struct iommu_device'")
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-02-22 12:25:31 +01:00
Joerg Roedel 8d2932dd06 Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas', 'arm/smmu', 'arm/mediatek', 'arm/core', 'x86/vt-d' and 'core' into next 2017-02-10 15:13:10 +01:00
Joerg Roedel 39ab9555c2 iommu: Add sysfs bindings for struct iommu_device
There is currently support for iommu sysfs bindings, but
those need to be implemented in the IOMMU drivers. Add a
more generic version of this by adding a struct device to
struct iommu_device and use that for the sysfs bindings.

Also convert the AMD and Intel IOMMU driver to make use of
it.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-02-10 13:44:57 +01:00
Joerg Roedel b0119e8708 iommu: Introduce new 'struct iommu_device'
This struct represents one hardware iommu in the iommu core
code. For now it only has the iommu-ops associated with it,
but that will be extended soon.

The register/unregister interface is also added, as well as
making use of it in the Intel and AMD IOMMU drivers.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-02-10 13:44:57 +01:00
Rafael J. Wysocki 696c7f8e03 ACPI / DMAR: Avoid passing NULL to acpi_put_table()
Linus reported that commit 174cc7187e "ACPICA: Tables: Back port
acpi_get_table_with_size() and early_acpi_os_unmap_memory() from
Linux kernel" added a new warning on his desktop system:

 ACPI Warning: Table ffffffff9fe6c0a0, Validation count is zero before decrement

which turns out to come from the acpi_put_table() in
detect_intel_iommu().

This happens if the DMAR table is not present in which case NULL is
passed to acpi_put_table() which doesn't check against that and
attempts to handle it regardless.

For this reason, check the pointer passed to acpi_put_table()
before invoking it.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Fixes: 6b11d1d677 ("ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-05 15:10:52 +01:00
Rafael J. Wysocki c8e008e2a6 Merge branches 'acpica' and 'acpi-scan'
* acpica:
  ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
  ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
  ACPICA: Tables: Allow FADT to be customized with virtual address
  ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

* acpi-scan:
  ACPI: do not warn if _BQC does not exist
2016-12-22 14:34:24 +01:00
Lv Zheng 6b11d1d677 ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
This patch removes the users of the deprectated APIs:
 acpi_get_table_with_size()
 early_acpi_os_unmap_memory()
The following APIs should be used instead of:
 acpi_get_table()
 acpi_put_table()

The deprecated APIs are invented to be a replacement of acpi_get_table()
during the early stage so that the early mapped pointer will not be stored
in ACPICA core and thus the late stage acpi_get_table() won't return a
wrong pointer. The mapping size is returned just because it is required by
early_acpi_os_unmap_memory() to unmap the pointer during early stage.

But as the mapping size equals to the acpi_table_header.length
(see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
such a convenient result is returned, driver code will start to use it
instead of accessing acpi_table_header to obtain the length.

Thus this patch cleans up the drivers by replacing returned table size with
acpi_table_header.length, and should be a no-op.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-21 02:36:38 +01:00
Ashok Raj 1c387188c6 iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
The VT-d specification (§8.3.3) says:
    ‘Virtual Functions’ of a ‘Physical Function’ are under the scope
    of the same remapping unit as the ‘Physical Function’.

The BIOS is not required to list all the possible VFs in the scope
tables, and arguably *shouldn't* make any attempt to do so, since there
could be a huge number of them.

This has been broken basically for ever — the VF is never going to match
against a specific unit's scope, so it ends up being assigned to the
INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but
now we're looking at Root-Complex integrated devices with SR-IOV support
it's going to start being wrong.

Fix it to simply use pci_physfn() before doing the lookup for PCI devices.

Cc: stable@vger.kernel.org
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2016-10-30 05:32:51 -06:00
Linus Torvalds dd9671172a IOMMU Updates for Linux v4.8
In the updates:
 
 	* Big endian support and preparation for defered probing for the
 	  Exynos IOMMU driver
 
 	* Simplifications in iommu-group id handling
 
 	* Support for Mediatek generation one IOMMU hardware
 
 	* Conversion of the AMD IOMMU driver to use the generic IOVA
 	  allocator. This driver now also benefits from the recent
 	  scalability improvements in the IOVA code.
 
 	* Preparations to use generic DMA mapping code in the Rockchip
 	  IOMMU driver
 
 	* Device tree adaption and conversion to use generic page-table
 	  code for the MSM IOMMU driver
 
 	* An iova_to_phys optimization in the ARM-SMMU driver to greatly
 	  improve page-table teardown performance with VFIO
 
 	* Various other small fixes and conversions
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXl3e+AAoJECvwRC2XARrjMIgP/1Mm9qIfcaAxKY4ByqbVfrH8
 313PO6rpwUhhywUmnf/1F/x+JbuLv8MmRXfSc106mdB1rq9NXpkORYKrqVxs0cSq
 6u6TzZWbF6WN1ipqXxDITNFBSy7u97K1VuFaKyYFfLbg8xrkcdkMZJ7BqM2xIEdk
 rnRKcfHo6wsmCXJ6InsUPmKAqU6AfMewZTGjO+v77Gce0rZEbsJ8n7BRKC9vO2bc
 akvN2W+zzEUSyhbuyYQBG+agpmC5GJvz4u+6QvAP5sxTWfAsnwAoPpP4xxR+/KjT
 eicHlja4v0YK6Hr4AJaMxoKfKIrCdqpWm0D2tg/edyWZCeg98AW/w7/s0I8OD3ao
 Otj6IqC8nPk0pYciOeEPQ7aqPbvKAqU2FYWt7lWamrdr98u2R3p2nXGl0KthoAj6
 JqzrCZXvBS7sj1IPLlGpj939yvbKbjpE0p7y1qhI1VEBXoBWFNvlKydkYx76BTGK
 F6paGVqn2Zwy00AqAsylTEkvIK063zwShZ6nPqz4bMdVlgzjrjCzdDecjfbHr8Ic
 6D2oCwyF+RJ8qw+Ecm9EmWFik80sgb+iUTeeYEXNf+YzLYt5McIj7fi3N+sUPel3
 YJ4S4x0sIpgUZZ1i+rOo8ZPAFHRU6SRPYV+ewaeYKrMt+Un5dTn9SddpqrJdbiUu
 YrF36BaQjc123IRGKrSd
 =xiS2
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

 - big-endian support and preparation for defered probing for the Exynos
   IOMMU driver

 - simplifications in iommu-group id handling

 - support for Mediatek generation one IOMMU hardware

 - conversion of the AMD IOMMU driver to use the generic IOVA allocator.
   This driver now also benefits from the recent scalability
   improvements in the IOVA code.

 - preparations to use generic DMA mapping code in the Rockchip IOMMU
   driver

 - device tree adaption and conversion to use generic page-table code
   for the MSM IOMMU driver

 - an iova_to_phys optimization in the ARM-SMMU driver to greatly
   improve page-table teardown performance with VFIO

 - various other small fixes and conversions

* tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits)
  iommu/amd: Initialize dma-ops domains with 3-level page-table
  iommu/amd: Update Alias-DTE in update_device_table()
  iommu/vt-d: Return error code in domain_context_mapping_one()
  iommu/amd: Use container_of to get dma_ops_domain
  iommu/amd: Flush iova queue before releasing dma_ops_domain
  iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  iommu/amd: Use dev_data->domain in get_domain()
  iommu/amd: Optimize map_sg and unmap_sg
  iommu/amd: Introduce dir2prot() helper
  iommu/amd: Implement timeout to flush unmap queues
  iommu/amd: Implement flush queue
  iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
  iommu/amd: Set up data structures for flush queue
  iommu/amd: Remove align-parameter from __map_single()
  iommu/amd: Remove other remains of old address allocator
  iommu/amd: Make use of the generic IOVA allocator
  iommu/amd: Remove special mapping code for dma_ops path
  iommu/amd: Pass gfp-flags to iommu_map_page()
  iommu/amd: Implement apply_dm_region call-back
  iommu/amd: Create a list of reserved iova addresses
  ...
2016-08-01 07:25:10 -04:00
Linus Torvalds 194dc870a5 Add braces to avoid "ambiguous ‘else’" compiler warnings
Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.

The resulting warnings look something like this:

  drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
  drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
     if (ctx != dev_priv->kernel_context)
        ^

even if the code itself is fine.

Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.

(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).

This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-27 20:03:31 -07:00
Nadav Amit 452014d2b4 iommu/vt-d: Remove unnecassary qi clflushes
According to the manual: "Hardware access to ...  invalidation queue ...
are always coherent."

Remove unnecassary clflushes accordingly.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-07-13 12:06:35 +02:00
Roland Dreier ffb2d1eb88 iommu/vt-d: Don't reject NTB devices due to scope mismatch
On a system with an Intel PCIe port configured as an NTB device, iommu
initialization fails with

    DMAR: Device scope type does not match for 0000:80:03.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):

    [0A0h 0160   1]      Device Scope Entry Type : 02
    [0A1h 0161   1]                 Entry Length : 08
    [0A2h 0162   2]                     Reserved : 0000
    [0A4h 0164   1]               Enumeration ID : 00
    [0A5h 0165   1]               PCI Bus Number : 80

    [0A6h 0166   2]                     PCI Path : 03,00

but the device has a type 0 PCI header:

    80:03.0 Bridge [0680]: Intel Corporation Device [8086:2f0d] (rev 02)
    00: 86 80 0d 2f 00 00 10 00 02 00 80 06 10 00 80 00
    10: 0c 00 c0 00 c0 38 00 00 0c 00 00 00 80 38 00 00
    20: 00 00 00 c8 00 00 10 c8 00 00 00 00 86 80 00 00
    30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch.  Use the class
0x0680 ("Other bridge device") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15 15:24:29 +02:00
Alex Williamson a0fe14d7dc iommu/vt-d: Improve fault handler error messages
Remove new line in error logs, avoid duplicate and explicit pr_fmt.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57 ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-05 16:22:42 +02:00
Alex Williamson c43fce4eeb iommu/vt-d: Ratelimit fault handler
Fault rates can easily overwhelm the console and make the system
unresponsive.  Ratelimit to allow an opportunity for maintenance.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57 ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-05 16:21:49 +02:00
Joerg Roedel e6a8c9b337 iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
In the PCI hotplug path of the Intel IOMMU driver, replace
the usage of the BUS_NOTIFY_DEL_DEVICE notifier, which is
executed before the driver is unbound from the device, with
BUS_NOTIFY_REMOVED_DEVICE, which runs after that.

This fixes a kernel BUG being triggered in the VT-d code
when the device driver tries to unmap DMA buffers and the
VT-d driver already destroyed all mappings.

Reported-by: Stefani Seibold <stefani@seibold.net>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-02-29 23:55:16 +01:00
Linus Torvalds 87bbcfdecc SVM fixes for Linux 4.5
Minor register size and interrupt acknowledgement fixes which only showed
 up in testing on newer hardware, but mostly a fix to the MM refcount
 handling to prevent a recursive refcount issue when mmap() is used on
 the file descriptor associated with a bound PASID.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAlbC/gAACgkQdwG7hYl686OY8QCfUPH+IB0zou9/MH3JNMz1ujot
 I6wAoK0R4KiOFXvjNeNPy+XroZ9xKqv/
 =RM+0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu

Pull IOMMU SVM fixes from David Woodhouse:
 "Minor register size and interrupt acknowledgement fixes which only
  showed up in testing on newer hardware, but mostly a fix to the MM
  refcount handling to prevent a recursive refcount issue when mmap() is
  used on the file descriptor associated with a bound PASID"

* tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu:
  iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
  iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
  iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
2016-02-16 08:04:06 -08:00
CQ Tang fda3bec12d iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
2016-01-13 23:30:49 +00:00
Joerg Roedel bc8474549e iommu/vt-d: Fix up error handling in alloc_iommu
Only check for error when iommu->iommu_dev has been assigned
and only assign drhd->iommu when the function can't fail
anymore.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-01-07 13:44:41 +01:00