Commit Graph

177 Commits

Author SHA1 Message Date
David Woodhouse a131bc1855 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6
Pull fixes in from 2.6.31 so that people testing the iommu-2.6.git tree
no longer trip over bugs which were already fixed (sorry, Horms).
2009-08-08 11:26:15 +01:00
Sheng Yang c5b1525533 intel-iommu: Fix enabling snooping feature by mistake
Two defects work together result in KVM device passthrough randomly can't
work:
1. iommu_snooping is not initialized to zero when vm_iommu_init() called.
So it is possible to get a random value.
2. One line added by commit 2c2e2c38("IOMMU Identity Mapping Support")
change the code path, let it bypass domain_update_iommu_cap(), as well as
missing the increment of domain iommu reference count.

The latter is also likely to cause a leak of domains on repeated VMM 
assignment and deassignment.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-06 11:35:50 +01:00
Fenghua Yu 33041ec049 intel-iommu: Mask physical address to correct page size in intel_map_single()
The physical address passed to domain_pfn_mapping() should be rounded 
down to the start of the MM page, not the VT-d page.

This issue causes kernel panic on PAGE_SIZE>VTD_PAGE_SIZE platforms e.g. ia64
platforms.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-05 09:15:48 +01:00
Fenghua Yu f532959b77 intel-iommu: Correct sglist size calculation.
In domain_sg_mapping(), use aligned_nrpages() instead of hand-coded
rounding code for calculating the size of each sg elem. This means that
on IA64 we correctly round up to the MM page size, not just to the VT-d
page size.

Also remove the incorrect mm_to_dma_pfn() when intel_map_sg() calls
domain_sg_mapping() -- the 'size' variable is in VT-d pages already.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-05 08:59:47 +01:00
David Woodhouse 19943b0e30 intel-iommu: Unify hardware and software passthrough support
This makes the hardware passthrough mode work a lot more like the
software version, so that the behaviour of a kernel with 'iommu=pt'
is the same whether the hardware supports passthrough or not.

In particular:
 - We use a single si_domain for the pass-through devices.
 - 32-bit devices can be taken out of the pass-through domain so that
   they don't have to use swiotlb.
 - Devices will work again after being removed from a KVM guest.
 - A potential oops on OOM (in init_context_pass_through()) is fixed.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-04 16:19:23 +01:00
Dan Carpenter 86f4d0123b intel-iommu: double kfree()
g_iommus is freed after we "goto error;".

Found by smatch (http://repo.or.cz/w/smatch.git).

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-20 05:01:20 +01:00
David Woodhouse 0db9b7aebb intel-iommu: Kill pointless intel_unmap_single() function
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15 08:17:26 +01:00
David Woodhouse acea0018a2 intel-iommu: Defer the iotlb flush and iova free for intel_unmap_sg() too.
I see no reason why we did this _only_ in intel_unmap_page().

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15 08:17:23 +01:00
David Woodhouse 3d39cecc48 intel-iommu: Remove superfluous iova_alloc_lock from IOVA code
We only ever obtain this lock immediately before the iova_rbtree_lock,
and release it immediately after the iova_rbtree_lock. So ditch it and
just use iova_rbtree_lock.

[v2: Remove the lockdep bits this time too]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15 08:17:02 +01:00
Sheng Yang 4b99d35270 intel-iommu: Fix intel_iommu_unmap_range() with size 0
After some API change, intel_iommu_unmap_range() introduced a assumption that
parameter size != 0, otherwise the dma_pte_clean_range() would have a
overflowed argument. But the user like KVM don't have this assumption before,
then some BUG() triggered.

Fix it by ignoring size = 0.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08 09:35:10 -07:00
David Woodhouse 147202aa77 intel-iommu: Speed up map routines by using cached domain ASAP
We did before, in the end -- but it was at the bottom of a long stack of
functions. Add an inline wrapper get_valid_domain_for_dev() which will
use the cached one _first_ and only make the out-of-line call if it's
not already set.

This takes the average time taken for a 1-page intel_map_sg() from 5961
cycles to 4812 cycles on my Lenovo x200s test box -- a modest 20%.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-07 19:43:24 +01:00
David Woodhouse 3dfc813d94 intel-iommu: Don't use identity mapping for PCI devices behind bridges
Our current strategy for pass-through mode is to put all devices into
the 1:1 domain at startup (which is before we know what their dma_mask
will be), and only _later_ take them out of that domain, if it turns out
that they really can't address all of memory.

However, when there are a bunch of PCI devices behind a bridge, they all
end up with the same source-id on their DMA transactions, and hence in
the same IOMMU domain. This means that we _can't_ easily move them from
the 1:1 domain into their own domain at runtime, because there might be DMA
in-flight from their siblings.

So we have to adjust our pass-through strategy: For PCI devices not on
the root bus, and for the bridges which will take responsibility for
their transactions, we have to start up _out_ of the 1:1 domain, just in
case.

This fixes the BUG() we see when we have 32-bit-capable devices behind a
PCI-PCI bridge, and use the software identity mapping.

It does mean that we might end up using 'normal' mapping mode for some
devices which could actually live with the faster 1:1 mapping -- but
this is only for PCI devices behind bridges, which presumably aren't the
devices for which people are most concerned about performance.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 19:19:10 +01:00
David Woodhouse 6941af2810 intel-iommu: Use iommu_should_identity_map() at startup time too.
At boot time, the dma_mask won't have been set on any devices, so we
assume that all devices will be 64-bit capable (and thus get a 1:1 map).

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 19:19:08 +01:00
David Woodhouse 736768325e intel-iommu: No mapping for non-PCI devices
This should fix kernel.org bug #11821, where the dcdbas driver makes up
a platform device and then uses dma_alloc_coherent() on it, in an
attempt to get memory < 4GiB.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 14:12:01 +01:00
David Woodhouse 62edf5dc4a intel-iommu: Restore DMAR_BROKEN_GFX_WA option for broken graphics drivers
We need to give people a little more time to fix the broken drivers.
Re-introduce this, but tied in properly with the 'iommu=pt' support this
time. Change the config option name and make it default to 'no' too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:59:46 +01:00
David Woodhouse 40e4aa3432 intel-iommu: Add iommu_should_identity_map() function
We do this twice, and it's about to get more complicated. This makes the
code slightly clearer about what it's doing, too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:55:41 +01:00
David Woodhouse 1b7bc0a161 intel-iommu: Fix reattaching of devices to identity mapping domain
When we reattach a device to the si_domain (because it's been removed
from a VM), we weren't calling domain_context_mapping() to actually tell
the hardware about that.

We should really put the call to domain_context_mapping() into
domain_add_dev_info() -- we never call the latter without also doing the
former, and we can keep the error paths simple that way. But that's a
cleanup which can wait for 2.6.32 now.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:49:46 +01:00
David Woodhouse 1e4c64c46d intel-iommu: Don't set identity mapping for bypassed graphics devices
We should check iommu_dummy() _first_, because that means it's attached
to an iommu that we've just disabled completely. At the moment, we might
try to put the device into the identity mapping domain.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:40:44 +01:00
David Woodhouse 5a5e02a614 intel-iommu: Fix dma vs. mm page confusion with aligned_nrpages()
The aligned_nrpages() function rounds up to the next VM page, but
returns its result as a number of DMA pages.

Purely theoretical except on IA64, which doesn't boot with VT-d right
now anyway.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 09:35:52 +01:00
David Woodhouse 6a43e574c5 intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()
Check dma_pte_present() and only free the page if there _is_ one.
Kind of surprising that there was no warning about this.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02 12:02:38 +01:00
David Woodhouse 75e6bf9638 intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loops
On Wed, 2009-07-01 at 16:59 -0700, Linus Torvalds wrote:
> I also _really_ hate how you do
>
>         (unsigned long)pte >> VTD_PAGE_SHIFT ==
>         (unsigned long)first_pte >> VTD_PAGE_SHIFT

Kill this, in favour of just looking to see if the incremented pte
pointer has 'wrapped' onto the next page. Which means we have to check
it _after_ incrementing it, not before.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02 11:27:13 +01:00
David Woodhouse 7766a3fb90 intel-iommu: Use cmpxchg64_local() for setting PTEs
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 20:27:03 +01:00
David Woodhouse 85b98276f2 intel-iommu: Warn about unmatched unmap requests
This would have found the bug in i386 pci_unmap_addr() a long time ago.
We shouldn't just silently return without doing anything.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:54:37 +01:00
David Woodhouse 206a73c102 intel-iommu: Kill superfluous mapping_lock
Since we're using cmpxchg64() anyway (because that's the only way to do
an atomic 64-bit store on i386), we might as well ditch the extra
locking and just use cmpxchg64() to ensure that we don't add the page
twice.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:43:37 +01:00
David Woodhouse c85994e477 intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:21:24 +01:00
David Woodhouse f3a0a52fff intel-iommu: Performance improvement for dma_pte_free_pagetable()
As with other functions, batch the CPU data cache flushes and don't keep
recalculating PTE addresses.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:58:15 +01:00
David Woodhouse 3d7b0e4154 intel-iommu: Don't free too much in dma_pte_free_pagetable()
The loop condition was wrong -- we should free a PMD only if its
_entire_ range is within the range we're intending to clear. The
early-termination condition was right, but not the loop.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:57:38 +01:00
David Woodhouse 1bf20f0dc5 intel-iommu: dump mappings but don't die on pte already set
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:55:21 +01:00
David Woodhouse 9051aa0268 intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:53:31 +01:00
David Woodhouse e1605495c7 intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()
Instead of calling domain_pfn_mapping() repeatedly with single or
small numbers of pages, just pass the sglist in. It can optimise the
number of cache flushes like domain_pfn_mapping() does, and gives a huge
speedup for large scatterlists.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:51:30 +01:00
David Woodhouse 875764de6f intel-iommu: Simplify __intel_alloc_iova()
There's no need for the separate iommu_alloc_iova() function, and
certainly not for it to be global. Remove the underscores while we're at
it.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:53 +01:00
David Woodhouse 6f6a00e40a intel-iommu: Performance improvement for domain_pfn_mapping()
As with dma_pte_clear_range(), don't keep flushing a single PTE at a
time. And also micro-optimise the setting of PTE values rather than
using the helper functions to do all the masking.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:45 +01:00
David Woodhouse 310a5ab93c intel-iommu: Performance improvement for dma_pte_clear_range()
It's a bit silly to repeatedly call domain_flush_cache() for each PTE
individually, as we clear it. Instead, batch them up and flush a whole
range at a time. We might as well refrain from recalculating the PTE
address from scratch each time round the loop too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:17 +01:00
David Woodhouse c5395d5c4a intel-iommu: Clean up iommu_domain_identity_map()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:12 +01:00
David Woodhouse 1a4a45516d intel-iommu: Remove last use of PHYSICAL_PAGE_MASK, for reserving PCI BARs
This is fairly broken anyway -- it doesn't take hotplug into account.
We should probably be checking page_is_ram() instead.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:05 +01:00
David Woodhouse 03d6a2461a intel-iommu: Make iommu_flush_iotlb_psi() take pfn as argument
Most of its callers are having to shift for themselves anyway, so we might
as well do it in iommu_flush_iotlb_psi().

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:38:11 +01:00
David Woodhouse 88cb6a7424 intel-iommu: Change aligned_size() to aligned_nrpages()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:38:04 +01:00
David Woodhouse b536d24d21 intel-iommu: Clean up intel_map_sg(), remove domain_page_mapping()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:35:06 +01:00
David Woodhouse ad05122162 intel-iommu: Use domain_pfn_mapping() in intel_iommu_map_range()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:35:00 +01:00
David Woodhouse 0ab36de274 intel-iommu: Use domain_pfn_mapping() in __intel_map_single()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:34:24 +01:00
David Woodhouse 61df744314 intel-iommu: Introduce domain_pfn_mapping()
... and use it in the trivial cases; the other callers want individual
(and bisectable) attention, since I screwed them up the first time...

Make the BUG_ON() happen on too-large virtual address rather than
physical address, too. That's the one we care about.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:33:59 +01:00
David Woodhouse 1c5a46ed49 intel-iommu: Clean up address handling in domain_page_mapping()
No more masking and alignment; just use pfns.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:33:11 +01:00
David Woodhouse b026fd28ea intel-iommu: Change addr_to_dma_pte() to pfn_to_dma_pte()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:32:26 +01:00
David Woodhouse 163cc52ccd intel-iommu: Clean up intel_iommu_unmap_range()
Use unaligned address for domain->max_addr. That algorithm isn't ideal
anyway -- we should probably just look at the last iova in the tree.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:31:12 +01:00
David Woodhouse d794dc9b30 intel-iommu: Make dma_pte_free_pagetable() take pfns as argument
With some cleanup of intel_unmap_page(), intel_unmap_sg() and
vm_domain_exit() to no longer play with 64-bit addresses.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:30:45 +01:00
David Woodhouse 6660c63a79 intel-iommu: Make dma_pte_free_pagetable() use pfns
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:30:35 +01:00
David Woodhouse 595badf5d6 intel-iommu: Make dma_pte_clear_range() take pfns as argument
Noting that this is now an _inclusive_ range.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:28:10 +01:00
David Woodhouse 04b18e65dd intel-iommu: Make dma_pte_clear_range() use pfns
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:26:36 +01:00
David Woodhouse 66eae8469e intel-iommu: Don't just mask out too-big physical addresses; BUG() instead
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:42 +01:00
David Woodhouse a75f7cf94f intel-iommu: Make dma_pte_clear_one() take pfn not address
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:40 +01:00