There is no need to keep a useful accessor for a public structure hidden
away in a private implementation. Move it out alongside the structure
definition so that other implementations may reuse it.
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When invalidating an IOVA range potentially spanning multiple pages,
such as when removing an entire intermediate-level table, we currently
only issue an invalidation for the first IOVA of that range. Since the
architecture specifies that address-based TLB maintenance operations
target a single entry, an SMMU could feasibly retain live entries for
subsequent pages within that unmapped range, which is not good.
Make sure we hit every possible entry by iterating over the whole range
at the granularity provided by the pagetable implementation.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: added missing semicolons...]
Signed-off-by: Will Deacon <will.deacon@arm.com>
IOMMU hardware with range-based TLB maintenance commands can work
happily with the iova and size arguments passed via the tlb_add_flush
callback, but for IOMMUs which require separate commands per entry in
the range, it is not straightforward to infer the necessary granularity
when it comes to issuing the actual commands.
Add an additional argument indicating the granularity for the benefit
of drivers needing to know, and update the ARM LPAE code appropriately
(for non-leaf invalidations we currently just assume the worst-case
page granularity rather than walking the table to check).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the case of corrupted page tables, or when an invalid size is given,
__arm_lpae_unmap() may recurse beyond the maximum number of levels.
Unfortunately the detection of this error condition only happens *after*
calculating a nonsense offset from something which might not be a valid
table pointer and dereferencing that to see if it is a valid PTE.
Make things a little more robust by checking the level is valid before
doing anything which depends on it being so.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Whilst the architecture only defines a few of the possible CERROR values,
we should handle unknown values gracefully rather than go out of bounds
trying to print out an error description.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The basic flow for add a device:
arm_smmu_add_device
|->iommu_group_get_for_dev
|->iommu_group_get
return group; (1)
|->ops->device_group : Init/increase reference count to/by 1.
|->iommu_group_add_device : Increase reference count by 1.
return group (2)
|->return 0;
Since we are adding one device, the flow is (2) and the group reference
count will be increased by 2. So, we need to add iommu_group_put at the
end of arm_smmu_add_device to decrease the count by 1.
Also take the failure path into consideration when fail to add a device.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When we initialise a bypass STE, we memset the structure to zero and
set the Valid and Config fields to indicate that the stream should
bypass the SMMU. Unfortunately, this results in an SHCFG field of 0
which means that the shareability of any incoming transactions is
overridden with non-shareable, leading to potential coherence problems
down the line.
This patch fixes the issue by initialising bypass STEs to use the
incoming shareability attributes. When translation is in effect at
either stage 1 or stage 2, the shareability is determined by the
page tables.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The free_io_pgtable_ops() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMUv3 driver uses dma_{alloc,free}_coherent to manage its
queues and configuration data structures.
This patch converts the driver to the managed (dmam_*) API, so that
resources are freed automatically on device teardown. This greatly
simplifies the failure paths and allows us to remove a bunch of
handcrafted freeing code.
Signed-off-by: Will Deacon <will.deacon@arm.com>
PRIQ_0_OF has been removed from the SMMUv3 architecture, so remove its
corresponding (and unused) #define from the driver.
Signed-off-by: Will Deacon <will.deacon@arm.com>
commit db0fa0cb01 "scatterlist: use sg_phys()" did replacements of
the form:
phys_addr_t phys = page_to_phys(sg_page(s));
phys_addr_t phys = sg_phys(s) & PAGE_MASK;
However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.
Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb01 ("scatterlist: use sg_phys()")
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Vitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As of commit 44d88c754e ("ARM: shmobile: Remove legacy SoC code
for R-Mobile A1"), the Renesas IPMMU/IPMMUI driver is no longer used.
In theory it could still be used on SH-Mobile AG5 and R-Mobile A1 SoCs,
but that requires adding DT support to the driver, which is not
planned.
Remove the driver, it can be resurrected from git history when needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
These new helpers simplify implementing multi-driver modules and
properly handle failure to register one driver by unregistering all
previously registered drivers.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This mmu_notifier_ops structure is never modified, so declare it as
const, like the other mmu_notifier_ops structures.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Get rid of the three error paths that look the same and move
error handling to a single place.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Instead of just checking for a write access, calculate the
flags that are passed to handle_mm_fault() more precisly and
use the pre-defined macros.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Not doing so is a bug and might trigger a BUG_ON in
handle_mm_fault(). So add the proper permission checks
before calling into mm code.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The handle_mm_fault function expects the caller to do the
access checks. Not doing so and calling the function with
wrong permissions is a bug (catched by a BUG_ON).
So fix this bug by adding proper access checking to the io
page-fault code in the AMD IOMMUv2 driver.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix these warnings:
CHECK drivers/iommu/s390-iommu.c
drivers/iommu/s390-iommu.c:52:21: warning: symbol 's390_domain_alloc' was not declared. Should it be static?
drivers/iommu/s390-iommu.c:76:6: warning: symbol 's390_domain_free' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We use lazy allocation for translation table entries but don't handle
allocation (and other) failures during translation table updates.
Handle these failures and undo translation table updates when it's
meaningful.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".
Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.
This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.
This patch then converts a number of sites
o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.
o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.
o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.
o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.
The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.
The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng
- Refresh ps3_defconfig from Geoff Levand
- Emit GNU & SysV hashes for the vdso from Michael Ellerman
- Define an enum for the bolted SLB indexes from Anshuman Khandual
- Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman
- Add gettimeofday() benchmark from Michael Neuling
- Avoid link stack corruption in __get_datapage() from Michael Neuling
- Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V
- Add ppc64le_defconfig from Michael Ellerman
- pseries: extract of_helpers module from Andy Shevchenko
- Correct string length in pseries_of_derive_parent() from Nathan Fontenot
- Free the MSI bitmap if it was slab allocated from Denis Kirjanov
- Shorten irq_chip name for the SIU from Christophe Leroy
- Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas
- Fix _ALIGN_* errors due to type difference. from Aneesh Kumar K.V
- powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King
- Disable hugepd for 64K page size. from Aneesh Kumar K.V
- Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V
- Make PCI non-optional for pseries from Michael Ellerman
- Individual System V IPC system calls from Sam bobroff
- Add selftest of unmuxed IPC calls from Michael Ellerman
- discard .exit.data at runtime from Stephen Rothwell
- Delete old orphaned PrPMC 280/2800 DTS and boot file. from Paul Gortmaker
- Use of_get_next_parent to simplify code from Christophe Jaillet
- Paginate some xmon output from Sam bobroff
- Add some more elements to the xmon PACA dump from Michael Ellerman
- Allow the tm-syscall selftest to build with old headers from Michael Ellerman
- Run EBB selftests only on POWER8 from Denis Kirjanov
- Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman
- Avoid reference to potentially freed memory in prom.c from Christophe Jaillet
- Quieten boot wrapper output with run_cmd from Geoff Levand
- EEH fixes and cleanups from Gavin Shan
- Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
- Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman
- Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov
- Fix ps3-lpm white space from Rudhresh Kumar J
- Fix ps3-vuart null dereference from Colin King
- nvram: Add missing kfree in error path from Christophe Jaillet
- nvram: Fix function name in some errors messages. from Christophe Jaillet
- drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen
- agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
- cxl: Free virtual PHB when removing from Andrew Donnellan
- scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman
- scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman
- Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump
support, a rework of the qoriq clock driver, device tree changes including
qoriq fman nodes, support for a new 85xx board, and some fixes.
- MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x
LocalPlus Bus FIFO with its device tree binding documentation, mpc512x
device tree updates and some minor fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWPEZgAAoJEFHr6jzI4aWANjYQAKX2Q/95hqKfCuF5FBcUmtMC
Pu/Nff027MVzxZ2ApDcvvLGps5Nz2bn3nIhc9zjkXc5E8DuL6X3Yl8ce7qyNcc3g
cJJ8RvtUo6J1OMWetXFehtPYniAAwKMhZYKnj0+WnLr2SyH/Vhl3ehDkFbGyPtuH
r+2E7krFjfVgU+bzciIFnOaDekFuFN/pXWMb6e6zQyBJe9N8ZIp96uouGCebKVd0
VDLItzdaKErT8JFfbymMPvZm3V0rMVx4WWu3kAbQX8LrD5a18NF1zrjAOHRXc61n
kkk8/DPuNOon1PbXXyiS5BcFyZRe+KE3VBnoW5sOMqMIRg5WdO1oU3e2pEfXMO8+
leXYwFLXiKzUZuOgQG2QiUhrzD2yC1o6/TJWATv0dSl9AwrecgPX+Vj6X357slAf
A9E3eMy5tgnpndBWZmvZS3W7YDKH+NkeZ+Q40+NErAlqr++ErrTcKVndk5vWlYTT
7mMZeTXagX66al/k5ATKqwB7iUSpnYHSAa9fcUYPSM2FnXsDxPyeJGkBbcoOmkGj
QrpgNYOvJaUJd076goZCV39v0c1xpfV9/9kyVch8HUadf6JcjpVZwYnbGw2qlJjh
ZanuBG2VOeSwaKQqXiRBSBetnpAg8CVpFjDmX9wOBfSek2wxEJqDX/vQExdbIDQQ
pUs7vnUxLzhmW/x+ygOI
=YwcM
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Kconfig: remove BE-only platforms from LE kernel build from Boqun
Feng
- Refresh ps3_defconfig from Geoff Levand
- Emit GNU & SysV hashes for the vdso from Michael Ellerman
- Define an enum for the bolted SLB indexes from Anshuman Khandual
- Use a local to avoid multiple calls to get_slb_shadow() from Michael
Ellerman
- Add gettimeofday() benchmark from Michael Neuling
- Avoid link stack corruption in __get_datapage() from Michael Neuling
- Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar
K.V
- Add ppc64le_defconfig from Michael Ellerman
- pseries: extract of_helpers module from Andy Shevchenko
- Correct string length in pseries_of_derive_parent() from Nathan
Fontenot
- Free the MSI bitmap if it was slab allocated from Denis Kirjanov
- Shorten irq_chip name for the SIU from Christophe Leroy
- Wait 1s for secondaries to enter OPAL during kexec from Samuel
Mendoza-Jonas
- Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V
- powerpc/pseries/hvcserver: don't memset pi_buff if it is null from
Colin Ian King
- Disable hugepd for 64K page size, from Aneesh Kumar K.V
- Differentiate between hugetlb and THP during page walk from Aneesh
Kumar K.V
- Make PCI non-optional for pseries from Michael Ellerman
- Individual System V IPC system calls from Sam bobroff
- Add selftest of unmuxed IPC calls from Michael Ellerman
- discard .exit.data at runtime from Stephen Rothwell
- Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul
Gortmaker
- Use of_get_next_parent to simplify code from Christophe Jaillet
- Paginate some xmon output from Sam bobroff
- Add some more elements to the xmon PACA dump from Michael Ellerman
- Allow the tm-syscall selftest to build with old headers from Michael
Ellerman
- Run EBB selftests only on POWER8 from Denis Kirjanov
- Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael
Ellerman
- Avoid reference to potentially freed memory in prom.c from Christophe
Jaillet
- Quieten boot wrapper output with run_cmd from Geoff Levand
- EEH fixes and cleanups from Gavin Shan
- Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
- Use of_get_next_parent() in of_get_ibm_chip_id() from Michael
Ellerman
- Fix section mismatch warning in msi_bitmap_alloc() from Denis
Kirjanov
- Fix ps3-lpm white space from Rudhresh Kumar J
- Fix ps3-vuart null dereference from Colin King
- nvram: Add missing kfree in error path from Christophe Jaillet
- nvram: Fix function name in some errors messages, from Christophe
Jaillet
- drivers/macintosh: adb: fix misleading Kconfig help text from Aaro
Koskinen
- agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
- cxl: Free virtual PHB when removing from Andrew Donnellan
- scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from
Michael Ellerman
- scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building
with O= from Michael Ellerman
- Freescale updates from Scott: Highlights include 64-bit book3e
kexec/kdump support, a rework of the qoriq clock driver, device tree
changes including qoriq fman nodes, support for a new 85xx board, and
some fixes.
- MPC5xxx updates from Anatolij: Highlights include a driver for
MPC512x LocalPlus Bus FIFO with its device tree binding
documentation, mpc512x device tree updates and some minor fixes.
* tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits)
powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc()
powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id()
powerpc/pseries: Correct string length in pseries_of_derive_parent()
powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry
powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)
powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan
powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes
powerpc: handle error case in cpm_muram_alloc()
powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake
powerpc/book3e-64: Enable kexec
powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
powerpc/book3e-64/kexec: Enable SMP release
powerpc/book3e-64/kexec: create an identity TLB mapping
powerpc/book3e-64: Don't limit paca to 256 MiB
powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
powerpc/book3e: support CONFIG_RELOCATABLE
powerpc/booke64: Fix args to copy_and_flush
powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
powerpc/e6500: kexec: Handle hardware threads
...
handling.
PPC: Mostly bug fixes.
ARM: No big features, but many small fixes and prerequisites including:
- a number of fixes for the arch-timer
- introducing proper level-triggered semantics for the arch-timers
- a series of patches to synchronously halt a guest (prerequisite for
IRQ forwarding)
- some tracepoint improvements
- a tweak for the EL2 panic handlers
- some more VGIC cleanups getting rid of redundant state
x86: quite a few changes:
- support for VT-d posted interrupts (i.e. PCI devices can inject
interrupts directly into vCPUs). This introduces a new component (in
virt/lib/) that connects VFIO and KVM together. The same infrastructure
will be used for ARM interrupt forwarding as well.
- more Hyper-V features, though the main one Hyper-V synthetic interrupt
controller will have to wait for 4.5. These will let KVM expose Hyper-V
devices.
- nested virtualization now supports VPID (same as PCID but for vCPUs)
which makes it quite a bit faster
- for future hardware that supports NVDIMM, there is support for clflushopt,
clwb, pcommit
- support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in
userspace, which reduces the attack surface of the hypervisor
- obligatory smattering of SMM fixes
- on the guest side, stable scheduler clock support was rewritten to not
require help from the hypervisor.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a
f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw
VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u
p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ
PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG
iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc=
=iv2z
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"First batch of KVM changes for 4.4.
s390:
A bunch of fixes and optimizations for interrupt and time handling.
PPC:
Mostly bug fixes.
ARM:
No big features, but many small fixes and prerequisites including:
- a number of fixes for the arch-timer
- introducing proper level-triggered semantics for the arch-timers
- a series of patches to synchronously halt a guest (prerequisite
for IRQ forwarding)
- some tracepoint improvements
- a tweak for the EL2 panic handlers
- some more VGIC cleanups getting rid of redundant state
x86:
Quite a few changes:
- support for VT-d posted interrupts (i.e. PCI devices can inject
interrupts directly into vCPUs). This introduces a new
component (in virt/lib/) that connects VFIO and KVM together.
The same infrastructure will be used for ARM interrupt
forwarding as well.
- more Hyper-V features, though the main one Hyper-V synthetic
interrupt controller will have to wait for 4.5. These will let
KVM expose Hyper-V devices.
- nested virtualization now supports VPID (same as PCID but for
vCPUs) which makes it quite a bit faster
- for future hardware that supports NVDIMM, there is support for
clflushopt, clwb, pcommit
- support for "split irqchip", i.e. LAPIC in kernel +
IOAPIC/PIC/PIT in userspace, which reduces the attack surface of
the hypervisor
- obligatory smattering of SMM fixes
- on the guest side, stable scheduler clock support was rewritten
to not require help from the hypervisor"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits)
KVM: VMX: Fix commit which broke PML
KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()
KVM: x86: allow RSM from 64-bit mode
KVM: VMX: fix SMEP and SMAP without EPT
KVM: x86: move kvm_set_irq_inatomic to legacy device assignment
KVM: device assignment: remove pointless #ifdefs
KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic
KVM: x86: zero apic_arb_prio on reset
drivers/hv: share Hyper-V SynIC constants with userspace
KVM: x86: handle SMBASE as physical address in RSM
KVM: x86: add read_phys to x86_emulate_ops
KVM: x86: removing unused variable
KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs
KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()
KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
KVM: arm/arm64: Optimize away redundant LR tracking
KVM: s390: use simple switch statement as multiplexer
KVM: s390: drop useless newline in debugging data
KVM: s390: SCA must not cross page boundaries
KVM: arm: Do not indent the arguments of DECLARE_BITMAP
...
This time including:
* A new IOMMU driver for s390 pci devices
* Common dma-ops support based on iommu-api for ARM64. The plan is to
use this as a basis for ARM32 and hopefully other architectures as
well in the future.
* MSI support for ARM-SMMUv3
* Cleanups and dead code removal in the AMD IOMMU driver
* Better RMRR handling for the Intel VT-d driver
* Various other cleanups and small fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWOz7hAAoJECvwRC2XARrjbvYQALwtITTA5iTm0y/ApwNMxI7n
pZpjZVPoBPNsGBc4t/MT8pVhUSdmpBOljbV4Y4CayL1mSSB6Bl2gooZjd66m7Z81
qMJYEVWhFQqVsIKkCSNOgaO7W5y+xt3rTgqN6vCu86/CCDfKrTPP/+CRl1T/z9bo
1J8ioM3KnZG9KzG8JuXYFg5wwbKToaBh6swSmj+O4U9hru7zV/ILP7ikcc9pyMji
12WbzCqchRORsJZD65xMRYAqRaPNN/3IlDejs00TOFhY3qpWgEgFUucyeRJBJ/+q
K4U8T5vZsnr1a04l7/BeYbLmP7y/9Qv0N0xMGtTyoy/w/BieGqRWu4hHhqf/44NO
EhCSXcEThMNCGTjP2VWC4dnQ/s7Y8OmSW9nCreUcFVxHoE5LfDoh8RngA2fpeNuS
ixb3OwP+YXHN9Ck+1BQqQCeBznsPTLuDxlhRjCJsWntIfMSkXebOkz83YxyZ9b0Q
gFvptfuknU7cotUwWa3dg8RiUB8kNlKJyEEByaVpWEbEOabnONKEMkstvuBx6Ots
kA63wbe7QcPgbUYuq7g0nijDw6E2aEtf0nx2Xx4ZDL932qjg/xUkiBpmbDXHw4Gu
nimNXVQtbCzF74SyTvxEtupiijOTm5eHtoKtg0mYnqPZ+V9eOwEvW8IHaFFf8XHD
SecikoTtH1Q4RVtqOcAQ
=jLlB
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"This time including:
- A new IOMMU driver for s390 pci devices
- Common dma-ops support based on iommu-api for ARM64. The plan is
to use this as a basis for ARM32 and hopefully other architectures
as well in the future.
- MSI support for ARM-SMMUv3
- Cleanups and dead code removal in the AMD IOMMU driver
- Better RMRR handling for the Intel VT-d driver
- Various other cleanups and small fixes"
* tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
iommu/vt-d: Fix return value check of parse_ioapics_under_ir()
iommu/vt-d: Propagate error-value from ir_parse_ioapic_hpet_scope()
iommu/vt-d: Adjust the return value of the parse_ioapics_under_ir
iommu: Move default domain allocation to iommu_group_get_for_dev()
iommu: Remove is_pci_dev() fall-back from iommu_group_get_for_dev
iommu/arm-smmu: Switch to device_group call-back
iommu/fsl: Convert to device_group call-back
iommu: Add device_group call-back to x86 iommu drivers
iommu: Add generic_device_group() function
iommu: Export and rename iommu_group_get_for_pci_dev()
iommu: Revive device_group iommu-ops call-back
iommu/amd: Remove find_last_devid_on_pci()
iommu/amd: Remove first/last_device handling
iommu/amd: Initialize amd_iommu_last_bdf for DEV_ALL
iommu/amd: Cleanup buffer allocation
iommu/amd: Remove cmd_buf_size and evt_buf_size from struct amd_iommu
iommu/amd: Align DTE flag definitions
iommu/amd: Remove old alias handling code
iommu/amd: Set alias DTE in do_attach/do_detach
iommu/amd: WARN when __[attach|detach]_device are called with irqs enabled
...
Pull intel iommu updates from David Woodhouse:
"This adds "Shared Virtual Memory" (aka PASID support) for the Intel
IOMMU. This allows devices to do DMA using process address space,
translated through the normal CPU page tables for the relevant mm.
With corresponding support added to the i915 driver, this has been
tested with the graphics device on Skylake. We don't have the
required TLP support in our PCIe root ports for supporting discrete
devices yet, so it's only integrated devices that can do it so far"
* git://git.infradead.org/intel-iommu: (23 commits)
iommu/vt-d: Fix rwxp flags in SVM device fault callback
iommu/vt-d: Expose struct svm_dev_ops without CONFIG_INTEL_IOMMU_SVM
iommu/vt-d: Clean up pasid_enabled() and ecs_enabled() dependencies
iommu/vt-d: Handle Caching Mode implementations of SVM
iommu/vt-d: Fix SVM IOTLB flush handling
iommu/vt-d: Use dev_err(..) in intel_svm_device_to_iommu(..)
iommu/vt-d: fix a loop in prq_event_thread()
iommu/vt-d: Fix IOTLB flushing for global pages
iommu/vt-d: Fix address shifting in page request handler
iommu/vt-d: shift wrapping bug in prq_event_thread()
iommu/vt-d: Fix NULL pointer dereference in page request error case
iommu/vt-d: Implement SVM_FLAG_SUPERVISOR_MODE for kernel access
iommu/vt-d: Implement SVM_FLAG_PRIVATE_PASID to allocate unique PASIDs
iommu/vt-d: Add callback to device driver on page faults
iommu/vt-d: Implement page request handling
iommu/vt-d: Generalise DMAR MSI setup to allow for page request events
iommu/vt-d: Implement deferred invalidate for SVM
iommu/vt-d: Add basic SVM PASID support
iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS
iommu/vt-d: Add initial support for PASID tables
...
Here's the "big" driver core updates for 4.4-rc1. Primarily a bunch of
debugfs updates, with a smattering of minor driver core fixes and
updates as well.
All have been in linux-next for a long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlY6ePQACgkQMUfUDdst+ymNTgCgpP0CZw57GpwF/Hp2L/lMkVeo
Kx8AoKhEi4iqD5fdCQS9qTfomB+2/M6g
=g7ZO
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the "big" driver core updates for 4.4-rc1. Primarily a bunch
of debugfs updates, with a smattering of minor driver core fixes and
updates as well.
All have been in linux-next for a long time"
* tag 'driver-core-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
debugfs: Add debugfs_create_ulong()
of: to support binding numa node to specified device in devicetree
debugfs: Add read-only/write-only bool file ops
debugfs: Add read-only/write-only size_t file ops
debugfs: Add read-only/write-only x64 file ops
debugfs: Consolidate file mode checks in debugfs_create_*()
Revert "mm: Check if section present during memory block (un)registering"
driver-core: platform: Provide helpers for multi-driver modules
mm: Check if section present during memory block (un)registering
devres: fix a for loop bounds check
CMA: fix CONFIG_CMA_SIZE_MBYTES overflow in 64bit
base/platform: assert that dev_pm_domain callbacks are called unconditionally
sysfs: correctly handle short reads on PREALLOC attrs.
base: soc: siplify ida usage
kobject: move EXPORT_SYMBOL() macros next to corresponding definitions
kobject: explain what kobject's sd field is
debugfs: document that debugfs_remove*() accepts NULL and error values
debugfs: Pass bool pointer to debugfs_create_bool()
ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'
This is the downside of using bitfields in the struct definition, rather
than doing all the explicit masking and shifting.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Two late fixes for the AMD IOMMU driver:
* One adds an additional check to the io page-fault handler to
avoid a BUG_ON being hit in handle_mm_fault()
* Second patch fixes a problem with devices writing to the
system management area and were blocked by the IOMMU because
the driver wrongly cleared out the DTE flags allowing that
access.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWLeEzAAoJECvwRC2XARrjD/gQAIAihdsuv33iQQAfwvaztNOu
l5WqQ6gr54xQXKebkY9MgiX5qXIqyNPHku08WGp63kWkfKSkTgvqQw/WNoRcpE2+
GrMKl4EvJyvOp9S3tbtx3QNeb19wGEHyOFN6UuhNBhcXzxsqMN4c3D77upZ2N81k
hYzYYjWNL+5NwsUtK8oZSZUwhmGr4Iuim9mJDMMGhZYw88/dQICIQQtPWc8/ritJ
v/sBJA3KdyRvStxuba64NOWnByYXYnzyrJvBtVPMfPfFjfcyC0D0dwfXe3jvyjh3
nDSRoXqGsd35MBwDfVIf3HUKP4Wxwd+5pbSyrTfD5b4anEFL62ifdTb6/lpMFY/X
uo88xn9oTSMHO0TOPJw2XaBB8Y2OwW1FE7BVpa0CYFMDwQ/vaIGXAoBcxGL98a26
O+xd+pcMVELwOT0XFS5ue7eaZZdLCooj52s1ik8tMIB2qu6lFzd4JyWA7O3LbnMU
qT7YvZKbATjBvIaP0fHpZuZv6iyE2L9pdrvDIGeBb2TqE7r89JfRIYZcYfSvtrdA
TwlijU/w3eMdUoDVDCSlVT9UVfFyPZNwVS1qT3iU4OeVS8MPTnM/lNnWokruAoY0
hcbOOZ7EEtT6o/1GXyWDVaBKbAuWRNhbEEUMEBUlgPN7k9AW6dQIZDPftXwQ05tQ
XzUgi1ueNHcw1br6PItG
=4Bny
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Two late fixes for the AMD IOMMU driver:
- add an additional check to the io page-fault handler to avoid a
BUG_ON being hit in handle_mm_fault()
- fix a problem with devices writing to the system management area
and were blocked by the IOMMU because the driver wrongly cleared
out the DTE flags allowing that access"
* tag 'iommu-fixes-v4.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Don't clear DTE flags when modifying it
iommu/amd: Fix BUG when faulting a PROT_NONE VMA
When booted with intel_iommu=ecs_off we were still allocating the PASID
tables even though we couldn't actually use them. We really want to make
the pasid_enabled() macro depend on ecs_enabled().
Which is unfortunate, because currently they're the other way round to
cope with the Broadwell/Skylake problems with ECS.
Instead of having ecs_enabled() depend on pasid_enabled(), which was never
something that made me happy anyway, make it depend in the normal case
on the "broken PASID" bit 28 *not* being set.
Then pasid_enabled() can depend on ecs_enabled() as it should. And we also
don't need to mess with it if we ever see an implementation that has some
features requiring ECS (like PRI) but which *doesn't* have PASID support.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Not entirely clear why, but it seems we need to reserve PASID zero and
flush it when we make a PASID entry present.
Quite we we couldn't use the true PASID value, isn't clear.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Propagate the error-value from the function ir_parse_ioapic_hpet_scope()
in parse_ioapics_under_ir() and cleanup its calling loop.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Adjust the return value of parse_ioapics_under_ir as
negative value representing failure and "0" representing
succcess. Just make it consistent with other function
implementations, and we can judge if calling is successfull
by if (!parse_ioapics_under_ir()) style.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Now that the iommu core support for iommu groups is not
pci-centric anymore, we can move default domain allocation
to the bus independent iommu_group_get_for_dev() function.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
All callers of iommu_group_get_for_dev() provide a
device_group call-back now, so this fall-back is no longer
needed.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This converts the ARM SMMU and the SMMUv3 driver to use the
new device_group call-back.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Convert the fsl pamu driver to make use of the new
device_group call-back.
Cc: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename that function to pci_device_group() and export it, so
that IOMMU drivers can use it as their device_group
call-back.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
That call-back is currently unused, change it into a
call-back function for finding the right IOMMU group for a
device.
This is a first step to remove the hard-coded PCI dependency
in the iommu-group code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull intel-iommu bugfix from David Woodhouse:
"This contains a single fix, for when the IOMMU API is used to overlay
an existing mapping comprised of 4KiB pages, with a mapping that can
use superpages.
For the *first* superpage in the new mapping, we were correctly¹
freeing the old bottom-level page table page and clearing the link to
it, before installing the superpage. For subsequent superpages,
however, we weren't. This causes a memory leak, and a warning about
setting a PTE which is already set.
¹ Well, not *entirely* correctly. We just free the page table pages
right there and then, which is wrong. In fact they should only be
freed *after* the IOTLB is flushed so we know the hardware will no
longer be looking at them.... and in fact I note that the IOTLB
flush is completely missing from the intel_iommu_map() code path,
although it needs to be there if it's permitted to overwrite
existing mappings.
Fixing those is somewhat more intrusive though, and will probably
need to wait for 4.4 at this point"
* tag 'for-linus-20151021' of git://git.infradead.org/intel-iommu:
iommu/vt-d: fix range computation when making room for large pages
The code is buggy and the values read from PCI are not
reliable anyway, so it is the best to just remove this code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Clean up the functions to allocate the command, event and
ppr-log buffers. Remove redundant code and change the return
value to int.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The driver always uses a constant size for these buffers
anyway, so there is no need to waste memory to store the
sizes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This mostly removes the code to create dev_data structures
for alias device ids. They are not necessary anymore, as
they were only created for device ids which have no struct
pci_dev associated with it. But these device ids are
handled in a simpler way now, so there is no need for this
code anymore.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With this we don't have to create dev_data entries for
non-existent devices (which only exist as request-ids).
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The alias list is handled aleady by iommu core code. No need
anymore to handle it in this part of the AMD IOMMU code
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The condition in the BUG_ON is an indicator of a BUG, but no
reason to kill the code path. Turn it into a WARN_ON and
bail out if it is hit.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
During device assignment/deassignment the flags in the DTE
get lost, which might cause spurious faults, for example
when the device tries to access the system management range.
Fix this by not clearing the flags with the rest of the DTE.
Reported-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Tested-by: G. Richard Bellamy <rbellamy@pteradigm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Change the 'pages' parameter to 'unsigned long' to avoid overflow.
Fix the device-IOTLB flush parameter calculation — the size of the IOTLB
flush is indicated by the position of the least significant zero bit in
the address field. For example, a value of 0x12345f000 will flush from
0x123440000 to 0x12347ffff (256KiB).
Finally, the cap_pgsel_inv() is not relevant to SVM; the spec says that
*all* implementations must support page-selective invaliation for
"first-level" translations. So don't check for it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This will give a little bit of assistance to those developing drivers
using SVM. It might cause a slight annoyance to end-users whose kernel
disables the IOMMU when drivers are trying to use it. But the fix there
is to fix the kernel to enable the IOMMU.
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There is an extra semi-colon on this if statement so we always break on
the first iteration.
Fixes: 0204a49609 ('iommu/vt-d: Add callback to device driver on page faults')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This really should be VTD_PAGE_SHIFT, not PAGE_SHIFT. Not that we ever
really anticipate seeing this used on IA64, but we should get it right
anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The "req->addr" variable is a bit field declared as "u64 addr:52;".
The "address" variable is a u64. We need to cast "req->addr" to a u64
before the shift or the result is truncated to 52 bits.
Fixes: a222a7f0bb ('iommu/vt-d: Implement page request handling')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Dan Carpenter pointed out an error path which could lead to us
dereferencing the 'svm' pointer after we know it to be NULL because the
PASID lookup failed. Fix that, and make it less likely to happen again.
Fixes: a222a7f0bb ('iommu/vt-d: Implement page request handling')
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Despite being a platform device, the SMMUv3 is capable of signaling
interrupts using MSIs. Hook it into the platform MSI framework and
enjoy faults being reported in a new and exciting way.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: tidied up the binding example and reworked most of the code]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit 1463fe44fd ("iommu/arm-smmu: Don't use VMIDs for stage-1
translations"), we don't need the GR0 base address when initialising a
context bank, so remove the useless local variable and its init code.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The bitmap allocator returns an int, which is one of the standard
negative values on failure. Rather than assigning this straight to a
u16 (like we do for the ASID and VMID callers), which means that we
won't detect failure correctly, use an int for the purposes of error
checking.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This is only usable for the static 1:1 mapping of physical memory.
Any access to vmalloc or module regions will require some way of doing
an IOTLB flush. It's theoretically possible to hook into the
tlb_flush_kernel_range() function, but that seems like overkill — most
of the addresses accessed through a kernel PASID *will* be in the 1:1
mapping.
If we really need to allow access to more interesting kernel regions,
then the answer will probably be an explicit IOTLB flush call after use,
akin to the DMA API's unmap function.
In fact, it might be worth introducing that sooner rather than later, and
making it just BUG() if the address isn't in the static 1:1 mapping.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Taking inspiration from the existing arch/arm code, break out some
generic functions to interface the DMA-API to the IOMMU-API. This will
do the bulk of the heavy lifting for IOMMU-backed dma-mapping.
Since associating an IOVA allocator with an IOMMU domain is a fairly
common need, rather than introduce yet another private structure just to
do this for ourselves, extend the top-level struct iommu_domain with the
notion. A simple opaque cookie allows reuse by other IOMMU API users
with their various different incompatible allocator types.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If IRTE is in posted format, the 'pda' field goes across the 64-bit
boundary, we need use cmpxchg16b to atomically update it. We only
expose posted-interrupt when X86_FEATURE_CX16 is supported and use
to update it atomically.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
handle_mm_fault indirectly triggers a BUG in do_numa_page
when given a VMA without read/write/execute access. Check
this condition in do_fault.
do_fault -> handle_mm_fault -> handle_pte_fault -> do_numa_page
mm/memory.c
3147 static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
....
3159 /* A PROT_NONE fault should not end up here */
3160 BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This provides basic PASID support for endpoint devices, tested with a
version of the i915 driver.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The behaviour if you enable PASID support after ATS is undefined. So we
have to enable it first, even if we don't know whether we'll need it.
This is safe enough; unless we set up a context that permits it, the device
can't actually *do* anything with it.
Also shift the feature detction to dmar_insert_one_dev_info() as it only
needs to happen once.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
As long as we use an identity mapping to work around the worst of the
hardware bugs which caused us to defeature it and change the definition
of the capability bit, we *can* use PASID support on the devices which
advertised it in bit 28 of the Extended Capability Register.
Allow people to do so with 'intel_iommu=pasid28' on the command line.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The VT-d specification says that "Software must enable ATS on endpoint
devices behind a Root Port only if the Root Port is reported as
supporting ATS transactions."
We walk up the tree to find a Root Port, but for integrated devices we
don't find one — we get to the host bridge. In that case we *should*
allow ATS. Currently we don't, which means that we are incorrectly
failing to use ATS for the integrated graphics. Fix that.
We should never break out of this loop "naturally" with bus==NULL,
since we'll always find bridge==NULL in that case (and now return 1).
So remove the check for (!bridge) after the loop, since it can never
happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
it'll oops anyway in that case, that'll do just as well.
Cc: stable@vger.kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In preparation for deprecating ioremap_cache() convert its usage in
intel-iommu to memremap. This also eliminates the mishandling of the
__iomem annotation in the implementation.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The SMMU architecture defines two different behaviors when 64-bit
registers are written with 32-bit writes. The first behavior causes
zero extension into the upper 32-bits. The second behavior splits a
64-bit register into "normal" 32-bit register pairs.
On some buggy implementations, registers incorrectly zero extended
when they should instead behave as normal 32-bit register pairs.
Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
[will: removed redundant macro parameters]
Signed-off-by: Will Deacon <will.deacon@arm.com>
'%pad' automatically prints with '0x', so remove the explicit '0x'
annotation.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Rather than keep a private list of struct arm_smmu_device and searching
this whenever we need to look up the correct SMMU instance, instead use
the drvdata field in the struct device to take care of the mapping for
us.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The DSP MMUs on DRA7xx SoC requires configuring an additional
MMU_CONFIG register present in the DSP_SYSTEM sub module. This
setting dictates whether the DSP Core's MDMA and EDMA traffic
is routed through the respective MMU or not. Add the support
to the OMAP iommu driver so that the traffic is not bypassed
when enabling the MMUs.
The MMU_CONFIG register has two different bits for enabling
each of these two MMUs present in the DSP processor sub-system
on DRA7xx. An id field is added to the OMAP iommu object to
identify and enable each IOMMU. The id information and the
DSP_SYSTEM.MMU_CONFIG register programming is achieved through
the processing of the optional "ti,syscon-mmuconfig" property.
A proper value is assigned to the id field only when this
property is present.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation for the installation of a large page, any small page
tables that may still exist in the target IOV address range are
removed. However, if a scatter/gather list entry is large enough to
fit more than one large page, the address space for any subsequent
large pages is not cleared of conflicting small page tables.
This can cause legitimate mapping requests to fail with errors of the
form below, potentially followed by a series of IOMMU faults:
ERROR: DMA PTE for vPFN 0xfde00 already set (to 7f83a4003 not 7e9e00083)
In this example, a 4MiB scatter/gather list entry resulted in the
successful installation of a large page @ vPFN 0xfdc00, followed by
a failed attempt to install another large page @ vPFN 0xfde00, due to
the presence of a pointer to a small page table @ 0x7f83a4000.
To address this problem, compute the number of large pages that fit
into a given scatter/gather list entry, and use it to derive the
last vPFN covered by the large page(s).
Cc: stable@vger.kernel.org
Signed-off-by: Christian Zander <christian@nervanasys.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
A few fixes piled up:
* Fix for a suspend/resume issue where PCI probing code overwrote
dev->irq for the MSI irq of the AMD IOMMU.
* Fix for a kernel crash when a 32 bit PCI device was assigned to a KVM
guest.
* Fix for a possible memory leak in the VT-d driver
* A couple of fixes for the ARM-SMMU driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWHNdbAAoJECvwRC2XARrjB/YQAKouJaRMjBaehx6kbaZMhMJy
hXDsh8Xl6TtCe6kLD2uXrvjLZAdu32kjrtzhhcM21EO5Ms2Weq6A60/98LwnJ4Eg
AqftjfxQsIwf2G1PvHb+xepgcFxIAhW6a3nORzx6d2AGrNWmMtUhbLTSncYjmojf
Td4dscuRmRPenJUV1JhcJQBR62QonknIHV99QmevaCSAoUdyuMH+t5kQVEgPjx7C
GlMPNEZZmGl7J3NXSWRtDSkUxFZ1OU8MTKc1LmPPHHAOZk37wbePihQbLLySlHPH
v4G1R05e2hG7C66yu959fyOleL87lDToUXhwQNFJMqEc+e7IzBzZsB3ANEHjpLQH
UJC9COU+sf8mPafja4ge/KbyGDmgDg/OMQJDhU6+DSXUflwymeWJmXr7sLFQex6O
nZO/SVzkbKj+PKxV7UnGD0sTeAAk0X6vfhFCL0l/acPpQg0T6Fpky5D5fUMv5dWS
xxxvxfwBcDoI44fxWBhfPYvmLFT9f5da+bpbzeeGjVSNezOkPJ65AJcVk5An4kQu
PRzJGoq3XpZHOeg5+O7IKzeuJ+3qc7Tz4wAzMxcaNFpVBl2qp1RUkTbmS9/YV1b5
ZOcIFBMLuUROE1ExsU19c5Uo0j1Bvh9jtdy6lNFCagQYzihtA0Jk19ucllx1jIjD
sdv2hgDIauRToKF1d9xz
=v5G4
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"A few fixes piled up:
- Fix for a suspend/resume issue where PCI probing code overwrote
dev->irq for the MSI irq of the AMD IOMMU.
- Fix for a kernel crash when a 32 bit PCI device was assigned to a
KVM guest.
- Fix for a possible memory leak in the VT-d driver
- A couple of fixes for the ARM-SMMU driver"
* tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix NULL pointer deref on device detach
iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices
iommu/vt-d: Fix memory leak in dmar_insert_one_dev_info()
iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA
iommu/arm-smmu: Ensure IAS is set correctly for AArch32-capable SMMUs
iommu/io-pgtable-arm: Don't use dma_to_phys()
When a device group is detached from its domain, the iommu
core code calls into the iommu driver to detach each device
individually.
Before this functionality went into the iommu core code, it
was implemented in the drivers, also in the AMD IOMMU
driver as the device alias handling code.
This code is still present, as there might be aliases that
don't exist as real PCI devices (and are therefore invisible
to the iommu core code).
Unfortunatly it might happen now, that a device is unbound
multiple times from its domain, first by the alias handling
code and then by the iommu core code (or vice verca).
This ends up in the do_detach function which dereferences
the dev_data->domain pointer. When the device is already
detached, this pointer is NULL and we get a kernel oops.
Removing the alias code completly is not an option, as that
would also remove the code which handles invisible aliases.
The code could be simplified, but this is too big of a
change outside the merge window.
For now, just check the dev_data->domain pointer in
do_detach and bail out if it is NULL.
Reported-by: Andreas Hartmann <andihartmann@freenet.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD IOMMU driver makes use of IOMMU PCI devices, so prevent binding other
PCI drivers to IOMMU PCI devices.
This fixes a bug reported by Boris that system suspend/resume gets broken
on AMD platforms. For more information, please refer to:
https://lkml.org/lkml/2015/9/26/89
Fixes: 991de2e590 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This adds an IOMMU API implementation for s390 PCI devices.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently the RMRR entries are created only at boot time.
This means they will vanish when the domain allocated at
boot time is destroyed.
This patch makes sure that also newly allocated domains will
get RMRR mappings.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Split the part of the function that fetches the domain out
and put the rest into into a domain_prepare_identity_map, so
that the code can also be used with when the domain is
already known.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Its a bit odd that debugfs_create_bool() takes 'u32 *' as an argument,
when all it needs is a boolean pointer.
It would be better to update this API to make it accept 'bool *'
instead, as that will make it more consistent and often more convenient.
Over that bool takes just a byte.
That required updates to all user sites as well, in the same commit
updating the API. regmap core was also using
debugfs_{read|write}_file_bool(), directly and variable types were
updated for that to be bool as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull IOVA fixes from David Woodhouse:
"The main fix here is the first one, fixing the over-allocation of
size-aligned requests. The other patches simply make the existing
IOVA code available to users other than the Intel VT-d driver, with no
functional change.
I concede the latter really *should* have been submitted during the
merge window, but since it's basically risk-free and people are
waiting to build on top of it and it's my fault I didn't get it in, I
(and they) would be grateful if you'd take it"
* git://git.infradead.org/intel-iommu:
iommu: Make the iova library a module
iommu: iova: Export symbols
iommu: iova: Move iova cache management to the iova library
iommu/iova: Avoid over-allocating when size-aligned
Enable VT-d Posted-Interrtups and add a command line
parameter for it.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are returning NULL if we are not able to attach the iommu
to the domain but while returning we missed freeing info.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix amd_iommu_detect() to return positive value on success, like
intended, and not zero. This will not change anything in the end
as AMD IOMMU disable swiotlb and properly associate itself with
devices even if detect() doesn't return a positive value.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: iommu@lists.linux-foundation.org