Recent changes to how GFP_ATOMIC is defined seems to have broken the
condition to use mips_alloc_from_contiguous() in
mips_dma_alloc_coherent().
I couldn't bottom out the exact change but I think it's this commit
d0164adc89 ("mm, page_alloc: distinguish between being unable to
sleep, unwilling to sleep and avoiding waking kswapd").
GFP_ATOMIC has multiple bits set and the check for !(gfp & GFP_ATOMIC)
isn't enough.
The reason behind this condition is to check whether we can potentially
do a sleeping memory allocation. Use gfpflags_allow_blocking() instead
which should be more robust.
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32
zone, as is the case for some 32-bit kernels, then massage_gfp_flags()
will cause DMA memory allocated for devices with a 32..63-bit
coherent_dma_mask to fall back to using __GFP_DMA, even though there may
only be 32-bits of physical address available anyway.
Correct that case to compare against a mask the size of phys_addr_t
instead of always using a 64-bit mask.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Fixes: a2e715a86c ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.")
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.36+
Patchwork: https://patchwork.linux-mips.org/patch/9610/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Most architectures do not support non-coherent allocations and either
define dma_{alloc,free}_noncoherent to their coherent versions or stub
them out.
Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
implements them directly.
This patch moves the Openrisc version to common code, and handles the
DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.
Note that actual non-coherent allocations require a dma_cache_sync
implementation, so if non-coherent allocations didn't work on
an architecture before this patch they still won't work after it.
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since 2009 we have a nice asm-generic header implementing lots of DMA API
functions for architectures using struct dma_map_ops, but unfortunately
it's still missing a lot of APIs that all architectures still have to
duplicate.
This series consolidates the remaining functions, although we still need
arch opt outs for two of them as a few architectures have very
non-standard implementations.
This patch (of 5):
The coherent DMA allocator works the same over all architectures supporting
dma_map operations.
This patch consolidates them and converges the minor differences:
- the debug_dma helpers are now called from all architectures, including
those that were previously missing them
- dma_alloc_from_coherent and dma_release_from_coherent are now always
called from the generic alloc/free routines instead of the ops
dma-mapping-common.h always includes dma-coherent.h to get the defintions
for them, or the stubs if the architecture doesn't support this feature
- checks for ->alloc / ->free presence are removed. There is only one
magic instead of dma_map_ops without them (mic_dma_ops) and that one
is x86 only anyway.
Besides that only x86 needs special treatment to replace a default devices
if none is passed and tweak the gfp_flags. An optional arch hook is provided
for that.
[linux@roeck-us.net: fix build]
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The generic implementation of dma_map_ops.mmap(), dma_common_mmap(),
is not correct for non-coherent devices. It expects to be passed a
virtual address previously returned by dma_alloc_coherent(), which for
a non-coherent device will return a KSEG1 address. It then attempts to
convert that virtual address to a physical address using virt_to_page()
which will yield an incorrect address.
Also, dma_common_mmap() does not handle the DMA_ATTR_WRITE_COMBINE
attribute, and therefore dma_mmap_writecombine() will not actually set
the appropriate pgprot_t flags for write combining.
This patch adds an implementation of dma_map_ops.mmap() that correctly
handles KSEG1 addresses, and enables write combining when requested.
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Cc: Sadegh Abbasi <Sadegh.Abbasi@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10808/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This replaces the plain loop over the sglist array with for_each_sg()
macro which consists of sg_next() function calls. Since MIPS doesn't
select ARCH_HAS_SG_CHAIN, it is not necessary to use for_each_sg() in
order to loop over each sg element. But this can help find problems
with drivers that do not properly initialize their sg tables when
CONFIG_DEBUG_SG is enabled.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: akpm@linux-foundation.org
Cc: linux-mips@linux-mips.org
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9930/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Only one MIPS development board actually supports enabling/disabling DMA
coherency at runtime, so it's not a good idea to push the overhead of
checking that configuration setting onto every other supported target as
well.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/5912/
The semantics stay the same - on Cavium Octeon the functions were dead
code (it overrides the MIPS DMA ops) - on other platforms they contained
no code at all.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5720/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The check cpu_needs_post_dma_flush() in mips_dma_sync_sg_for_cpu() and
the check !plat_device_is_coherent() in mips_dma_sync_sg_for_device()
can be moved outside the for loop.
As a side effect, this also avoids a GCC bug that caused kernel compile
to fail with the error:
arch/mips/mm/dma-default.c: In function 'mips_dma_sync_sg_for_cpu':
arch/mips/mm/dma-default.c:316:1: internal compiler error: in add_insn_before, at emit-rtl.c:3852
This gcc failure is seen in Code Sourcery toolchains [e.g. gcc version
4.7.2 (Sourcery CodeBench Lite 2012.09-99)] after commit "MIPS: Optimize
current_cpu_type() for better code."
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5907/
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Tested-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
o Move current_cpu_type() to a separate header file
o #ifdefing on supported CPU types lets modern GCC know that certain
code in callers may be discarded ideally turning current_cpu_type() into
a function returning a constant.
o Use current_cpu_type() rather than direct access to struct cpuinfo_mips.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5833/
The BMIPS5000 (Zephyr) processor utilizes instruction speculation. A
stale misprediction address in either the JTB or the CRS may trigger
a prefetch inside a region that is currently being used by a DMA engine,
which is not IO-coherent. This prefetch will fetch a line into the
scache, and that line will soon become stale (ie wrong) during/after the
DMA. Mayhem ensues.
In dma-default.c, the r10000 is handled as a special case in the same way
that we want to handle Zephyr. So we generalize the exception cases into
a function, and include Zephyr as one of the processors that needs this
special care.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: cernekee@gmail.com
Patchwork: https://patchwork.linux-mips.org/patch/5776/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Provide a default implementation of phys_to_dma and dma_to_phys in
mach-generic/dma_coherence.h.
If CONFIG_NEED_SG_DMA_LENGTH is defined, the dma_length field in
struct scatterlist is used. Set this up in mips_dma_map_sg so that
the default mips DMA ops can be used when SWIOTLB is enabled.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5409/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Some MIPS controllers have hardware I/O coherency. This patch
detects those and turns off software coherency. A new kernel
command line option also allows the user to manually turn
software coherency on or off.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Having received another series of whitespace patches I decided to do this
once and for all rather than dealing with this kind of patches trickling
in forever.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use asm-generic/dma-mapping-common.h to handle all DMA mapping operations
and establish a default get_dma_ops() that forwards all operations to the
existing code.
Augment dev_archdata to carry a pointer to the struct dma_map_ops, allowing
DMA operations to be overridden on a per device basis. Currently this is
never filled in, so the default dma_map_ops are used. A follow-on patch
sets this for Octeon PCI devices.
Also initialize the dma_debug system as it is now used if it is configured.
Includes fixes by Kevin Cernekee <cernekee@gmail.com>.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Patchwork: http://patchwork.linux-mips.org/patch/1637/
Patchwork: http://patchwork.linux-mips.org/patch/1678/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This only matters for ISA devices with a 24-bit DMA limit or for devices
with a 32-bit DMA limit on systems with ZONE_DMA32 enabled. The latter
currently only affects 32-bit PCI cards on Sibyte-based systems with more
than 1GB RAM installed.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Architectures implement dma_is_consistent() in different ways (some
misinterpret the definition of API in DMA-API.txt). So it hasn't been so
useful for drivers. We have only one user of the API in tree. Unlikely
out-of-tree drivers use the API.
Even if we fix dma_is_consistent() in some architectures, it doesn't look
useful at all. It was invented long ago for some old systems that can't
allocate coherent memory at all. It's better to export only APIs that are
definitely necessary for drivers.
Let's remove this API.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
The ohci-sm501 driver requires dma_declare_coherent_memory(). It is used
by the driver's local memory allocation with dma_alloc_coherent().
Tested on TANBAC TB0287(VR4131 + SM501).
[Ralf: Fixed reject in dma-default.c and removed the entire #if 0'ed block
in dma-mapping.h instead of just the #if 0.]
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Synchronize dma_map_page/dma_unmap_page and dma_map_single/dma_unmap_single.
This will reduce unnecessary writebacks and invalidates.
[Ralf: make dma_unmap_page an inline function.]
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
dma_cache_wback_inv() expects virtual address, but physical was provided
due to translation via plat_dma_addr_to_phys().
If replaced with dma_addr_to_virt(), page fault oops from dma_unmap_page()
is gone on au1550 platform.
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Acked-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We add a dev parameter to plat_unmap_dma_mem(), and hooks for
plat_dma_supported() and plat_extra_sync_for_device() which should be
nop changes for all existing targets.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
dma_free_noncoherent() and dma_free_coherent() are missing calls to
plat_unmap_dma_mem(). This patch adds them.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We were getting away with this for so long only because the only platform
with a non-empty plat_unmap_dma_mem() doesn't call dma_sync_sg_for_cpu()
and dma_sync_sg_for_device() from its commonly used drivers.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:
This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread). So I
CC'ed this to KVM camp. Comments are appreciated.
A pointer to dma_mapping_ops to struct dev_archdata is added. If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's
NULL, the system-wide dma_ops pointer is used as before.
If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging). It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.
The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations. So x86 can't have dma_mapping_ops per
device. Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.
The first patch adds the device argument to dma_mapping_error. The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.
This patch:
dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations. So we can't have dma_mapping_ops per device.
Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device
argument.
[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Not cache coherent R10k systems (like IP28) need to do real cache
invalidates in dma_cache_sync().
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Sibyte SOCs only have 32-bit PCI. Due to the sparse use of the address
space only the first 1GB of memory is mapped at physical addresses
below 1GB. If a system has more than 1GB of memory 32-bit DMA will
not be able to reach all of it.
For now this patch is good enough to keep Sibyte users happy but it seems
eventually something like swiotlb will be needed for Sibyte.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes two places where we used plain 'x - PAGE_OFFSET' to
achieve virtual to physical address convertions. This type of convertion
is no more allowed since commit 6f284a2ce7.
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
[Build fixes for machines that don't use the generic dma-coherence.h]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Platforms will now have to supply a function dma_device_is_coherent which
returns if a particular device participates in the coherence domain. For
most platforms this function will always return 0 or 1.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>