This patch adds support for CMA to dma-mapping subsystem for ARM
architecture. By default a global CMA area is used, but specific devices
are allowed to have their private memory areas if required (they can be
created with dma_declare_contiguous() function during board
initialisation).
Contiguous memory areas reserved for DMA are remapped with 2-level page
tables on boot. Once a buffer is requested, a low memory kernel mapping
is updated to to match requested memory access type.
GFP_ATOMIC allocations are performed from special pool which is created
early during boot. This way remapping page attributes is not needed on
allocation time.
CMA has been enabled unconditionally for ARMv6+ systems.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
CC: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Rob Clark <rob.clark@linaro.org>
Tested-by: Ohad Ben-Cohen <ohad@wizery.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Robert Nelson <robertcnelson@gmail.com>
Tested-by: Barry Song <Baohua.Song@csr.com>
On certain architectures, there might be a need to mark certain
addresses with strongly ordered memory attributes to avoid ordering
issues at the interconnect level.
On OMAP4, the asynchronous bridge buffers can only be drained
with strongly ordered accesses and hence the need to mark the
memory strongly ordered.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Woodruff Richard <r-woodruff2@ti.com>
Tested-by: Vishwanath BS <vishwanath.bs@ti.com>
The earlier TCM memory regions were mapped as MT_MEMORY_UNCACHED
which doesn't really work on platforms supporting the new v6
features like the NX bit. Add unique MT_MEMORY_[I|D]TCM types
instead.
Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch provides a device drivers, which has a omap iommu, with
address mapping APIs between device virtual address(iommu), physical
address and MPU virtual address.
There are 4 possible patterns for iommu virtual address(iova/da) mapping.
|iova/ mapping iommu_ page
| da pa va (d)-(p)-(v) function type
---------------------------------------------------------------------------
1 | c c c 1 - 1 - 1 _kmap() / _kunmap() s
2 | c c,a c 1 - 1 - 1 _kmalloc()/ _kfree() s
3 | c d c 1 - n - 1 _vmap() / _vunmap() s
4 | c d,a c 1 - n - 1 _vmalloc()/ _vfree() n*
'iova': device iommu virtual address
'da': alias of 'iova'
'pa': physical address
'va': mpu virtual address
'c': contiguous memory area
'd': dicontiguous memory area
'a': anonymous memory allocation
'()': optional feature
'n': a normal page(4KB) size is used.
's': multiple iommu superpage(16MB, 1MB, 64KB, 4KB) size is used.
'*': not yet, but feasible.
Signed-off-by: Hiroshi DOYU <Hiroshi.DOYU@nokia.com>
This patch adds a Non-cacheable Normal ARM executable memory type,
MT_MEMORY_NONCACHED.
On OMAP3, this is used for rapid dynamic voltage/frequency scaling in
the VDD2 voltage domain. OMAP3's SDRAM controller (SDRC) is in the
VDD2 voltage domain, and its clock frequency must change along with
voltage. The SDRC clock change code cannot run from SDRAM itself,
since SDRAM accesses are paused during the clock change. So the
current implementation of the DVFS code executes from OMAP on-chip
SRAM, aka "OCM RAM."
If the OCM RAM pages are marked as Cacheable, the ARM cache controller
will attempt to flush dirty cache lines to the SDRC, so it can fill
those lines with OCM RAM instruction code. The problem is that the
SDRC is paused during DVFS, and so any SDRAM access causes the ARM MPU
subsystem to hang.
TI's original solution to this problem was to mark the OCM RAM
sections as Strongly Ordered memory, thus preventing caching. This is
overkill: since the memory is marked as non-bufferable, OCM RAM writes
become needlessly slow. The idea of "Strongly Ordered SRAM" is also
conceptually disturbing. Previous LAKML list discussion is here:
http://www.spinics.net/lists/arm-kernel/msg54312.html
This memory type MT_MEMORY_NONCACHED is used for OCM RAM by a future
patch.
Cc: Richard Woodruff <r-woodruff2@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mikael Pettersson reported:
The 2.6.28-rc kernels fail to detect PCI device 0000:00:01.0
(the first ethernet port) on my Thecus n2100 XScale box.
There is however still a strange "ghost" device that gets partially
detected in 2.6.28-rc2 vanilla.
The IOP321 manual says:
The user designates the memory region containing the OCCDR as
non-cacheable and non-bufferable from the IntelR XScaleTM core.
This guarantees that all load/stores to the OCCDR are only of
DWORD quantities.
Ensure that the OCCDR is so mapped.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As of the previous commit, MT_DEVICE_IXP2000 encodes to the same
PTE bit encoding as MT_DEVICE, so it's now redundant. Convert
MT_DEVICE_IXP2000 to use MT_DEVICE instead, and remove its aliases.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch provides an ARM implementation of ioremap_wc().
We use different page table attributes depending on which CPU we
are running on:
- Non-XScale ARMv5 and earlier systems: The ARMv5 ARM documents four
possible mapping types (CB=00/01/10/11). We can't use any of the
cached memory types (CB=10/11), since that breaks coherency with
peripheral devices. Both CB=00 and CB=01 are suitable for _wc, and
CB=01 (Uncached/Buffered) allows the hardware more freedom than
CB=00, so we'll use that.
(The ARMv5 ARM seems to suggest that CB=01 is allowed to delay stores
but isn't allowed to merge them, but there is no other mapping type
we can use that allows the hardware to delay and merge stores, so
we'll go with CB=01.)
- XScale v1/v2 (ARMv5): same as the ARMv5 case above, with the slight
difference that on these platforms, CB=01 actually _does_ allow
merging stores. (If you want noncoalescing bufferable behavior
on Xscale v1/v2, you need to use XCB=101.)
- Xscale v3 (ARMv5) and ARMv6+: on these systems, we use TEXCB=00100
mappings (Inner/Outer Uncacheable in xsc3 parlance, Uncached Normal
in ARMv6 parlance).
The ARMv6 ARM explicitly says that any accesses to Normal memory can
be merged, which makes Normal memory more suitable for _wc mappings
than Device or Strongly Ordered memory, as the latter two mapping
types are guaranteed to maintain transaction number, size and order.
We use the Uncached variety of Normal mappings for the same reason
that we can't use C=1 mappings on ARMv5.
The xsc3 Architecture Specification documents TEXCB=00100 as being
Uncacheable and allowing coalescing of writes, which is also just
what we need.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>