Pull dmaengine changes from Dan
1/ Bartlomiej and Dan finalized a rework of the dma address unmap
implementation.
2/ In the course of testing 1/ a collection of enhancements to dmatest
fell out. Notably basic performance statistics, and fixed / enhanced
test control through new module parameters 'run', 'wait', 'noverify',
and 'verbose'. Thanks to Andriy and Linus for their review.
3/ Testing the raid related corner cases of 1/ triggered bugs in the
recently added 16-source operation support in the ioatdma driver.
4/ Some minor fixes / cleanups to mv_xor and ioatdma.
Conflicts:
drivers/dma/dmatest.c
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The implementation of ioat3_irq_reinit has two bugs:
1/ The mode is incorrectly set to MSIX for the MSI case
2/ The 'dev_id' parameter to free_irq is the ioatdma_device not the channel in
the msi and intx case
Include a small cleanup to clarify that ioat3_irq_reinit is only for bwd
hardware
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Once we have determined that we will not have all of our desired msix
vectors there is no point in attempting a single msix allocation. The
driver will already need to read registers to determine the source of
the interrupt the fact that it is msix is moot. Fallback directly to
msi.
Reported-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove support for DMA unmapping from drivers as it is no longer
needed (DMA core code is now handling it).
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[djbw: fix up chan2parent() unused warning in drivers/dma/dw/core.c]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add a hook for a common dma unmap implementation to enable removal of
the per driver custom unmap code. (A reworked version of Bartlomiej
Zolnierkiewicz's patches to remove the custom callbacks and the size
increase of dma_async_tx_descriptor for drivers that don't care about
raid).
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Dave Jiang <dave.jiang@intel.com>
[bzolnier: prepare pl330 driver for adding missing unmap while at it]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There have never been any real users of MEMSET operations since they
have been introduced in January 2007 by commit 7405f74bad ("dmaengine:
refactor dmaengine around dma_async_tx_descriptor"). Therefore remove
support for them for now, it can be always brought back when needed.
[sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Intel Atom S1200 family ioatdma changed the channel reset behavior.
It does a reset similar to PCI FLR by resetting all the MSIX
registers. We have to re-init msix interrupts because of this. This
workaround is only specific to this platform and is not expected to carry
over to the later generations.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pull slave-dmaengine updates from Vinod Koul:
"This is fairly big pull by my standards as I had missed last merge
window. So we have the support for device tree for slave-dmaengine,
large updates to dw_dmac driver from Andy for reusing on different
architectures. Along with this we have fixes on bunch of the drivers"
Fix up trivial conflicts, usually due to #include line movement next to
each other.
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (111 commits)
Revert "ARM: SPEAr13xx: Pass DW DMAC platform data from DT"
ARM: dts: pl330: Add #dma-cells for generic dma binding support
DMA: PL330: Register the DMA controller with the generic DMA helpers
DMA: PL330: Add xlate function
DMA: PL330: Add new pl330 filter for DT case.
dma: tegra20-apb-dma: remove unnecessary assignment
edma: do not waste memory for dma_mask
dma: coh901318: set residue only if dma is in progress
dma: coh901318: avoid unbalanced locking
dmaengine.h: remove redundant else keyword
dma: of-dma: protect list write operation by spin_lock
dmaengine: ste_dma40: do not remove descriptors for cyclic transfers
dma: of-dma.c: fix memory leakage
dw_dmac: apply default dma_mask if needed
dmaengine: ioat - fix spare sparse complain
dmaengine: move drivers/of/dma.c -> drivers/dma/of-dma.c
ioatdma: fix race between updating ioat->head and IOAT_COMPLETION_PENDING
dw_dmac: add support for Lynxpoint DMA controllers
dw_dmac: return proper residue value
dw_dmac: fill individual length of descriptor
...
Make ioat_dma_self_test() do DMA unmapping itself and fix handling
of failure cases.
Cc: Dan Williams <djbw@fb.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitconst,
and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Viresh Kumar <viresh.linux@gmail.com>
Cc: Dan Williams <djbw@fb.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Barry Song <baohua.song@csr.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1/ regression fix for Xen as it now trips over a broken assumption
about the dma address size on 32-bit builds
2/ new quirk for netdma to ignore dma channels that cannot meet
netdma alignment requirements
3/ fixes for two long standing issues in ioatdma (ring size overflow)
and iop-adma (potential stack corruption)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPhIfhAAoJEB7SkWpmfYgCguIQAL4qF+RC9/JggSHIjfOrYiPd
yboV80GqqQHHBwy8hfZVUrIEPMebvD/xUIk6iUQNXR+6EA8Ln0jukvQMpWNnI+Cc
TXgA5Ok70an4PD1MqnCsWyCJjsyPyhprbRHurxBcesf+y96POJxhING0rcKvft50
mvYnbtrkYe9M9x3b8TBGc0JaTVeL29Ck3FtkTz4uUktbkhRNfCcfEd28NRQpf8MB
vkjbjRGBQmGsnKxYCaEhlF1GPJyTlYjg4BBWtseJgb2R9s7tvJrkotFea/NmSfjq
XCuVKjpiFp3YyJuxJERWdwqRWvyAZFfcYyZX440nG0b7GBgSn+T7A9XhUs8vMboi
tLwoDfBbJDlKMaFpHex7Z6RtZZmVl3gWDNZTqpG44n4pabd4RPip04f0k7Wfs+cp
tzU9hGAOvgsZ8w4/JgxH8YJOZbIGzbDGOA1IhWcbxIbmFTblMiFnV3TC7qfhoRbR
8qtScIE7bUck2MYVlMMn9utd9tvKFa6HNgo41+f78/4+U7zQ/VrsbA/DWQct40R5
5k+EEvyYFUzIXn79E0GVN5h4NHH5gfAs3MZ7jIgwgHedBp4Ki68XYKNu+pIV3YwG
CFTPn1mVOXnCdt+fsjG5tL9Jecx1Mij6w3nWU93ZU6cHmC77YmU+DLxPIGuyR1a2
EmpObwfq5peXzkgQpEsB
=F3IR
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine
Pull dmaengine fixes from Dan Williams:
1/ regression fix for Xen as it now trips over a broken assumption
about the dma address size on 32-bit builds
2/ new quirk for netdma to ignore dma channels that cannot meet
netdma alignment requirements
3/ fixes for two long standing issues in ioatdma (ring size overflow)
and iop-adma (potential stack corruption)
* tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
netdma: adding alignment check for NETDMA ops
ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata
ioat: ring size variables need to be 32bit to avoid overflow
iop-adma: Corrected array overflow in RAID6 Xscale(R) test.
ioat: fix size of 'completion' for Xen
Starting with v3.2 Jonathan reports that Xen crashes loading the ioatdma
driver. A debug run shows:
ioatdma 0000:00:16.4: desc[0]: (0x300cc7000->0x300cc7040) cookie: 0 flags: 0x2 ctl: 0x29 (op: 0 int_en: 1 compl: 1)
...
ioatdma 0000:00:16.4: ioat_get_current_completion: phys_complete: 0xcc7000
...which shows that in this environment GFP_KERNEL memory may be backed
by a 64-bit dma address. This breaks the driver's assumption that an
unsigned long should be able to contain the physical address for
descriptor memory. Switch to dma_addr_t which beyond being the right
size, is the true type for the data i.e. an io-virtual address
inidicating the engine's last processed descriptor.
[stable: 3.2+]
Cc: <stable@vger.kernel.org>
Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: William Dauchy <wdauchy@gmail.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Ensure all DMA engine drivers initialize their cookies in the same way,
so that they all behave in a similar fashion. This means their first
issued cookie will be 2 rather than 1, and will increment to INT_MAX
before returning 1 and starting over.
In connection with this, Dan Williams said:
> Russell King wrote:
> > Secondly, some DMA engine drivers initialize the dma_chan cookie to 0,
> > others to 1. Is there a reason for this, or are these all buggy?
>
> I know that ioat and iop-adma expect 0 to mean "I have cleaned up this
> descriptor and it is idle", and would break if zero was an in-flight
> cookie value. The reserved usage of zero is an driver internal
> concern, but I have no problem formalizing it as a reserved value.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Now that we have the completed cookie in the dma_chan structure, we
can consolidate the tx_status functions by providing a function to set
the txstate structure and returning the DMA status. We also provide
a separate helper to set the residue for cookies which are still in
progress.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Provide a common function to do the cookie mechanics for completing
a DMA descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Everyone deals with assigning DMA cookies in the same way (it's part of
the API so they should be), so lets consolidate the common code into a
helper function to avoid this duplication.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Add a local private header file to contain definitions and declarations
which should only be used by DMA engine drivers.
We also fix linux/dmaengine.h to use LINUX_DMAENGINE_H to guard against
multiple inclusion.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Every DMA engine implementation declares a last completed dma cookie
in their private dma channel structures. This is pointless, and
forces driver specific code. Move this out into the common dma_chan
structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout. Give
them an explicit include via the following $0.02 script.
=========================================
#!/bin/bash
MANUAL=""
for i in `git grep -l 'prefetch(.*)' .` ; do
grep -q '<linux/prefetch.h>' $i
if [ $? = 0 ] ; then
continue
fi
( echo '?^#include <linux/?a'
echo '#include <linux/prefetch.h>'
echo .
echo w
echo q
) | ed -s $i > /dev/null 2>&1
if [ $? != 0 ]; then
echo $i needs manual fixup
MANUAL="$i $MANUAL"
fi
done
echo ------------------- 8\<----------------------
echo vi $MANUAL
=========================================
Signed-off-by: Paul <paul.gortmaker@windriver.com>
[ Fixed up some incorrect #include placements, and added some
non-network drivers and the fib_trie.c case - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Convert the device_is_tx_complete() operation on the
DMA engine to a generic device_tx_status()operation which
can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE,
DMA_TX_PAUSED.
[dan.j.williams@intel.com: update for timberdale]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Li Yang <leoli@freescale.com>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Magnus Damm <damm@opensource.se>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Joe Perches <joe@perches.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rename for_each_bit to for_each_set_bit in the kernel source tree. To
permit for_each_clear_bit(), should that ever be added.
The patch includes a macro to map the old for_each_bit() onto the new
for_each_set_bit(). This is a (very) temporary thing to ease the migration.
[akpm@linux-foundation.org: add temporary for_each_bit()]
Suggested-by: Alexey Dobriyan <adobriyan@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.
Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Put the ioat2 and ioat3 state machines in the halted state with all
errors cleared.
The ioat1 init path is not disturbed for stability, there are no
reported ioat1 initiaization issues.
Cc: <stable@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Tested-by: Roland Dreier <rdreier@cisco.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Drop ioatdma's use of tx_list from struct dma_async_tx_descriptor in
preparation for removal of this field.
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This adds a hardware specific self test to be called from ioat_probe.
In the ioat3 case we will have tests for all the different raid
operations, while ioat1 and ioat2 will continue to just test memcpy.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Export driver attributes for diagnostic purposes:
'ring_size': total number of descriptors available to the engine
'ring_active': number of descriptors in-flight
'capabilities': supported operation types for this channel
'version': Intel(R) QuickData specfication revision
This also allows some chattiness to be removed from the driver startup
as this information is now available via sysfs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Up until this point the driver for Intel(R) QuickData Technology
engines, specification versions 2 and 3, were mostly identical save for
a few quirks. Version 3.2 hardware adds many new capabilities (like
raid offload support) requiring some infrastructure that is not relevant
for v2. For better code organization of the new funcionality move v3
and v3.2 support to its own file dma_v3.c, and export some routines from
the base files (dma.c and dma_v2.c) that can be reused directly.
The first new capability included in this code reorganization is support
for v3.2 memset operations.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In order to support dynamic resizing of the descriptor ring or polling
for a descriptor in the presence of a hung channel the reset handler
needs to make progress while in a non-preemptible context. The current
workqueue implementation precludes polling channel reset completion
under spin_lock().
This conversion also allows us to return to opportunistic cleanup in the
ioat2 case as the timer implementation guarantees at least one cleanup
after every descriptor is submitted. This means the worst case
completion latency becomes the timer frequency (for exceptional
circumstances), but with the benefit of avoiding busy waiting when the
lock is contended.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Save 4 bytes per software descriptor by transmitting tx_cnt in an unused
portion of the hardware descriptor.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Mark all single use initialization routines with __devinit.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The register write in ioat_dma_cleanup_tasklet is unfortunate in two
ways:
1/ It clears the extra 'enable' bits that we set at alloc_chan_resources time
2/ It gives the impression that it disables interrupts when it is in
fact re-arming interrupts
[ Impact: fix, persist the value of the chanctrl register when re-arming ]
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Don't trust that the reserved bits are always zero, also sanity check
the returned value.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The cleanup path makes an effort to only perform an atomic read of the
64-bit completion address. However in the 32-bit case it does not
matter if we read the upper-32 and lower-32 non-atomically because the
upper-32 will always be zero.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Provide some output for debugging the driver.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The unified ioat1/ioat2 ioat_dma_unmap() implementation derives the
source and dest addresses from the unmap descriptor. There is no longer
a need to track this information in struct ioat_desc_sw.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Replace the current linked list munged into a ring with a native ring
buffer implementation. The benefit of this approach is reduced overhead
as many parameters can be derived from ring position with simple pointer
comparisons and descriptor allocation/freeing becomes just a
manipulation of head/tail pointers.
It requires a contiguous allocation for the software descriptor
information.
Since this arrangement is significantly different from the ioat1 chain,
move ioat2,3 support into its own file and header. Common routines are
exported from driver/dma/ioat/dma.[ch].
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Prepare the code for the conversion of the ioat2 linked-list-ring into a
native ring buffer. After this conversion ioat2 channels will share
less of the ioat1 infrastructure, but there will still be places where
sharing is possible. struct ioat_chan_common is created to house the
channel attributes that will remain common between ioat1 and ioat2
channels.
For every routine that accesses both common and hardware specific fields
the old unified 'ioat_chan' pointer is split into an 'ioat' and 'chan'
pointer. Where 'chan' references common fields and 'ioat' the
hardware/version specific.
[ Impact: pure structure member movement/variable renames, no logic changes ]
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If a callback is to be attached to a descriptor the channel needs to
know at ->prep time so it can set the interrupt enable bit. This is in
preparation for moving descriptor ioat2 descriptor preparation from
->submit to ->prep.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The async_tx api assumes that after a successful ->prep a subsequent
->submit will not fail due to a lack of resources.
This also fixes a bug in the allocation failure case. Previously the
descriptors allocated prior to the allocation failure would not be
returned to the free list.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This cleans up a mess of and'ing and or'ing bit definitions, and allows
simple assignments from the specified dma_ctrl_flags parameter.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Towards the removal of ioatdma_device.version split the initialization
path into distinct versions. This conversion:
1/ moves version specific probe code to version specific routines
2/ removes the need for ioat_device
3/ turns off the ioat1 msi quirk if the device is reinitialized for intx
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* reduce device->common. to dma-> in ioat_dma_{probe,remove,selftest}
* ioat_lookup_chan_by_index to ioat_chan_by_index
* multi-line function definitions
* ioat_desc_sw.async_tx to ioat_desc_sw.txd
* desc->txd. to tx-> in cleanup routine
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The driver currently duplicates much of what these routines offer, so
just use the common code. For example ->irq_mode tracks what interrupt
mode was initialized, which duplicates the ->msix_enabled and
->msi_enabled handling in pcim_release.
This also adds a check to the return value of dma_async_device_register,
which can fail.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>