Building the caam driver on arm64 produces a harmless warning:
drivers/crypto/caam/caamalg.c:140:139: warning: comparison of distinct pointer types lacks a cast
We can use min_t to tell the compiler which type we want it to use
here.
Fixes: 5ecf8ef910 ("crypto: caam - fix sg dump")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Shared descriptors used by ahash_final() and ahash_finup()
are identical, thus get rid of one of them (sh_desc_finup).
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The pointer to the descriptor buffer is not touched,
it always points to start of the descriptor buffer.
Thus, make it const.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sec4_sg_entry structure is used only by helper functions in sg_sw_sec4.h.
Since SEC HW S/G entries are to be manipulated only indirectly, via these
functions, move sec4_sg_entry to the corresponding header.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This reverts commit 66d2e20280.
Quoting from Russell's findings:
https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg21136.html
[quote]
Okay, I've re-tested, using a different way of measuring, because using
openssl speed is impractical for off-loaded engines. I've decided to
use this way to measure the performance:
dd if=/dev/zero bs=1048576 count=128 | /usr/bin/time openssl dgst -md5
For the threaded IRQs case gives:
0.05user 2.74system 0:05.30elapsed 52%CPU (0avgtext+0avgdata 2400maxresident)k
0.06user 2.52system 0:05.18elapsed 49%CPU (0avgtext+0avgdata 2404maxresident)k
0.12user 2.60system 0:05.61elapsed 48%CPU (0avgtext+0avgdata 2460maxresident)k
=> 5.36s => 25.0MB/s
and the tasklet case:
0.08user 2.53system 0:04.83elapsed 54%CPU (0avgtext+0avgdata 2468maxresident)k
0.09user 2.47system 0:05.16elapsed 49%CPU (0avgtext+0avgdata 2368maxresident)k
0.10user 2.51system 0:04.87elapsed 53%CPU (0avgtext+0avgdata 2460maxresident)k
=> 4.95 => 27.1MB/s
which corresponds to an 8% slowdown for the threaded IRQ case. So,
tasklets are indeed faster than threaded IRQs.
[...]
I think I've proven from the above that this patch needs to be reverted
due to the performance regression, and that there _is_ most definitely
a deterimental effect of switching from tasklets to threaded IRQs.
[/quote]
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
alkcipher_edesc_alloc() and ablkcipher_giv_edesc_alloc() don't
free / unmap resources on error path:
- dmap_map_sg() could fail, thus make sure the return value is checked
- unmap DMA mappings in case of error
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ERRID is a 4-bit field.
Since err_id values are in [0..15] and err_id_list array size is 16,
the condition "err_id < ARRAY_SIZE(err_id_list)" is always true.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
REG3 no longer needs to be updated, since it's not used after that.
This shared descriptor command is a leftover of the conversion to
AEAD interface.
Fixes: 479bcc7c5b "crypto: caam - Convert authenc to new AEAD interface"
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix the following smatch warnings:
drivers/crypto/caam/caamalg.c:2350 aead_edesc_alloc() warn: we tested 'src_nents' before and it was 'true'
drivers/crypto/caam/caamrng.c:351 caam_rng_init() error: no modifiers for allocation.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
1. fix HDR_START_IDX_MASK, HDR_SD_SHARE_MASK, HDR_JD_SHARE_MASK
Define HDR_START_IDX_MASK consistently with the other masks:
mask = bitmask << offset
2. OP_ALG_TYPE_CLASS1 and OP_ALG_TYPE_CLASS2 must be shifted.
3. fix FIFO_STORE output data type value for AFHA S-Box
4. fix OPERATION pkha modular arithmetic source mask
5. rename LDST_SRCDST_WORD_CLASS1_ICV_SZ to
LDST_SRCDST_WORD_CLASS1_IV_SZ (it refers to IV, not ICV).
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 4464a7d4f5
("crypto: caam - remove error propagation handling")
removed error propagation handling only from caamalg.
Do this in all other places: caamhash, caamrng.
Update descriptors' lengths appropriately.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AEAD givenc descriptor relies on moving the IV through the
output FIFO and then back to the CTX2 for authentication. The
SEQ FIFO STORE could be scheduled before the data can be
read from OFIFO, especially since the SEQ FIFO LOAD needs
to wait for the SEQ FIFO LOAD SKIP to finish first. The
SKIP takes more time when the input is SG than when it's
a contiguous buffer. If the SEQ FIFO LOAD is not scheduled
before the STORE, the DECO will hang waiting for data
to be available in the OFIFO so it can be transferred to C2.
In order to overcome this, first force transfer of IV to C2
by starting the "cryptlen" transfer first and then starting to
store data from OFIFO to the output buffer.
Fixes: 1acebad3d8 ("crypto: caam - faster aead implementation")
Cc: <stable@vger.kernel.org> # 3.2+
Signed-off-by: Alex Porosanu <alexandru.porosanu@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Switch to resource-managed function devm_kzalloc instead
of kzalloc and remove unneeded kfree
Also, remove kfree in probe function and remove
function, mv_remove as it is now has nothing to do.
The Coccinelle semantic patch used to make this change is as follows:
//<smpl>
@platform@
identifier p, probefn, removefn;
@@
struct platform_driver p = {
.probe = probefn,
.remove = removefn,
};
@prb@
identifier platform.probefn, pdev;
expression e, e1, e2;
@@
probefn(struct platform_device *pdev, ...) {
<+...
- e = kzalloc(e1, e2)
+ e = devm_kzalloc(&pdev->dev, e1, e2)
...
?-kfree(e);
...+>
}
@rem depends on prb@
identifier platform.removefn;
expression prb.e;
@@
removefn(...) {
<...
- kfree(e);
...>
}
//</smpl>
Signed-off-by: Nadim Almas <nadim.902@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Trivial fix to spelling mistake "pointeur" to "pointer"
in dev_err message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The exponent size in the ccp_op structure is in bits. A v5
CCP requires the exponent size to be in bytes, so convert
the size from bits to bytes when populating the descriptor.
The current code references the exponent in memory, but
these fields have not been set since the exponent is
actually store in the LSB. Populate the descriptor with
the LSB location (address).
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When using AES-XTS on a Wandboard, we receive a Mode error:
caam_jr 2102000.jr1: 20001311: CCB: desc idx 19: AES: Mode error.
According to the Security Reference Manual, the Low Power AES units
of the i.MX6 do not support the XTS mode. Therefore we must not
register XTS implementations in the Crypto API.
Signed-off-by: Sven Ebenfeld <sven.ebenfeld@gmail.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Cc: <stable@vger.kernel.org> # 4.4+
Fixes: c6415a6016 "crypto: caam - add support for acipher xts(aes)"
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that lazy FPU is gone, we don't use CR0.TS (except possibly in
KVM guest mode). Remove irq_ts_save(), irq_ts_restore(), and all of
their callers.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm list <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/70b9b9e7ba70659bedcb08aba63d0f9214f338f2.1477951965.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Building the caam driver on arm64 produces a harmless warning:
drivers/crypto/caam/caamalg.c:140:139: warning: comparison of distinct pointer types lacks a cast
We can use min_t to tell the compiler which type we want it to use
here.
Fixes: 5ecf8ef910 ("crypto: caam - fix sg dump")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Trivial fix to typo in dev_dbg message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is no need to have the 'struct atmel_aes_dev *aes_dd' variable
static since new value always be assigned before use it.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix a few problems revealed by testing: verify consistent
units, especially in public slot allocation. Percolate
some common initialization code up to a common routine.
Add some comments.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Clean up patch for an unneeded structure member.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bit fields are not sensitive to endianness, so use
a transparent standard data type
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes the following sparse warning:
drivers/crypto/ccp/ccp-dev.c:44:6: warning:
symbol 'ccp_error_codes' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
i.MX6UL does only require three clocks to enable CAAM module.
Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The lsb field uses a value of -1 to indicate that it
is unassigned. Therefore type must be a signed int.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The size used in 'dma_free_coherent()' looks un-initialized here.
ctx->sa_len is set a few lines below and is apparently not set by the
caller.
So use 'size' as in the corresponding 'dma_alloc_coherent()' a few lines
above.
This has been spotted with coccinelle, using the following script:
////////////////////
@r@
expression x0, x1, y0, y1, z0, z1, t0, t1, ret;
@@
* ret = dma_alloc_coherent(x0, y0, z0, t0);
...
* dma_free_coherent(x1, y1, ret, t1);
@script:python@
y0 << r.y0;
y1 << r.y1;
@@
if y1.find(y0) == -1:
print "WARNING: sizes look different: '%s' vs '%s'" % (y0, y1)
////////////////////
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, the driver breaks chain for all kind of hash requests in order to
don't override intermediate states of partial ahash updates. However, some final
ahash requests can be directly processed by the engine, and so without
intermediate state. This is typically the case for most for the HMAC requests
processed via IPSec.
This commits adds a TDMA descriptor to copy context for these of requests
into the "op" dma pool, then it allow to chain these requests at the DMA level.
The 'complete' operation is also updated to retrieve the MAC digest from the
right location.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, we used a dedicated dma pool to copy the result of outer IV for
cipher requests. Instead of using a dma pool per outer data, we prefer
use the op dma pool that contains all part of the request from the SRAM.
Then, the outer data that is likely to be used by the 'complete'
operation, is copied later. In this way, any type of result can be
retrieved by DMA for cipher or ahash requests.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the xts(aes) algorithm, which is supported from
hardware version 0x500 and above (sama5d2x).
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes a compiler error when VERBOSE_DEBUG is defined. Indeed,
in atmel_aes_write(), the 3rd argument of atmel_aes_reg_name() was
missing.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reported-by: Levent Demir <levent.demir@inria.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.9:
API:
- The crypto engine code now supports hashes.
Algorithms:
- Allow keys >= 2048 bits in FIPS mode for RSA.
Drivers:
- Memory overwrite fix for vmx ghash.
- Add support for building ARM sha1-neon in Thumb2 mode.
- Reenable ARM ghash-ce code by adding import/export.
- Reenable img-hash by adding import/export.
- Add support for multiple cores in omap-aes.
- Add little-endian support for sha1-powerpc.
- Add Cavium HWRNG driver for ThunderX SoC"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
crypto: caam - treat SGT address pointer as u64
crypto: ccp - Make syslog errors human-readable
crypto: ccp - clean up data structure
crypto: vmx - Ensure ghash-generic is enabled
crypto: testmgr - add guard to dst buffer for ahash_export
crypto: caam - Unmap region obtained by of_iomap
crypto: sha1-powerpc - little-endian support
crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
crypto: vmx - Fix memory corruption caused by p8_ghash
crypto: ghash-generic - move common definitions to a new header file
crypto: caam - fix sg dump
hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
crypto: omap-sham - shrink the internal buffer size
crypto: omap-sham - add support for export/import
crypto: omap-sham - convert driver logic to use sgs for data xmit
crypto: omap-sham - change the DMA threshold value to a define
crypto: omap-sham - add support functions for sg based data handling
crypto: omap-sham - rename sgl to sgl_tmp for deprecation
crypto: omap-sham - align algorithms on word offset
crypto: omap-sham - add context export/import stubs
...
Even for i.MX, CAAM is able to use address pointers greater than
32 bits, the address pointer field being interpreted as a double word.
Enforce u64 address pointer in the sec4_sg_entry struct.
This patch fixes the SGT address pointer endianness issue for
32bit platforms where core endianness != caam endianness.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add human-readable strings to log messages about CCP errors
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change names of data structure instances. Add const
keyword where appropriate. Add error handling path.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Select CRYPTO_GHASH for vmx_crypto since p8_ghash uses it as the
fallback implementation.
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Free memory mapping, if probe is not successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch changes the p8_ghash driver to use ghash-generic as a fixed
fallback implementation. This allows the correct value of descsize to be
defined directly in its shash_alg structure and avoids problems with
incorrect buffer sizes when its state is exported or imported.
Reported-by: Jan Stancek <jstancek@redhat.com>
Fixes: cc333cd68d ("crypto: vmx - Adding GHASH routines for VMX module")
Cc: stable@vger.kernel.org
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure scatterlists have a virtual memory mapping before dumping.
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current internal buffer size is way too large for crypto core, so
shrink it to be smaller. This makes the buffer to fit into the space
reserved for the export/import buffers also.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the driver has been converted to use scatterlists for data
handling, add proper implementation for the export/import stubs also.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, the internal buffer has been used for data transmission. Change
this so that scatterlists are used instead, and change the driver to
actually use the previously introduced helper functions for scatterlist
preparation.
This patch also removes the old buffer handling code which is no longer
needed.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the threshold value was hardcoded in the driver. Having a define
for it makes it easier to configure.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently omap-sham uses a huge internal buffer for caching data, and
pushing this out to the DMA as large chunks. This, unfortunately,
doesn't work too well with the export/import functionality required
for ahash algorithms, and must be changed towards more scatterlist
centric approach.
This patch adds support functions for (mostly) scatterlist based data
handling. omap_sham_prepare_request() prepares a scatterlist for DMA
transfer to SHA crypto accelerator. This requires checking the data /
offset / length alignment of the data, splitting the data to SHA block
size granularity, and adding any remaining data back to the buffer.
With this patch, the code doesn't actually go live yet, the support code
will be taken properly into use with additional patches that modify the
SHA driver functionality itself.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current usage of sgl will be deprecated, and will be replaced by an
array required by the sg based driver implementation. Rename the existing
variable as sgl_tmp so that it can be removed from the driver easily later.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
OMAP HW generally expects data for DMA to be on word boundary, so make the
SHA driver inform crypto framework of the same preference.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Initially these just return -ENOTSUPP to indicate that they don't
really do anything yet. Some sort of implementation is required
for the driver to at least probe.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We get 1 warning when building kernel with W=1:
drivers/crypto/sunxi-ss/sun4i-ss-hash.c:168:5: warning: no previous prototype for 'sun4i_hash' [-Wmissing-prototypes]
In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
So this patch marks it 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix the retrn value check which testing the wrong variable
in ccp_dmaengine_register().
Fixes: 58ea8abf49 ("crypto: ccp - Register the CCP as a DMA resource")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move statements for error handling which were identical
in two if branches to the end of these functions.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The local variable "ret" will be set to an appropriate value a bit later.
Thus omit the explicit initialisation at the beginning.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Return a value at the end without storing it in an intermediate variable.
* Delete the local variable "ret" which became unnecessary with
this refactoring.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adjust jump labels according to the current Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adjust jump labels according to the current Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".
This issue was detected by using the Coccinelle software.
* Replace the specification of a data type by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix to return error code -ENOMEM from the crypto_engine_alloc_init()
error handling case instead of 0, as done elsewhere in this function.
Fixes: 0529900a01 ("crypto: omap-aes - Support crypto engine framework")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix to return error code -ENOMEM from the crypto_engine_alloc_init()
error handling case instead of 0, as done elsewhere in this function.
Fixes: f1b77aaca8 ("crypto: omap-des - Integrate with the crypto
engine framework")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like
cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with
cxgb4 driver and free them while unregistering. All the queues and the
interrupts for them will be allocated during ULD probe only and freed
during remove.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a memory leak in an error path in uc loader.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.
Fixes: f1b77aaca8 ("crypto: omap-des - Integrate with the crypto engine framework")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.
Fixes: 0529900a01 ("crypto: omap-aes - Support crypto engine framework")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As setting up the DMA operations is quite costly, add software fallback
support for requests smaller than 200 bytes. This change gives some 10%
extra performance in ipsec use case.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[t-kristo@ti.com: udpated against latest upstream, to use skcipher mainly]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some SoCs like omap4/omap5/dra7 contain multiple AES crypto accelerator
cores. Adapt the driver to support this. The driver picks the last used
device from a list of AES devices.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[t-kristo@ti.com: forward ported to 4.7 kernel]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling runtime PM API at the cra_init/exit is bad for power management
purposes, as the lifetime for a CRA can be very long. Instead, use
pm_runtime autosuspend approach for handling the device clocks. Clocks
are enabled when they are actually required, and autosuspend disables
these if they have not been used for a sufficiently long time period.
By default, the timeout value is 1 second.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If software fallback is used on older hardware accelerator setup (OMAP2/
OMAP3), the first block of data must be purged from the buffer. The
first block contains the pre-generated ipad value required by the HW,
but the software fallback algorithm generates its own, causing wrong
results.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If we have processed any data with the hardware accelerator (digcnt > 0),
we must complete the entire hash by using it. This is because the current
hash value can't be imported to the software fallback algorithm. Otherwise
we end up with wrong hash results.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some of the call paths of OMAP SHA driver can avoid executing the next
step of the crypto queue under tasklet; instead, execute the next step
directly via function call. This avoids a costly round-trip via the
scheduler giving a slight performance boost.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/Kconfig
All conflicts were cases of overlapping commits.
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should not use NO_IRQ, as we are trying to get rid of that.
In this case, the call to irq_of_parse_and_map() is both wrong
(as it returns '0' on failure, not NO_IRQ) and unnecessary
(as platform_get_irq() does the same thing)
This removes the call to irq_of_parse_and_map() and checks for
the error code correctly.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ccp_dmaengine_register used to return with an error code before
releasing all resource. This patch adds a jump to the appropriate label
ensuring that the resources are properly released before returning.
This issue was found with Hector.
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-nonce is being loaded using append_load_imm_u32() instead of
append_load_as_imm() (nonce is a byte array / stream, not a 4-byte
variable)
-counter is not being added in big endian format, as mandatated by
RFC3686 and expected by the crypto engine
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current crypto engine allow only ablkcipher_request to be enqueued.
Thus denying any use of it for hardware that also handle hash algo.
This patch modify the API for allowing to enqueue ciphers and hash.
Since omap-aes/omap-des are the only users, this patch also convert them
to the new cryptoengine API.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch move the whole crypto engine API to its own header
crypto/engine.h.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix incorrect value of ADF_C3XXX_ACCELERATORS_MASK.
Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Copy const_tab array into DMA-able memory (accesible by qat hw).
Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We get 1 warning when biuld kernel with W=1:
drivers/crypto/caam/ctrl.c:398:5: warning: no previous prototype for 'caam_get_era' [-Wmissing-prototypes]
In fact, this function is declared in drivers/crypto/caam/ctrl.h,
so this patch add missing header dependencies.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For algorithms that implement IV generators before the crypto ops,
the IV needed for decryption is initially located in req->src
scatterlist, not in req->iv.
Avoid copying the IV into req->iv by modifying the (givdecrypt)
descriptors to load it directly from req->src.
aead_givdecrypt() is no longer needed and goes away.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes the following sparse warning:
drivers/crypto/chelsio/chcr_algo.c:593:5: warning:
symbol 'cxgb4_is_crypto_q_full' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If devm_add_action() fails we are explicitly calling the cleanup to free
the resources allocated. Lets use the helper devm_add_action_or_reset()
and return directly in case of error, as we know that the cleanup function
has been already called by the helper if there was any error.
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
clk_prepare_enable() may fail, so we should better check its return
value and propagate it in the case of failure.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add the missing unlock before return from function sun4i_hash()
in the error handling case.
Fixes: 477d9b2e59 ("crypto: sun4i-ss - unify update/final function")
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
walk.iv is not assigned a value in blkcipher_walk_init. It makes iv uninitialized.
It is possibly a null value(as shown below), which is then used by aes_p8_encrypt.
This patch moves iv = walk.iv after blkcipher_walk_virt, in which walk.iv is set.
[17856.268050] Unable to handle kernel paging request for data at address 0x00000000
[17856.268212] Faulting instruction address: 0xd000000002ff04bc
7:mon> t
[link register ] d000000002ff47b8 p8_aes_xts_crypt+0x168/0x2a0 [vmx_crypto] (938)
[c000000013b77960] d000000002ff4794 p8_aes_xts_crypt+0x144/0x2a0 [vmx_crypto] (unreliable)
[c000000013b77a70] c000000000544d64 skcipher_decrypt_blkcipher+0x64/0x80
[c000000013b77ac0] d000000003c0175c crypt_convert+0x53c/0x620 [dm_crypt]
[c000000013b77ba0] d000000003c043fc kcryptd_crypt+0x3cc/0x440 [dm_crypt]
[c000000013b77c50] c0000000000f3070 process_one_work+0x1e0/0x590
[c000000013b77ce0] c0000000000f34c8 worker_thread+0xa8/0x660
[c000000013b77d80] c0000000000fc0b0 kthread+0x110/0x130
[c000000013b77e30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Increase value of supported key sizes for qat_aes_xts.
aes-xts keys consists of keys of equal size concatenated.
Fixes: def14bfaf3 ("crypto: qat - add support for ctr(aes) and xts(aes)")
Cc: stable@vger.kernel.org
Reported-by: Wenqian Yu <wenqian.yu@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adds the config entry for the Chelsio Crypto Driver, Makefile changes
for the same.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Chelsio's Crypto Hardware can perform the following operations:
SHA1, SHA224, SHA256, SHA384 and SHA512, HMAC(SHA1), HMAC(SHA224),
HMAC(SHA256), HMAC(SHA384), HAMC(SHA512), AES-128-CBC, AES-192-CBC,
AES-256-CBC, AES-128-XTS, AES-256-XTS
This patch implements the driver for above mentioned features. This
driver is an Upper Layer Driver which is attached to Chelsio's LLD
(cxgb4) and uses the queue allocated by the LLD for sending the crypto
requests to the Hardware and receiving the responses from it.
The crypto operations can be performed by Chelsio's hardware from the
userspace applications and/or from within the kernel space using the
kernel's crypto API.
The above mentioned crypto features have been tested using kernel's
tests mentioned in testmgr.h. They also have been tested from user
space using libkcapi and Openssl.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/crypto/ccp/ccp-dev.c:62:14: warning:
symbol 'ccp_increment_unit_ordinal' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Two crypto alg are badly indented, this patch fix this style issue.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The dev *ss is stored both in sun4i_tfm_ctx and sun4i_req_ctx.
Since this pointer will never be changed during tfm life, it is better
to remove it from sun4i_req_ctx.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Two words are badly spelled, this patch respell them.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ss variable is never used, remove it.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The update and final functions have lots of common action.
This patch mix them in one function.
This will give some improvements:
- This will permit asynchronous support more easily
- This will permit to use finup/digest functions with some performance
improvements
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The variable i is always checked against unsigned value and cannot be
negative.
This patch set it as unsigned.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Don't use 64 'as is', as max block size in mv_cesa_ahash_cache_req. Use
CESA_MAX_HASH_BLOCK_SIZE instead, this is better for readability.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, in mv_cesa_{md5,sha1,sha256}_init creq->state is initialized
before the call to mv_cesa_ahash_init. This is wrong because this
function fills creq with zero by using memset, so its 'state' that
contains the default DIGEST is overwritten. This commit fixes the issue
by initializing creq->state just after the call to mv_cesa_ahash_init.
Fixes: commit b0ef51067c ("crypto: marvell/cesa - initialize hash...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, sub part of mv_cesa_int was responsible of dequeuing complete
requests, then call the 'cleanup' operation on these reqs and call the
crypto api callback 'complete'. The problem is that the transformation
context 'ctx' is retrieved only once before the while loop. Which means
that the wrong 'cleanup' operation might be called on the wrong type of
cesa requests, it can lead to memory corruptions with this message:
marvell-cesa f1090000.crypto: dma_pool_free cesa_padding, 5a5a5a5a/5a5a5a5a (bad dma)
This commit fixes the issue, by updating the transformation context for
each dequeued cesa request.
Fixes: commit 85030c5168 ("crypto: marvell - Add support for chai...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_ahash_cache_req() function always returns 0, which makes
its return value pretty much useless. However, in addition to
returning a useless value, it also returns a boolean in a variable
passed by reference to indicate if the request was already cached.
So, this commit changes mv_cesa_ahash_cache_req() to return this
boolean. It consequently simplifies the only call site of
mv_cesa_ahash_cache_req(), where the "ret" variable is no longer
needed.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_ahash_init() function always returns 0, and the return
value is anyway never checked. Turn it into a function returning void.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The dma_iter parameter of mv_cesa_ahash_dma_add_cache() is never used,
so get rid of it.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_dma_add_op() function builds a mv_cesa_tdma_desc structure
to copy the operation description to the SRAM, but doesn't explicitly
initialize the destination of the copy. It works fine because the
operatin description must be copied at the beginning of the SRAM, and
the mv_cesa_tdma_desc structure is initialized to zero when
allocated. However, it is somewhat confusing to not have a destination
defined.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Threaded interrupts can perform the function of the tasklet, and much
more safely too - without races when trying to take the tasklet and
interrupt down on device removal.
With the old code, there is a window where we call tasklet_kill(). If
the interrupt handler happens to be running on a different CPU, and
subsequently calls tasklet_schedule(), the tasklet will be re-scheduled
for execution.
Switching to a hardirq/threadirq combination implementation avoids this,
and it also means generic code deals with the teardown sequencing of the
threaded and non-threaded parts.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper to map the source scatterlist into the descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper function to perform the descriptor allocation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX
hardware, this will never happen. However, dma_map_sg() can fail, and
we completely fail to check its return value. So, fix this properly.
Arrange the code to map the scatterlist early, so we know how many
scatter table entries to allocate, and then fill them in. This allows
us to keep relatively simple error cleanup paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that we clean up allocations and DMA mappings after encountering
an error rather than just giving up and leaking memory and resources.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since the extended descriptor includes the hardware descriptor, and the
sec4 scatterlist immediately follows this, we can declare it as a array
at the very end of the extended descriptor. This allows us to get rid
of an initialiser for every site where we allocate an extended
descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mark the hardware descriptor as being cache line aligned; on DMA
incoherent architectures, the hardware descriptor should sit in a
separate cache line from the CPU accessed data to avoid polluting
the caches.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rather than giving the descriptor as hw_desc[0], give it's real size.
All places where we allocate an ahash_edesc incorporate DESC_JOB_IO_LEN
bytes of job descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
caamhash contains this weird code:
src_nents = sg_count(req->src, req->nbytes);
dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE);
...
edesc->src_nents = src_nents;
sg_count() returns zero when sg_nents_for_len() returns zero or one.
This means we don't need to use a hardware scatterlist. However,
setting src_nents to zero causes problems when we unmap:
if (edesc->src_nents)
dma_unmap_sg_chained(dev, req->src, edesc->src_nents,
DMA_TO_DEVICE, edesc->chained);
as zero here means that we have no entries to unmap. This causes us
to leak DMA mappings, where we map one scatterlist entry and then
fail to unmap it.
This can be fixed in two ways: either by writing the number of entries
that were requested of dma_map_sg(), or by reworking the "no SG
required" case.
We adopt the re-work solution here - we replace sg_count() with
sg_nents_for_len(), so src_nents now contains the real number of
scatterlist entries, and we then change the test for using the
hardware scatterlist to src_nents > 1 rather than just non-zero.
This change passes my sshd, openssl tests hashing /bin and tcrypt
tests.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Properly allocate enough memory to respect the fallback.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the probe function only emits an output on success
when debug is specifically enabled. It would be more useful
if this happens by default.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the img-hash accelerator does not probe
successfully due to a change in the checks made during
registration with the crypto framework. This is due to
import and export functions not being defined. Correct
this.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Current img hash claims sys and periph gate clocks
and this can be gated in system suspend scenarios.
Add support for Device pm ops for img hash to gate
the clocks claimed by img hash.
Signed-off-by: Govindraj Raja <Govindraj.Raja@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Burst length of 16 drives the hash accelerator out of spec
and causes stability issues in some cases. Reduce this to
stop data being lost.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move 0 length buffer to end of structure to stop overwriting
fallback request data. This doesn't cause a bug itself as the
buffer is never used alongside the fallback but should be
changed.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Sporadic null pointer exceptions came from here. Fix them.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A second CCP is available, identical to the first, with
its ownn PCI ID. Make it available for use by the crypto
subsystem, as well as for DMA activity and random
number generation.
This device is not pre-configured at at boot time. The
driver must configure it (during the probe) for use.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Every CCP is capable of providing general DMA services.
Register the device as a provider.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Available queue space is used to decide (by counting free slots)
if we have to put a command on hold or if it can be sent
to the engine immediately.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make the RNG support code common (where possible) in
preparation for adding a v5 device.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the KSB access/management functions to the v3
device file, and add function pointers to the actions
structure. At the operations layer all of the references
to the storage block will be generic (virtual). This is
in preparation for a version 5 device, in which the
private storage block is managed differently.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Form and use of the local storage block in the CCP is
particular to the device version. Much of the code that
accesses the storage block can treat it as a virtual
resource, and will under go some renaming. Device-specific
access to the memory will be moved into device file.
Service functions will be added to the actions
structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use more concise field names; "perform_" is too verbose.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Device-specific values for the BAR and offset should be found
in the version data structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Most error branches following the call to npe_request contain a call to
npe_request. This patch add a call to npe_release to error branches
following the call to npe_request that do not have it.
This issue was found with Hector.
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since 6de62f15b5 ("crypto: algif_hash - Require setkey before
accept(2)"), the AF_ALG interface requires userspace to provide a key
to any algorithm that has a setkey method. However, the non-HMAC
algorithms are not keyed, so setting a key is unnecessary.
Fix this by removing the setkey method from the non-keyed hash
algorithms.
Fixes: 6de62f15b5 ("crypto: algif_hash - Require setkey before accept(2)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To be able to generate shared descriptors for AEAD, the authentication size
needs to be known. However, there is no imposed order of calling .setkey,
.setauthsize callbacks.
Thus, in case authentication size is not known at .setkey time, defer it
until .setauthsize is called.
The authsize != 0 check was incorrectly removed when converting the driver
to the new AEAD interface.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are a few things missed by the conversion to the
new AEAD interface:
1 - echainiv(authenc) encrypt shared descriptor
The shared descriptor is incorrect: due to the order of operations,
at some point in time MATH3 register is being overwritten.
2 - buffer used for echainiv(authenc) encrypt shared descriptor
Encrypt and givencrypt shared descriptors (for AEAD ops) are mutually
exclusive and thus use the same buffer in context state: sh_desc_enc.
However, there's one place missed by s/sh_desc_givenc/sh_desc_enc,
leading to errors when echainiv(authenc(...)) algorithms are used:
DECO: desc idx 14: Header Error. Invalid length or parity, or
certain other problems.
While here, also fix a typo: dma_mapping_error() is checking
for validity of sh_desc_givenc_dma instead of sh_desc_enc_dma.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes a number of regressions in the marvell cesa driver caused
by the chaining work, and a regression in lib/mpi that leads to a
GFP_KERNEL allocation with preemption disabled"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: marvell - Don't copy IV vectors from the _process op for ciphers
lib/mpi: Fix SG miter leak
crypto: marvell - Update cache with input sg only when it is unmapped
crypto: marvell - Don't chain at DMA level when backlog is disabled
crypto: marvell - Fix memory leaks in TDMA chain for cipher requests
Highlights:
- PowerNV PCI hotplug support.
- Lots more Power9 support.
- eBPF JIT support on ppc64le.
- Lots of cxl updates.
- Boot code consolidation.
Bug fixes:
- Fix spin_unlock_wait() from Boqun Feng
- Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
- Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
- mm: Ensure "special" zones are empty from Oliver O'Halloran
- ftrace: Separate the heuristics for checking call sites from Michael Ellerman
- modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
- Fix endianness when reading TCEs from Alexey Kardashevskiy
- start rtasd before PCI probing from Greg Kurz
- PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
- powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
Cleanups & fixes:
- Drop support for MPIC in pseries from Rashmica Gupta
- Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
- Remove unused symbols in asm-offsets.c from Rashmica Gupta
- Fix SRIOV not building without EEH enabled from Russell Currey
- Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
- Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
- Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
- Avoid -maltivec when using clang integrated assembler from Anton Blanchard
- Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
- Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
- export cpu_to_core_id() from Mauricio Faria de Oliveira
- Remove old symbols from defconfigs from Andrew Donnellan
- Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
- Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
- Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
- Remove RELOCATABLE_PPC32 from Kevin Hao
- Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
Minor cleanups & fixes:
- Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
Stephen Rothwell, Stewart Smith.
Freescale updates from Scott:
- "Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
PowerNV PCI hotplug from Gavin Shan:
- PCI: Add pcibios_setup_bridge()
- Override pcibios_setup_bridge()
- Remove PCI_RESET_DELAY_US
- Move pnv_pci_ioda_setup_opal_tce_kill() around
- Increase PE# capacity
- Allocate PE# in reverse order
- Create PEs in pcibios_setup_bridge()
- Setup PE for root bus
- Extend PCI bridge resources
- Make pnv_ioda_deconfigure_pe() visible
- Dynamically release PE
- Update bridge windows on PCI plug
- Delay populating pdn
- Support PCI slot ID
- Use PCI slot reset infrastructure
- Introduce pnv_pci_get_slot_id()
- Functions to get/set PCI slot state
- PCI/hotplug: PowerPC PowerNV PCI hotplug driver
- Print correct PHB type names
Power9 idle support from Shreyas B. Prabhu:
- set power_save func after the idle states are initialized
- Use PNV_THREAD_WINKLE macro while requesting for winkle
- make hypervisor state restore a function
- Rename idle_power7.S to idle_book3s.S
- Rename reusable idle functions to hardware agnostic names
- Make pnv_powersave_common more generic
- abstraction for saving SPRs before entering deep idle states
- Add platform support for stop instruction
- cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
- cpuidle/powernv: cleanup cpuidle-powernv.c
- cpuidle/powernv: Add support for POWER ISA v3 idle states
- Use deepest stop state when cpu is offlined
Power9 PMU from Madhavan Srinivasan:
- factor out power8 pmu macros and defines
- factor out power8 pmu functions
- factor out power8 __init_pmu code
- Add power9 event list macros for generic and cache events
- Power9 PMU support
- Export Power9 generic and cache events to sysfs
Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
- Add XICS emulation APIs
- Move a few exception common handlers to make room
- Add support for HV virtualization interrupts
- Add mechanism to force a replay of interrupts
- Add ICP OPAL backend
- Discover IODA3 PHBs
- pci: Remove obsolete SW invalidate
- opal: Add real mode call wrappers
- Rename TCE invalidation calls
- Remove SWINV constants and obsolete TCE code
- Rework accessing the TCE invalidate register
- Fallback to OPAL for TCE invalidations
- Use the device-tree to get available range of M64's
- Check status of a PHB before using it
- pci: Don't try to allocate resources that will be reassigned
Other Power9:
- Send SIGBUS on unaligned copy and paste from Chris Smart
- Large Decrementer support from Oliver O'Halloran
- Load Monitor Register Support from Jack Miller
Performance improvements from Anton Blanchard:
- Avoid load hit store in __giveup_fpu() and __giveup_altivec()
- Avoid load hit store in setup_sigcontext()
- Remove assembly versions of strcpy, strcat, strlen and strcmp
- Align hot loops of some string functions
eBPF JIT from Naveen N. Rao:
- Fix/enhance 32-bit Load Immediate implementation
- Optimize 64-bit Immediate loads
- Introduce rotate immediate instructions
- A few cleanups
- Isolate classic BPF JIT specifics into a separate header
- Implement JIT compiler for extended BPF
Operator Panel driver from Suraj Jitindar Singh:
- devicetree/bindings: Add binding for operator panel on FSP machines
- Add inline function to get rc from an ASYNC_COMP opal_msg
- Add driver for operator panel on FSP machines
Sparse fixes from Daniel Axtens:
- make some things static
- Introduce asm-prototypes.h
- Include headers containing prototypes
- Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
- kvm: Clarify __user annotations
- Pass endianness to sparse
- Make ppc_md.{halt, restart} __noreturn
MM fixes & cleanups from Aneesh Kumar K.V:
- radix: Update LPCR HR bit as per ISA
- use _raw variant of page table accessors
- Compile out radix related functions if RADIX_MMU is disabled
- Clear top 16 bits of va only on older cpus
- Print formation regarding the the MMU mode
- hash: Update SDR1 size encoding as documented in ISA 3.0
- radix: Update PID switch sequence
- radix: Update machine call back to support new HCALL.
- radix: Add LPID based tlb flush helpers
- radix: Add a kernel command line to disable radix
- Cleanup LPCR defines
Boot code consolidation from Benjamin Herrenschmidt:
- Move epapr_paravirt_early_init() to early_init_devtree()
- cell: Don't use flat device-tree after boot
- ge_imp3a: Don't use the flat device-tree after boot
- mpc85xx_ds: Don't use the flat device-tree after boot
- mpc85xx_rdb: Don't use the flat device-tree after boot
- Don't test for machine type in rtas_initialize()
- Don't test for machine type in smp_setup_cpu_maps()
- dt: Add of_device_compatible_match()
- Factor do_feature_fixup calls
- Move 64-bit feature fixup earlier
- Move 64-bit memory reserves to setup_arch()
- Use a cachable DART
- Move FW feature probing out of pseries probe()
- Put exception configuration in a common place
- Remove early allocation of the SMU command buffer
- Move MMU backend selection out of platform code
- pasemi: Remove IOBMAP allocation from platform probe()
- mm/hash: Don't use machine_is() early during boot
- Don't test for machine type to detect HEA special case
- pmac: Remove spurrious machine type test
- Move hash table ops to a separate structure
- Ensure that ppc_md is empty before probing for machine type
- Move 64-bit probe_machine() to later in the boot process
- Move 32-bit probe() machine to later in the boot process
- Get rid of ppc_md.init_early()
- Move the boot time info banner to a separate function
- Move setting of {i,d}cache_bsize to initialize_cache_info()
- Move the content of setup_system() to setup_arch()
- Move cache info inits to a separate function
- Re-order the call to smp_setup_cpu_maps()
- Re-order setup_panic()
- Make a few boot functions __init
- Merge 32-bit and 64-bit setup_arch()
Other new features:
- tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
- tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
- powerpc: Add module autoloading based on CPU features from Alastair D'Silva
- crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
- Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
- xmon: Dump ISA 2.06 SPRs from Michael Ellerman
- xmon: Dump ISA 2.07 SPRs from Michael Ellerman
- Add a parameter to disable 1TB segs from Oliver O'Halloran
- powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
- Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
- pseries: Add pseries hotplug workqueue from John Allen
- pseries: Add support for hotplug interrupt source from John Allen
- pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
- pseries: Move property cloning into its own routine from Nathan Fontenot
- pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
- pseries: Auto-online hotplugged memory from Nathan Fontenot
- pseries: Remove call to memblock_add() from Nathan Fontenot
cxl:
- Add set and get private data to context struct from Michael Neuling
- make base more explicitly non-modular from Paul Gortmaker
- Use for_each_compatible_node() macro from Wei Yongjun
- Frederic Barrat
- Abstract the differences between the PSL and XSL
- Make vPHB device node match adapter's
- Philippe Bergheaud
- Add mechanism for delivering AFU driver specific events
- Ignore CAPI adapters misplaced in switched slots
- Refine slice error debug messages
- Andrew Donnellan
- static-ify variables to fix sparse warnings
- PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
- PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
- Add cxl_check_and_switch_mode() API to switch bi-modal cards
- remove dead Kconfig options
- fix potential NULL dereference in free_adapter()
- Ian Munsie
- Update process element after allocating interrupts
- Add support for CAPP DMA mode
- Fix allowing bogus AFU descriptors with 0 maximum processes
- Fix allocating a minimum of 2 pages for the SPA
- Fix bug where AFU disable operation had no effect
- Workaround XSL bug that does not clear the RA bit after a reset
- Fix NULL pointer dereference on kernel contexts with no AFU interrupts
- powerpc/powernv: Split cxl code out into a separate file
- Add cxl_slot_is_supported API
- Enable bus mastering for devices using CAPP DMA mode
- Move cxl_afu_get / cxl_afu_put to base
- Allow a default context to be associated with an external pci_dev
- Do not create vPHB if there are no AFU configuration records
- powerpc/powernv: Add support for the cxl kernel api on the real phb
- Add support for using the kernel API with a real PHB
- Add kernel APIs to get & set the max irqs per context
- Add preliminary workaround for CX4 interrupt limitation
- Add support for interrupts on the Mellanox CX4
- Workaround PE=0 hardware limitation in Mellanox CX4
- powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
selftests:
- Test unaligned copy and paste from Chris Smart
- Load Monitor Register Tests from Jack Miller
- Cyril Bur
- exec() with suspended transaction
- Use signed long to read perf_event_paranoid
- Fix usage message in context_switch
- Fix generation of vector instructions/types in context_switch
- Michael Ellerman
- Use "Delta" rather than "Error" in normal output
- Import Anton's mmap & futex micro benchmarks
- Add a test for PROT_SAO
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
AWh/hd0Iin2gdkcG39Mr
=Z5kM
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- PowerNV PCI hotplug support.
- Lots more Power9 support.
- eBPF JIT support on ppc64le.
- Lots of cxl updates.
- Boot code consolidation.
Bug fixes:
- Fix spin_unlock_wait() from Boqun Feng
- Fix stack pointer corruption in __tm_recheckpoint() from Michael
Neuling
- Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
- mm: Ensure "special" zones are empty from Oliver O'Halloran
- ftrace: Separate the heuristics for checking call sites from
Michael Ellerman
- modules: Never restore r2 for a mprofile-kernel style mcount() call
from Michael Ellerman
- Fix endianness when reading TCEs from Alexey Kardashevskiy
- start rtasd before PCI probing from Greg Kurz
- PCI: rpaphp: Fix slot registration for multiple slots under a PHB
from Tyrel Datwyler
- powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
Bhattiprolu
Cleanups & fixes:
- Drop support for MPIC in pseries from Rashmica Gupta
- Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
- Remove unused symbols in asm-offsets.c from Rashmica Gupta
- Fix SRIOV not building without EEH enabled from Russell Currey
- Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
- Reduce log level of PCI I/O space warning from Benjamin
Herrenschmidt
- Add array bounds checking to crash_shutdown_handlers from Suraj
Jitindar Singh
- Avoid -maltivec when using clang integrated assembler from Anton
Blanchard
- Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
- Fix error return value in cmm_mem_going_offline() from Rasmus
Villemoes
- export cpu_to_core_id() from Mauricio Faria de Oliveira
- Remove old symbols from defconfigs from Andrew Donnellan
- Update obsolete comments in setup_32.c about entry conditions from
Benjamin Herrenschmidt
- Add comment explaining the purpose of setup_kdump_trampoline() from
Benjamin Herrenschmidt
- Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
Hao
- Remove RELOCATABLE_PPC32 from Kevin Hao
- Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
Minor cleanups & fixes:
- Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
Michael Ellerman, Stephen Rothwell, Stewart Smith.
Freescale updates from Scott:
- "Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
PowerNV PCI hotplug from Gavin Shan:
- PCI: Add pcibios_setup_bridge()
- Override pcibios_setup_bridge()
- Remove PCI_RESET_DELAY_US
- Move pnv_pci_ioda_setup_opal_tce_kill() around
- Increase PE# capacity
- Allocate PE# in reverse order
- Create PEs in pcibios_setup_bridge()
- Setup PE for root bus
- Extend PCI bridge resources
- Make pnv_ioda_deconfigure_pe() visible
- Dynamically release PE
- Update bridge windows on PCI plug
- Delay populating pdn
- Support PCI slot ID
- Use PCI slot reset infrastructure
- Introduce pnv_pci_get_slot_id()
- Functions to get/set PCI slot state
- PCI/hotplug: PowerPC PowerNV PCI hotplug driver
- Print correct PHB type names
Power9 idle support from Shreyas B. Prabhu:
- set power_save func after the idle states are initialized
- Use PNV_THREAD_WINKLE macro while requesting for winkle
- make hypervisor state restore a function
- Rename idle_power7.S to idle_book3s.S
- Rename reusable idle functions to hardware agnostic names
- Make pnv_powersave_common more generic
- abstraction for saving SPRs before entering deep idle states
- Add platform support for stop instruction
- cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
- cpuidle/powernv: cleanup cpuidle-powernv.c
- cpuidle/powernv: Add support for POWER ISA v3 idle states
- Use deepest stop state when cpu is offlined
Power9 PMU from Madhavan Srinivasan:
- factor out power8 pmu macros and defines
- factor out power8 pmu functions
- factor out power8 __init_pmu code
- Add power9 event list macros for generic and cache events
- Power9 PMU support
- Export Power9 generic and cache events to sysfs
Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
- Add XICS emulation APIs
- Move a few exception common handlers to make room
- Add support for HV virtualization interrupts
- Add mechanism to force a replay of interrupts
- Add ICP OPAL backend
- Discover IODA3 PHBs
- pci: Remove obsolete SW invalidate
- opal: Add real mode call wrappers
- Rename TCE invalidation calls
- Remove SWINV constants and obsolete TCE code
- Rework accessing the TCE invalidate register
- Fallback to OPAL for TCE invalidations
- Use the device-tree to get available range of M64's
- Check status of a PHB before using it
- pci: Don't try to allocate resources that will be reassigned
Other Power9:
- Send SIGBUS on unaligned copy and paste from Chris Smart
- Large Decrementer support from Oliver O'Halloran
- Load Monitor Register Support from Jack Miller
Performance improvements from Anton Blanchard:
- Avoid load hit store in __giveup_fpu() and __giveup_altivec()
- Avoid load hit store in setup_sigcontext()
- Remove assembly versions of strcpy, strcat, strlen and strcmp
- Align hot loops of some string functions
eBPF JIT from Naveen N. Rao:
- Fix/enhance 32-bit Load Immediate implementation
- Optimize 64-bit Immediate loads
- Introduce rotate immediate instructions
- A few cleanups
- Isolate classic BPF JIT specifics into a separate header
- Implement JIT compiler for extended BPF
Operator Panel driver from Suraj Jitindar Singh:
- devicetree/bindings: Add binding for operator panel on FSP machines
- Add inline function to get rc from an ASYNC_COMP opal_msg
- Add driver for operator panel on FSP machines
Sparse fixes from Daniel Axtens:
- make some things static
- Introduce asm-prototypes.h
- Include headers containing prototypes
- Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
- kvm: Clarify __user annotations
- Pass endianness to sparse
- Make ppc_md.{halt, restart} __noreturn
MM fixes & cleanups from Aneesh Kumar K.V:
- radix: Update LPCR HR bit as per ISA
- use _raw variant of page table accessors
- Compile out radix related functions if RADIX_MMU is disabled
- Clear top 16 bits of va only on older cpus
- Print formation regarding the the MMU mode
- hash: Update SDR1 size encoding as documented in ISA 3.0
- radix: Update PID switch sequence
- radix: Update machine call back to support new HCALL.
- radix: Add LPID based tlb flush helpers
- radix: Add a kernel command line to disable radix
- Cleanup LPCR defines
Boot code consolidation from Benjamin Herrenschmidt:
- Move epapr_paravirt_early_init() to early_init_devtree()
- cell: Don't use flat device-tree after boot
- ge_imp3a: Don't use the flat device-tree after boot
- mpc85xx_ds: Don't use the flat device-tree after boot
- mpc85xx_rdb: Don't use the flat device-tree after boot
- Don't test for machine type in rtas_initialize()
- Don't test for machine type in smp_setup_cpu_maps()
- dt: Add of_device_compatible_match()
- Factor do_feature_fixup calls
- Move 64-bit feature fixup earlier
- Move 64-bit memory reserves to setup_arch()
- Use a cachable DART
- Move FW feature probing out of pseries probe()
- Put exception configuration in a common place
- Remove early allocation of the SMU command buffer
- Move MMU backend selection out of platform code
- pasemi: Remove IOBMAP allocation from platform probe()
- mm/hash: Don't use machine_is() early during boot
- Don't test for machine type to detect HEA special case
- pmac: Remove spurrious machine type test
- Move hash table ops to a separate structure
- Ensure that ppc_md is empty before probing for machine type
- Move 64-bit probe_machine() to later in the boot process
- Move 32-bit probe() machine to later in the boot process
- Get rid of ppc_md.init_early()
- Move the boot time info banner to a separate function
- Move setting of {i,d}cache_bsize to initialize_cache_info()
- Move the content of setup_system() to setup_arch()
- Move cache info inits to a separate function
- Re-order the call to smp_setup_cpu_maps()
- Re-order setup_panic()
- Make a few boot functions __init
- Merge 32-bit and 64-bit setup_arch()
Other new features:
- tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
- tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
- powerpc: Add module autoloading based on CPU features from Alastair D'Silva
- crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
- Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
- xmon: Dump ISA 2.06 SPRs from Michael Ellerman
- xmon: Dump ISA 2.07 SPRs from Michael Ellerman
- Add a parameter to disable 1TB segs from Oliver O'Halloran
- powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
- Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
- pseries: Add pseries hotplug workqueue from John Allen
- pseries: Add support for hotplug interrupt source from John Allen
- pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
- pseries: Move property cloning into its own routine from Nathan Fontenot
- pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
- pseries: Auto-online hotplugged memory from Nathan Fontenot
- pseries: Remove call to memblock_add() from Nathan Fontenot
cxl:
- Add set and get private data to context struct from Michael Neuling
- make base more explicitly non-modular from Paul Gortmaker
- Use for_each_compatible_node() macro from Wei Yongjun
- Frederic Barrat
- Abstract the differences between the PSL and XSL
- Make vPHB device node match adapter's
- Philippe Bergheaud
- Add mechanism for delivering AFU driver specific events
- Ignore CAPI adapters misplaced in switched slots
- Refine slice error debug messages
- Andrew Donnellan
- static-ify variables to fix sparse warnings
- PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
- PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
- Add cxl_check_and_switch_mode() API to switch bi-modal cards
- remove dead Kconfig options
- fix potential NULL dereference in free_adapter()
- Ian Munsie
- Update process element after allocating interrupts
- Add support for CAPP DMA mode
- Fix allowing bogus AFU descriptors with 0 maximum processes
- Fix allocating a minimum of 2 pages for the SPA
- Fix bug where AFU disable operation had no effect
- Workaround XSL bug that does not clear the RA bit after a reset
- Fix NULL pointer dereference on kernel contexts with no AFU interrupts
- powerpc/powernv: Split cxl code out into a separate file
- Add cxl_slot_is_supported API
- Enable bus mastering for devices using CAPP DMA mode
- Move cxl_afu_get / cxl_afu_put to base
- Allow a default context to be associated with an external pci_dev
- Do not create vPHB if there are no AFU configuration records
- powerpc/powernv: Add support for the cxl kernel api on the real phb
- Add support for using the kernel API with a real PHB
- Add kernel APIs to get & set the max irqs per context
- Add preliminary workaround for CX4 interrupt limitation
- Add support for interrupts on the Mellanox CX4
- Workaround PE=0 hardware limitation in Mellanox CX4
- powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
selftests:
- Test unaligned copy and paste from Chris Smart
- Load Monitor Register Tests from Jack Miller
- Cyril Bur
- exec() with suspended transaction
- Use signed long to read perf_event_paranoid
- Fix usage message in context_switch
- Fix generation of vector instructions/types in context_switch
- Michael Ellerman
- Use "Delta" rather than "Error" in normal output
- Import Anton's mmap & futex micro benchmarks
- Add a test for PROT_SAO"
* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
powerpc/mm: Parenthesise IS_ENABLED() in if condition
tty/hvc: Use opal irqchip interface if available
tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
selftests/powerpc: exec() with suspended transaction
powerpc: Improve comment explaining why we modify VRSAVE
powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
powerpc/mm: Fix build break when PPC_NATIVE=n
crypto: vmx - Convert to CPU feature based module autoloading
powerpc: Add module autoloading based on CPU features
powerpc/powernv/ioda: Fix endianness when reading TCEs
powerpc/mm: Add memory barrier in __hugepte_alloc()
powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
powerpc/ftrace: Separate the heuristics for checking call sites
powerpc: Merge 32-bit and 64-bit setup_arch()
powerpc/64: Make a few boot functions __init
powerpc: Re-order setup_panic()
powerpc: Re-order the call to smp_setup_cpu_maps()
powerpc/32: Move cache info inits to a separate function
powerpc/64: Move the content of setup_system() to setup_arch()
...
The IV output vectors should only be copied from the _complete operation
and not from the _process operation, i.e only from the operation that is
designed to copy the result of the request to the right location. This
copy is already done in the _complete operation, so this commit removes
the duplicated code in the _process op.
Fixes: 3610d6cd5231 ("crypto: marvell - Add a complete...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the cache of the ahash requests was updated from the 'complete'
operation. This complete operation is called from mv_cesa_tdma_process
before the cleanup operation, which means that the content of req->src
can be read and copied when it is still mapped. This commit fixes the
issue by moving this cache update from mv_cesa_ahash_complete to
mv_cesa_ahash_req_cleanup, so the copy is done once the sglist is
unmapped.
Fixes: 1bf6682cb3 ("crypto: marvell - Add a complete operation for..")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The flag CRYPTO_TFM_REQ_MAY_BACKLOG is optional and can be set from the
user to put requests into the backlog queue when the main cryptographic
queue is full. Before calling mv_cesa_tdma_chain we must check the value
of the return status to be sure that the current request has been
correctly queued or added to the backlog.
Fixes: 85030c5168 ("crypto: marvell - Add support for chaining...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far in mv_cesa_ablkcipher_dma_req_init, if an error is thrown while
the tdma chain is built there is a memory leak. This issue exists
because the chain is assigned later at the end of the function, so the
cleanup function is called with the wrong version of the chain.
Fixes: db509a4533 ("crypto: marvell/cesa - add TDMA support")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.8:
API:
- first part of skcipher low-level conversions
- add KPP (Key-agreement Protocol Primitives) interface.
Algorithms:
- fix IPsec/cryptd reordering issues that affects aesni
- RSA no longer does explicit leading zero removal
- add SHA3
- add DH
- add ECDH
- improve DRBG performance by not doing CTR by hand
Drivers:
- add x86 AVX2 multibuffer SHA256/512
- add POWER8 optimised crc32c
- add xts support to vmx
- add DH support to qat
- add RSA support to caam
- add Layerscape support to caam
- add SEC1 AEAD support to talitos
- improve performance by chaining requests in marvell/cesa
- add support for Araneus Alea I USB RNG
- add support for Broadcom BCM5301 RNG
- add support for Amlogic Meson RNG
- add support Broadcom NSP SoC RNG"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
crypto: vmx - Fix aes_p8_xts_decrypt build failure
crypto: vmx - Ignore generated files
crypto: vmx - Adding support for XTS
crypto: vmx - Adding asm subroutines for XTS
crypto: skcipher - add comment for skcipher_alg->base
crypto: testmgr - Print akcipher algorithm name
crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
crypto: nx - off by one bug in nx_of_update_msc()
crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
crypto: scatterwalk - Inline start/map/done
crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
crypto: scatterwalk - Fix test in scatterwalk_done
crypto: api - Optimise away crypto_yield when hard preemption is on
crypto: scatterwalk - add no-copy support to copychunks
crypto: scatterwalk - Remove scatterwalk_bytes_sglen
crypto: omap - Stop using crypto scatterwalk_bytes_sglen
crypto: skcipher - Remove top-level givcipher interface
crypto: user - Remove crypto_lookup_skcipher call
crypto: cts - Convert to skcipher
...
Pull s390 updates from Martin Schwidefsky:
"There are a couple of new things for s390 with this merge request:
- a new scheduling domain "drawer" is added to reflect the unusual
topology found on z13 machines. Performance tests showed up to 8
percent gain with the additional domain.
- the new crc-32 checksum crypto module uses the vector-galois-field
multiply and sum SIMD instruction to speed up crc-32 and crc-32c.
- proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
the generic vmlinux.lds linker script definitions.
- kcov instrumentation support. A prerequisite for that is the
inline assembly basic block cleanup, which is the reason for the
net/iucv/iucv.c change.
- support for 2GB pages is added to the hugetlbfs backend.
Then there are two removals:
- the oprofile hardware sampling support is dead code and is removed.
The oprofile user space uses the perf interface nowadays.
- the ETR clock synchronization is removed, this has been superseeded
be the STP clock synchronization. And it always has been
"interesting" code..
And the usual bug fixes and cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
s390/smp: clean up a condition
s390/cio/chp : Remove deprecated create_singlethread_workqueue
s390/chsc: improve channel path descriptor determination
s390/chsc: sanitize fmt check for chp_desc determination
s390/cio: make fmt1 channel path descriptor optional
s390/chsc: fix ioctl CHSC_INFO_CU command
s390/cio/device_ops: fix kernel doc
s390/cio: allow to reset channel measurement block
s390/console: Make preferred console handling more consistent
s390/mm: fix gmap tlb flush issues
s390/mm: add support for 2GB hugepages
s390: have unique symbol for __switch_to address
s390/cpuinfo: show maximum thread id
s390/ptrace: clarify bits in the per_struct
s390: stack address vs thread_info
s390: remove pointless load within __switch_to
s390: enable kcov support
s390/cpumf: use basic block for ecctr inline assembly
s390/hypfs: use basic block for diag inline assembly
...
This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the vmx_crypto module if the CPU supports
it.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Parallel build can sporadically fail because asn1 headers may
not be built yet by the time qat_asym_algs.o is compiled:
drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory
#include "qat_rsapubkey-asn1.h"
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We use _GLOBAL so there is no need to do the manual alignment,
in fact it causes a build failure.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ignore assembly files generated by the perl script.
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch add XTS support using VMX-crypto driver.
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch add XTS subroutines using VMX-crypto driver.
It gives a boost of 20 times using XTS.
These code has been adopted from OpenSSL project in collaboration
with the original author (Andy Polyakov <appro@openssl.org>).
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the parameter 'gfp_flags' instead of 'flag' as second argument of
dma_pool_alloc(). The parameter 'flag' is for the TDMA descriptor, its
content has no sense for the allocator.
Fixes: bac8e805a3 ("crypto: marvell - Copy IV vectors by DMA...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The props->ap[] array is defined like this:
struct alg_props ap[NX_MAX_FC][NX_MAX_MODE][3];
So we can see that if msc->fc and msc->mode are == to NX_MAX_FC or
NX_MAX_MODE then we're off by one.
Fixes: ae0222b728 ('powerpc/crypto: nx driver code supporting nx encryption')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We already have a generic function sg_nents_for_len which does
the same thing. This patch switches omap over to it and also
adds error handling in case the SG list is short.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is not need to drop leading zeros from the RSA output
operations results.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add DH support under kpp api. Drop struct qat_rsa_request and
introduce a more generic struct qat_asym_request and share it
between RSA and DH requests.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Extend qat driver to use RSA CRT mode when all CRT related components are
present in the private key. Simplify code in qat_rsa_setkey by adding
qat_rsa_clear_ctx.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Different product families will use FLR or SBR.
Virtual Function devices have no reset method.
Signed-off-by: Conor McLoughlin <conor.mcloughlin@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to
devm_ioremap_resource.
The Coccinelle semantic patch that makes this change is as follows:
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add RSA support to caam driver.
Initial author is Yashpal Dutta <yashpal.dutta@freescale.com>.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Drop all asn1 related code and use the new rsa_helper
functions rsa_parse_[pub|priv]_key for parsing the key
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The arm-neon-sha implementations have cra_priority of 150...300, so
increase omap-sham priority to 400 to ensure it is on top of any
software alg.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces use of the obsolete ablkcipher with skcipher.
It also removes shash_fallback which is totally unused.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ARM allmodconfig build currently warngs because of the
ux500 crypto driver not working well with the jump label
implementation that we started using for dynamic debug, which
breaks building with 'gcc -O0':
In file included from /git/arm-soc/include/linux/jump_label.h:105:0,
from /git/arm-soc/include/linux/dynamic_debug.h:5,
from /git/arm-soc/include/linux/printk.h:289,
from /git/arm-soc/include/linux/kernel.h:13,
from /git/arm-soc/include/linux/clk.h:16,
from /git/arm-soc/drivers/crypto/ux500/hash/hash_core.c:16:
/git/arm-soc/arch/arm/include/asm/jump_label.h: In function 'hash_set_dma_transfer':
/git/arm-soc/arch/arm/include/asm/jump_label.h:13:7: error: asm operand 0 probably doesn't match constraints [-Werror]
asm_volatile_goto("1:\n\t"
Turning off compiler optimizations has never really been supported
here, and it's only used when debugging the driver. I have not found
a good reason for doing this here, other than a misguided attempt
to produce more readable assembly output. Also, the driver is only
used in obsolete hardware that almost certainly nobody will spend
time debugging any more.
This just removes the -O0 flag from the compiler options.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adds software fallback support for small crypto requests. In these cases,
it is undesirable to use DMA, as setting it up itself is rather heavy
operation. Gives about 40% extra performance in ipsec usecase.
Signed-off-by: Bin Liu <b-liu@ti.com>
[t-kristo@ti.com: dropped the extra traces, updated some comments
on the code]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The extra call to dmaengine_terminate_all is not needed, as the DMA
is not running at this point. This improves performance slightly.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change crypto queue size from 1 to 10 for omap SHA driver. This should
allow clients to enqueue requests more effectively to avoid serializing
whole crypto sequences, giving extra performance.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling runtime PM API for every block causes serious performance hit to
crypto operations that are done on a long buffer. As crypto is performed
on a page boundary, encrypting large buffers can cause a series of crypto
operations divided by page. The runtime PM API is also called those many
times.
Convert the driver to use runtime_pm autosuspend instead, with a default
timeout value of 1 second. This results in upto ~50% speedup.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that crypto requests are chained together at the DMA level, we
increase the size of the crypto queue for each engine. The result is
that as the backlog list is reached later, it does not stop the crypto
stack from sending asychronous requests, so more cryptographic tasks
are processed by the engines.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
intervention. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.
This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine (stopped or possibly already
running).
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
required to allow chaining crypto requests at the DMA level. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the crypto requests were sent to engines sequentially.
This commit moves the SRAM I/O operations from the prepare to the step
functions. It provides flexibility for future works and allow to prepare
a request while the engine is running.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, the only way to access the tdma chain is to use the 'req'
union from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem
if we want to handle the TDMA chaining vs standard/non-DMA processing in
a generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones) and to remove the type completly. To limit the overhead, we get
rid of the type field, which can now be deduced from the req->chain.first
value. Once these changes are done the union is no longer needed, so
remove it and move mv_cesa_ablkcipher_std_req and mv_cesa_req
to mv_cesa_ablkcipher_req directly. There are also no needs to keep the
'base' field into the union of mv_cesa_ahash_req, so move it into the
upper structure.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the way that the type of a TDMA operation was checked was wrong.
We have to use the type mask in order to get the right part of the flag
containing the type of the operation.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
EXTRA_CFLAGS is still supported but its usage is deprecated.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a crypto API module to access the vector extension based CRC-32
implementations. Users can request the optimized implementation through
the shash crypto API interface.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
alloc_workqueue replaces deprecated create_workqueue().
The workqueue device_reset_wq has workitem &reset_data->reset_work per
adf_reset_dev_data. The workqueue pf2vf_resp_wq is a workqueue for
PF2VF responses has workitem &pf2vf_resp->pf2vf_resp_work per pf2vf_resp.
The workqueue adf_vf_stop_wq is used to call adf_dev_stop()
asynchronously.
Dedicated workqueues have been used in all cases since the workitems
on the workqueues are involved in operation of crypto which can be used in
the IO path which is depended upon during memory reclaim. Hence,
WQ_MEM_RECLAIM has been set to gurantee forward progress under memory
pressure.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The hash buffer is really HASH_BLOCK_SIZE bytes, someone
must have thought that memmove takes n*u32 words by mistake.
Tests work as good/bad as before after this patch.
Cc: Joakim Bech <joakim.bech@linaro.org>
Cc: stable@vger.kernel.org
Reported-by: David Binderman <linuxdev.baldrick@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at
priority 1000. Unfortunately this means we never use AES-CBC and
AES-CTR, because the base AES-CBC cipher that is implemented on
top of AES inherits its priority.
To fix this, AES-CBC and AES-CTR have to be a higher priority. Set
them to 2000.
Testing on a POWER8 with:
cryptsetup benchmark --cipher aes --key-size 256
Shows decryption speed increase from 402.4 MB/s to 3069.2 MB/s,
over 7x faster. Thanks to Mike Strosaker for helping me debug
this issue.
Fixes: 8c755ace35 ("crypto: vmx - Adding CBC routines for VMX module")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When calling ppc-xlate.pl, we pass it either linux-ppc64 or
linux-ppc64le. The script however was expecting linux64le, a result
of its OpenSSL origins. This means we aren't obeying the ppc64le
ABIv2 rules.
Fix this by checking for linux-ppc64le.
Fixes: 5ca5573820 ("crypto: vmx - comply with ABIs that specify vrsave as reserved.")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SEC1 doesn't have IPSEC_ESP descriptor type but it is able to perform
IPSEC using HMAC_SNOOP_NO_AFEU, which is also existing on SEC2
In order to be able to define descriptors templates for SEC1 without
breaking SEC2+, we have to give lower priority to HMAC_SNOOP_NO_AFEU
so that SEC2+ selects IPSEC_ESP and not HMAC_SNOOP_NO_AFEU which is
less performant.
This is done by adding a priority field in the template. If the field
is 0, we use the default priority, otherwise we used the one in the
field.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patchs enhances the IPSEC_ESP related functions for them to
also supports the same operations with descriptor type
HMAC_SNOOP_NO_AFEU.
The differences between the two descriptor types are:
* pointeurs 2 and 3 are swaped (Confidentiality key and
Primary EU Context IN)
* HMAC_SNOOP_NO_AFEU has CICV out in pointer 6
* HMAC_SNOOP_NO_AFEU has no primary EU context out so we get it
from the end of data out
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation of IPSEC for SEC1, first step is to make the mapping
helpers more generic so that they can also be used by AEAD functions.
First, the functions are moved before IPSEC functions in talitos.c
talitos_sg_unmap() and unmap_sg_talitos_ptr() are merged as they
are quite similar, the second one handling the SEC1 case an calling
the first one for SEC2
map_sg_in_talitos_ptr() and map_sg_out_talitos_ptr() are merged
into talitos_sg_map() and enhenced to support offseted zones
as used for AEAD. The actual mapping is now performed outside that
helper. The DMA sync is also done outside to not make it several
times.
talitos_edesc_alloc() size calculation are fixed to also take into
account AEAD specific parts also for SEC1
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In order to be able to use the mapping/unmapping helpers for IPSEC
it needs to be move upper in the file
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use helper for all modifications to talitos_ptr in preparation to
the implementation of AEAD for SEC1
to_talitos_ptr_extent_clear() has been removed in favor of
to_talitos_ptr_ext_set() to set any value and
to_talitos_ptr_ext_or() to or the extent field with a value
name has been shorten to help keeping single lines of 80 chars
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>