mirror of https://gitee.com/openkylin/linux.git
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu: "API: - Add 1472-byte test to tcrypt for IPsec - Reintroduced crypto stats interface with numerous changes - Support incremental algorithm dumps Algorithms: - Add xchacha12/20 - Add nhpoly1305 - Add adiantum - Add streebog hash - Mark cts(cbc(aes)) as FIPS allowed Drivers: - Improve performance of arm64/chacha20 - Improve performance of x86/chacha20 - Add NEON-accelerated nhpoly1305 - Add SSE2 accelerated nhpoly1305 - Add AVX2 accelerated nhpoly1305 - Add support for 192/256-bit keys in gcmaes AVX - Add SG support in gcmaes AVX - ESN for inline IPsec tx in chcr - Add support for CryptoCell 703 in ccree - Add support for CryptoCell 713 in ccree - Add SM4 support in ccree - Add SM3 support in ccree - Add support for chacha20 in caam/qi2 - Add support for chacha20 + poly1305 in caam/jr - Add support for chacha20 + poly1305 in caam/qi2 - Add AEAD cipher support in cavium/nitrox" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (130 commits) crypto: skcipher - remove remnants of internal IV generators crypto: cavium/nitrox - Fix build with !CONFIG_DEBUG_FS crypto: salsa20-generic - don't unnecessarily use atomic walk crypto: skcipher - add might_sleep() to skcipher_walk_virt() crypto: x86/chacha - avoid sleeping under kernel_fpu_begin() crypto: cavium/nitrox - Added AEAD cipher support crypto: mxc-scc - fix build warnings on ARM64 crypto: api - document missing stats member crypto: user - remove unused dump functions crypto: chelsio - Fix wrong error counter increments crypto: chelsio - Reset counters on cxgb4 Detach crypto: chelsio - Handle PCI shutdown event crypto: chelsio - cleanup:send addr as value in function argument crypto: chelsio - Use same value for both channel in single WR crypto: chelsio - Swap location of AAD and IV sent in WR crypto: chelsio - remove set but not used variable 'kctx_len' crypto: ux500 - Use proper enum in hash_set_dma_transfer crypto: ux500 - Use proper enum in cryp_set_dma_transfer crypto: aesni - Add scatter/gather avx stubs, and use them in C crypto: aesni - Introduce partial block macro ..
This commit is contained in:
commit
b71acb0e37
|
@ -1,15 +1,6 @@
|
||||||
Programming Interface
|
Programming Interface
|
||||||
=====================
|
=====================
|
||||||
|
|
||||||
Please note that the kernel crypto API contains the AEAD givcrypt API
|
|
||||||
(crypto_aead_giv\* and aead_givcrypt\* function calls in
|
|
||||||
include/crypto/aead.h). This API is obsolete and will be removed in the
|
|
||||||
future. To obtain the functionality of an AEAD cipher with internal IV
|
|
||||||
generation, use the IV generator as a regular cipher. For example,
|
|
||||||
rfc4106(gcm(aes)) is the AEAD cipher with external IV generation and
|
|
||||||
seqniv(rfc4106(gcm(aes))) implies that the kernel crypto API generates
|
|
||||||
the IV. Different IV generators are available.
|
|
||||||
|
|
||||||
.. class:: toc-title
|
.. class:: toc-title
|
||||||
|
|
||||||
Table of contents
|
Table of contents
|
||||||
|
|
|
@ -157,10 +157,6 @@ applicable to a cipher, it is not displayed:
|
||||||
|
|
||||||
- rng for random number generator
|
- rng for random number generator
|
||||||
|
|
||||||
- givcipher for cipher with associated IV generator (see the geniv
|
|
||||||
entry below for the specification of the IV generator type used by
|
|
||||||
the cipher implementation)
|
|
||||||
|
|
||||||
- kpp for a Key-agreement Protocol Primitive (KPP) cipher such as
|
- kpp for a Key-agreement Protocol Primitive (KPP) cipher such as
|
||||||
an ECDH or DH implementation
|
an ECDH or DH implementation
|
||||||
|
|
||||||
|
@ -174,16 +170,7 @@ applicable to a cipher, it is not displayed:
|
||||||
|
|
||||||
- digestsize: output size of the message digest
|
- digestsize: output size of the message digest
|
||||||
|
|
||||||
- geniv: IV generation type:
|
- geniv: IV generator (obsolete)
|
||||||
|
|
||||||
- eseqiv for encrypted sequence number based IV generation
|
|
||||||
|
|
||||||
- seqiv for sequence number based IV generation
|
|
||||||
|
|
||||||
- chainiv for chain iv generation
|
|
||||||
|
|
||||||
- <builtin> is a marker that the cipher implements IV generation and
|
|
||||||
handling as it is specific to the given cipher
|
|
||||||
|
|
||||||
Key Sizes
|
Key Sizes
|
||||||
---------
|
---------
|
||||||
|
@ -218,10 +205,6 @@ the aforementioned cipher types:
|
||||||
|
|
||||||
- CRYPTO_ALG_TYPE_ABLKCIPHER Asynchronous multi-block cipher
|
- CRYPTO_ALG_TYPE_ABLKCIPHER Asynchronous multi-block cipher
|
||||||
|
|
||||||
- CRYPTO_ALG_TYPE_GIVCIPHER Asynchronous multi-block cipher packed
|
|
||||||
together with an IV generator (see geniv field in the /proc/crypto
|
|
||||||
listing for the known IV generators)
|
|
||||||
|
|
||||||
- CRYPTO_ALG_TYPE_KPP Key-agreement Protocol Primitive (KPP) such as
|
- CRYPTO_ALG_TYPE_KPP Key-agreement Protocol Primitive (KPP) such as
|
||||||
an ECDH or DH implementation
|
an ECDH or DH implementation
|
||||||
|
|
||||||
|
@ -338,18 +321,14 @@ uses the API applicable to the cipher type specified for the block.
|
||||||
|
|
||||||
The following call sequence is applicable when the IPSEC layer triggers
|
The following call sequence is applicable when the IPSEC layer triggers
|
||||||
an encryption operation with the esp_output function. During
|
an encryption operation with the esp_output function. During
|
||||||
configuration, the administrator set up the use of rfc4106(gcm(aes)) as
|
configuration, the administrator set up the use of seqiv(rfc4106(gcm(aes)))
|
||||||
the cipher for ESP. The following call sequence is now depicted in the
|
as the cipher for ESP. The following call sequence is now depicted in
|
||||||
ASCII art above:
|
the ASCII art above:
|
||||||
|
|
||||||
1. esp_output() invokes crypto_aead_encrypt() to trigger an
|
1. esp_output() invokes crypto_aead_encrypt() to trigger an
|
||||||
encryption operation of the AEAD cipher with IV generator.
|
encryption operation of the AEAD cipher with IV generator.
|
||||||
|
|
||||||
In case of GCM, the SEQIV implementation is registered as GIVCIPHER
|
The SEQIV generates the IV.
|
||||||
in crypto_rfc4106_alloc().
|
|
||||||
|
|
||||||
The SEQIV performs its operation to generate an IV where the core
|
|
||||||
function is seqiv_geniv().
|
|
||||||
|
|
||||||
2. Now, SEQIV uses the AEAD API function calls to invoke the associated
|
2. Now, SEQIV uses the AEAD API function calls to invoke the associated
|
||||||
AEAD cipher. In our case, during the instantiation of SEQIV, the
|
AEAD cipher. In our case, during the instantiation of SEQIV, the
|
||||||
|
|
|
@ -1,8 +1,12 @@
|
||||||
Arm TrustZone CryptoCell cryptographic engine
|
Arm TrustZone CryptoCell cryptographic engine
|
||||||
|
|
||||||
Required properties:
|
Required properties:
|
||||||
- compatible: Should be one of: "arm,cryptocell-712-ree",
|
- compatible: Should be one of -
|
||||||
"arm,cryptocell-710-ree" or "arm,cryptocell-630p-ree".
|
"arm,cryptocell-713-ree"
|
||||||
|
"arm,cryptocell-703-ree"
|
||||||
|
"arm,cryptocell-712-ree"
|
||||||
|
"arm,cryptocell-710-ree"
|
||||||
|
"arm,cryptocell-630p-ree"
|
||||||
- reg: Base physical address of the engine and length of memory mapped region.
|
- reg: Base physical address of the engine and length of memory mapped region.
|
||||||
- interrupts: Interrupt number for the device.
|
- interrupts: Interrupt number for the device.
|
||||||
|
|
||||||
|
|
|
@ -6,6 +6,8 @@ Required properties:
|
||||||
- interrupts : Should contain MXS DCP interrupt numbers, VMI IRQ and DCP IRQ
|
- interrupts : Should contain MXS DCP interrupt numbers, VMI IRQ and DCP IRQ
|
||||||
must be supplied, optionally Secure IRQ can be present, but
|
must be supplied, optionally Secure IRQ can be present, but
|
||||||
is currently not implemented and not used.
|
is currently not implemented and not used.
|
||||||
|
- clocks : Clock reference (only required on some SOCs: 6ull and 6sll).
|
||||||
|
- clock-names : Must be "dcp".
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
|
|
|
@ -3484,6 +3484,7 @@ F: include/linux/spi/cc2520.h
|
||||||
F: Documentation/devicetree/bindings/net/ieee802154/cc2520.txt
|
F: Documentation/devicetree/bindings/net/ieee802154/cc2520.txt
|
||||||
|
|
||||||
CCREE ARM TRUSTZONE CRYPTOCELL REE DRIVER
|
CCREE ARM TRUSTZONE CRYPTOCELL REE DRIVER
|
||||||
|
M: Yael Chemla <yael.chemla@foss.arm.com>
|
||||||
M: Gilad Ben-Yossef <gilad@benyossef.com>
|
M: Gilad Ben-Yossef <gilad@benyossef.com>
|
||||||
L: linux-crypto@vger.kernel.org
|
L: linux-crypto@vger.kernel.org
|
||||||
S: Supported
|
S: Supported
|
||||||
|
@ -7147,7 +7148,9 @@ F: crypto/842.c
|
||||||
F: lib/842/
|
F: lib/842/
|
||||||
|
|
||||||
IBM Power in-Nest Crypto Acceleration
|
IBM Power in-Nest Crypto Acceleration
|
||||||
M: Paulo Flabiano Smorigo <pfsmorigo@linux.ibm.com>
|
M: Breno Leitão <leitao@debian.org>
|
||||||
|
M: Nayna Jain <nayna@linux.ibm.com>
|
||||||
|
M: Paulo Flabiano Smorigo <pfsmorigo@gmail.com>
|
||||||
L: linux-crypto@vger.kernel.org
|
L: linux-crypto@vger.kernel.org
|
||||||
S: Supported
|
S: Supported
|
||||||
F: drivers/crypto/nx/Makefile
|
F: drivers/crypto/nx/Makefile
|
||||||
|
@ -7211,7 +7214,9 @@ S: Supported
|
||||||
F: drivers/scsi/ibmvscsi_tgt/
|
F: drivers/scsi/ibmvscsi_tgt/
|
||||||
|
|
||||||
IBM Power VMX Cryptographic instructions
|
IBM Power VMX Cryptographic instructions
|
||||||
M: Paulo Flabiano Smorigo <pfsmorigo@linux.ibm.com>
|
M: Breno Leitão <leitao@debian.org>
|
||||||
|
M: Nayna Jain <nayna@linux.ibm.com>
|
||||||
|
M: Paulo Flabiano Smorigo <pfsmorigo@gmail.com>
|
||||||
L: linux-crypto@vger.kernel.org
|
L: linux-crypto@vger.kernel.org
|
||||||
S: Supported
|
S: Supported
|
||||||
F: drivers/crypto/vmx/Makefile
|
F: drivers/crypto/vmx/Makefile
|
||||||
|
|
|
@ -69,6 +69,15 @@ config CRYPTO_AES_ARM
|
||||||
help
|
help
|
||||||
Use optimized AES assembler routines for ARM platforms.
|
Use optimized AES assembler routines for ARM platforms.
|
||||||
|
|
||||||
|
On ARM processors without the Crypto Extensions, this is the
|
||||||
|
fastest AES implementation for single blocks. For multiple
|
||||||
|
blocks, the NEON bit-sliced implementation is usually faster.
|
||||||
|
|
||||||
|
This implementation may be vulnerable to cache timing attacks,
|
||||||
|
since it uses lookup tables. However, as countermeasures it
|
||||||
|
disables IRQs and preloads the tables; it is hoped this makes
|
||||||
|
such attacks very difficult.
|
||||||
|
|
||||||
config CRYPTO_AES_ARM_BS
|
config CRYPTO_AES_ARM_BS
|
||||||
tristate "Bit sliced AES using NEON instructions"
|
tristate "Bit sliced AES using NEON instructions"
|
||||||
depends on KERNEL_MODE_NEON
|
depends on KERNEL_MODE_NEON
|
||||||
|
@ -117,9 +126,14 @@ config CRYPTO_CRC32_ARM_CE
|
||||||
select CRYPTO_HASH
|
select CRYPTO_HASH
|
||||||
|
|
||||||
config CRYPTO_CHACHA20_NEON
|
config CRYPTO_CHACHA20_NEON
|
||||||
tristate "NEON accelerated ChaCha20 symmetric cipher"
|
tristate "NEON accelerated ChaCha stream cipher algorithms"
|
||||||
depends on KERNEL_MODE_NEON
|
depends on KERNEL_MODE_NEON
|
||||||
select CRYPTO_BLKCIPHER
|
select CRYPTO_BLKCIPHER
|
||||||
select CRYPTO_CHACHA20
|
select CRYPTO_CHACHA20
|
||||||
|
|
||||||
|
config CRYPTO_NHPOLY1305_NEON
|
||||||
|
tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)"
|
||||||
|
depends on KERNEL_MODE_NEON
|
||||||
|
select CRYPTO_NHPOLY1305
|
||||||
|
|
||||||
endif
|
endif
|
||||||
|
|
|
@ -9,7 +9,8 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
|
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
|
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
|
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
|
||||||
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
|
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
|
||||||
|
obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
|
||||||
|
|
||||||
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
|
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
|
||||||
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
|
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
|
||||||
|
@ -52,7 +53,8 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
|
||||||
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
|
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
|
||||||
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
|
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
|
||||||
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
|
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
|
||||||
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
|
chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
|
||||||
|
nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
|
||||||
|
|
||||||
ifdef REGENERATE_ARM_CRYPTO
|
ifdef REGENERATE_ARM_CRYPTO
|
||||||
quiet_cmd_perl = PERL $@
|
quiet_cmd_perl = PERL $@
|
||||||
|
|
|
@ -10,7 +10,6 @@
|
||||||
|
|
||||||
#include <asm/hwcap.h>
|
#include <asm/hwcap.h>
|
||||||
#include <asm/neon.h>
|
#include <asm/neon.h>
|
||||||
#include <asm/hwcap.h>
|
|
||||||
#include <crypto/aes.h>
|
#include <crypto/aes.h>
|
||||||
#include <crypto/internal/simd.h>
|
#include <crypto/internal/simd.h>
|
||||||
#include <crypto/internal/skcipher.h>
|
#include <crypto/internal/skcipher.h>
|
||||||
|
|
|
@ -10,6 +10,7 @@
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/linkage.h>
|
#include <linux/linkage.h>
|
||||||
|
#include <asm/assembler.h>
|
||||||
#include <asm/cache.h>
|
#include <asm/cache.h>
|
||||||
|
|
||||||
.text
|
.text
|
||||||
|
@ -41,7 +42,7 @@
|
||||||
.endif
|
.endif
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
.macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc, sz, op
|
.macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc, sz, op, oldcpsr
|
||||||
__select \out0, \in0, 0
|
__select \out0, \in0, 0
|
||||||
__select t0, \in1, 1
|
__select t0, \in1, 1
|
||||||
__load \out0, \out0, 0, \sz, \op
|
__load \out0, \out0, 0, \sz, \op
|
||||||
|
@ -73,6 +74,14 @@
|
||||||
__load t0, t0, 3, \sz, \op
|
__load t0, t0, 3, \sz, \op
|
||||||
__load \t4, \t4, 3, \sz, \op
|
__load \t4, \t4, 3, \sz, \op
|
||||||
|
|
||||||
|
.ifnb \oldcpsr
|
||||||
|
/*
|
||||||
|
* This is the final round and we're done with all data-dependent table
|
||||||
|
* lookups, so we can safely re-enable interrupts.
|
||||||
|
*/
|
||||||
|
restore_irqs \oldcpsr
|
||||||
|
.endif
|
||||||
|
|
||||||
eor \out1, \out1, t1, ror #24
|
eor \out1, \out1, t1, ror #24
|
||||||
eor \out0, \out0, t2, ror #16
|
eor \out0, \out0, t2, ror #16
|
||||||
ldm rk!, {t1, t2}
|
ldm rk!, {t1, t2}
|
||||||
|
@ -83,14 +92,14 @@
|
||||||
eor \out1, \out1, t2
|
eor \out1, \out1, t2
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
.macro fround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op
|
.macro fround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op, oldcpsr
|
||||||
__hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, \sz, \op
|
__hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, \sz, \op
|
||||||
__hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, \sz, \op
|
__hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, \sz, \op, \oldcpsr
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
.macro iround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op
|
.macro iround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op, oldcpsr
|
||||||
__hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, \sz, \op
|
__hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, \sz, \op
|
||||||
__hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, \sz, \op
|
__hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, \sz, \op, \oldcpsr
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
.macro __rev, out, in
|
.macro __rev, out, in
|
||||||
|
@ -118,13 +127,14 @@
|
||||||
.macro do_crypt, round, ttab, ltab, bsz
|
.macro do_crypt, round, ttab, ltab, bsz
|
||||||
push {r3-r11, lr}
|
push {r3-r11, lr}
|
||||||
|
|
||||||
|
// Load keys first, to reduce latency in case they're not cached yet.
|
||||||
|
ldm rk!, {r8-r11}
|
||||||
|
|
||||||
ldr r4, [in]
|
ldr r4, [in]
|
||||||
ldr r5, [in, #4]
|
ldr r5, [in, #4]
|
||||||
ldr r6, [in, #8]
|
ldr r6, [in, #8]
|
||||||
ldr r7, [in, #12]
|
ldr r7, [in, #12]
|
||||||
|
|
||||||
ldm rk!, {r8-r11}
|
|
||||||
|
|
||||||
#ifdef CONFIG_CPU_BIG_ENDIAN
|
#ifdef CONFIG_CPU_BIG_ENDIAN
|
||||||
__rev r4, r4
|
__rev r4, r4
|
||||||
__rev r5, r5
|
__rev r5, r5
|
||||||
|
@ -138,6 +148,25 @@
|
||||||
eor r7, r7, r11
|
eor r7, r7, r11
|
||||||
|
|
||||||
__adrl ttab, \ttab
|
__adrl ttab, \ttab
|
||||||
|
/*
|
||||||
|
* Disable interrupts and prefetch the 1024-byte 'ft' or 'it' table into
|
||||||
|
* L1 cache, assuming cacheline size >= 32. This is a hardening measure
|
||||||
|
* intended to make cache-timing attacks more difficult. They may not
|
||||||
|
* be fully prevented, however; see the paper
|
||||||
|
* https://cr.yp.to/antiforgery/cachetiming-20050414.pdf
|
||||||
|
* ("Cache-timing attacks on AES") for a discussion of the many
|
||||||
|
* difficulties involved in writing truly constant-time AES software.
|
||||||
|
*/
|
||||||
|
save_and_disable_irqs t0
|
||||||
|
.set i, 0
|
||||||
|
.rept 1024 / 128
|
||||||
|
ldr r8, [ttab, #i + 0]
|
||||||
|
ldr r9, [ttab, #i + 32]
|
||||||
|
ldr r10, [ttab, #i + 64]
|
||||||
|
ldr r11, [ttab, #i + 96]
|
||||||
|
.set i, i + 128
|
||||||
|
.endr
|
||||||
|
push {t0} // oldcpsr
|
||||||
|
|
||||||
tst rounds, #2
|
tst rounds, #2
|
||||||
bne 1f
|
bne 1f
|
||||||
|
@ -151,8 +180,21 @@
|
||||||
\round r4, r5, r6, r7, r8, r9, r10, r11
|
\round r4, r5, r6, r7, r8, r9, r10, r11
|
||||||
b 0b
|
b 0b
|
||||||
|
|
||||||
2: __adrl ttab, \ltab
|
2: .ifb \ltab
|
||||||
\round r4, r5, r6, r7, r8, r9, r10, r11, \bsz, b
|
add ttab, ttab, #1
|
||||||
|
.else
|
||||||
|
__adrl ttab, \ltab
|
||||||
|
// Prefetch inverse S-box for final round; see explanation above
|
||||||
|
.set i, 0
|
||||||
|
.rept 256 / 64
|
||||||
|
ldr t0, [ttab, #i + 0]
|
||||||
|
ldr t1, [ttab, #i + 32]
|
||||||
|
.set i, i + 64
|
||||||
|
.endr
|
||||||
|
.endif
|
||||||
|
|
||||||
|
pop {rounds} // oldcpsr
|
||||||
|
\round r4, r5, r6, r7, r8, r9, r10, r11, \bsz, b, rounds
|
||||||
|
|
||||||
#ifdef CONFIG_CPU_BIG_ENDIAN
|
#ifdef CONFIG_CPU_BIG_ENDIAN
|
||||||
__rev r4, r4
|
__rev r4, r4
|
||||||
|
@ -175,7 +217,7 @@
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
ENTRY(__aes_arm_encrypt)
|
ENTRY(__aes_arm_encrypt)
|
||||||
do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2
|
do_crypt fround, crypto_ft_tab,, 2
|
||||||
ENDPROC(__aes_arm_encrypt)
|
ENDPROC(__aes_arm_encrypt)
|
||||||
|
|
||||||
.align 5
|
.align 5
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
/*
|
/*
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
|
* ChaCha/XChaCha NEON helper functions
|
||||||
*
|
*
|
||||||
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
||||||
*
|
*
|
||||||
|
@ -27,9 +27,9 @@
|
||||||
* (d) vtbl.8 + vtbl.8 (multiple of 8 bits rotations only,
|
* (d) vtbl.8 + vtbl.8 (multiple of 8 bits rotations only,
|
||||||
* needs index vector)
|
* needs index vector)
|
||||||
*
|
*
|
||||||
* ChaCha20 has 16, 12, 8, and 7-bit rotations. For the 12 and 7-bit
|
* ChaCha has 16, 12, 8, and 7-bit rotations. For the 12 and 7-bit rotations,
|
||||||
* rotations, the only choices are (a) and (b). We use (a) since it takes
|
* the only choices are (a) and (b). We use (a) since it takes two-thirds the
|
||||||
* two-thirds the cycles of (b) on both Cortex-A7 and Cortex-A53.
|
* cycles of (b) on both Cortex-A7 and Cortex-A53.
|
||||||
*
|
*
|
||||||
* For the 16-bit rotation, we use vrev32.16 since it's consistently fastest
|
* For the 16-bit rotation, we use vrev32.16 since it's consistently fastest
|
||||||
* and doesn't need a temporary register.
|
* and doesn't need a temporary register.
|
||||||
|
@ -52,30 +52,20 @@
|
||||||
.fpu neon
|
.fpu neon
|
||||||
.align 5
|
.align 5
|
||||||
|
|
||||||
ENTRY(chacha20_block_xor_neon)
|
/*
|
||||||
// r0: Input state matrix, s
|
* chacha_permute - permute one block
|
||||||
// r1: 1 data block output, o
|
*
|
||||||
// r2: 1 data block input, i
|
* Permute one 64-byte block where the state matrix is stored in the four NEON
|
||||||
|
* registers q0-q3. It performs matrix operations on four words in parallel,
|
||||||
//
|
* but requires shuffling to rearrange the words after each round.
|
||||||
// This function encrypts one ChaCha20 block by loading the state matrix
|
*
|
||||||
// in four NEON registers. It performs matrix operation on four words in
|
* The round count is given in r3.
|
||||||
// parallel, but requireds shuffling to rearrange the words after each
|
*
|
||||||
// round.
|
* Clobbers: r3, ip, q4-q5
|
||||||
//
|
*/
|
||||||
|
chacha_permute:
|
||||||
// x0..3 = s0..3
|
|
||||||
add ip, r0, #0x20
|
|
||||||
vld1.32 {q0-q1}, [r0]
|
|
||||||
vld1.32 {q2-q3}, [ip]
|
|
||||||
|
|
||||||
vmov q8, q0
|
|
||||||
vmov q9, q1
|
|
||||||
vmov q10, q2
|
|
||||||
vmov q11, q3
|
|
||||||
|
|
||||||
adr ip, .Lrol8_table
|
adr ip, .Lrol8_table
|
||||||
mov r3, #10
|
|
||||||
vld1.8 {d10}, [ip, :64]
|
vld1.8 {d10}, [ip, :64]
|
||||||
|
|
||||||
.Ldoubleround:
|
.Ldoubleround:
|
||||||
|
@ -139,9 +129,31 @@ ENTRY(chacha20_block_xor_neon)
|
||||||
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
||||||
vext.8 q3, q3, q3, #4
|
vext.8 q3, q3, q3, #4
|
||||||
|
|
||||||
subs r3, r3, #1
|
subs r3, r3, #2
|
||||||
bne .Ldoubleround
|
bne .Ldoubleround
|
||||||
|
|
||||||
|
bx lr
|
||||||
|
ENDPROC(chacha_permute)
|
||||||
|
|
||||||
|
ENTRY(chacha_block_xor_neon)
|
||||||
|
// r0: Input state matrix, s
|
||||||
|
// r1: 1 data block output, o
|
||||||
|
// r2: 1 data block input, i
|
||||||
|
// r3: nrounds
|
||||||
|
push {lr}
|
||||||
|
|
||||||
|
// x0..3 = s0..3
|
||||||
|
add ip, r0, #0x20
|
||||||
|
vld1.32 {q0-q1}, [r0]
|
||||||
|
vld1.32 {q2-q3}, [ip]
|
||||||
|
|
||||||
|
vmov q8, q0
|
||||||
|
vmov q9, q1
|
||||||
|
vmov q10, q2
|
||||||
|
vmov q11, q3
|
||||||
|
|
||||||
|
bl chacha_permute
|
||||||
|
|
||||||
add ip, r2, #0x20
|
add ip, r2, #0x20
|
||||||
vld1.8 {q4-q5}, [r2]
|
vld1.8 {q4-q5}, [r2]
|
||||||
vld1.8 {q6-q7}, [ip]
|
vld1.8 {q6-q7}, [ip]
|
||||||
|
@ -166,15 +178,33 @@ ENTRY(chacha20_block_xor_neon)
|
||||||
vst1.8 {q0-q1}, [r1]
|
vst1.8 {q0-q1}, [r1]
|
||||||
vst1.8 {q2-q3}, [ip]
|
vst1.8 {q2-q3}, [ip]
|
||||||
|
|
||||||
bx lr
|
pop {pc}
|
||||||
ENDPROC(chacha20_block_xor_neon)
|
ENDPROC(chacha_block_xor_neon)
|
||||||
|
|
||||||
|
ENTRY(hchacha_block_neon)
|
||||||
|
// r0: Input state matrix, s
|
||||||
|
// r1: output (8 32-bit words)
|
||||||
|
// r2: nrounds
|
||||||
|
push {lr}
|
||||||
|
|
||||||
|
vld1.32 {q0-q1}, [r0]!
|
||||||
|
vld1.32 {q2-q3}, [r0]
|
||||||
|
|
||||||
|
mov r3, r2
|
||||||
|
bl chacha_permute
|
||||||
|
|
||||||
|
vst1.32 {q0}, [r1]!
|
||||||
|
vst1.32 {q3}, [r1]
|
||||||
|
|
||||||
|
pop {pc}
|
||||||
|
ENDPROC(hchacha_block_neon)
|
||||||
|
|
||||||
.align 4
|
.align 4
|
||||||
.Lctrinc: .word 0, 1, 2, 3
|
.Lctrinc: .word 0, 1, 2, 3
|
||||||
.Lrol8_table: .byte 3, 0, 1, 2, 7, 4, 5, 6
|
.Lrol8_table: .byte 3, 0, 1, 2, 7, 4, 5, 6
|
||||||
|
|
||||||
.align 5
|
.align 5
|
||||||
ENTRY(chacha20_4block_xor_neon)
|
ENTRY(chacha_4block_xor_neon)
|
||||||
push {r4-r5}
|
push {r4-r5}
|
||||||
mov r4, sp // preserve the stack pointer
|
mov r4, sp // preserve the stack pointer
|
||||||
sub ip, sp, #0x20 // allocate a 32 byte buffer
|
sub ip, sp, #0x20 // allocate a 32 byte buffer
|
||||||
|
@ -184,9 +214,10 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
// r0: Input state matrix, s
|
// r0: Input state matrix, s
|
||||||
// r1: 4 data blocks output, o
|
// r1: 4 data blocks output, o
|
||||||
// r2: 4 data blocks input, i
|
// r2: 4 data blocks input, i
|
||||||
|
// r3: nrounds
|
||||||
|
|
||||||
//
|
//
|
||||||
// This function encrypts four consecutive ChaCha20 blocks by loading
|
// This function encrypts four consecutive ChaCha blocks by loading
|
||||||
// the state matrix in NEON registers four times. The algorithm performs
|
// the state matrix in NEON registers four times. The algorithm performs
|
||||||
// each operation on the corresponding word of each state matrix, hence
|
// each operation on the corresponding word of each state matrix, hence
|
||||||
// requires no word shuffling. The words are re-interleaved before the
|
// requires no word shuffling. The words are re-interleaved before the
|
||||||
|
@ -219,7 +250,6 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
vdup.32 q0, d0[0]
|
vdup.32 q0, d0[0]
|
||||||
|
|
||||||
adr ip, .Lrol8_table
|
adr ip, .Lrol8_table
|
||||||
mov r3, #10
|
|
||||||
b 1f
|
b 1f
|
||||||
|
|
||||||
.Ldoubleround4:
|
.Ldoubleround4:
|
||||||
|
@ -417,7 +447,7 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
vsri.u32 q5, q8, #25
|
vsri.u32 q5, q8, #25
|
||||||
vsri.u32 q6, q9, #25
|
vsri.u32 q6, q9, #25
|
||||||
|
|
||||||
subs r3, r3, #1
|
subs r3, r3, #2
|
||||||
bne .Ldoubleround4
|
bne .Ldoubleround4
|
||||||
|
|
||||||
// x0..7[0-3] are in q0-q7, x10..15[0-3] are in q10-q15.
|
// x0..7[0-3] are in q0-q7, x10..15[0-3] are in q10-q15.
|
||||||
|
@ -527,4 +557,4 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
|
|
||||||
pop {r4-r5}
|
pop {r4-r5}
|
||||||
bx lr
|
bx lr
|
||||||
ENDPROC(chacha20_4block_xor_neon)
|
ENDPROC(chacha_4block_xor_neon)
|
|
@ -0,0 +1,201 @@
|
||||||
|
/*
|
||||||
|
* ARM NEON accelerated ChaCha and XChaCha stream ciphers,
|
||||||
|
* including ChaCha20 (RFC7539)
|
||||||
|
*
|
||||||
|
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License version 2 as
|
||||||
|
* published by the Free Software Foundation.
|
||||||
|
*
|
||||||
|
* Based on:
|
||||||
|
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
|
||||||
|
*
|
||||||
|
* Copyright (C) 2015 Martin Willi
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License as published by
|
||||||
|
* the Free Software Foundation; either version 2 of the License, or
|
||||||
|
* (at your option) any later version.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/algapi.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/internal/skcipher.h>
|
||||||
|
#include <linux/kernel.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
#include <asm/hwcap.h>
|
||||||
|
#include <asm/neon.h>
|
||||||
|
#include <asm/simd.h>
|
||||||
|
|
||||||
|
asmlinkage void chacha_block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
|
||||||
|
int nrounds);
|
||||||
|
asmlinkage void chacha_4block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
|
||||||
|
int nrounds);
|
||||||
|
asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds);
|
||||||
|
|
||||||
|
static void chacha_doneon(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int bytes, int nrounds)
|
||||||
|
{
|
||||||
|
u8 buf[CHACHA_BLOCK_SIZE];
|
||||||
|
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE * 4) {
|
||||||
|
chacha_4block_xor_neon(state, dst, src, nrounds);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE * 4;
|
||||||
|
src += CHACHA_BLOCK_SIZE * 4;
|
||||||
|
dst += CHACHA_BLOCK_SIZE * 4;
|
||||||
|
state[12] += 4;
|
||||||
|
}
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE) {
|
||||||
|
chacha_block_xor_neon(state, dst, src, nrounds);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE;
|
||||||
|
src += CHACHA_BLOCK_SIZE;
|
||||||
|
dst += CHACHA_BLOCK_SIZE;
|
||||||
|
state[12]++;
|
||||||
|
}
|
||||||
|
if (bytes) {
|
||||||
|
memcpy(buf, src, bytes);
|
||||||
|
chacha_block_xor_neon(state, buf, buf, nrounds);
|
||||||
|
memcpy(dst, buf, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_neon_stream_xor(struct skcipher_request *req,
|
||||||
|
struct chacha_ctx *ctx, u8 *iv)
|
||||||
|
{
|
||||||
|
struct skcipher_walk walk;
|
||||||
|
u32 state[16];
|
||||||
|
int err;
|
||||||
|
|
||||||
|
err = skcipher_walk_virt(&walk, req, false);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, iv);
|
||||||
|
|
||||||
|
while (walk.nbytes > 0) {
|
||||||
|
unsigned int nbytes = walk.nbytes;
|
||||||
|
|
||||||
|
if (nbytes < walk.total)
|
||||||
|
nbytes = round_down(nbytes, walk.stride);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
|
||||||
|
nbytes, ctx->nrounds);
|
||||||
|
kernel_neon_end();
|
||||||
|
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
||||||
|
}
|
||||||
|
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_neon(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
|
||||||
|
return crypto_chacha_crypt(req);
|
||||||
|
|
||||||
|
return chacha_neon_stream_xor(req, ctx, req->iv);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int xchacha_neon(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct chacha_ctx subctx;
|
||||||
|
u32 state[16];
|
||||||
|
u8 real_iv[16];
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
|
||||||
|
return crypto_xchacha_crypt(req);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, req->iv);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
hchacha_block_neon(state, subctx.key, ctx->nrounds);
|
||||||
|
kernel_neon_end();
|
||||||
|
subctx.nrounds = ctx->nrounds;
|
||||||
|
|
||||||
|
memcpy(&real_iv[0], req->iv + 24, 8);
|
||||||
|
memcpy(&real_iv[8], req->iv + 16, 8);
|
||||||
|
return chacha_neon_stream_xor(req, &subctx, real_iv);
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct skcipher_alg algs[] = {
|
||||||
|
{
|
||||||
|
.base.cra_name = "chacha20",
|
||||||
|
.base.cra_driver_name = "chacha20-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = CHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 4 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = chacha_neon,
|
||||||
|
.decrypt = chacha_neon,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha20",
|
||||||
|
.base.cra_driver_name = "xchacha20-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 4 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = xchacha_neon,
|
||||||
|
.decrypt = xchacha_neon,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha12",
|
||||||
|
.base.cra_driver_name = "xchacha12-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 4 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha12_setkey,
|
||||||
|
.encrypt = xchacha_neon,
|
||||||
|
.decrypt = xchacha_neon,
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init chacha_simd_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!(elf_hwcap & HWCAP_NEON))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit chacha_simd_mod_fini(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(chacha_simd_mod_init);
|
||||||
|
module_exit(chacha_simd_mod_fini);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (NEON accelerated)");
|
||||||
|
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20-neon");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20-neon");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12-neon");
|
|
@ -1,127 +0,0 @@
|
||||||
/*
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
|
|
||||||
*
|
|
||||||
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License version 2 as
|
|
||||||
* published by the Free Software Foundation.
|
|
||||||
*
|
|
||||||
* Based on:
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
|
|
||||||
*
|
|
||||||
* Copyright (C) 2015 Martin Willi
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License as published by
|
|
||||||
* the Free Software Foundation; either version 2 of the License, or
|
|
||||||
* (at your option) any later version.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <crypto/algapi.h>
|
|
||||||
#include <crypto/chacha20.h>
|
|
||||||
#include <crypto/internal/skcipher.h>
|
|
||||||
#include <linux/kernel.h>
|
|
||||||
#include <linux/module.h>
|
|
||||||
|
|
||||||
#include <asm/hwcap.h>
|
|
||||||
#include <asm/neon.h>
|
|
||||||
#include <asm/simd.h>
|
|
||||||
|
|
||||||
asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
|
|
||||||
static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
|
|
||||||
unsigned int bytes)
|
|
||||||
{
|
|
||||||
u8 buf[CHACHA20_BLOCK_SIZE];
|
|
||||||
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
|
|
||||||
chacha20_4block_xor_neon(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
src += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
state[12] += 4;
|
|
||||||
}
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE) {
|
|
||||||
chacha20_block_xor_neon(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE;
|
|
||||||
src += CHACHA20_BLOCK_SIZE;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE;
|
|
||||||
state[12]++;
|
|
||||||
}
|
|
||||||
if (bytes) {
|
|
||||||
memcpy(buf, src, bytes);
|
|
||||||
chacha20_block_xor_neon(state, buf, buf);
|
|
||||||
memcpy(dst, buf, bytes);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static int chacha20_neon(struct skcipher_request *req)
|
|
||||||
{
|
|
||||||
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
|
||||||
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
|
|
||||||
struct skcipher_walk walk;
|
|
||||||
u32 state[16];
|
|
||||||
int err;
|
|
||||||
|
|
||||||
if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
|
|
||||||
return crypto_chacha20_crypt(req);
|
|
||||||
|
|
||||||
err = skcipher_walk_virt(&walk, req, true);
|
|
||||||
|
|
||||||
crypto_chacha20_init(state, ctx, walk.iv);
|
|
||||||
|
|
||||||
kernel_neon_begin();
|
|
||||||
while (walk.nbytes > 0) {
|
|
||||||
unsigned int nbytes = walk.nbytes;
|
|
||||||
|
|
||||||
if (nbytes < walk.total)
|
|
||||||
nbytes = round_down(nbytes, walk.stride);
|
|
||||||
|
|
||||||
chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
|
|
||||||
nbytes);
|
|
||||||
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
|
||||||
}
|
|
||||||
kernel_neon_end();
|
|
||||||
|
|
||||||
return err;
|
|
||||||
}
|
|
||||||
|
|
||||||
static struct skcipher_alg alg = {
|
|
||||||
.base.cra_name = "chacha20",
|
|
||||||
.base.cra_driver_name = "chacha20-neon",
|
|
||||||
.base.cra_priority = 300,
|
|
||||||
.base.cra_blocksize = 1,
|
|
||||||
.base.cra_ctxsize = sizeof(struct chacha20_ctx),
|
|
||||||
.base.cra_module = THIS_MODULE,
|
|
||||||
|
|
||||||
.min_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.max_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.ivsize = CHACHA20_IV_SIZE,
|
|
||||||
.chunksize = CHACHA20_BLOCK_SIZE,
|
|
||||||
.walksize = 4 * CHACHA20_BLOCK_SIZE,
|
|
||||||
.setkey = crypto_chacha20_setkey,
|
|
||||||
.encrypt = chacha20_neon,
|
|
||||||
.decrypt = chacha20_neon,
|
|
||||||
};
|
|
||||||
|
|
||||||
static int __init chacha20_simd_mod_init(void)
|
|
||||||
{
|
|
||||||
if (!(elf_hwcap & HWCAP_NEON))
|
|
||||||
return -ENODEV;
|
|
||||||
|
|
||||||
return crypto_register_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void __exit chacha20_simd_mod_fini(void)
|
|
||||||
{
|
|
||||||
crypto_unregister_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
module_init(chacha20_simd_mod_init);
|
|
||||||
module_exit(chacha20_simd_mod_fini);
|
|
||||||
|
|
||||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
|
||||||
MODULE_LICENSE("GPL v2");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20");
|
|
|
@ -0,0 +1,116 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0 */
|
||||||
|
/*
|
||||||
|
* NH - ε-almost-universal hash function, NEON accelerated version
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*
|
||||||
|
* Author: Eric Biggers <ebiggers@google.com>
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/linkage.h>
|
||||||
|
|
||||||
|
.text
|
||||||
|
.fpu neon
|
||||||
|
|
||||||
|
KEY .req r0
|
||||||
|
MESSAGE .req r1
|
||||||
|
MESSAGE_LEN .req r2
|
||||||
|
HASH .req r3
|
||||||
|
|
||||||
|
PASS0_SUMS .req q0
|
||||||
|
PASS0_SUM_A .req d0
|
||||||
|
PASS0_SUM_B .req d1
|
||||||
|
PASS1_SUMS .req q1
|
||||||
|
PASS1_SUM_A .req d2
|
||||||
|
PASS1_SUM_B .req d3
|
||||||
|
PASS2_SUMS .req q2
|
||||||
|
PASS2_SUM_A .req d4
|
||||||
|
PASS2_SUM_B .req d5
|
||||||
|
PASS3_SUMS .req q3
|
||||||
|
PASS3_SUM_A .req d6
|
||||||
|
PASS3_SUM_B .req d7
|
||||||
|
K0 .req q4
|
||||||
|
K1 .req q5
|
||||||
|
K2 .req q6
|
||||||
|
K3 .req q7
|
||||||
|
T0 .req q8
|
||||||
|
T0_L .req d16
|
||||||
|
T0_H .req d17
|
||||||
|
T1 .req q9
|
||||||
|
T1_L .req d18
|
||||||
|
T1_H .req d19
|
||||||
|
T2 .req q10
|
||||||
|
T2_L .req d20
|
||||||
|
T2_H .req d21
|
||||||
|
T3 .req q11
|
||||||
|
T3_L .req d22
|
||||||
|
T3_H .req d23
|
||||||
|
|
||||||
|
.macro _nh_stride k0, k1, k2, k3
|
||||||
|
|
||||||
|
// Load next message stride
|
||||||
|
vld1.8 {T3}, [MESSAGE]!
|
||||||
|
|
||||||
|
// Load next key stride
|
||||||
|
vld1.32 {\k3}, [KEY]!
|
||||||
|
|
||||||
|
// Add message words to key words
|
||||||
|
vadd.u32 T0, T3, \k0
|
||||||
|
vadd.u32 T1, T3, \k1
|
||||||
|
vadd.u32 T2, T3, \k2
|
||||||
|
vadd.u32 T3, T3, \k3
|
||||||
|
|
||||||
|
// Multiply 32x32 => 64 and accumulate
|
||||||
|
vmlal.u32 PASS0_SUMS, T0_L, T0_H
|
||||||
|
vmlal.u32 PASS1_SUMS, T1_L, T1_H
|
||||||
|
vmlal.u32 PASS2_SUMS, T2_L, T2_H
|
||||||
|
vmlal.u32 PASS3_SUMS, T3_L, T3_H
|
||||||
|
.endm
|
||||||
|
|
||||||
|
/*
|
||||||
|
* void nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
* u8 hash[NH_HASH_BYTES])
|
||||||
|
*
|
||||||
|
* It's guaranteed that message_len % 16 == 0.
|
||||||
|
*/
|
||||||
|
ENTRY(nh_neon)
|
||||||
|
|
||||||
|
vld1.32 {K0,K1}, [KEY]!
|
||||||
|
vmov.u64 PASS0_SUMS, #0
|
||||||
|
vmov.u64 PASS1_SUMS, #0
|
||||||
|
vld1.32 {K2}, [KEY]!
|
||||||
|
vmov.u64 PASS2_SUMS, #0
|
||||||
|
vmov.u64 PASS3_SUMS, #0
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #64
|
||||||
|
blt .Lloop4_done
|
||||||
|
.Lloop4:
|
||||||
|
_nh_stride K0, K1, K2, K3
|
||||||
|
_nh_stride K1, K2, K3, K0
|
||||||
|
_nh_stride K2, K3, K0, K1
|
||||||
|
_nh_stride K3, K0, K1, K2
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #64
|
||||||
|
bge .Lloop4
|
||||||
|
|
||||||
|
.Lloop4_done:
|
||||||
|
ands MESSAGE_LEN, MESSAGE_LEN, #63
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K0, K1, K2, K3
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #16
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K1, K2, K3, K0
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #16
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K2, K3, K0, K1
|
||||||
|
|
||||||
|
.Ldone:
|
||||||
|
// Sum the accumulators for each pass, then store the sums to 'hash'
|
||||||
|
vadd.u64 T0_L, PASS0_SUM_A, PASS0_SUM_B
|
||||||
|
vadd.u64 T0_H, PASS1_SUM_A, PASS1_SUM_B
|
||||||
|
vadd.u64 T1_L, PASS2_SUM_A, PASS2_SUM_B
|
||||||
|
vadd.u64 T1_H, PASS3_SUM_A, PASS3_SUM_B
|
||||||
|
vst1.8 {T0-T1}, [HASH]
|
||||||
|
bx lr
|
||||||
|
ENDPROC(nh_neon)
|
|
@ -0,0 +1,77 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
|
||||||
|
* (NEON accelerated version)
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <asm/neon.h>
|
||||||
|
#include <asm/simd.h>
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
u8 hash[NH_HASH_BYTES]);
|
||||||
|
|
||||||
|
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
|
||||||
|
static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
__le64 hash[NH_NUM_PASSES])
|
||||||
|
{
|
||||||
|
nh_neon(key, message, message_len, (u8 *)hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nhpoly1305_neon_update(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen)
|
||||||
|
{
|
||||||
|
if (srclen < 64 || !may_use_simd())
|
||||||
|
return crypto_nhpoly1305_update(desc, src, srclen);
|
||||||
|
|
||||||
|
do {
|
||||||
|
unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
|
||||||
|
kernel_neon_end();
|
||||||
|
src += n;
|
||||||
|
srclen -= n;
|
||||||
|
} while (srclen);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct shash_alg nhpoly1305_alg = {
|
||||||
|
.base.cra_name = "nhpoly1305",
|
||||||
|
.base.cra_driver_name = "nhpoly1305-neon",
|
||||||
|
.base.cra_priority = 200,
|
||||||
|
.base.cra_ctxsize = sizeof(struct nhpoly1305_key),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
.digestsize = POLY1305_DIGEST_SIZE,
|
||||||
|
.init = crypto_nhpoly1305_init,
|
||||||
|
.update = nhpoly1305_neon_update,
|
||||||
|
.final = crypto_nhpoly1305_final,
|
||||||
|
.setkey = crypto_nhpoly1305_setkey,
|
||||||
|
.descsize = sizeof(struct nhpoly1305_state),
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init nhpoly1305_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!(elf_hwcap & HWCAP_NEON))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit nhpoly1305_mod_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(nhpoly1305_mod_init);
|
||||||
|
module_exit(nhpoly1305_mod_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (NEON-accelerated)");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305-neon");
|
|
@ -101,11 +101,16 @@ config CRYPTO_AES_ARM64_NEON_BLK
|
||||||
select CRYPTO_SIMD
|
select CRYPTO_SIMD
|
||||||
|
|
||||||
config CRYPTO_CHACHA20_NEON
|
config CRYPTO_CHACHA20_NEON
|
||||||
tristate "NEON accelerated ChaCha20 symmetric cipher"
|
tristate "ChaCha20, XChaCha20, and XChaCha12 stream ciphers using NEON instructions"
|
||||||
depends on KERNEL_MODE_NEON
|
depends on KERNEL_MODE_NEON
|
||||||
select CRYPTO_BLKCIPHER
|
select CRYPTO_BLKCIPHER
|
||||||
select CRYPTO_CHACHA20
|
select CRYPTO_CHACHA20
|
||||||
|
|
||||||
|
config CRYPTO_NHPOLY1305_NEON
|
||||||
|
tristate "NHPoly1305 hash function using NEON instructions (for Adiantum)"
|
||||||
|
depends on KERNEL_MODE_NEON
|
||||||
|
select CRYPTO_NHPOLY1305
|
||||||
|
|
||||||
config CRYPTO_AES_ARM64_BS
|
config CRYPTO_AES_ARM64_BS
|
||||||
tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
|
tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
|
||||||
depends on KERNEL_MODE_NEON
|
depends on KERNEL_MODE_NEON
|
||||||
|
|
|
@ -50,8 +50,11 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
|
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
|
||||||
sha512-arm64-y := sha512-glue.o sha512-core.o
|
sha512-arm64-y := sha512-glue.o sha512-core.o
|
||||||
|
|
||||||
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
|
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
|
||||||
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
|
chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
|
||||||
|
|
||||||
|
obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
|
||||||
|
nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
|
||||||
|
|
||||||
obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o
|
obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o
|
||||||
aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
|
aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
|
||||||
|
|
|
@ -1,13 +1,13 @@
|
||||||
/*
|
/*
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
|
* ChaCha/XChaCha NEON helper functions
|
||||||
*
|
*
|
||||||
* Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
* Copyright (C) 2016-2018 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
||||||
*
|
*
|
||||||
* This program is free software; you can redistribute it and/or modify
|
* This program is free software; you can redistribute it and/or modify
|
||||||
* it under the terms of the GNU General Public License version 2 as
|
* it under the terms of the GNU General Public License version 2 as
|
||||||
* published by the Free Software Foundation.
|
* published by the Free Software Foundation.
|
||||||
*
|
*
|
||||||
* Based on:
|
* Originally based on:
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
|
* ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
|
||||||
*
|
*
|
||||||
* Copyright (C) 2015 Martin Willi
|
* Copyright (C) 2015 Martin Willi
|
||||||
|
@ -19,29 +19,27 @@
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/linkage.h>
|
#include <linux/linkage.h>
|
||||||
|
#include <asm/assembler.h>
|
||||||
|
#include <asm/cache.h>
|
||||||
|
|
||||||
.text
|
.text
|
||||||
.align 6
|
.align 6
|
||||||
|
|
||||||
ENTRY(chacha20_block_xor_neon)
|
/*
|
||||||
// x0: Input state matrix, s
|
* chacha_permute - permute one block
|
||||||
// x1: 1 data block output, o
|
*
|
||||||
// x2: 1 data block input, i
|
* Permute one 64-byte block where the state matrix is stored in the four NEON
|
||||||
|
* registers v0-v3. It performs matrix operations on four words in parallel,
|
||||||
|
* but requires shuffling to rearrange the words after each round.
|
||||||
|
*
|
||||||
|
* The round count is given in w3.
|
||||||
|
*
|
||||||
|
* Clobbers: w3, x10, v4, v12
|
||||||
|
*/
|
||||||
|
chacha_permute:
|
||||||
|
|
||||||
//
|
adr_l x10, ROT8
|
||||||
// This function encrypts one ChaCha20 block by loading the state matrix
|
ld1 {v12.4s}, [x10]
|
||||||
// in four NEON registers. It performs matrix operation on four words in
|
|
||||||
// parallel, but requires shuffling to rearrange the words after each
|
|
||||||
// round.
|
|
||||||
//
|
|
||||||
|
|
||||||
// x0..3 = s0..3
|
|
||||||
adr x3, ROT8
|
|
||||||
ld1 {v0.4s-v3.4s}, [x0]
|
|
||||||
ld1 {v8.4s-v11.4s}, [x0]
|
|
||||||
ld1 {v12.4s}, [x3]
|
|
||||||
|
|
||||||
mov x3, #10
|
|
||||||
|
|
||||||
.Ldoubleround:
|
.Ldoubleround:
|
||||||
// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
|
@ -102,9 +100,27 @@ ENTRY(chacha20_block_xor_neon)
|
||||||
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
||||||
ext v3.16b, v3.16b, v3.16b, #4
|
ext v3.16b, v3.16b, v3.16b, #4
|
||||||
|
|
||||||
subs x3, x3, #1
|
subs w3, w3, #2
|
||||||
b.ne .Ldoubleround
|
b.ne .Ldoubleround
|
||||||
|
|
||||||
|
ret
|
||||||
|
ENDPROC(chacha_permute)
|
||||||
|
|
||||||
|
ENTRY(chacha_block_xor_neon)
|
||||||
|
// x0: Input state matrix, s
|
||||||
|
// x1: 1 data block output, o
|
||||||
|
// x2: 1 data block input, i
|
||||||
|
// w3: nrounds
|
||||||
|
|
||||||
|
stp x29, x30, [sp, #-16]!
|
||||||
|
mov x29, sp
|
||||||
|
|
||||||
|
// x0..3 = s0..3
|
||||||
|
ld1 {v0.4s-v3.4s}, [x0]
|
||||||
|
ld1 {v8.4s-v11.4s}, [x0]
|
||||||
|
|
||||||
|
bl chacha_permute
|
||||||
|
|
||||||
ld1 {v4.16b-v7.16b}, [x2]
|
ld1 {v4.16b-v7.16b}, [x2]
|
||||||
|
|
||||||
// o0 = i0 ^ (x0 + s0)
|
// o0 = i0 ^ (x0 + s0)
|
||||||
|
@ -125,71 +141,156 @@ ENTRY(chacha20_block_xor_neon)
|
||||||
|
|
||||||
st1 {v0.16b-v3.16b}, [x1]
|
st1 {v0.16b-v3.16b}, [x1]
|
||||||
|
|
||||||
|
ldp x29, x30, [sp], #16
|
||||||
ret
|
ret
|
||||||
ENDPROC(chacha20_block_xor_neon)
|
ENDPROC(chacha_block_xor_neon)
|
||||||
|
|
||||||
|
ENTRY(hchacha_block_neon)
|
||||||
|
// x0: Input state matrix, s
|
||||||
|
// x1: output (8 32-bit words)
|
||||||
|
// w2: nrounds
|
||||||
|
|
||||||
|
stp x29, x30, [sp, #-16]!
|
||||||
|
mov x29, sp
|
||||||
|
|
||||||
|
ld1 {v0.4s-v3.4s}, [x0]
|
||||||
|
|
||||||
|
mov w3, w2
|
||||||
|
bl chacha_permute
|
||||||
|
|
||||||
|
st1 {v0.16b}, [x1], #16
|
||||||
|
st1 {v3.16b}, [x1]
|
||||||
|
|
||||||
|
ldp x29, x30, [sp], #16
|
||||||
|
ret
|
||||||
|
ENDPROC(hchacha_block_neon)
|
||||||
|
|
||||||
|
a0 .req w12
|
||||||
|
a1 .req w13
|
||||||
|
a2 .req w14
|
||||||
|
a3 .req w15
|
||||||
|
a4 .req w16
|
||||||
|
a5 .req w17
|
||||||
|
a6 .req w19
|
||||||
|
a7 .req w20
|
||||||
|
a8 .req w21
|
||||||
|
a9 .req w22
|
||||||
|
a10 .req w23
|
||||||
|
a11 .req w24
|
||||||
|
a12 .req w25
|
||||||
|
a13 .req w26
|
||||||
|
a14 .req w27
|
||||||
|
a15 .req w28
|
||||||
|
|
||||||
.align 6
|
.align 6
|
||||||
ENTRY(chacha20_4block_xor_neon)
|
ENTRY(chacha_4block_xor_neon)
|
||||||
|
frame_push 10
|
||||||
|
|
||||||
// x0: Input state matrix, s
|
// x0: Input state matrix, s
|
||||||
// x1: 4 data blocks output, o
|
// x1: 4 data blocks output, o
|
||||||
// x2: 4 data blocks input, i
|
// x2: 4 data blocks input, i
|
||||||
|
// w3: nrounds
|
||||||
|
// x4: byte count
|
||||||
|
|
||||||
|
adr_l x10, .Lpermute
|
||||||
|
and x5, x4, #63
|
||||||
|
add x10, x10, x5
|
||||||
|
add x11, x10, #64
|
||||||
|
|
||||||
//
|
//
|
||||||
// This function encrypts four consecutive ChaCha20 blocks by loading
|
// This function encrypts four consecutive ChaCha blocks by loading
|
||||||
// the state matrix in NEON registers four times. The algorithm performs
|
// the state matrix in NEON registers four times. The algorithm performs
|
||||||
// each operation on the corresponding word of each state matrix, hence
|
// each operation on the corresponding word of each state matrix, hence
|
||||||
// requires no word shuffling. For final XORing step we transpose the
|
// requires no word shuffling. For final XORing step we transpose the
|
||||||
// matrix by interleaving 32- and then 64-bit words, which allows us to
|
// matrix by interleaving 32- and then 64-bit words, which allows us to
|
||||||
// do XOR in NEON registers.
|
// do XOR in NEON registers.
|
||||||
//
|
//
|
||||||
adr x3, CTRINC // ... and ROT8
|
// At the same time, a fifth block is encrypted in parallel using
|
||||||
ld1 {v30.4s-v31.4s}, [x3]
|
// scalar registers
|
||||||
|
//
|
||||||
|
adr_l x9, CTRINC // ... and ROT8
|
||||||
|
ld1 {v30.4s-v31.4s}, [x9]
|
||||||
|
|
||||||
// x0..15[0-3] = s0..3[0..3]
|
// x0..15[0-3] = s0..3[0..3]
|
||||||
mov x4, x0
|
add x8, x0, #16
|
||||||
ld4r { v0.4s- v3.4s}, [x4], #16
|
ld4r { v0.4s- v3.4s}, [x0]
|
||||||
ld4r { v4.4s- v7.4s}, [x4], #16
|
ld4r { v4.4s- v7.4s}, [x8], #16
|
||||||
ld4r { v8.4s-v11.4s}, [x4], #16
|
ld4r { v8.4s-v11.4s}, [x8], #16
|
||||||
ld4r {v12.4s-v15.4s}, [x4]
|
ld4r {v12.4s-v15.4s}, [x8]
|
||||||
|
|
||||||
// x12 += counter values 0-3
|
mov a0, v0.s[0]
|
||||||
|
mov a1, v1.s[0]
|
||||||
|
mov a2, v2.s[0]
|
||||||
|
mov a3, v3.s[0]
|
||||||
|
mov a4, v4.s[0]
|
||||||
|
mov a5, v5.s[0]
|
||||||
|
mov a6, v6.s[0]
|
||||||
|
mov a7, v7.s[0]
|
||||||
|
mov a8, v8.s[0]
|
||||||
|
mov a9, v9.s[0]
|
||||||
|
mov a10, v10.s[0]
|
||||||
|
mov a11, v11.s[0]
|
||||||
|
mov a12, v12.s[0]
|
||||||
|
mov a13, v13.s[0]
|
||||||
|
mov a14, v14.s[0]
|
||||||
|
mov a15, v15.s[0]
|
||||||
|
|
||||||
|
// x12 += counter values 1-4
|
||||||
add v12.4s, v12.4s, v30.4s
|
add v12.4s, v12.4s, v30.4s
|
||||||
|
|
||||||
mov x3, #10
|
|
||||||
|
|
||||||
.Ldoubleround4:
|
.Ldoubleround4:
|
||||||
// x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
// x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
||||||
// x1 += x5, x13 = rotl32(x13 ^ x1, 16)
|
// x1 += x5, x13 = rotl32(x13 ^ x1, 16)
|
||||||
// x2 += x6, x14 = rotl32(x14 ^ x2, 16)
|
// x2 += x6, x14 = rotl32(x14 ^ x2, 16)
|
||||||
// x3 += x7, x15 = rotl32(x15 ^ x3, 16)
|
// x3 += x7, x15 = rotl32(x15 ^ x3, 16)
|
||||||
add v0.4s, v0.4s, v4.4s
|
add v0.4s, v0.4s, v4.4s
|
||||||
|
add a0, a0, a4
|
||||||
add v1.4s, v1.4s, v5.4s
|
add v1.4s, v1.4s, v5.4s
|
||||||
|
add a1, a1, a5
|
||||||
add v2.4s, v2.4s, v6.4s
|
add v2.4s, v2.4s, v6.4s
|
||||||
|
add a2, a2, a6
|
||||||
add v3.4s, v3.4s, v7.4s
|
add v3.4s, v3.4s, v7.4s
|
||||||
|
add a3, a3, a7
|
||||||
|
|
||||||
eor v12.16b, v12.16b, v0.16b
|
eor v12.16b, v12.16b, v0.16b
|
||||||
|
eor a12, a12, a0
|
||||||
eor v13.16b, v13.16b, v1.16b
|
eor v13.16b, v13.16b, v1.16b
|
||||||
|
eor a13, a13, a1
|
||||||
eor v14.16b, v14.16b, v2.16b
|
eor v14.16b, v14.16b, v2.16b
|
||||||
|
eor a14, a14, a2
|
||||||
eor v15.16b, v15.16b, v3.16b
|
eor v15.16b, v15.16b, v3.16b
|
||||||
|
eor a15, a15, a3
|
||||||
|
|
||||||
rev32 v12.8h, v12.8h
|
rev32 v12.8h, v12.8h
|
||||||
|
ror a12, a12, #16
|
||||||
rev32 v13.8h, v13.8h
|
rev32 v13.8h, v13.8h
|
||||||
|
ror a13, a13, #16
|
||||||
rev32 v14.8h, v14.8h
|
rev32 v14.8h, v14.8h
|
||||||
|
ror a14, a14, #16
|
||||||
rev32 v15.8h, v15.8h
|
rev32 v15.8h, v15.8h
|
||||||
|
ror a15, a15, #16
|
||||||
|
|
||||||
// x8 += x12, x4 = rotl32(x4 ^ x8, 12)
|
// x8 += x12, x4 = rotl32(x4 ^ x8, 12)
|
||||||
// x9 += x13, x5 = rotl32(x5 ^ x9, 12)
|
// x9 += x13, x5 = rotl32(x5 ^ x9, 12)
|
||||||
// x10 += x14, x6 = rotl32(x6 ^ x10, 12)
|
// x10 += x14, x6 = rotl32(x6 ^ x10, 12)
|
||||||
// x11 += x15, x7 = rotl32(x7 ^ x11, 12)
|
// x11 += x15, x7 = rotl32(x7 ^ x11, 12)
|
||||||
add v8.4s, v8.4s, v12.4s
|
add v8.4s, v8.4s, v12.4s
|
||||||
|
add a8, a8, a12
|
||||||
add v9.4s, v9.4s, v13.4s
|
add v9.4s, v9.4s, v13.4s
|
||||||
|
add a9, a9, a13
|
||||||
add v10.4s, v10.4s, v14.4s
|
add v10.4s, v10.4s, v14.4s
|
||||||
|
add a10, a10, a14
|
||||||
add v11.4s, v11.4s, v15.4s
|
add v11.4s, v11.4s, v15.4s
|
||||||
|
add a11, a11, a15
|
||||||
|
|
||||||
eor v16.16b, v4.16b, v8.16b
|
eor v16.16b, v4.16b, v8.16b
|
||||||
|
eor a4, a4, a8
|
||||||
eor v17.16b, v5.16b, v9.16b
|
eor v17.16b, v5.16b, v9.16b
|
||||||
|
eor a5, a5, a9
|
||||||
eor v18.16b, v6.16b, v10.16b
|
eor v18.16b, v6.16b, v10.16b
|
||||||
|
eor a6, a6, a10
|
||||||
eor v19.16b, v7.16b, v11.16b
|
eor v19.16b, v7.16b, v11.16b
|
||||||
|
eor a7, a7, a11
|
||||||
|
|
||||||
shl v4.4s, v16.4s, #12
|
shl v4.4s, v16.4s, #12
|
||||||
shl v5.4s, v17.4s, #12
|
shl v5.4s, v17.4s, #12
|
||||||
|
@ -197,42 +298,66 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
shl v7.4s, v19.4s, #12
|
shl v7.4s, v19.4s, #12
|
||||||
|
|
||||||
sri v4.4s, v16.4s, #20
|
sri v4.4s, v16.4s, #20
|
||||||
|
ror a4, a4, #20
|
||||||
sri v5.4s, v17.4s, #20
|
sri v5.4s, v17.4s, #20
|
||||||
|
ror a5, a5, #20
|
||||||
sri v6.4s, v18.4s, #20
|
sri v6.4s, v18.4s, #20
|
||||||
|
ror a6, a6, #20
|
||||||
sri v7.4s, v19.4s, #20
|
sri v7.4s, v19.4s, #20
|
||||||
|
ror a7, a7, #20
|
||||||
|
|
||||||
// x0 += x4, x12 = rotl32(x12 ^ x0, 8)
|
// x0 += x4, x12 = rotl32(x12 ^ x0, 8)
|
||||||
// x1 += x5, x13 = rotl32(x13 ^ x1, 8)
|
// x1 += x5, x13 = rotl32(x13 ^ x1, 8)
|
||||||
// x2 += x6, x14 = rotl32(x14 ^ x2, 8)
|
// x2 += x6, x14 = rotl32(x14 ^ x2, 8)
|
||||||
// x3 += x7, x15 = rotl32(x15 ^ x3, 8)
|
// x3 += x7, x15 = rotl32(x15 ^ x3, 8)
|
||||||
add v0.4s, v0.4s, v4.4s
|
add v0.4s, v0.4s, v4.4s
|
||||||
|
add a0, a0, a4
|
||||||
add v1.4s, v1.4s, v5.4s
|
add v1.4s, v1.4s, v5.4s
|
||||||
|
add a1, a1, a5
|
||||||
add v2.4s, v2.4s, v6.4s
|
add v2.4s, v2.4s, v6.4s
|
||||||
|
add a2, a2, a6
|
||||||
add v3.4s, v3.4s, v7.4s
|
add v3.4s, v3.4s, v7.4s
|
||||||
|
add a3, a3, a7
|
||||||
|
|
||||||
eor v12.16b, v12.16b, v0.16b
|
eor v12.16b, v12.16b, v0.16b
|
||||||
|
eor a12, a12, a0
|
||||||
eor v13.16b, v13.16b, v1.16b
|
eor v13.16b, v13.16b, v1.16b
|
||||||
|
eor a13, a13, a1
|
||||||
eor v14.16b, v14.16b, v2.16b
|
eor v14.16b, v14.16b, v2.16b
|
||||||
|
eor a14, a14, a2
|
||||||
eor v15.16b, v15.16b, v3.16b
|
eor v15.16b, v15.16b, v3.16b
|
||||||
|
eor a15, a15, a3
|
||||||
|
|
||||||
tbl v12.16b, {v12.16b}, v31.16b
|
tbl v12.16b, {v12.16b}, v31.16b
|
||||||
|
ror a12, a12, #24
|
||||||
tbl v13.16b, {v13.16b}, v31.16b
|
tbl v13.16b, {v13.16b}, v31.16b
|
||||||
|
ror a13, a13, #24
|
||||||
tbl v14.16b, {v14.16b}, v31.16b
|
tbl v14.16b, {v14.16b}, v31.16b
|
||||||
|
ror a14, a14, #24
|
||||||
tbl v15.16b, {v15.16b}, v31.16b
|
tbl v15.16b, {v15.16b}, v31.16b
|
||||||
|
ror a15, a15, #24
|
||||||
|
|
||||||
// x8 += x12, x4 = rotl32(x4 ^ x8, 7)
|
// x8 += x12, x4 = rotl32(x4 ^ x8, 7)
|
||||||
// x9 += x13, x5 = rotl32(x5 ^ x9, 7)
|
// x9 += x13, x5 = rotl32(x5 ^ x9, 7)
|
||||||
// x10 += x14, x6 = rotl32(x6 ^ x10, 7)
|
// x10 += x14, x6 = rotl32(x6 ^ x10, 7)
|
||||||
// x11 += x15, x7 = rotl32(x7 ^ x11, 7)
|
// x11 += x15, x7 = rotl32(x7 ^ x11, 7)
|
||||||
add v8.4s, v8.4s, v12.4s
|
add v8.4s, v8.4s, v12.4s
|
||||||
|
add a8, a8, a12
|
||||||
add v9.4s, v9.4s, v13.4s
|
add v9.4s, v9.4s, v13.4s
|
||||||
|
add a9, a9, a13
|
||||||
add v10.4s, v10.4s, v14.4s
|
add v10.4s, v10.4s, v14.4s
|
||||||
|
add a10, a10, a14
|
||||||
add v11.4s, v11.4s, v15.4s
|
add v11.4s, v11.4s, v15.4s
|
||||||
|
add a11, a11, a15
|
||||||
|
|
||||||
eor v16.16b, v4.16b, v8.16b
|
eor v16.16b, v4.16b, v8.16b
|
||||||
|
eor a4, a4, a8
|
||||||
eor v17.16b, v5.16b, v9.16b
|
eor v17.16b, v5.16b, v9.16b
|
||||||
|
eor a5, a5, a9
|
||||||
eor v18.16b, v6.16b, v10.16b
|
eor v18.16b, v6.16b, v10.16b
|
||||||
|
eor a6, a6, a10
|
||||||
eor v19.16b, v7.16b, v11.16b
|
eor v19.16b, v7.16b, v11.16b
|
||||||
|
eor a7, a7, a11
|
||||||
|
|
||||||
shl v4.4s, v16.4s, #7
|
shl v4.4s, v16.4s, #7
|
||||||
shl v5.4s, v17.4s, #7
|
shl v5.4s, v17.4s, #7
|
||||||
|
@ -240,42 +365,66 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
shl v7.4s, v19.4s, #7
|
shl v7.4s, v19.4s, #7
|
||||||
|
|
||||||
sri v4.4s, v16.4s, #25
|
sri v4.4s, v16.4s, #25
|
||||||
|
ror a4, a4, #25
|
||||||
sri v5.4s, v17.4s, #25
|
sri v5.4s, v17.4s, #25
|
||||||
|
ror a5, a5, #25
|
||||||
sri v6.4s, v18.4s, #25
|
sri v6.4s, v18.4s, #25
|
||||||
|
ror a6, a6, #25
|
||||||
sri v7.4s, v19.4s, #25
|
sri v7.4s, v19.4s, #25
|
||||||
|
ror a7, a7, #25
|
||||||
|
|
||||||
// x0 += x5, x15 = rotl32(x15 ^ x0, 16)
|
// x0 += x5, x15 = rotl32(x15 ^ x0, 16)
|
||||||
// x1 += x6, x12 = rotl32(x12 ^ x1, 16)
|
// x1 += x6, x12 = rotl32(x12 ^ x1, 16)
|
||||||
// x2 += x7, x13 = rotl32(x13 ^ x2, 16)
|
// x2 += x7, x13 = rotl32(x13 ^ x2, 16)
|
||||||
// x3 += x4, x14 = rotl32(x14 ^ x3, 16)
|
// x3 += x4, x14 = rotl32(x14 ^ x3, 16)
|
||||||
add v0.4s, v0.4s, v5.4s
|
add v0.4s, v0.4s, v5.4s
|
||||||
|
add a0, a0, a5
|
||||||
add v1.4s, v1.4s, v6.4s
|
add v1.4s, v1.4s, v6.4s
|
||||||
|
add a1, a1, a6
|
||||||
add v2.4s, v2.4s, v7.4s
|
add v2.4s, v2.4s, v7.4s
|
||||||
|
add a2, a2, a7
|
||||||
add v3.4s, v3.4s, v4.4s
|
add v3.4s, v3.4s, v4.4s
|
||||||
|
add a3, a3, a4
|
||||||
|
|
||||||
eor v15.16b, v15.16b, v0.16b
|
eor v15.16b, v15.16b, v0.16b
|
||||||
|
eor a15, a15, a0
|
||||||
eor v12.16b, v12.16b, v1.16b
|
eor v12.16b, v12.16b, v1.16b
|
||||||
|
eor a12, a12, a1
|
||||||
eor v13.16b, v13.16b, v2.16b
|
eor v13.16b, v13.16b, v2.16b
|
||||||
|
eor a13, a13, a2
|
||||||
eor v14.16b, v14.16b, v3.16b
|
eor v14.16b, v14.16b, v3.16b
|
||||||
|
eor a14, a14, a3
|
||||||
|
|
||||||
rev32 v15.8h, v15.8h
|
rev32 v15.8h, v15.8h
|
||||||
|
ror a15, a15, #16
|
||||||
rev32 v12.8h, v12.8h
|
rev32 v12.8h, v12.8h
|
||||||
|
ror a12, a12, #16
|
||||||
rev32 v13.8h, v13.8h
|
rev32 v13.8h, v13.8h
|
||||||
|
ror a13, a13, #16
|
||||||
rev32 v14.8h, v14.8h
|
rev32 v14.8h, v14.8h
|
||||||
|
ror a14, a14, #16
|
||||||
|
|
||||||
// x10 += x15, x5 = rotl32(x5 ^ x10, 12)
|
// x10 += x15, x5 = rotl32(x5 ^ x10, 12)
|
||||||
// x11 += x12, x6 = rotl32(x6 ^ x11, 12)
|
// x11 += x12, x6 = rotl32(x6 ^ x11, 12)
|
||||||
// x8 += x13, x7 = rotl32(x7 ^ x8, 12)
|
// x8 += x13, x7 = rotl32(x7 ^ x8, 12)
|
||||||
// x9 += x14, x4 = rotl32(x4 ^ x9, 12)
|
// x9 += x14, x4 = rotl32(x4 ^ x9, 12)
|
||||||
add v10.4s, v10.4s, v15.4s
|
add v10.4s, v10.4s, v15.4s
|
||||||
|
add a10, a10, a15
|
||||||
add v11.4s, v11.4s, v12.4s
|
add v11.4s, v11.4s, v12.4s
|
||||||
|
add a11, a11, a12
|
||||||
add v8.4s, v8.4s, v13.4s
|
add v8.4s, v8.4s, v13.4s
|
||||||
|
add a8, a8, a13
|
||||||
add v9.4s, v9.4s, v14.4s
|
add v9.4s, v9.4s, v14.4s
|
||||||
|
add a9, a9, a14
|
||||||
|
|
||||||
eor v16.16b, v5.16b, v10.16b
|
eor v16.16b, v5.16b, v10.16b
|
||||||
|
eor a5, a5, a10
|
||||||
eor v17.16b, v6.16b, v11.16b
|
eor v17.16b, v6.16b, v11.16b
|
||||||
|
eor a6, a6, a11
|
||||||
eor v18.16b, v7.16b, v8.16b
|
eor v18.16b, v7.16b, v8.16b
|
||||||
|
eor a7, a7, a8
|
||||||
eor v19.16b, v4.16b, v9.16b
|
eor v19.16b, v4.16b, v9.16b
|
||||||
|
eor a4, a4, a9
|
||||||
|
|
||||||
shl v5.4s, v16.4s, #12
|
shl v5.4s, v16.4s, #12
|
||||||
shl v6.4s, v17.4s, #12
|
shl v6.4s, v17.4s, #12
|
||||||
|
@ -283,42 +432,66 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
shl v4.4s, v19.4s, #12
|
shl v4.4s, v19.4s, #12
|
||||||
|
|
||||||
sri v5.4s, v16.4s, #20
|
sri v5.4s, v16.4s, #20
|
||||||
|
ror a5, a5, #20
|
||||||
sri v6.4s, v17.4s, #20
|
sri v6.4s, v17.4s, #20
|
||||||
|
ror a6, a6, #20
|
||||||
sri v7.4s, v18.4s, #20
|
sri v7.4s, v18.4s, #20
|
||||||
|
ror a7, a7, #20
|
||||||
sri v4.4s, v19.4s, #20
|
sri v4.4s, v19.4s, #20
|
||||||
|
ror a4, a4, #20
|
||||||
|
|
||||||
// x0 += x5, x15 = rotl32(x15 ^ x0, 8)
|
// x0 += x5, x15 = rotl32(x15 ^ x0, 8)
|
||||||
// x1 += x6, x12 = rotl32(x12 ^ x1, 8)
|
// x1 += x6, x12 = rotl32(x12 ^ x1, 8)
|
||||||
// x2 += x7, x13 = rotl32(x13 ^ x2, 8)
|
// x2 += x7, x13 = rotl32(x13 ^ x2, 8)
|
||||||
// x3 += x4, x14 = rotl32(x14 ^ x3, 8)
|
// x3 += x4, x14 = rotl32(x14 ^ x3, 8)
|
||||||
add v0.4s, v0.4s, v5.4s
|
add v0.4s, v0.4s, v5.4s
|
||||||
|
add a0, a0, a5
|
||||||
add v1.4s, v1.4s, v6.4s
|
add v1.4s, v1.4s, v6.4s
|
||||||
|
add a1, a1, a6
|
||||||
add v2.4s, v2.4s, v7.4s
|
add v2.4s, v2.4s, v7.4s
|
||||||
|
add a2, a2, a7
|
||||||
add v3.4s, v3.4s, v4.4s
|
add v3.4s, v3.4s, v4.4s
|
||||||
|
add a3, a3, a4
|
||||||
|
|
||||||
eor v15.16b, v15.16b, v0.16b
|
eor v15.16b, v15.16b, v0.16b
|
||||||
|
eor a15, a15, a0
|
||||||
eor v12.16b, v12.16b, v1.16b
|
eor v12.16b, v12.16b, v1.16b
|
||||||
|
eor a12, a12, a1
|
||||||
eor v13.16b, v13.16b, v2.16b
|
eor v13.16b, v13.16b, v2.16b
|
||||||
|
eor a13, a13, a2
|
||||||
eor v14.16b, v14.16b, v3.16b
|
eor v14.16b, v14.16b, v3.16b
|
||||||
|
eor a14, a14, a3
|
||||||
|
|
||||||
tbl v15.16b, {v15.16b}, v31.16b
|
tbl v15.16b, {v15.16b}, v31.16b
|
||||||
|
ror a15, a15, #24
|
||||||
tbl v12.16b, {v12.16b}, v31.16b
|
tbl v12.16b, {v12.16b}, v31.16b
|
||||||
|
ror a12, a12, #24
|
||||||
tbl v13.16b, {v13.16b}, v31.16b
|
tbl v13.16b, {v13.16b}, v31.16b
|
||||||
|
ror a13, a13, #24
|
||||||
tbl v14.16b, {v14.16b}, v31.16b
|
tbl v14.16b, {v14.16b}, v31.16b
|
||||||
|
ror a14, a14, #24
|
||||||
|
|
||||||
// x10 += x15, x5 = rotl32(x5 ^ x10, 7)
|
// x10 += x15, x5 = rotl32(x5 ^ x10, 7)
|
||||||
// x11 += x12, x6 = rotl32(x6 ^ x11, 7)
|
// x11 += x12, x6 = rotl32(x6 ^ x11, 7)
|
||||||
// x8 += x13, x7 = rotl32(x7 ^ x8, 7)
|
// x8 += x13, x7 = rotl32(x7 ^ x8, 7)
|
||||||
// x9 += x14, x4 = rotl32(x4 ^ x9, 7)
|
// x9 += x14, x4 = rotl32(x4 ^ x9, 7)
|
||||||
add v10.4s, v10.4s, v15.4s
|
add v10.4s, v10.4s, v15.4s
|
||||||
|
add a10, a10, a15
|
||||||
add v11.4s, v11.4s, v12.4s
|
add v11.4s, v11.4s, v12.4s
|
||||||
|
add a11, a11, a12
|
||||||
add v8.4s, v8.4s, v13.4s
|
add v8.4s, v8.4s, v13.4s
|
||||||
|
add a8, a8, a13
|
||||||
add v9.4s, v9.4s, v14.4s
|
add v9.4s, v9.4s, v14.4s
|
||||||
|
add a9, a9, a14
|
||||||
|
|
||||||
eor v16.16b, v5.16b, v10.16b
|
eor v16.16b, v5.16b, v10.16b
|
||||||
|
eor a5, a5, a10
|
||||||
eor v17.16b, v6.16b, v11.16b
|
eor v17.16b, v6.16b, v11.16b
|
||||||
|
eor a6, a6, a11
|
||||||
eor v18.16b, v7.16b, v8.16b
|
eor v18.16b, v7.16b, v8.16b
|
||||||
|
eor a7, a7, a8
|
||||||
eor v19.16b, v4.16b, v9.16b
|
eor v19.16b, v4.16b, v9.16b
|
||||||
|
eor a4, a4, a9
|
||||||
|
|
||||||
shl v5.4s, v16.4s, #7
|
shl v5.4s, v16.4s, #7
|
||||||
shl v6.4s, v17.4s, #7
|
shl v6.4s, v17.4s, #7
|
||||||
|
@ -326,11 +499,15 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
shl v4.4s, v19.4s, #7
|
shl v4.4s, v19.4s, #7
|
||||||
|
|
||||||
sri v5.4s, v16.4s, #25
|
sri v5.4s, v16.4s, #25
|
||||||
|
ror a5, a5, #25
|
||||||
sri v6.4s, v17.4s, #25
|
sri v6.4s, v17.4s, #25
|
||||||
|
ror a6, a6, #25
|
||||||
sri v7.4s, v18.4s, #25
|
sri v7.4s, v18.4s, #25
|
||||||
|
ror a7, a7, #25
|
||||||
sri v4.4s, v19.4s, #25
|
sri v4.4s, v19.4s, #25
|
||||||
|
ror a4, a4, #25
|
||||||
|
|
||||||
subs x3, x3, #1
|
subs w3, w3, #2
|
||||||
b.ne .Ldoubleround4
|
b.ne .Ldoubleround4
|
||||||
|
|
||||||
ld4r {v16.4s-v19.4s}, [x0], #16
|
ld4r {v16.4s-v19.4s}, [x0], #16
|
||||||
|
@ -344,9 +521,17 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
// x2[0-3] += s0[2]
|
// x2[0-3] += s0[2]
|
||||||
// x3[0-3] += s0[3]
|
// x3[0-3] += s0[3]
|
||||||
add v0.4s, v0.4s, v16.4s
|
add v0.4s, v0.4s, v16.4s
|
||||||
|
mov w6, v16.s[0]
|
||||||
|
mov w7, v17.s[0]
|
||||||
add v1.4s, v1.4s, v17.4s
|
add v1.4s, v1.4s, v17.4s
|
||||||
|
mov w8, v18.s[0]
|
||||||
|
mov w9, v19.s[0]
|
||||||
add v2.4s, v2.4s, v18.4s
|
add v2.4s, v2.4s, v18.4s
|
||||||
|
add a0, a0, w6
|
||||||
|
add a1, a1, w7
|
||||||
add v3.4s, v3.4s, v19.4s
|
add v3.4s, v3.4s, v19.4s
|
||||||
|
add a2, a2, w8
|
||||||
|
add a3, a3, w9
|
||||||
|
|
||||||
ld4r {v24.4s-v27.4s}, [x0], #16
|
ld4r {v24.4s-v27.4s}, [x0], #16
|
||||||
ld4r {v28.4s-v31.4s}, [x0]
|
ld4r {v28.4s-v31.4s}, [x0]
|
||||||
|
@ -356,95 +541,304 @@ ENTRY(chacha20_4block_xor_neon)
|
||||||
// x6[0-3] += s1[2]
|
// x6[0-3] += s1[2]
|
||||||
// x7[0-3] += s1[3]
|
// x7[0-3] += s1[3]
|
||||||
add v4.4s, v4.4s, v20.4s
|
add v4.4s, v4.4s, v20.4s
|
||||||
|
mov w6, v20.s[0]
|
||||||
|
mov w7, v21.s[0]
|
||||||
add v5.4s, v5.4s, v21.4s
|
add v5.4s, v5.4s, v21.4s
|
||||||
|
mov w8, v22.s[0]
|
||||||
|
mov w9, v23.s[0]
|
||||||
add v6.4s, v6.4s, v22.4s
|
add v6.4s, v6.4s, v22.4s
|
||||||
|
add a4, a4, w6
|
||||||
|
add a5, a5, w7
|
||||||
add v7.4s, v7.4s, v23.4s
|
add v7.4s, v7.4s, v23.4s
|
||||||
|
add a6, a6, w8
|
||||||
|
add a7, a7, w9
|
||||||
|
|
||||||
// x8[0-3] += s2[0]
|
// x8[0-3] += s2[0]
|
||||||
// x9[0-3] += s2[1]
|
// x9[0-3] += s2[1]
|
||||||
// x10[0-3] += s2[2]
|
// x10[0-3] += s2[2]
|
||||||
// x11[0-3] += s2[3]
|
// x11[0-3] += s2[3]
|
||||||
add v8.4s, v8.4s, v24.4s
|
add v8.4s, v8.4s, v24.4s
|
||||||
|
mov w6, v24.s[0]
|
||||||
|
mov w7, v25.s[0]
|
||||||
add v9.4s, v9.4s, v25.4s
|
add v9.4s, v9.4s, v25.4s
|
||||||
|
mov w8, v26.s[0]
|
||||||
|
mov w9, v27.s[0]
|
||||||
add v10.4s, v10.4s, v26.4s
|
add v10.4s, v10.4s, v26.4s
|
||||||
|
add a8, a8, w6
|
||||||
|
add a9, a9, w7
|
||||||
add v11.4s, v11.4s, v27.4s
|
add v11.4s, v11.4s, v27.4s
|
||||||
|
add a10, a10, w8
|
||||||
|
add a11, a11, w9
|
||||||
|
|
||||||
// x12[0-3] += s3[0]
|
// x12[0-3] += s3[0]
|
||||||
// x13[0-3] += s3[1]
|
// x13[0-3] += s3[1]
|
||||||
// x14[0-3] += s3[2]
|
// x14[0-3] += s3[2]
|
||||||
// x15[0-3] += s3[3]
|
// x15[0-3] += s3[3]
|
||||||
add v12.4s, v12.4s, v28.4s
|
add v12.4s, v12.4s, v28.4s
|
||||||
|
mov w6, v28.s[0]
|
||||||
|
mov w7, v29.s[0]
|
||||||
add v13.4s, v13.4s, v29.4s
|
add v13.4s, v13.4s, v29.4s
|
||||||
|
mov w8, v30.s[0]
|
||||||
|
mov w9, v31.s[0]
|
||||||
add v14.4s, v14.4s, v30.4s
|
add v14.4s, v14.4s, v30.4s
|
||||||
|
add a12, a12, w6
|
||||||
|
add a13, a13, w7
|
||||||
add v15.4s, v15.4s, v31.4s
|
add v15.4s, v15.4s, v31.4s
|
||||||
|
add a14, a14, w8
|
||||||
|
add a15, a15, w9
|
||||||
|
|
||||||
// interleave 32-bit words in state n, n+1
|
// interleave 32-bit words in state n, n+1
|
||||||
|
ldp w6, w7, [x2], #64
|
||||||
zip1 v16.4s, v0.4s, v1.4s
|
zip1 v16.4s, v0.4s, v1.4s
|
||||||
|
ldp w8, w9, [x2, #-56]
|
||||||
|
eor a0, a0, w6
|
||||||
zip2 v17.4s, v0.4s, v1.4s
|
zip2 v17.4s, v0.4s, v1.4s
|
||||||
|
eor a1, a1, w7
|
||||||
zip1 v18.4s, v2.4s, v3.4s
|
zip1 v18.4s, v2.4s, v3.4s
|
||||||
|
eor a2, a2, w8
|
||||||
zip2 v19.4s, v2.4s, v3.4s
|
zip2 v19.4s, v2.4s, v3.4s
|
||||||
|
eor a3, a3, w9
|
||||||
|
ldp w6, w7, [x2, #-48]
|
||||||
zip1 v20.4s, v4.4s, v5.4s
|
zip1 v20.4s, v4.4s, v5.4s
|
||||||
|
ldp w8, w9, [x2, #-40]
|
||||||
|
eor a4, a4, w6
|
||||||
zip2 v21.4s, v4.4s, v5.4s
|
zip2 v21.4s, v4.4s, v5.4s
|
||||||
|
eor a5, a5, w7
|
||||||
zip1 v22.4s, v6.4s, v7.4s
|
zip1 v22.4s, v6.4s, v7.4s
|
||||||
|
eor a6, a6, w8
|
||||||
zip2 v23.4s, v6.4s, v7.4s
|
zip2 v23.4s, v6.4s, v7.4s
|
||||||
|
eor a7, a7, w9
|
||||||
|
ldp w6, w7, [x2, #-32]
|
||||||
zip1 v24.4s, v8.4s, v9.4s
|
zip1 v24.4s, v8.4s, v9.4s
|
||||||
|
ldp w8, w9, [x2, #-24]
|
||||||
|
eor a8, a8, w6
|
||||||
zip2 v25.4s, v8.4s, v9.4s
|
zip2 v25.4s, v8.4s, v9.4s
|
||||||
|
eor a9, a9, w7
|
||||||
zip1 v26.4s, v10.4s, v11.4s
|
zip1 v26.4s, v10.4s, v11.4s
|
||||||
|
eor a10, a10, w8
|
||||||
zip2 v27.4s, v10.4s, v11.4s
|
zip2 v27.4s, v10.4s, v11.4s
|
||||||
|
eor a11, a11, w9
|
||||||
|
ldp w6, w7, [x2, #-16]
|
||||||
zip1 v28.4s, v12.4s, v13.4s
|
zip1 v28.4s, v12.4s, v13.4s
|
||||||
|
ldp w8, w9, [x2, #-8]
|
||||||
|
eor a12, a12, w6
|
||||||
zip2 v29.4s, v12.4s, v13.4s
|
zip2 v29.4s, v12.4s, v13.4s
|
||||||
|
eor a13, a13, w7
|
||||||
zip1 v30.4s, v14.4s, v15.4s
|
zip1 v30.4s, v14.4s, v15.4s
|
||||||
|
eor a14, a14, w8
|
||||||
zip2 v31.4s, v14.4s, v15.4s
|
zip2 v31.4s, v14.4s, v15.4s
|
||||||
|
eor a15, a15, w9
|
||||||
|
|
||||||
|
mov x3, #64
|
||||||
|
subs x5, x4, #128
|
||||||
|
add x6, x5, x2
|
||||||
|
csel x3, x3, xzr, ge
|
||||||
|
csel x2, x2, x6, ge
|
||||||
|
|
||||||
// interleave 64-bit words in state n, n+2
|
// interleave 64-bit words in state n, n+2
|
||||||
zip1 v0.2d, v16.2d, v18.2d
|
zip1 v0.2d, v16.2d, v18.2d
|
||||||
zip2 v4.2d, v16.2d, v18.2d
|
zip2 v4.2d, v16.2d, v18.2d
|
||||||
|
stp a0, a1, [x1], #64
|
||||||
zip1 v8.2d, v17.2d, v19.2d
|
zip1 v8.2d, v17.2d, v19.2d
|
||||||
zip2 v12.2d, v17.2d, v19.2d
|
zip2 v12.2d, v17.2d, v19.2d
|
||||||
ld1 {v16.16b-v19.16b}, [x2], #64
|
stp a2, a3, [x1, #-56]
|
||||||
|
ld1 {v16.16b-v19.16b}, [x2], x3
|
||||||
|
|
||||||
|
subs x6, x4, #192
|
||||||
|
ccmp x3, xzr, #4, lt
|
||||||
|
add x7, x6, x2
|
||||||
|
csel x3, x3, xzr, eq
|
||||||
|
csel x2, x2, x7, eq
|
||||||
|
|
||||||
zip1 v1.2d, v20.2d, v22.2d
|
zip1 v1.2d, v20.2d, v22.2d
|
||||||
zip2 v5.2d, v20.2d, v22.2d
|
zip2 v5.2d, v20.2d, v22.2d
|
||||||
|
stp a4, a5, [x1, #-48]
|
||||||
zip1 v9.2d, v21.2d, v23.2d
|
zip1 v9.2d, v21.2d, v23.2d
|
||||||
zip2 v13.2d, v21.2d, v23.2d
|
zip2 v13.2d, v21.2d, v23.2d
|
||||||
ld1 {v20.16b-v23.16b}, [x2], #64
|
stp a6, a7, [x1, #-40]
|
||||||
|
ld1 {v20.16b-v23.16b}, [x2], x3
|
||||||
|
|
||||||
|
subs x7, x4, #256
|
||||||
|
ccmp x3, xzr, #4, lt
|
||||||
|
add x8, x7, x2
|
||||||
|
csel x3, x3, xzr, eq
|
||||||
|
csel x2, x2, x8, eq
|
||||||
|
|
||||||
zip1 v2.2d, v24.2d, v26.2d
|
zip1 v2.2d, v24.2d, v26.2d
|
||||||
zip2 v6.2d, v24.2d, v26.2d
|
zip2 v6.2d, v24.2d, v26.2d
|
||||||
|
stp a8, a9, [x1, #-32]
|
||||||
zip1 v10.2d, v25.2d, v27.2d
|
zip1 v10.2d, v25.2d, v27.2d
|
||||||
zip2 v14.2d, v25.2d, v27.2d
|
zip2 v14.2d, v25.2d, v27.2d
|
||||||
ld1 {v24.16b-v27.16b}, [x2], #64
|
stp a10, a11, [x1, #-24]
|
||||||
|
ld1 {v24.16b-v27.16b}, [x2], x3
|
||||||
|
|
||||||
|
subs x8, x4, #320
|
||||||
|
ccmp x3, xzr, #4, lt
|
||||||
|
add x9, x8, x2
|
||||||
|
csel x2, x2, x9, eq
|
||||||
|
|
||||||
zip1 v3.2d, v28.2d, v30.2d
|
zip1 v3.2d, v28.2d, v30.2d
|
||||||
zip2 v7.2d, v28.2d, v30.2d
|
zip2 v7.2d, v28.2d, v30.2d
|
||||||
|
stp a12, a13, [x1, #-16]
|
||||||
zip1 v11.2d, v29.2d, v31.2d
|
zip1 v11.2d, v29.2d, v31.2d
|
||||||
zip2 v15.2d, v29.2d, v31.2d
|
zip2 v15.2d, v29.2d, v31.2d
|
||||||
|
stp a14, a15, [x1, #-8]
|
||||||
ld1 {v28.16b-v31.16b}, [x2]
|
ld1 {v28.16b-v31.16b}, [x2]
|
||||||
|
|
||||||
// xor with corresponding input, write to output
|
// xor with corresponding input, write to output
|
||||||
|
tbnz x5, #63, 0f
|
||||||
eor v16.16b, v16.16b, v0.16b
|
eor v16.16b, v16.16b, v0.16b
|
||||||
eor v17.16b, v17.16b, v1.16b
|
eor v17.16b, v17.16b, v1.16b
|
||||||
eor v18.16b, v18.16b, v2.16b
|
eor v18.16b, v18.16b, v2.16b
|
||||||
eor v19.16b, v19.16b, v3.16b
|
eor v19.16b, v19.16b, v3.16b
|
||||||
|
st1 {v16.16b-v19.16b}, [x1], #64
|
||||||
|
cbz x5, .Lout
|
||||||
|
|
||||||
|
tbnz x6, #63, 1f
|
||||||
eor v20.16b, v20.16b, v4.16b
|
eor v20.16b, v20.16b, v4.16b
|
||||||
eor v21.16b, v21.16b, v5.16b
|
eor v21.16b, v21.16b, v5.16b
|
||||||
st1 {v16.16b-v19.16b}, [x1], #64
|
|
||||||
eor v22.16b, v22.16b, v6.16b
|
eor v22.16b, v22.16b, v6.16b
|
||||||
eor v23.16b, v23.16b, v7.16b
|
eor v23.16b, v23.16b, v7.16b
|
||||||
|
st1 {v20.16b-v23.16b}, [x1], #64
|
||||||
|
cbz x6, .Lout
|
||||||
|
|
||||||
|
tbnz x7, #63, 2f
|
||||||
eor v24.16b, v24.16b, v8.16b
|
eor v24.16b, v24.16b, v8.16b
|
||||||
eor v25.16b, v25.16b, v9.16b
|
eor v25.16b, v25.16b, v9.16b
|
||||||
st1 {v20.16b-v23.16b}, [x1], #64
|
|
||||||
eor v26.16b, v26.16b, v10.16b
|
eor v26.16b, v26.16b, v10.16b
|
||||||
eor v27.16b, v27.16b, v11.16b
|
eor v27.16b, v27.16b, v11.16b
|
||||||
eor v28.16b, v28.16b, v12.16b
|
|
||||||
st1 {v24.16b-v27.16b}, [x1], #64
|
st1 {v24.16b-v27.16b}, [x1], #64
|
||||||
|
cbz x7, .Lout
|
||||||
|
|
||||||
|
tbnz x8, #63, 3f
|
||||||
|
eor v28.16b, v28.16b, v12.16b
|
||||||
eor v29.16b, v29.16b, v13.16b
|
eor v29.16b, v29.16b, v13.16b
|
||||||
eor v30.16b, v30.16b, v14.16b
|
eor v30.16b, v30.16b, v14.16b
|
||||||
eor v31.16b, v31.16b, v15.16b
|
eor v31.16b, v31.16b, v15.16b
|
||||||
st1 {v28.16b-v31.16b}, [x1]
|
st1 {v28.16b-v31.16b}, [x1]
|
||||||
|
|
||||||
|
.Lout: frame_pop
|
||||||
ret
|
ret
|
||||||
ENDPROC(chacha20_4block_xor_neon)
|
|
||||||
|
|
||||||
CTRINC: .word 0, 1, 2, 3
|
// fewer than 128 bytes of in/output
|
||||||
|
0: ld1 {v8.16b}, [x10]
|
||||||
|
ld1 {v9.16b}, [x11]
|
||||||
|
movi v10.16b, #16
|
||||||
|
sub x2, x1, #64
|
||||||
|
add x1, x1, x5
|
||||||
|
ld1 {v16.16b-v19.16b}, [x2]
|
||||||
|
tbl v4.16b, {v0.16b-v3.16b}, v8.16b
|
||||||
|
tbx v20.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v5.16b, {v0.16b-v3.16b}, v8.16b
|
||||||
|
tbx v21.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v6.16b, {v0.16b-v3.16b}, v8.16b
|
||||||
|
tbx v22.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v7.16b, {v0.16b-v3.16b}, v8.16b
|
||||||
|
tbx v23.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
|
||||||
|
eor v20.16b, v20.16b, v4.16b
|
||||||
|
eor v21.16b, v21.16b, v5.16b
|
||||||
|
eor v22.16b, v22.16b, v6.16b
|
||||||
|
eor v23.16b, v23.16b, v7.16b
|
||||||
|
st1 {v20.16b-v23.16b}, [x1]
|
||||||
|
b .Lout
|
||||||
|
|
||||||
|
// fewer than 192 bytes of in/output
|
||||||
|
1: ld1 {v8.16b}, [x10]
|
||||||
|
ld1 {v9.16b}, [x11]
|
||||||
|
movi v10.16b, #16
|
||||||
|
add x1, x1, x6
|
||||||
|
tbl v0.16b, {v4.16b-v7.16b}, v8.16b
|
||||||
|
tbx v20.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v1.16b, {v4.16b-v7.16b}, v8.16b
|
||||||
|
tbx v21.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v2.16b, {v4.16b-v7.16b}, v8.16b
|
||||||
|
tbx v22.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
add v8.16b, v8.16b, v10.16b
|
||||||
|
add v9.16b, v9.16b, v10.16b
|
||||||
|
tbl v3.16b, {v4.16b-v7.16b}, v8.16b
|
||||||
|
tbx v23.16b, {v16.16b-v19.16b}, v9.16b
|
||||||
|
|
||||||
|
eor v20.16b, v20.16b, v0.16b
|
||||||
|
eor v21.16b, v21.16b, v1.16b
|
||||||
|
eor v22.16b, v22.16b, v2.16b
|
||||||
|
eor v23.16b, v23.16b, v3.16b
|
||||||
|
st1 {v20.16b-v23.16b}, [x1]
|
||||||
|
b .Lout
|
||||||
|
|
||||||
|
// fewer than 256 bytes of in/output
|
||||||
|
2: ld1 {v4.16b}, [x10]
|
||||||
|
ld1 {v5.16b}, [x11]
|
||||||
|
movi v6.16b, #16
|
||||||
|
add x1, x1, x7
|
||||||
|
tbl v0.16b, {v8.16b-v11.16b}, v4.16b
|
||||||
|
tbx v24.16b, {v20.16b-v23.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v1.16b, {v8.16b-v11.16b}, v4.16b
|
||||||
|
tbx v25.16b, {v20.16b-v23.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v2.16b, {v8.16b-v11.16b}, v4.16b
|
||||||
|
tbx v26.16b, {v20.16b-v23.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v3.16b, {v8.16b-v11.16b}, v4.16b
|
||||||
|
tbx v27.16b, {v20.16b-v23.16b}, v5.16b
|
||||||
|
|
||||||
|
eor v24.16b, v24.16b, v0.16b
|
||||||
|
eor v25.16b, v25.16b, v1.16b
|
||||||
|
eor v26.16b, v26.16b, v2.16b
|
||||||
|
eor v27.16b, v27.16b, v3.16b
|
||||||
|
st1 {v24.16b-v27.16b}, [x1]
|
||||||
|
b .Lout
|
||||||
|
|
||||||
|
// fewer than 320 bytes of in/output
|
||||||
|
3: ld1 {v4.16b}, [x10]
|
||||||
|
ld1 {v5.16b}, [x11]
|
||||||
|
movi v6.16b, #16
|
||||||
|
add x1, x1, x8
|
||||||
|
tbl v0.16b, {v12.16b-v15.16b}, v4.16b
|
||||||
|
tbx v28.16b, {v24.16b-v27.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v1.16b, {v12.16b-v15.16b}, v4.16b
|
||||||
|
tbx v29.16b, {v24.16b-v27.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v2.16b, {v12.16b-v15.16b}, v4.16b
|
||||||
|
tbx v30.16b, {v24.16b-v27.16b}, v5.16b
|
||||||
|
add v4.16b, v4.16b, v6.16b
|
||||||
|
add v5.16b, v5.16b, v6.16b
|
||||||
|
tbl v3.16b, {v12.16b-v15.16b}, v4.16b
|
||||||
|
tbx v31.16b, {v24.16b-v27.16b}, v5.16b
|
||||||
|
|
||||||
|
eor v28.16b, v28.16b, v0.16b
|
||||||
|
eor v29.16b, v29.16b, v1.16b
|
||||||
|
eor v30.16b, v30.16b, v2.16b
|
||||||
|
eor v31.16b, v31.16b, v3.16b
|
||||||
|
st1 {v28.16b-v31.16b}, [x1]
|
||||||
|
b .Lout
|
||||||
|
ENDPROC(chacha_4block_xor_neon)
|
||||||
|
|
||||||
|
.section ".rodata", "a", %progbits
|
||||||
|
.align L1_CACHE_SHIFT
|
||||||
|
.Lpermute:
|
||||||
|
.set .Li, 0
|
||||||
|
.rept 192
|
||||||
|
.byte (.Li - 64)
|
||||||
|
.set .Li, .Li + 1
|
||||||
|
.endr
|
||||||
|
|
||||||
|
CTRINC: .word 1, 2, 3, 4
|
||||||
ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f
|
ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f
|
|
@ -0,0 +1,198 @@
|
||||||
|
/*
|
||||||
|
* ARM NEON accelerated ChaCha and XChaCha stream ciphers,
|
||||||
|
* including ChaCha20 (RFC7539)
|
||||||
|
*
|
||||||
|
* Copyright (C) 2016 - 2017 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License version 2 as
|
||||||
|
* published by the Free Software Foundation.
|
||||||
|
*
|
||||||
|
* Based on:
|
||||||
|
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
|
||||||
|
*
|
||||||
|
* Copyright (C) 2015 Martin Willi
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License as published by
|
||||||
|
* the Free Software Foundation; either version 2 of the License, or
|
||||||
|
* (at your option) any later version.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/algapi.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/internal/skcipher.h>
|
||||||
|
#include <linux/kernel.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
#include <asm/hwcap.h>
|
||||||
|
#include <asm/neon.h>
|
||||||
|
#include <asm/simd.h>
|
||||||
|
|
||||||
|
asmlinkage void chacha_block_xor_neon(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
int nrounds);
|
||||||
|
asmlinkage void chacha_4block_xor_neon(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
int nrounds, int bytes);
|
||||||
|
asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds);
|
||||||
|
|
||||||
|
static void chacha_doneon(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
int bytes, int nrounds)
|
||||||
|
{
|
||||||
|
while (bytes > 0) {
|
||||||
|
int l = min(bytes, CHACHA_BLOCK_SIZE * 5);
|
||||||
|
|
||||||
|
if (l <= CHACHA_BLOCK_SIZE) {
|
||||||
|
u8 buf[CHACHA_BLOCK_SIZE];
|
||||||
|
|
||||||
|
memcpy(buf, src, l);
|
||||||
|
chacha_block_xor_neon(state, buf, buf, nrounds);
|
||||||
|
memcpy(dst, buf, l);
|
||||||
|
state[12] += 1;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
chacha_4block_xor_neon(state, dst, src, nrounds, l);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE * 5;
|
||||||
|
src += CHACHA_BLOCK_SIZE * 5;
|
||||||
|
dst += CHACHA_BLOCK_SIZE * 5;
|
||||||
|
state[12] += 5;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_neon_stream_xor(struct skcipher_request *req,
|
||||||
|
struct chacha_ctx *ctx, u8 *iv)
|
||||||
|
{
|
||||||
|
struct skcipher_walk walk;
|
||||||
|
u32 state[16];
|
||||||
|
int err;
|
||||||
|
|
||||||
|
err = skcipher_walk_virt(&walk, req, false);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, iv);
|
||||||
|
|
||||||
|
while (walk.nbytes > 0) {
|
||||||
|
unsigned int nbytes = walk.nbytes;
|
||||||
|
|
||||||
|
if (nbytes < walk.total)
|
||||||
|
nbytes = rounddown(nbytes, walk.stride);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
|
||||||
|
nbytes, ctx->nrounds);
|
||||||
|
kernel_neon_end();
|
||||||
|
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
||||||
|
}
|
||||||
|
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_neon(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
|
||||||
|
return crypto_chacha_crypt(req);
|
||||||
|
|
||||||
|
return chacha_neon_stream_xor(req, ctx, req->iv);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int xchacha_neon(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct chacha_ctx subctx;
|
||||||
|
u32 state[16];
|
||||||
|
u8 real_iv[16];
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
|
||||||
|
return crypto_xchacha_crypt(req);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, req->iv);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
hchacha_block_neon(state, subctx.key, ctx->nrounds);
|
||||||
|
kernel_neon_end();
|
||||||
|
subctx.nrounds = ctx->nrounds;
|
||||||
|
|
||||||
|
memcpy(&real_iv[0], req->iv + 24, 8);
|
||||||
|
memcpy(&real_iv[8], req->iv + 16, 8);
|
||||||
|
return chacha_neon_stream_xor(req, &subctx, real_iv);
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct skcipher_alg algs[] = {
|
||||||
|
{
|
||||||
|
.base.cra_name = "chacha20",
|
||||||
|
.base.cra_driver_name = "chacha20-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = CHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 5 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = chacha_neon,
|
||||||
|
.decrypt = chacha_neon,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha20",
|
||||||
|
.base.cra_driver_name = "xchacha20-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 5 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = xchacha_neon,
|
||||||
|
.decrypt = xchacha_neon,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha12",
|
||||||
|
.base.cra_driver_name = "xchacha12-neon",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.walksize = 5 * CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha12_setkey,
|
||||||
|
.encrypt = xchacha_neon,
|
||||||
|
.decrypt = xchacha_neon,
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init chacha_simd_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!(elf_hwcap & HWCAP_ASIMD))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit chacha_simd_mod_fini(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(chacha_simd_mod_init);
|
||||||
|
module_exit(chacha_simd_mod_fini);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (NEON accelerated)");
|
||||||
|
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20-neon");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20-neon");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12-neon");
|
|
@ -1,133 +0,0 @@
|
||||||
/*
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
|
|
||||||
*
|
|
||||||
* Copyright (C) 2016 - 2017 Linaro, Ltd. <ard.biesheuvel@linaro.org>
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License version 2 as
|
|
||||||
* published by the Free Software Foundation.
|
|
||||||
*
|
|
||||||
* Based on:
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
|
|
||||||
*
|
|
||||||
* Copyright (C) 2015 Martin Willi
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License as published by
|
|
||||||
* the Free Software Foundation; either version 2 of the License, or
|
|
||||||
* (at your option) any later version.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <crypto/algapi.h>
|
|
||||||
#include <crypto/chacha20.h>
|
|
||||||
#include <crypto/internal/skcipher.h>
|
|
||||||
#include <linux/kernel.h>
|
|
||||||
#include <linux/module.h>
|
|
||||||
|
|
||||||
#include <asm/hwcap.h>
|
|
||||||
#include <asm/neon.h>
|
|
||||||
#include <asm/simd.h>
|
|
||||||
|
|
||||||
asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
|
|
||||||
static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
|
|
||||||
unsigned int bytes)
|
|
||||||
{
|
|
||||||
u8 buf[CHACHA20_BLOCK_SIZE];
|
|
||||||
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
|
|
||||||
kernel_neon_begin();
|
|
||||||
chacha20_4block_xor_neon(state, dst, src);
|
|
||||||
kernel_neon_end();
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
src += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
state[12] += 4;
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!bytes)
|
|
||||||
return;
|
|
||||||
|
|
||||||
kernel_neon_begin();
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE) {
|
|
||||||
chacha20_block_xor_neon(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE;
|
|
||||||
src += CHACHA20_BLOCK_SIZE;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE;
|
|
||||||
state[12]++;
|
|
||||||
}
|
|
||||||
if (bytes) {
|
|
||||||
memcpy(buf, src, bytes);
|
|
||||||
chacha20_block_xor_neon(state, buf, buf);
|
|
||||||
memcpy(dst, buf, bytes);
|
|
||||||
}
|
|
||||||
kernel_neon_end();
|
|
||||||
}
|
|
||||||
|
|
||||||
static int chacha20_neon(struct skcipher_request *req)
|
|
||||||
{
|
|
||||||
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
|
||||||
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
|
|
||||||
struct skcipher_walk walk;
|
|
||||||
u32 state[16];
|
|
||||||
int err;
|
|
||||||
|
|
||||||
if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE)
|
|
||||||
return crypto_chacha20_crypt(req);
|
|
||||||
|
|
||||||
err = skcipher_walk_virt(&walk, req, false);
|
|
||||||
|
|
||||||
crypto_chacha20_init(state, ctx, walk.iv);
|
|
||||||
|
|
||||||
while (walk.nbytes > 0) {
|
|
||||||
unsigned int nbytes = walk.nbytes;
|
|
||||||
|
|
||||||
if (nbytes < walk.total)
|
|
||||||
nbytes = round_down(nbytes, walk.stride);
|
|
||||||
|
|
||||||
chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
|
|
||||||
nbytes);
|
|
||||||
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
|
||||||
}
|
|
||||||
|
|
||||||
return err;
|
|
||||||
}
|
|
||||||
|
|
||||||
static struct skcipher_alg alg = {
|
|
||||||
.base.cra_name = "chacha20",
|
|
||||||
.base.cra_driver_name = "chacha20-neon",
|
|
||||||
.base.cra_priority = 300,
|
|
||||||
.base.cra_blocksize = 1,
|
|
||||||
.base.cra_ctxsize = sizeof(struct chacha20_ctx),
|
|
||||||
.base.cra_module = THIS_MODULE,
|
|
||||||
|
|
||||||
.min_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.max_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.ivsize = CHACHA20_IV_SIZE,
|
|
||||||
.chunksize = CHACHA20_BLOCK_SIZE,
|
|
||||||
.walksize = 4 * CHACHA20_BLOCK_SIZE,
|
|
||||||
.setkey = crypto_chacha20_setkey,
|
|
||||||
.encrypt = chacha20_neon,
|
|
||||||
.decrypt = chacha20_neon,
|
|
||||||
};
|
|
||||||
|
|
||||||
static int __init chacha20_simd_mod_init(void)
|
|
||||||
{
|
|
||||||
if (!(elf_hwcap & HWCAP_ASIMD))
|
|
||||||
return -ENODEV;
|
|
||||||
|
|
||||||
return crypto_register_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void __exit chacha20_simd_mod_fini(void)
|
|
||||||
{
|
|
||||||
crypto_unregister_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
module_init(chacha20_simd_mod_init);
|
|
||||||
module_exit(chacha20_simd_mod_fini);
|
|
||||||
|
|
||||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
|
||||||
MODULE_LICENSE("GPL v2");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20");
|
|
|
@ -0,0 +1,103 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0 */
|
||||||
|
/*
|
||||||
|
* NH - ε-almost-universal hash function, ARM64 NEON accelerated version
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*
|
||||||
|
* Author: Eric Biggers <ebiggers@google.com>
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/linkage.h>
|
||||||
|
|
||||||
|
KEY .req x0
|
||||||
|
MESSAGE .req x1
|
||||||
|
MESSAGE_LEN .req x2
|
||||||
|
HASH .req x3
|
||||||
|
|
||||||
|
PASS0_SUMS .req v0
|
||||||
|
PASS1_SUMS .req v1
|
||||||
|
PASS2_SUMS .req v2
|
||||||
|
PASS3_SUMS .req v3
|
||||||
|
K0 .req v4
|
||||||
|
K1 .req v5
|
||||||
|
K2 .req v6
|
||||||
|
K3 .req v7
|
||||||
|
T0 .req v8
|
||||||
|
T1 .req v9
|
||||||
|
T2 .req v10
|
||||||
|
T3 .req v11
|
||||||
|
T4 .req v12
|
||||||
|
T5 .req v13
|
||||||
|
T6 .req v14
|
||||||
|
T7 .req v15
|
||||||
|
|
||||||
|
.macro _nh_stride k0, k1, k2, k3
|
||||||
|
|
||||||
|
// Load next message stride
|
||||||
|
ld1 {T3.16b}, [MESSAGE], #16
|
||||||
|
|
||||||
|
// Load next key stride
|
||||||
|
ld1 {\k3\().4s}, [KEY], #16
|
||||||
|
|
||||||
|
// Add message words to key words
|
||||||
|
add T0.4s, T3.4s, \k0\().4s
|
||||||
|
add T1.4s, T3.4s, \k1\().4s
|
||||||
|
add T2.4s, T3.4s, \k2\().4s
|
||||||
|
add T3.4s, T3.4s, \k3\().4s
|
||||||
|
|
||||||
|
// Multiply 32x32 => 64 and accumulate
|
||||||
|
mov T4.d[0], T0.d[1]
|
||||||
|
mov T5.d[0], T1.d[1]
|
||||||
|
mov T6.d[0], T2.d[1]
|
||||||
|
mov T7.d[0], T3.d[1]
|
||||||
|
umlal PASS0_SUMS.2d, T0.2s, T4.2s
|
||||||
|
umlal PASS1_SUMS.2d, T1.2s, T5.2s
|
||||||
|
umlal PASS2_SUMS.2d, T2.2s, T6.2s
|
||||||
|
umlal PASS3_SUMS.2d, T3.2s, T7.2s
|
||||||
|
.endm
|
||||||
|
|
||||||
|
/*
|
||||||
|
* void nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
* u8 hash[NH_HASH_BYTES])
|
||||||
|
*
|
||||||
|
* It's guaranteed that message_len % 16 == 0.
|
||||||
|
*/
|
||||||
|
ENTRY(nh_neon)
|
||||||
|
|
||||||
|
ld1 {K0.4s,K1.4s}, [KEY], #32
|
||||||
|
movi PASS0_SUMS.2d, #0
|
||||||
|
movi PASS1_SUMS.2d, #0
|
||||||
|
ld1 {K2.4s}, [KEY], #16
|
||||||
|
movi PASS2_SUMS.2d, #0
|
||||||
|
movi PASS3_SUMS.2d, #0
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #64
|
||||||
|
blt .Lloop4_done
|
||||||
|
.Lloop4:
|
||||||
|
_nh_stride K0, K1, K2, K3
|
||||||
|
_nh_stride K1, K2, K3, K0
|
||||||
|
_nh_stride K2, K3, K0, K1
|
||||||
|
_nh_stride K3, K0, K1, K2
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #64
|
||||||
|
bge .Lloop4
|
||||||
|
|
||||||
|
.Lloop4_done:
|
||||||
|
ands MESSAGE_LEN, MESSAGE_LEN, #63
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K0, K1, K2, K3
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #16
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K1, K2, K3, K0
|
||||||
|
|
||||||
|
subs MESSAGE_LEN, MESSAGE_LEN, #16
|
||||||
|
beq .Ldone
|
||||||
|
_nh_stride K2, K3, K0, K1
|
||||||
|
|
||||||
|
.Ldone:
|
||||||
|
// Sum the accumulators for each pass, then store the sums to 'hash'
|
||||||
|
addp T0.2d, PASS0_SUMS.2d, PASS1_SUMS.2d
|
||||||
|
addp T1.2d, PASS2_SUMS.2d, PASS3_SUMS.2d
|
||||||
|
st1 {T0.16b,T1.16b}, [HASH]
|
||||||
|
ret
|
||||||
|
ENDPROC(nh_neon)
|
|
@ -0,0 +1,77 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
|
||||||
|
* (ARM64 NEON accelerated version)
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <asm/neon.h>
|
||||||
|
#include <asm/simd.h>
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
u8 hash[NH_HASH_BYTES]);
|
||||||
|
|
||||||
|
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
|
||||||
|
static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
__le64 hash[NH_NUM_PASSES])
|
||||||
|
{
|
||||||
|
nh_neon(key, message, message_len, (u8 *)hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nhpoly1305_neon_update(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen)
|
||||||
|
{
|
||||||
|
if (srclen < 64 || !may_use_simd())
|
||||||
|
return crypto_nhpoly1305_update(desc, src, srclen);
|
||||||
|
|
||||||
|
do {
|
||||||
|
unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
|
||||||
|
|
||||||
|
kernel_neon_begin();
|
||||||
|
crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
|
||||||
|
kernel_neon_end();
|
||||||
|
src += n;
|
||||||
|
srclen -= n;
|
||||||
|
} while (srclen);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct shash_alg nhpoly1305_alg = {
|
||||||
|
.base.cra_name = "nhpoly1305",
|
||||||
|
.base.cra_driver_name = "nhpoly1305-neon",
|
||||||
|
.base.cra_priority = 200,
|
||||||
|
.base.cra_ctxsize = sizeof(struct nhpoly1305_key),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
.digestsize = POLY1305_DIGEST_SIZE,
|
||||||
|
.init = crypto_nhpoly1305_init,
|
||||||
|
.update = nhpoly1305_neon_update,
|
||||||
|
.final = crypto_nhpoly1305_final,
|
||||||
|
.setkey = crypto_nhpoly1305_setkey,
|
||||||
|
.descsize = sizeof(struct nhpoly1305_state),
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init nhpoly1305_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!(elf_hwcap & HWCAP_ASIMD))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit nhpoly1305_mod_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(nhpoly1305_mod_init);
|
||||||
|
module_exit(nhpoly1305_mod_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (NEON-accelerated)");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305-neon");
|
|
@ -137,7 +137,7 @@ static int fallback_init_cip(struct crypto_tfm *tfm)
|
||||||
struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
|
struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
|
||||||
|
|
||||||
sctx->fallback.cip = crypto_alloc_cipher(name, 0,
|
sctx->fallback.cip = crypto_alloc_cipher(name, 0,
|
||||||
CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
|
CRYPTO_ALG_NEED_FALLBACK);
|
||||||
|
|
||||||
if (IS_ERR(sctx->fallback.cip)) {
|
if (IS_ERR(sctx->fallback.cip)) {
|
||||||
pr_err("Allocating AES fallback algorithm %s failed\n",
|
pr_err("Allocating AES fallback algorithm %s failed\n",
|
||||||
|
|
|
@ -476,11 +476,6 @@ static bool __init sparc64_has_aes_opcode(void)
|
||||||
|
|
||||||
static int __init aes_sparc64_mod_init(void)
|
static int __init aes_sparc64_mod_init(void)
|
||||||
{
|
{
|
||||||
int i;
|
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(algs); i++)
|
|
||||||
INIT_LIST_HEAD(&algs[i].cra_list);
|
|
||||||
|
|
||||||
if (sparc64_has_aes_opcode()) {
|
if (sparc64_has_aes_opcode()) {
|
||||||
pr_info("Using sparc64 aes opcodes optimized AES implementation\n");
|
pr_info("Using sparc64 aes opcodes optimized AES implementation\n");
|
||||||
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
||||||
|
|
|
@ -299,11 +299,6 @@ static bool __init sparc64_has_camellia_opcode(void)
|
||||||
|
|
||||||
static int __init camellia_sparc64_mod_init(void)
|
static int __init camellia_sparc64_mod_init(void)
|
||||||
{
|
{
|
||||||
int i;
|
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(algs); i++)
|
|
||||||
INIT_LIST_HEAD(&algs[i].cra_list);
|
|
||||||
|
|
||||||
if (sparc64_has_camellia_opcode()) {
|
if (sparc64_has_camellia_opcode()) {
|
||||||
pr_info("Using sparc64 camellia opcodes optimized CAMELLIA implementation\n");
|
pr_info("Using sparc64 camellia opcodes optimized CAMELLIA implementation\n");
|
||||||
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
||||||
|
|
|
@ -510,11 +510,6 @@ static bool __init sparc64_has_des_opcode(void)
|
||||||
|
|
||||||
static int __init des_sparc64_mod_init(void)
|
static int __init des_sparc64_mod_init(void)
|
||||||
{
|
{
|
||||||
int i;
|
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(algs); i++)
|
|
||||||
INIT_LIST_HEAD(&algs[i].cra_list);
|
|
||||||
|
|
||||||
if (sparc64_has_des_opcode()) {
|
if (sparc64_has_des_opcode()) {
|
||||||
pr_info("Using sparc64 des opcodes optimized DES implementation\n");
|
pr_info("Using sparc64 des opcodes optimized DES implementation\n");
|
||||||
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
return crypto_register_algs(algs, ARRAY_SIZE(algs));
|
||||||
|
|
|
@ -8,6 +8,7 @@ OBJECT_FILES_NON_STANDARD := y
|
||||||
avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
|
avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
|
||||||
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
|
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
|
||||||
$(comma)4)$(comma)%ymm2,yes,no)
|
$(comma)4)$(comma)%ymm2,yes,no)
|
||||||
|
avx512_supported :=$(call as-instr,vpmovm2b %k1$(comma)%zmm5,yes,no)
|
||||||
sha1_ni_supported :=$(call as-instr,sha1msg1 %xmm0$(comma)%xmm1,yes,no)
|
sha1_ni_supported :=$(call as-instr,sha1msg1 %xmm0$(comma)%xmm1,yes,no)
|
||||||
sha256_ni_supported :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,yes,no)
|
sha256_ni_supported :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,yes,no)
|
||||||
|
|
||||||
|
@ -23,7 +24,7 @@ obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
|
||||||
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
|
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
|
||||||
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
|
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
|
||||||
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
|
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
|
||||||
obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha20-x86_64.o
|
obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha-x86_64.o
|
||||||
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
|
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
|
||||||
obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
|
obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
|
||||||
obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
|
obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
|
||||||
|
@ -46,6 +47,9 @@ obj-$(CONFIG_CRYPTO_MORUS1280_GLUE) += morus1280_glue.o
|
||||||
obj-$(CONFIG_CRYPTO_MORUS640_SSE2) += morus640-sse2.o
|
obj-$(CONFIG_CRYPTO_MORUS640_SSE2) += morus640-sse2.o
|
||||||
obj-$(CONFIG_CRYPTO_MORUS1280_SSE2) += morus1280-sse2.o
|
obj-$(CONFIG_CRYPTO_MORUS1280_SSE2) += morus1280-sse2.o
|
||||||
|
|
||||||
|
obj-$(CONFIG_CRYPTO_NHPOLY1305_SSE2) += nhpoly1305-sse2.o
|
||||||
|
obj-$(CONFIG_CRYPTO_NHPOLY1305_AVX2) += nhpoly1305-avx2.o
|
||||||
|
|
||||||
# These modules require assembler to support AVX.
|
# These modules require assembler to support AVX.
|
||||||
ifeq ($(avx_supported),yes)
|
ifeq ($(avx_supported),yes)
|
||||||
obj-$(CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64) += \
|
obj-$(CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64) += \
|
||||||
|
@ -74,7 +78,7 @@ camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
|
||||||
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
|
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
|
||||||
twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
|
twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
|
||||||
twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
|
twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
|
||||||
chacha20-x86_64-y := chacha20-ssse3-x86_64.o chacha20_glue.o
|
chacha-x86_64-y := chacha-ssse3-x86_64.o chacha_glue.o
|
||||||
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
|
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
|
||||||
|
|
||||||
aegis128-aesni-y := aegis128-aesni-asm.o aegis128-aesni-glue.o
|
aegis128-aesni-y := aegis128-aesni-asm.o aegis128-aesni-glue.o
|
||||||
|
@ -84,6 +88,8 @@ aegis256-aesni-y := aegis256-aesni-asm.o aegis256-aesni-glue.o
|
||||||
morus640-sse2-y := morus640-sse2-asm.o morus640-sse2-glue.o
|
morus640-sse2-y := morus640-sse2-asm.o morus640-sse2-glue.o
|
||||||
morus1280-sse2-y := morus1280-sse2-asm.o morus1280-sse2-glue.o
|
morus1280-sse2-y := morus1280-sse2-asm.o morus1280-sse2-glue.o
|
||||||
|
|
||||||
|
nhpoly1305-sse2-y := nh-sse2-x86_64.o nhpoly1305-sse2-glue.o
|
||||||
|
|
||||||
ifeq ($(avx_supported),yes)
|
ifeq ($(avx_supported),yes)
|
||||||
camellia-aesni-avx-x86_64-y := camellia-aesni-avx-asm_64.o \
|
camellia-aesni-avx-x86_64-y := camellia-aesni-avx-asm_64.o \
|
||||||
camellia_aesni_avx_glue.o
|
camellia_aesni_avx_glue.o
|
||||||
|
@ -97,10 +103,16 @@ endif
|
||||||
|
|
||||||
ifeq ($(avx2_supported),yes)
|
ifeq ($(avx2_supported),yes)
|
||||||
camellia-aesni-avx2-y := camellia-aesni-avx2-asm_64.o camellia_aesni_avx2_glue.o
|
camellia-aesni-avx2-y := camellia-aesni-avx2-asm_64.o camellia_aesni_avx2_glue.o
|
||||||
chacha20-x86_64-y += chacha20-avx2-x86_64.o
|
chacha-x86_64-y += chacha-avx2-x86_64.o
|
||||||
serpent-avx2-y := serpent-avx2-asm_64.o serpent_avx2_glue.o
|
serpent-avx2-y := serpent-avx2-asm_64.o serpent_avx2_glue.o
|
||||||
|
|
||||||
morus1280-avx2-y := morus1280-avx2-asm.o morus1280-avx2-glue.o
|
morus1280-avx2-y := morus1280-avx2-asm.o morus1280-avx2-glue.o
|
||||||
|
|
||||||
|
nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
|
||||||
|
endif
|
||||||
|
|
||||||
|
ifeq ($(avx512_supported),yes)
|
||||||
|
chacha-x86_64-y += chacha-avx512vl-x86_64.o
|
||||||
endif
|
endif
|
||||||
|
|
||||||
aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
|
aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -84,7 +84,7 @@ struct gcm_context_data {
|
||||||
u8 current_counter[GCM_BLOCK_LEN];
|
u8 current_counter[GCM_BLOCK_LEN];
|
||||||
u64 partial_block_len;
|
u64 partial_block_len;
|
||||||
u64 unused;
|
u64 unused;
|
||||||
u8 hash_keys[GCM_BLOCK_LEN * 8];
|
u8 hash_keys[GCM_BLOCK_LEN * 16];
|
||||||
};
|
};
|
||||||
|
|
||||||
asmlinkage int aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
|
asmlinkage int aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
|
||||||
|
@ -175,6 +175,32 @@ asmlinkage void aesni_gcm_finalize(void *ctx,
|
||||||
struct gcm_context_data *gdata,
|
struct gcm_context_data *gdata,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len);
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
|
static struct aesni_gcm_tfm_s {
|
||||||
|
void (*init)(void *ctx,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *iv,
|
||||||
|
u8 *hash_subkey, const u8 *aad,
|
||||||
|
unsigned long aad_len);
|
||||||
|
void (*enc_update)(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in,
|
||||||
|
unsigned long plaintext_len);
|
||||||
|
void (*dec_update)(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in,
|
||||||
|
unsigned long ciphertext_len);
|
||||||
|
void (*finalize)(void *ctx,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
} *aesni_gcm_tfm;
|
||||||
|
|
||||||
|
struct aesni_gcm_tfm_s aesni_gcm_tfm_sse = {
|
||||||
|
.init = &aesni_gcm_init,
|
||||||
|
.enc_update = &aesni_gcm_enc_update,
|
||||||
|
.dec_update = &aesni_gcm_dec_update,
|
||||||
|
.finalize = &aesni_gcm_finalize,
|
||||||
|
};
|
||||||
|
|
||||||
#ifdef CONFIG_AS_AVX
|
#ifdef CONFIG_AS_AVX
|
||||||
asmlinkage void aes_ctr_enc_128_avx_by8(const u8 *in, u8 *iv,
|
asmlinkage void aes_ctr_enc_128_avx_by8(const u8 *in, u8 *iv,
|
||||||
void *keys, u8 *out, unsigned int num_bytes);
|
void *keys, u8 *out, unsigned int num_bytes);
|
||||||
|
@ -183,136 +209,94 @@ asmlinkage void aes_ctr_enc_192_avx_by8(const u8 *in, u8 *iv,
|
||||||
asmlinkage void aes_ctr_enc_256_avx_by8(const u8 *in, u8 *iv,
|
asmlinkage void aes_ctr_enc_256_avx_by8(const u8 *in, u8 *iv,
|
||||||
void *keys, u8 *out, unsigned int num_bytes);
|
void *keys, u8 *out, unsigned int num_bytes);
|
||||||
/*
|
/*
|
||||||
* asmlinkage void aesni_gcm_precomp_avx_gen2()
|
* asmlinkage void aesni_gcm_init_avx_gen2()
|
||||||
* gcm_data *my_ctx_data, context data
|
* gcm_data *my_ctx_data, context data
|
||||||
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
|
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
|
||||||
*/
|
*/
|
||||||
asmlinkage void aesni_gcm_precomp_avx_gen2(void *my_ctx_data, u8 *hash_subkey);
|
asmlinkage void aesni_gcm_init_avx_gen2(void *my_ctx_data,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *iv,
|
||||||
|
u8 *hash_subkey,
|
||||||
|
const u8 *aad,
|
||||||
|
unsigned long aad_len);
|
||||||
|
|
||||||
asmlinkage void aesni_gcm_enc_avx_gen2(void *ctx, u8 *out,
|
asmlinkage void aesni_gcm_enc_update_avx_gen2(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in, unsigned long plaintext_len);
|
||||||
|
asmlinkage void aesni_gcm_dec_update_avx_gen2(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in,
|
||||||
|
unsigned long ciphertext_len);
|
||||||
|
asmlinkage void aesni_gcm_finalize_avx_gen2(void *ctx,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
|
asmlinkage void aesni_gcm_enc_avx_gen2(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
||||||
const u8 *aad, unsigned long aad_len,
|
const u8 *aad, unsigned long aad_len,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len);
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
asmlinkage void aesni_gcm_dec_avx_gen2(void *ctx, u8 *out,
|
asmlinkage void aesni_gcm_dec_avx_gen2(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
||||||
const u8 *aad, unsigned long aad_len,
|
const u8 *aad, unsigned long aad_len,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len);
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
static void aesni_gcm_enc_avx(void *ctx,
|
struct aesni_gcm_tfm_s aesni_gcm_tfm_avx_gen2 = {
|
||||||
struct gcm_context_data *data, u8 *out,
|
.init = &aesni_gcm_init_avx_gen2,
|
||||||
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
.enc_update = &aesni_gcm_enc_update_avx_gen2,
|
||||||
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
|
.dec_update = &aesni_gcm_dec_update_avx_gen2,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len)
|
.finalize = &aesni_gcm_finalize_avx_gen2,
|
||||||
{
|
};
|
||||||
struct crypto_aes_ctx *aes_ctx = (struct crypto_aes_ctx*)ctx;
|
|
||||||
if ((plaintext_len < AVX_GEN2_OPTSIZE) || (aes_ctx-> key_length != AES_KEYSIZE_128)){
|
|
||||||
aesni_gcm_enc(ctx, data, out, in,
|
|
||||||
plaintext_len, iv, hash_subkey, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else {
|
|
||||||
aesni_gcm_precomp_avx_gen2(ctx, hash_subkey);
|
|
||||||
aesni_gcm_enc_avx_gen2(ctx, out, in, plaintext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void aesni_gcm_dec_avx(void *ctx,
|
|
||||||
struct gcm_context_data *data, u8 *out,
|
|
||||||
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
|
||||||
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
|
|
||||||
u8 *auth_tag, unsigned long auth_tag_len)
|
|
||||||
{
|
|
||||||
struct crypto_aes_ctx *aes_ctx = (struct crypto_aes_ctx*)ctx;
|
|
||||||
if ((ciphertext_len < AVX_GEN2_OPTSIZE) || (aes_ctx-> key_length != AES_KEYSIZE_128)) {
|
|
||||||
aesni_gcm_dec(ctx, data, out, in,
|
|
||||||
ciphertext_len, iv, hash_subkey, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else {
|
|
||||||
aesni_gcm_precomp_avx_gen2(ctx, hash_subkey);
|
|
||||||
aesni_gcm_dec_avx_gen2(ctx, out, in, ciphertext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#ifdef CONFIG_AS_AVX2
|
#ifdef CONFIG_AS_AVX2
|
||||||
/*
|
/*
|
||||||
* asmlinkage void aesni_gcm_precomp_avx_gen4()
|
* asmlinkage void aesni_gcm_init_avx_gen4()
|
||||||
* gcm_data *my_ctx_data, context data
|
* gcm_data *my_ctx_data, context data
|
||||||
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
|
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
|
||||||
*/
|
*/
|
||||||
asmlinkage void aesni_gcm_precomp_avx_gen4(void *my_ctx_data, u8 *hash_subkey);
|
asmlinkage void aesni_gcm_init_avx_gen4(void *my_ctx_data,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *iv,
|
||||||
|
u8 *hash_subkey,
|
||||||
|
const u8 *aad,
|
||||||
|
unsigned long aad_len);
|
||||||
|
|
||||||
asmlinkage void aesni_gcm_enc_avx_gen4(void *ctx, u8 *out,
|
asmlinkage void aesni_gcm_enc_update_avx_gen4(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in, unsigned long plaintext_len);
|
||||||
|
asmlinkage void aesni_gcm_dec_update_avx_gen4(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
|
const u8 *in,
|
||||||
|
unsigned long ciphertext_len);
|
||||||
|
asmlinkage void aesni_gcm_finalize_avx_gen4(void *ctx,
|
||||||
|
struct gcm_context_data *gdata,
|
||||||
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
|
asmlinkage void aesni_gcm_enc_avx_gen4(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
||||||
const u8 *aad, unsigned long aad_len,
|
const u8 *aad, unsigned long aad_len,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len);
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
asmlinkage void aesni_gcm_dec_avx_gen4(void *ctx, u8 *out,
|
asmlinkage void aesni_gcm_dec_avx_gen4(void *ctx,
|
||||||
|
struct gcm_context_data *gdata, u8 *out,
|
||||||
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
||||||
const u8 *aad, unsigned long aad_len,
|
const u8 *aad, unsigned long aad_len,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len);
|
u8 *auth_tag, unsigned long auth_tag_len);
|
||||||
|
|
||||||
static void aesni_gcm_enc_avx2(void *ctx,
|
struct aesni_gcm_tfm_s aesni_gcm_tfm_avx_gen4 = {
|
||||||
struct gcm_context_data *data, u8 *out,
|
.init = &aesni_gcm_init_avx_gen4,
|
||||||
const u8 *in, unsigned long plaintext_len, u8 *iv,
|
.enc_update = &aesni_gcm_enc_update_avx_gen4,
|
||||||
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
|
.dec_update = &aesni_gcm_dec_update_avx_gen4,
|
||||||
u8 *auth_tag, unsigned long auth_tag_len)
|
.finalize = &aesni_gcm_finalize_avx_gen4,
|
||||||
{
|
};
|
||||||
struct crypto_aes_ctx *aes_ctx = (struct crypto_aes_ctx*)ctx;
|
|
||||||
if ((plaintext_len < AVX_GEN2_OPTSIZE) || (aes_ctx-> key_length != AES_KEYSIZE_128)) {
|
|
||||||
aesni_gcm_enc(ctx, data, out, in,
|
|
||||||
plaintext_len, iv, hash_subkey, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else if (plaintext_len < AVX_GEN4_OPTSIZE) {
|
|
||||||
aesni_gcm_precomp_avx_gen2(ctx, hash_subkey);
|
|
||||||
aesni_gcm_enc_avx_gen2(ctx, out, in, plaintext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else {
|
|
||||||
aesni_gcm_precomp_avx_gen4(ctx, hash_subkey);
|
|
||||||
aesni_gcm_enc_avx_gen4(ctx, out, in, plaintext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void aesni_gcm_dec_avx2(void *ctx,
|
|
||||||
struct gcm_context_data *data, u8 *out,
|
|
||||||
const u8 *in, unsigned long ciphertext_len, u8 *iv,
|
|
||||||
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
|
|
||||||
u8 *auth_tag, unsigned long auth_tag_len)
|
|
||||||
{
|
|
||||||
struct crypto_aes_ctx *aes_ctx = (struct crypto_aes_ctx*)ctx;
|
|
||||||
if ((ciphertext_len < AVX_GEN2_OPTSIZE) || (aes_ctx-> key_length != AES_KEYSIZE_128)) {
|
|
||||||
aesni_gcm_dec(ctx, data, out, in,
|
|
||||||
ciphertext_len, iv, hash_subkey,
|
|
||||||
aad, aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else if (ciphertext_len < AVX_GEN4_OPTSIZE) {
|
|
||||||
aesni_gcm_precomp_avx_gen2(ctx, hash_subkey);
|
|
||||||
aesni_gcm_dec_avx_gen2(ctx, out, in, ciphertext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
} else {
|
|
||||||
aesni_gcm_precomp_avx_gen4(ctx, hash_subkey);
|
|
||||||
aesni_gcm_dec_avx_gen4(ctx, out, in, ciphertext_len, iv, aad,
|
|
||||||
aad_len, auth_tag, auth_tag_len);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
static void (*aesni_gcm_enc_tfm)(void *ctx,
|
|
||||||
struct gcm_context_data *data, u8 *out,
|
|
||||||
const u8 *in, unsigned long plaintext_len,
|
|
||||||
u8 *iv, u8 *hash_subkey, const u8 *aad,
|
|
||||||
unsigned long aad_len, u8 *auth_tag,
|
|
||||||
unsigned long auth_tag_len);
|
|
||||||
|
|
||||||
static void (*aesni_gcm_dec_tfm)(void *ctx,
|
|
||||||
struct gcm_context_data *data, u8 *out,
|
|
||||||
const u8 *in, unsigned long ciphertext_len,
|
|
||||||
u8 *iv, u8 *hash_subkey, const u8 *aad,
|
|
||||||
unsigned long aad_len, u8 *auth_tag,
|
|
||||||
unsigned long auth_tag_len);
|
|
||||||
|
|
||||||
static inline struct
|
static inline struct
|
||||||
aesni_rfc4106_gcm_ctx *aesni_rfc4106_gcm_ctx_get(struct crypto_aead *tfm)
|
aesni_rfc4106_gcm_ctx *aesni_rfc4106_gcm_ctx_get(struct crypto_aead *tfm)
|
||||||
{
|
{
|
||||||
|
@ -794,6 +778,7 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
{
|
{
|
||||||
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
|
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
|
||||||
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
|
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
|
||||||
|
struct aesni_gcm_tfm_s *gcm_tfm = aesni_gcm_tfm;
|
||||||
struct gcm_context_data data AESNI_ALIGN_ATTR;
|
struct gcm_context_data data AESNI_ALIGN_ATTR;
|
||||||
struct scatter_walk dst_sg_walk = {};
|
struct scatter_walk dst_sg_walk = {};
|
||||||
unsigned long left = req->cryptlen;
|
unsigned long left = req->cryptlen;
|
||||||
|
@ -811,6 +796,15 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
if (!enc)
|
if (!enc)
|
||||||
left -= auth_tag_len;
|
left -= auth_tag_len;
|
||||||
|
|
||||||
|
#ifdef CONFIG_AS_AVX2
|
||||||
|
if (left < AVX_GEN4_OPTSIZE && gcm_tfm == &aesni_gcm_tfm_avx_gen4)
|
||||||
|
gcm_tfm = &aesni_gcm_tfm_avx_gen2;
|
||||||
|
#endif
|
||||||
|
#ifdef CONFIG_AS_AVX
|
||||||
|
if (left < AVX_GEN2_OPTSIZE && gcm_tfm == &aesni_gcm_tfm_avx_gen2)
|
||||||
|
gcm_tfm = &aesni_gcm_tfm_sse;
|
||||||
|
#endif
|
||||||
|
|
||||||
/* Linearize assoc, if not already linear */
|
/* Linearize assoc, if not already linear */
|
||||||
if (req->src->length >= assoclen && req->src->length &&
|
if (req->src->length >= assoclen && req->src->length &&
|
||||||
(!PageHighMem(sg_page(req->src)) ||
|
(!PageHighMem(sg_page(req->src)) ||
|
||||||
|
@ -835,7 +829,7 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
}
|
}
|
||||||
|
|
||||||
kernel_fpu_begin();
|
kernel_fpu_begin();
|
||||||
aesni_gcm_init(aes_ctx, &data, iv,
|
gcm_tfm->init(aes_ctx, &data, iv,
|
||||||
hash_subkey, assoc, assoclen);
|
hash_subkey, assoc, assoclen);
|
||||||
if (req->src != req->dst) {
|
if (req->src != req->dst) {
|
||||||
while (left) {
|
while (left) {
|
||||||
|
@ -846,10 +840,10 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
len = min(srclen, dstlen);
|
len = min(srclen, dstlen);
|
||||||
if (len) {
|
if (len) {
|
||||||
if (enc)
|
if (enc)
|
||||||
aesni_gcm_enc_update(aes_ctx, &data,
|
gcm_tfm->enc_update(aes_ctx, &data,
|
||||||
dst, src, len);
|
dst, src, len);
|
||||||
else
|
else
|
||||||
aesni_gcm_dec_update(aes_ctx, &data,
|
gcm_tfm->dec_update(aes_ctx, &data,
|
||||||
dst, src, len);
|
dst, src, len);
|
||||||
}
|
}
|
||||||
left -= len;
|
left -= len;
|
||||||
|
@ -867,10 +861,10 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
len = scatterwalk_clamp(&src_sg_walk, left);
|
len = scatterwalk_clamp(&src_sg_walk, left);
|
||||||
if (len) {
|
if (len) {
|
||||||
if (enc)
|
if (enc)
|
||||||
aesni_gcm_enc_update(aes_ctx, &data,
|
gcm_tfm->enc_update(aes_ctx, &data,
|
||||||
src, src, len);
|
src, src, len);
|
||||||
else
|
else
|
||||||
aesni_gcm_dec_update(aes_ctx, &data,
|
gcm_tfm->dec_update(aes_ctx, &data,
|
||||||
src, src, len);
|
src, src, len);
|
||||||
}
|
}
|
||||||
left -= len;
|
left -= len;
|
||||||
|
@ -879,7 +873,7 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
scatterwalk_done(&src_sg_walk, 1, left);
|
scatterwalk_done(&src_sg_walk, 1, left);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
aesni_gcm_finalize(aes_ctx, &data, authTag, auth_tag_len);
|
gcm_tfm->finalize(aes_ctx, &data, authTag, auth_tag_len);
|
||||||
kernel_fpu_end();
|
kernel_fpu_end();
|
||||||
|
|
||||||
if (!assocmem)
|
if (!assocmem)
|
||||||
|
@ -912,147 +906,15 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
|
||||||
static int gcmaes_encrypt(struct aead_request *req, unsigned int assoclen,
|
static int gcmaes_encrypt(struct aead_request *req, unsigned int assoclen,
|
||||||
u8 *hash_subkey, u8 *iv, void *aes_ctx)
|
u8 *hash_subkey, u8 *iv, void *aes_ctx)
|
||||||
{
|
{
|
||||||
u8 one_entry_in_sg = 0;
|
return gcmaes_crypt_by_sg(true, req, assoclen, hash_subkey, iv,
|
||||||
u8 *src, *dst, *assoc;
|
aes_ctx);
|
||||||
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
|
|
||||||
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
|
|
||||||
struct scatter_walk src_sg_walk;
|
|
||||||
struct scatter_walk dst_sg_walk = {};
|
|
||||||
struct gcm_context_data data AESNI_ALIGN_ATTR;
|
|
||||||
|
|
||||||
if (((struct crypto_aes_ctx *)aes_ctx)->key_length != AES_KEYSIZE_128 ||
|
|
||||||
aesni_gcm_enc_tfm == aesni_gcm_enc ||
|
|
||||||
req->cryptlen < AVX_GEN2_OPTSIZE) {
|
|
||||||
return gcmaes_crypt_by_sg(true, req, assoclen, hash_subkey, iv,
|
|
||||||
aes_ctx);
|
|
||||||
}
|
|
||||||
if (sg_is_last(req->src) &&
|
|
||||||
(!PageHighMem(sg_page(req->src)) ||
|
|
||||||
req->src->offset + req->src->length <= PAGE_SIZE) &&
|
|
||||||
sg_is_last(req->dst) &&
|
|
||||||
(!PageHighMem(sg_page(req->dst)) ||
|
|
||||||
req->dst->offset + req->dst->length <= PAGE_SIZE)) {
|
|
||||||
one_entry_in_sg = 1;
|
|
||||||
scatterwalk_start(&src_sg_walk, req->src);
|
|
||||||
assoc = scatterwalk_map(&src_sg_walk);
|
|
||||||
src = assoc + req->assoclen;
|
|
||||||
dst = src;
|
|
||||||
if (unlikely(req->src != req->dst)) {
|
|
||||||
scatterwalk_start(&dst_sg_walk, req->dst);
|
|
||||||
dst = scatterwalk_map(&dst_sg_walk) + req->assoclen;
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
/* Allocate memory for src, dst, assoc */
|
|
||||||
assoc = kmalloc(req->cryptlen + auth_tag_len + req->assoclen,
|
|
||||||
GFP_ATOMIC);
|
|
||||||
if (unlikely(!assoc))
|
|
||||||
return -ENOMEM;
|
|
||||||
scatterwalk_map_and_copy(assoc, req->src, 0,
|
|
||||||
req->assoclen + req->cryptlen, 0);
|
|
||||||
src = assoc + req->assoclen;
|
|
||||||
dst = src;
|
|
||||||
}
|
|
||||||
|
|
||||||
kernel_fpu_begin();
|
|
||||||
aesni_gcm_enc_tfm(aes_ctx, &data, dst, src, req->cryptlen, iv,
|
|
||||||
hash_subkey, assoc, assoclen,
|
|
||||||
dst + req->cryptlen, auth_tag_len);
|
|
||||||
kernel_fpu_end();
|
|
||||||
|
|
||||||
/* The authTag (aka the Integrity Check Value) needs to be written
|
|
||||||
* back to the packet. */
|
|
||||||
if (one_entry_in_sg) {
|
|
||||||
if (unlikely(req->src != req->dst)) {
|
|
||||||
scatterwalk_unmap(dst - req->assoclen);
|
|
||||||
scatterwalk_advance(&dst_sg_walk, req->dst->length);
|
|
||||||
scatterwalk_done(&dst_sg_walk, 1, 0);
|
|
||||||
}
|
|
||||||
scatterwalk_unmap(assoc);
|
|
||||||
scatterwalk_advance(&src_sg_walk, req->src->length);
|
|
||||||
scatterwalk_done(&src_sg_walk, req->src == req->dst, 0);
|
|
||||||
} else {
|
|
||||||
scatterwalk_map_and_copy(dst, req->dst, req->assoclen,
|
|
||||||
req->cryptlen + auth_tag_len, 1);
|
|
||||||
kfree(assoc);
|
|
||||||
}
|
|
||||||
return 0;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int gcmaes_decrypt(struct aead_request *req, unsigned int assoclen,
|
static int gcmaes_decrypt(struct aead_request *req, unsigned int assoclen,
|
||||||
u8 *hash_subkey, u8 *iv, void *aes_ctx)
|
u8 *hash_subkey, u8 *iv, void *aes_ctx)
|
||||||
{
|
{
|
||||||
u8 one_entry_in_sg = 0;
|
return gcmaes_crypt_by_sg(false, req, assoclen, hash_subkey, iv,
|
||||||
u8 *src, *dst, *assoc;
|
aes_ctx);
|
||||||
unsigned long tempCipherLen = 0;
|
|
||||||
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
|
|
||||||
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
|
|
||||||
u8 authTag[16];
|
|
||||||
struct scatter_walk src_sg_walk;
|
|
||||||
struct scatter_walk dst_sg_walk = {};
|
|
||||||
struct gcm_context_data data AESNI_ALIGN_ATTR;
|
|
||||||
int retval = 0;
|
|
||||||
|
|
||||||
if (((struct crypto_aes_ctx *)aes_ctx)->key_length != AES_KEYSIZE_128 ||
|
|
||||||
aesni_gcm_enc_tfm == aesni_gcm_enc ||
|
|
||||||
req->cryptlen < AVX_GEN2_OPTSIZE) {
|
|
||||||
return gcmaes_crypt_by_sg(false, req, assoclen, hash_subkey, iv,
|
|
||||||
aes_ctx);
|
|
||||||
}
|
|
||||||
tempCipherLen = (unsigned long)(req->cryptlen - auth_tag_len);
|
|
||||||
|
|
||||||
if (sg_is_last(req->src) &&
|
|
||||||
(!PageHighMem(sg_page(req->src)) ||
|
|
||||||
req->src->offset + req->src->length <= PAGE_SIZE) &&
|
|
||||||
sg_is_last(req->dst) && req->dst->length &&
|
|
||||||
(!PageHighMem(sg_page(req->dst)) ||
|
|
||||||
req->dst->offset + req->dst->length <= PAGE_SIZE)) {
|
|
||||||
one_entry_in_sg = 1;
|
|
||||||
scatterwalk_start(&src_sg_walk, req->src);
|
|
||||||
assoc = scatterwalk_map(&src_sg_walk);
|
|
||||||
src = assoc + req->assoclen;
|
|
||||||
dst = src;
|
|
||||||
if (unlikely(req->src != req->dst)) {
|
|
||||||
scatterwalk_start(&dst_sg_walk, req->dst);
|
|
||||||
dst = scatterwalk_map(&dst_sg_walk) + req->assoclen;
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
/* Allocate memory for src, dst, assoc */
|
|
||||||
assoc = kmalloc(req->cryptlen + req->assoclen, GFP_ATOMIC);
|
|
||||||
if (!assoc)
|
|
||||||
return -ENOMEM;
|
|
||||||
scatterwalk_map_and_copy(assoc, req->src, 0,
|
|
||||||
req->assoclen + req->cryptlen, 0);
|
|
||||||
src = assoc + req->assoclen;
|
|
||||||
dst = src;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
kernel_fpu_begin();
|
|
||||||
aesni_gcm_dec_tfm(aes_ctx, &data, dst, src, tempCipherLen, iv,
|
|
||||||
hash_subkey, assoc, assoclen,
|
|
||||||
authTag, auth_tag_len);
|
|
||||||
kernel_fpu_end();
|
|
||||||
|
|
||||||
/* Compare generated tag with passed in tag. */
|
|
||||||
retval = crypto_memneq(src + tempCipherLen, authTag, auth_tag_len) ?
|
|
||||||
-EBADMSG : 0;
|
|
||||||
|
|
||||||
if (one_entry_in_sg) {
|
|
||||||
if (unlikely(req->src != req->dst)) {
|
|
||||||
scatterwalk_unmap(dst - req->assoclen);
|
|
||||||
scatterwalk_advance(&dst_sg_walk, req->dst->length);
|
|
||||||
scatterwalk_done(&dst_sg_walk, 1, 0);
|
|
||||||
}
|
|
||||||
scatterwalk_unmap(assoc);
|
|
||||||
scatterwalk_advance(&src_sg_walk, req->src->length);
|
|
||||||
scatterwalk_done(&src_sg_walk, req->src == req->dst, 0);
|
|
||||||
} else {
|
|
||||||
scatterwalk_map_and_copy(dst, req->dst, req->assoclen,
|
|
||||||
tempCipherLen, 1);
|
|
||||||
kfree(assoc);
|
|
||||||
}
|
|
||||||
return retval;
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int helper_rfc4106_encrypt(struct aead_request *req)
|
static int helper_rfc4106_encrypt(struct aead_request *req)
|
||||||
|
@ -1420,21 +1282,18 @@ static int __init aesni_init(void)
|
||||||
#ifdef CONFIG_AS_AVX2
|
#ifdef CONFIG_AS_AVX2
|
||||||
if (boot_cpu_has(X86_FEATURE_AVX2)) {
|
if (boot_cpu_has(X86_FEATURE_AVX2)) {
|
||||||
pr_info("AVX2 version of gcm_enc/dec engaged.\n");
|
pr_info("AVX2 version of gcm_enc/dec engaged.\n");
|
||||||
aesni_gcm_enc_tfm = aesni_gcm_enc_avx2;
|
aesni_gcm_tfm = &aesni_gcm_tfm_avx_gen4;
|
||||||
aesni_gcm_dec_tfm = aesni_gcm_dec_avx2;
|
|
||||||
} else
|
} else
|
||||||
#endif
|
#endif
|
||||||
#ifdef CONFIG_AS_AVX
|
#ifdef CONFIG_AS_AVX
|
||||||
if (boot_cpu_has(X86_FEATURE_AVX)) {
|
if (boot_cpu_has(X86_FEATURE_AVX)) {
|
||||||
pr_info("AVX version of gcm_enc/dec engaged.\n");
|
pr_info("AVX version of gcm_enc/dec engaged.\n");
|
||||||
aesni_gcm_enc_tfm = aesni_gcm_enc_avx;
|
aesni_gcm_tfm = &aesni_gcm_tfm_avx_gen2;
|
||||||
aesni_gcm_dec_tfm = aesni_gcm_dec_avx;
|
|
||||||
} else
|
} else
|
||||||
#endif
|
#endif
|
||||||
{
|
{
|
||||||
pr_info("SSE version of gcm_enc/dec engaged.\n");
|
pr_info("SSE version of gcm_enc/dec engaged.\n");
|
||||||
aesni_gcm_enc_tfm = aesni_gcm_enc;
|
aesni_gcm_tfm = &aesni_gcm_tfm_sse;
|
||||||
aesni_gcm_dec_tfm = aesni_gcm_dec;
|
|
||||||
}
|
}
|
||||||
aesni_ctr_enc_tfm = aesni_ctr_enc;
|
aesni_ctr_enc_tfm = aesni_ctr_enc;
|
||||||
#ifdef CONFIG_AS_AVX
|
#ifdef CONFIG_AS_AVX
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,836 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0+ */
|
||||||
|
/*
|
||||||
|
* ChaCha 256-bit cipher algorithm, x64 AVX-512VL functions
|
||||||
|
*
|
||||||
|
* Copyright (C) 2018 Martin Willi
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/linkage.h>
|
||||||
|
|
||||||
|
.section .rodata.cst32.CTR2BL, "aM", @progbits, 32
|
||||||
|
.align 32
|
||||||
|
CTR2BL: .octa 0x00000000000000000000000000000000
|
||||||
|
.octa 0x00000000000000000000000000000001
|
||||||
|
|
||||||
|
.section .rodata.cst32.CTR4BL, "aM", @progbits, 32
|
||||||
|
.align 32
|
||||||
|
CTR4BL: .octa 0x00000000000000000000000000000002
|
||||||
|
.octa 0x00000000000000000000000000000003
|
||||||
|
|
||||||
|
.section .rodata.cst32.CTR8BL, "aM", @progbits, 32
|
||||||
|
.align 32
|
||||||
|
CTR8BL: .octa 0x00000003000000020000000100000000
|
||||||
|
.octa 0x00000007000000060000000500000004
|
||||||
|
|
||||||
|
.text
|
||||||
|
|
||||||
|
ENTRY(chacha_2block_xor_avx512vl)
|
||||||
|
# %rdi: Input state matrix, s
|
||||||
|
# %rsi: up to 2 data blocks output, o
|
||||||
|
# %rdx: up to 2 data blocks input, i
|
||||||
|
# %rcx: input/output length in bytes
|
||||||
|
# %r8d: nrounds
|
||||||
|
|
||||||
|
# This function encrypts two ChaCha blocks by loading the state
|
||||||
|
# matrix twice across four AVX registers. It performs matrix operations
|
||||||
|
# on four words in each matrix in parallel, but requires shuffling to
|
||||||
|
# rearrange the words after each round.
|
||||||
|
|
||||||
|
vzeroupper
|
||||||
|
|
||||||
|
# x0..3[0-2] = s0..3
|
||||||
|
vbroadcasti128 0x00(%rdi),%ymm0
|
||||||
|
vbroadcasti128 0x10(%rdi),%ymm1
|
||||||
|
vbroadcasti128 0x20(%rdi),%ymm2
|
||||||
|
vbroadcasti128 0x30(%rdi),%ymm3
|
||||||
|
|
||||||
|
vpaddd CTR2BL(%rip),%ymm3,%ymm3
|
||||||
|
|
||||||
|
vmovdqa %ymm0,%ymm8
|
||||||
|
vmovdqa %ymm1,%ymm9
|
||||||
|
vmovdqa %ymm2,%ymm10
|
||||||
|
vmovdqa %ymm3,%ymm11
|
||||||
|
|
||||||
|
.Ldoubleround:
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $16,%ymm3,%ymm3
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 12)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $12,%ymm1,%ymm1
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 8)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $8,%ymm3,%ymm3
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 7)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $7,%ymm1,%ymm1
|
||||||
|
|
||||||
|
# x1 = shuffle32(x1, MASK(0, 3, 2, 1))
|
||||||
|
vpshufd $0x39,%ymm1,%ymm1
|
||||||
|
# x2 = shuffle32(x2, MASK(1, 0, 3, 2))
|
||||||
|
vpshufd $0x4e,%ymm2,%ymm2
|
||||||
|
# x3 = shuffle32(x3, MASK(2, 1, 0, 3))
|
||||||
|
vpshufd $0x93,%ymm3,%ymm3
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $16,%ymm3,%ymm3
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 12)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $12,%ymm1,%ymm1
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 8)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $8,%ymm3,%ymm3
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 7)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $7,%ymm1,%ymm1
|
||||||
|
|
||||||
|
# x1 = shuffle32(x1, MASK(2, 1, 0, 3))
|
||||||
|
vpshufd $0x93,%ymm1,%ymm1
|
||||||
|
# x2 = shuffle32(x2, MASK(1, 0, 3, 2))
|
||||||
|
vpshufd $0x4e,%ymm2,%ymm2
|
||||||
|
# x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
||||||
|
vpshufd $0x39,%ymm3,%ymm3
|
||||||
|
|
||||||
|
sub $2,%r8d
|
||||||
|
jnz .Ldoubleround
|
||||||
|
|
||||||
|
# o0 = i0 ^ (x0 + s0)
|
||||||
|
vpaddd %ymm8,%ymm0,%ymm7
|
||||||
|
cmp $0x10,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x00(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x00(%rsi)
|
||||||
|
vextracti128 $1,%ymm7,%xmm0
|
||||||
|
# o1 = i1 ^ (x1 + s1)
|
||||||
|
vpaddd %ymm9,%ymm1,%ymm7
|
||||||
|
cmp $0x20,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x10(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x10(%rsi)
|
||||||
|
vextracti128 $1,%ymm7,%xmm1
|
||||||
|
# o2 = i2 ^ (x2 + s2)
|
||||||
|
vpaddd %ymm10,%ymm2,%ymm7
|
||||||
|
cmp $0x30,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x20(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x20(%rsi)
|
||||||
|
vextracti128 $1,%ymm7,%xmm2
|
||||||
|
# o3 = i3 ^ (x3 + s3)
|
||||||
|
vpaddd %ymm11,%ymm3,%ymm7
|
||||||
|
cmp $0x40,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x30(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x30(%rsi)
|
||||||
|
vextracti128 $1,%ymm7,%xmm3
|
||||||
|
|
||||||
|
# xor and write second block
|
||||||
|
vmovdqa %xmm0,%xmm7
|
||||||
|
cmp $0x50,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x40(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x40(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm1,%xmm7
|
||||||
|
cmp $0x60,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x50(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x50(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm2,%xmm7
|
||||||
|
cmp $0x70,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x60(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x60(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm3,%xmm7
|
||||||
|
cmp $0x80,%rcx
|
||||||
|
jl .Lxorpart2
|
||||||
|
vpxord 0x70(%rdx),%xmm7,%xmm6
|
||||||
|
vmovdqu %xmm6,0x70(%rsi)
|
||||||
|
|
||||||
|
.Ldone2:
|
||||||
|
vzeroupper
|
||||||
|
ret
|
||||||
|
|
||||||
|
.Lxorpart2:
|
||||||
|
# xor remaining bytes from partial register into output
|
||||||
|
mov %rcx,%rax
|
||||||
|
and $0xf,%rcx
|
||||||
|
jz .Ldone8
|
||||||
|
mov %rax,%r9
|
||||||
|
and $~0xf,%r9
|
||||||
|
|
||||||
|
mov $1,%rax
|
||||||
|
shld %cl,%rax,%rax
|
||||||
|
sub $1,%rax
|
||||||
|
kmovq %rax,%k1
|
||||||
|
|
||||||
|
vmovdqu8 (%rdx,%r9),%xmm1{%k1}{z}
|
||||||
|
vpxord %xmm7,%xmm1,%xmm1
|
||||||
|
vmovdqu8 %xmm1,(%rsi,%r9){%k1}
|
||||||
|
|
||||||
|
jmp .Ldone2
|
||||||
|
|
||||||
|
ENDPROC(chacha_2block_xor_avx512vl)
|
||||||
|
|
||||||
|
ENTRY(chacha_4block_xor_avx512vl)
|
||||||
|
# %rdi: Input state matrix, s
|
||||||
|
# %rsi: up to 4 data blocks output, o
|
||||||
|
# %rdx: up to 4 data blocks input, i
|
||||||
|
# %rcx: input/output length in bytes
|
||||||
|
# %r8d: nrounds
|
||||||
|
|
||||||
|
# This function encrypts four ChaCha blocks by loading the state
|
||||||
|
# matrix four times across eight AVX registers. It performs matrix
|
||||||
|
# operations on four words in two matrices in parallel, sequentially
|
||||||
|
# to the operations on the four words of the other two matrices. The
|
||||||
|
# required word shuffling has a rather high latency, we can do the
|
||||||
|
# arithmetic on two matrix-pairs without much slowdown.
|
||||||
|
|
||||||
|
vzeroupper
|
||||||
|
|
||||||
|
# x0..3[0-4] = s0..3
|
||||||
|
vbroadcasti128 0x00(%rdi),%ymm0
|
||||||
|
vbroadcasti128 0x10(%rdi),%ymm1
|
||||||
|
vbroadcasti128 0x20(%rdi),%ymm2
|
||||||
|
vbroadcasti128 0x30(%rdi),%ymm3
|
||||||
|
|
||||||
|
vmovdqa %ymm0,%ymm4
|
||||||
|
vmovdqa %ymm1,%ymm5
|
||||||
|
vmovdqa %ymm2,%ymm6
|
||||||
|
vmovdqa %ymm3,%ymm7
|
||||||
|
|
||||||
|
vpaddd CTR2BL(%rip),%ymm3,%ymm3
|
||||||
|
vpaddd CTR4BL(%rip),%ymm7,%ymm7
|
||||||
|
|
||||||
|
vmovdqa %ymm0,%ymm11
|
||||||
|
vmovdqa %ymm1,%ymm12
|
||||||
|
vmovdqa %ymm2,%ymm13
|
||||||
|
vmovdqa %ymm3,%ymm14
|
||||||
|
vmovdqa %ymm7,%ymm15
|
||||||
|
|
||||||
|
.Ldoubleround4:
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $16,%ymm3,%ymm3
|
||||||
|
|
||||||
|
vpaddd %ymm5,%ymm4,%ymm4
|
||||||
|
vpxord %ymm4,%ymm7,%ymm7
|
||||||
|
vprold $16,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 12)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $12,%ymm1,%ymm1
|
||||||
|
|
||||||
|
vpaddd %ymm7,%ymm6,%ymm6
|
||||||
|
vpxord %ymm6,%ymm5,%ymm5
|
||||||
|
vprold $12,%ymm5,%ymm5
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 8)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $8,%ymm3,%ymm3
|
||||||
|
|
||||||
|
vpaddd %ymm5,%ymm4,%ymm4
|
||||||
|
vpxord %ymm4,%ymm7,%ymm7
|
||||||
|
vprold $8,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 7)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $7,%ymm1,%ymm1
|
||||||
|
|
||||||
|
vpaddd %ymm7,%ymm6,%ymm6
|
||||||
|
vpxord %ymm6,%ymm5,%ymm5
|
||||||
|
vprold $7,%ymm5,%ymm5
|
||||||
|
|
||||||
|
# x1 = shuffle32(x1, MASK(0, 3, 2, 1))
|
||||||
|
vpshufd $0x39,%ymm1,%ymm1
|
||||||
|
vpshufd $0x39,%ymm5,%ymm5
|
||||||
|
# x2 = shuffle32(x2, MASK(1, 0, 3, 2))
|
||||||
|
vpshufd $0x4e,%ymm2,%ymm2
|
||||||
|
vpshufd $0x4e,%ymm6,%ymm6
|
||||||
|
# x3 = shuffle32(x3, MASK(2, 1, 0, 3))
|
||||||
|
vpshufd $0x93,%ymm3,%ymm3
|
||||||
|
vpshufd $0x93,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $16,%ymm3,%ymm3
|
||||||
|
|
||||||
|
vpaddd %ymm5,%ymm4,%ymm4
|
||||||
|
vpxord %ymm4,%ymm7,%ymm7
|
||||||
|
vprold $16,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 12)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $12,%ymm1,%ymm1
|
||||||
|
|
||||||
|
vpaddd %ymm7,%ymm6,%ymm6
|
||||||
|
vpxord %ymm6,%ymm5,%ymm5
|
||||||
|
vprold $12,%ymm5,%ymm5
|
||||||
|
|
||||||
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 8)
|
||||||
|
vpaddd %ymm1,%ymm0,%ymm0
|
||||||
|
vpxord %ymm0,%ymm3,%ymm3
|
||||||
|
vprold $8,%ymm3,%ymm3
|
||||||
|
|
||||||
|
vpaddd %ymm5,%ymm4,%ymm4
|
||||||
|
vpxord %ymm4,%ymm7,%ymm7
|
||||||
|
vprold $8,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x2 += x3, x1 = rotl32(x1 ^ x2, 7)
|
||||||
|
vpaddd %ymm3,%ymm2,%ymm2
|
||||||
|
vpxord %ymm2,%ymm1,%ymm1
|
||||||
|
vprold $7,%ymm1,%ymm1
|
||||||
|
|
||||||
|
vpaddd %ymm7,%ymm6,%ymm6
|
||||||
|
vpxord %ymm6,%ymm5,%ymm5
|
||||||
|
vprold $7,%ymm5,%ymm5
|
||||||
|
|
||||||
|
# x1 = shuffle32(x1, MASK(2, 1, 0, 3))
|
||||||
|
vpshufd $0x93,%ymm1,%ymm1
|
||||||
|
vpshufd $0x93,%ymm5,%ymm5
|
||||||
|
# x2 = shuffle32(x2, MASK(1, 0, 3, 2))
|
||||||
|
vpshufd $0x4e,%ymm2,%ymm2
|
||||||
|
vpshufd $0x4e,%ymm6,%ymm6
|
||||||
|
# x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
||||||
|
vpshufd $0x39,%ymm3,%ymm3
|
||||||
|
vpshufd $0x39,%ymm7,%ymm7
|
||||||
|
|
||||||
|
sub $2,%r8d
|
||||||
|
jnz .Ldoubleround4
|
||||||
|
|
||||||
|
# o0 = i0 ^ (x0 + s0), first block
|
||||||
|
vpaddd %ymm11,%ymm0,%ymm10
|
||||||
|
cmp $0x10,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x00(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x00(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm0
|
||||||
|
# o1 = i1 ^ (x1 + s1), first block
|
||||||
|
vpaddd %ymm12,%ymm1,%ymm10
|
||||||
|
cmp $0x20,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x10(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x10(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm1
|
||||||
|
# o2 = i2 ^ (x2 + s2), first block
|
||||||
|
vpaddd %ymm13,%ymm2,%ymm10
|
||||||
|
cmp $0x30,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x20(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x20(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm2
|
||||||
|
# o3 = i3 ^ (x3 + s3), first block
|
||||||
|
vpaddd %ymm14,%ymm3,%ymm10
|
||||||
|
cmp $0x40,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x30(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x30(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm3
|
||||||
|
|
||||||
|
# xor and write second block
|
||||||
|
vmovdqa %xmm0,%xmm10
|
||||||
|
cmp $0x50,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x40(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x40(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm1,%xmm10
|
||||||
|
cmp $0x60,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x50(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x50(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm2,%xmm10
|
||||||
|
cmp $0x70,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x60(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x60(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm3,%xmm10
|
||||||
|
cmp $0x80,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x70(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x70(%rsi)
|
||||||
|
|
||||||
|
# o0 = i0 ^ (x0 + s0), third block
|
||||||
|
vpaddd %ymm11,%ymm4,%ymm10
|
||||||
|
cmp $0x90,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x80(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x80(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm4
|
||||||
|
# o1 = i1 ^ (x1 + s1), third block
|
||||||
|
vpaddd %ymm12,%ymm5,%ymm10
|
||||||
|
cmp $0xa0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0x90(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0x90(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm5
|
||||||
|
# o2 = i2 ^ (x2 + s2), third block
|
||||||
|
vpaddd %ymm13,%ymm6,%ymm10
|
||||||
|
cmp $0xb0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xa0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xa0(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm6
|
||||||
|
# o3 = i3 ^ (x3 + s3), third block
|
||||||
|
vpaddd %ymm15,%ymm7,%ymm10
|
||||||
|
cmp $0xc0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xb0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xb0(%rsi)
|
||||||
|
vextracti128 $1,%ymm10,%xmm7
|
||||||
|
|
||||||
|
# xor and write fourth block
|
||||||
|
vmovdqa %xmm4,%xmm10
|
||||||
|
cmp $0xd0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xc0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xc0(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm5,%xmm10
|
||||||
|
cmp $0xe0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xd0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xd0(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm6,%xmm10
|
||||||
|
cmp $0xf0,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xe0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xe0(%rsi)
|
||||||
|
|
||||||
|
vmovdqa %xmm7,%xmm10
|
||||||
|
cmp $0x100,%rcx
|
||||||
|
jl .Lxorpart4
|
||||||
|
vpxord 0xf0(%rdx),%xmm10,%xmm9
|
||||||
|
vmovdqu %xmm9,0xf0(%rsi)
|
||||||
|
|
||||||
|
.Ldone4:
|
||||||
|
vzeroupper
|
||||||
|
ret
|
||||||
|
|
||||||
|
.Lxorpart4:
|
||||||
|
# xor remaining bytes from partial register into output
|
||||||
|
mov %rcx,%rax
|
||||||
|
and $0xf,%rcx
|
||||||
|
jz .Ldone8
|
||||||
|
mov %rax,%r9
|
||||||
|
and $~0xf,%r9
|
||||||
|
|
||||||
|
mov $1,%rax
|
||||||
|
shld %cl,%rax,%rax
|
||||||
|
sub $1,%rax
|
||||||
|
kmovq %rax,%k1
|
||||||
|
|
||||||
|
vmovdqu8 (%rdx,%r9),%xmm1{%k1}{z}
|
||||||
|
vpxord %xmm10,%xmm1,%xmm1
|
||||||
|
vmovdqu8 %xmm1,(%rsi,%r9){%k1}
|
||||||
|
|
||||||
|
jmp .Ldone4
|
||||||
|
|
||||||
|
ENDPROC(chacha_4block_xor_avx512vl)
|
||||||
|
|
||||||
|
ENTRY(chacha_8block_xor_avx512vl)
|
||||||
|
# %rdi: Input state matrix, s
|
||||||
|
# %rsi: up to 8 data blocks output, o
|
||||||
|
# %rdx: up to 8 data blocks input, i
|
||||||
|
# %rcx: input/output length in bytes
|
||||||
|
# %r8d: nrounds
|
||||||
|
|
||||||
|
# This function encrypts eight consecutive ChaCha blocks by loading
|
||||||
|
# the state matrix in AVX registers eight times. Compared to AVX2, this
|
||||||
|
# mostly benefits from the new rotate instructions in VL and the
|
||||||
|
# additional registers.
|
||||||
|
|
||||||
|
vzeroupper
|
||||||
|
|
||||||
|
# x0..15[0-7] = s[0..15]
|
||||||
|
vpbroadcastd 0x00(%rdi),%ymm0
|
||||||
|
vpbroadcastd 0x04(%rdi),%ymm1
|
||||||
|
vpbroadcastd 0x08(%rdi),%ymm2
|
||||||
|
vpbroadcastd 0x0c(%rdi),%ymm3
|
||||||
|
vpbroadcastd 0x10(%rdi),%ymm4
|
||||||
|
vpbroadcastd 0x14(%rdi),%ymm5
|
||||||
|
vpbroadcastd 0x18(%rdi),%ymm6
|
||||||
|
vpbroadcastd 0x1c(%rdi),%ymm7
|
||||||
|
vpbroadcastd 0x20(%rdi),%ymm8
|
||||||
|
vpbroadcastd 0x24(%rdi),%ymm9
|
||||||
|
vpbroadcastd 0x28(%rdi),%ymm10
|
||||||
|
vpbroadcastd 0x2c(%rdi),%ymm11
|
||||||
|
vpbroadcastd 0x30(%rdi),%ymm12
|
||||||
|
vpbroadcastd 0x34(%rdi),%ymm13
|
||||||
|
vpbroadcastd 0x38(%rdi),%ymm14
|
||||||
|
vpbroadcastd 0x3c(%rdi),%ymm15
|
||||||
|
|
||||||
|
# x12 += counter values 0-3
|
||||||
|
vpaddd CTR8BL(%rip),%ymm12,%ymm12
|
||||||
|
|
||||||
|
vmovdqa64 %ymm0,%ymm16
|
||||||
|
vmovdqa64 %ymm1,%ymm17
|
||||||
|
vmovdqa64 %ymm2,%ymm18
|
||||||
|
vmovdqa64 %ymm3,%ymm19
|
||||||
|
vmovdqa64 %ymm4,%ymm20
|
||||||
|
vmovdqa64 %ymm5,%ymm21
|
||||||
|
vmovdqa64 %ymm6,%ymm22
|
||||||
|
vmovdqa64 %ymm7,%ymm23
|
||||||
|
vmovdqa64 %ymm8,%ymm24
|
||||||
|
vmovdqa64 %ymm9,%ymm25
|
||||||
|
vmovdqa64 %ymm10,%ymm26
|
||||||
|
vmovdqa64 %ymm11,%ymm27
|
||||||
|
vmovdqa64 %ymm12,%ymm28
|
||||||
|
vmovdqa64 %ymm13,%ymm29
|
||||||
|
vmovdqa64 %ymm14,%ymm30
|
||||||
|
vmovdqa64 %ymm15,%ymm31
|
||||||
|
|
||||||
|
.Ldoubleround8:
|
||||||
|
# x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
||||||
|
vpaddd %ymm0,%ymm4,%ymm0
|
||||||
|
vpxord %ymm0,%ymm12,%ymm12
|
||||||
|
vprold $16,%ymm12,%ymm12
|
||||||
|
# x1 += x5, x13 = rotl32(x13 ^ x1, 16)
|
||||||
|
vpaddd %ymm1,%ymm5,%ymm1
|
||||||
|
vpxord %ymm1,%ymm13,%ymm13
|
||||||
|
vprold $16,%ymm13,%ymm13
|
||||||
|
# x2 += x6, x14 = rotl32(x14 ^ x2, 16)
|
||||||
|
vpaddd %ymm2,%ymm6,%ymm2
|
||||||
|
vpxord %ymm2,%ymm14,%ymm14
|
||||||
|
vprold $16,%ymm14,%ymm14
|
||||||
|
# x3 += x7, x15 = rotl32(x15 ^ x3, 16)
|
||||||
|
vpaddd %ymm3,%ymm7,%ymm3
|
||||||
|
vpxord %ymm3,%ymm15,%ymm15
|
||||||
|
vprold $16,%ymm15,%ymm15
|
||||||
|
|
||||||
|
# x8 += x12, x4 = rotl32(x4 ^ x8, 12)
|
||||||
|
vpaddd %ymm12,%ymm8,%ymm8
|
||||||
|
vpxord %ymm8,%ymm4,%ymm4
|
||||||
|
vprold $12,%ymm4,%ymm4
|
||||||
|
# x9 += x13, x5 = rotl32(x5 ^ x9, 12)
|
||||||
|
vpaddd %ymm13,%ymm9,%ymm9
|
||||||
|
vpxord %ymm9,%ymm5,%ymm5
|
||||||
|
vprold $12,%ymm5,%ymm5
|
||||||
|
# x10 += x14, x6 = rotl32(x6 ^ x10, 12)
|
||||||
|
vpaddd %ymm14,%ymm10,%ymm10
|
||||||
|
vpxord %ymm10,%ymm6,%ymm6
|
||||||
|
vprold $12,%ymm6,%ymm6
|
||||||
|
# x11 += x15, x7 = rotl32(x7 ^ x11, 12)
|
||||||
|
vpaddd %ymm15,%ymm11,%ymm11
|
||||||
|
vpxord %ymm11,%ymm7,%ymm7
|
||||||
|
vprold $12,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x0 += x4, x12 = rotl32(x12 ^ x0, 8)
|
||||||
|
vpaddd %ymm0,%ymm4,%ymm0
|
||||||
|
vpxord %ymm0,%ymm12,%ymm12
|
||||||
|
vprold $8,%ymm12,%ymm12
|
||||||
|
# x1 += x5, x13 = rotl32(x13 ^ x1, 8)
|
||||||
|
vpaddd %ymm1,%ymm5,%ymm1
|
||||||
|
vpxord %ymm1,%ymm13,%ymm13
|
||||||
|
vprold $8,%ymm13,%ymm13
|
||||||
|
# x2 += x6, x14 = rotl32(x14 ^ x2, 8)
|
||||||
|
vpaddd %ymm2,%ymm6,%ymm2
|
||||||
|
vpxord %ymm2,%ymm14,%ymm14
|
||||||
|
vprold $8,%ymm14,%ymm14
|
||||||
|
# x3 += x7, x15 = rotl32(x15 ^ x3, 8)
|
||||||
|
vpaddd %ymm3,%ymm7,%ymm3
|
||||||
|
vpxord %ymm3,%ymm15,%ymm15
|
||||||
|
vprold $8,%ymm15,%ymm15
|
||||||
|
|
||||||
|
# x8 += x12, x4 = rotl32(x4 ^ x8, 7)
|
||||||
|
vpaddd %ymm12,%ymm8,%ymm8
|
||||||
|
vpxord %ymm8,%ymm4,%ymm4
|
||||||
|
vprold $7,%ymm4,%ymm4
|
||||||
|
# x9 += x13, x5 = rotl32(x5 ^ x9, 7)
|
||||||
|
vpaddd %ymm13,%ymm9,%ymm9
|
||||||
|
vpxord %ymm9,%ymm5,%ymm5
|
||||||
|
vprold $7,%ymm5,%ymm5
|
||||||
|
# x10 += x14, x6 = rotl32(x6 ^ x10, 7)
|
||||||
|
vpaddd %ymm14,%ymm10,%ymm10
|
||||||
|
vpxord %ymm10,%ymm6,%ymm6
|
||||||
|
vprold $7,%ymm6,%ymm6
|
||||||
|
# x11 += x15, x7 = rotl32(x7 ^ x11, 7)
|
||||||
|
vpaddd %ymm15,%ymm11,%ymm11
|
||||||
|
vpxord %ymm11,%ymm7,%ymm7
|
||||||
|
vprold $7,%ymm7,%ymm7
|
||||||
|
|
||||||
|
# x0 += x5, x15 = rotl32(x15 ^ x0, 16)
|
||||||
|
vpaddd %ymm0,%ymm5,%ymm0
|
||||||
|
vpxord %ymm0,%ymm15,%ymm15
|
||||||
|
vprold $16,%ymm15,%ymm15
|
||||||
|
# x1 += x6, x12 = rotl32(x12 ^ x1, 16)
|
||||||
|
vpaddd %ymm1,%ymm6,%ymm1
|
||||||
|
vpxord %ymm1,%ymm12,%ymm12
|
||||||
|
vprold $16,%ymm12,%ymm12
|
||||||
|
# x2 += x7, x13 = rotl32(x13 ^ x2, 16)
|
||||||
|
vpaddd %ymm2,%ymm7,%ymm2
|
||||||
|
vpxord %ymm2,%ymm13,%ymm13
|
||||||
|
vprold $16,%ymm13,%ymm13
|
||||||
|
# x3 += x4, x14 = rotl32(x14 ^ x3, 16)
|
||||||
|
vpaddd %ymm3,%ymm4,%ymm3
|
||||||
|
vpxord %ymm3,%ymm14,%ymm14
|
||||||
|
vprold $16,%ymm14,%ymm14
|
||||||
|
|
||||||
|
# x10 += x15, x5 = rotl32(x5 ^ x10, 12)
|
||||||
|
vpaddd %ymm15,%ymm10,%ymm10
|
||||||
|
vpxord %ymm10,%ymm5,%ymm5
|
||||||
|
vprold $12,%ymm5,%ymm5
|
||||||
|
# x11 += x12, x6 = rotl32(x6 ^ x11, 12)
|
||||||
|
vpaddd %ymm12,%ymm11,%ymm11
|
||||||
|
vpxord %ymm11,%ymm6,%ymm6
|
||||||
|
vprold $12,%ymm6,%ymm6
|
||||||
|
# x8 += x13, x7 = rotl32(x7 ^ x8, 12)
|
||||||
|
vpaddd %ymm13,%ymm8,%ymm8
|
||||||
|
vpxord %ymm8,%ymm7,%ymm7
|
||||||
|
vprold $12,%ymm7,%ymm7
|
||||||
|
# x9 += x14, x4 = rotl32(x4 ^ x9, 12)
|
||||||
|
vpaddd %ymm14,%ymm9,%ymm9
|
||||||
|
vpxord %ymm9,%ymm4,%ymm4
|
||||||
|
vprold $12,%ymm4,%ymm4
|
||||||
|
|
||||||
|
# x0 += x5, x15 = rotl32(x15 ^ x0, 8)
|
||||||
|
vpaddd %ymm0,%ymm5,%ymm0
|
||||||
|
vpxord %ymm0,%ymm15,%ymm15
|
||||||
|
vprold $8,%ymm15,%ymm15
|
||||||
|
# x1 += x6, x12 = rotl32(x12 ^ x1, 8)
|
||||||
|
vpaddd %ymm1,%ymm6,%ymm1
|
||||||
|
vpxord %ymm1,%ymm12,%ymm12
|
||||||
|
vprold $8,%ymm12,%ymm12
|
||||||
|
# x2 += x7, x13 = rotl32(x13 ^ x2, 8)
|
||||||
|
vpaddd %ymm2,%ymm7,%ymm2
|
||||||
|
vpxord %ymm2,%ymm13,%ymm13
|
||||||
|
vprold $8,%ymm13,%ymm13
|
||||||
|
# x3 += x4, x14 = rotl32(x14 ^ x3, 8)
|
||||||
|
vpaddd %ymm3,%ymm4,%ymm3
|
||||||
|
vpxord %ymm3,%ymm14,%ymm14
|
||||||
|
vprold $8,%ymm14,%ymm14
|
||||||
|
|
||||||
|
# x10 += x15, x5 = rotl32(x5 ^ x10, 7)
|
||||||
|
vpaddd %ymm15,%ymm10,%ymm10
|
||||||
|
vpxord %ymm10,%ymm5,%ymm5
|
||||||
|
vprold $7,%ymm5,%ymm5
|
||||||
|
# x11 += x12, x6 = rotl32(x6 ^ x11, 7)
|
||||||
|
vpaddd %ymm12,%ymm11,%ymm11
|
||||||
|
vpxord %ymm11,%ymm6,%ymm6
|
||||||
|
vprold $7,%ymm6,%ymm6
|
||||||
|
# x8 += x13, x7 = rotl32(x7 ^ x8, 7)
|
||||||
|
vpaddd %ymm13,%ymm8,%ymm8
|
||||||
|
vpxord %ymm8,%ymm7,%ymm7
|
||||||
|
vprold $7,%ymm7,%ymm7
|
||||||
|
# x9 += x14, x4 = rotl32(x4 ^ x9, 7)
|
||||||
|
vpaddd %ymm14,%ymm9,%ymm9
|
||||||
|
vpxord %ymm9,%ymm4,%ymm4
|
||||||
|
vprold $7,%ymm4,%ymm4
|
||||||
|
|
||||||
|
sub $2,%r8d
|
||||||
|
jnz .Ldoubleround8
|
||||||
|
|
||||||
|
# x0..15[0-3] += s[0..15]
|
||||||
|
vpaddd %ymm16,%ymm0,%ymm0
|
||||||
|
vpaddd %ymm17,%ymm1,%ymm1
|
||||||
|
vpaddd %ymm18,%ymm2,%ymm2
|
||||||
|
vpaddd %ymm19,%ymm3,%ymm3
|
||||||
|
vpaddd %ymm20,%ymm4,%ymm4
|
||||||
|
vpaddd %ymm21,%ymm5,%ymm5
|
||||||
|
vpaddd %ymm22,%ymm6,%ymm6
|
||||||
|
vpaddd %ymm23,%ymm7,%ymm7
|
||||||
|
vpaddd %ymm24,%ymm8,%ymm8
|
||||||
|
vpaddd %ymm25,%ymm9,%ymm9
|
||||||
|
vpaddd %ymm26,%ymm10,%ymm10
|
||||||
|
vpaddd %ymm27,%ymm11,%ymm11
|
||||||
|
vpaddd %ymm28,%ymm12,%ymm12
|
||||||
|
vpaddd %ymm29,%ymm13,%ymm13
|
||||||
|
vpaddd %ymm30,%ymm14,%ymm14
|
||||||
|
vpaddd %ymm31,%ymm15,%ymm15
|
||||||
|
|
||||||
|
# interleave 32-bit words in state n, n+1
|
||||||
|
vpunpckldq %ymm1,%ymm0,%ymm16
|
||||||
|
vpunpckhdq %ymm1,%ymm0,%ymm17
|
||||||
|
vpunpckldq %ymm3,%ymm2,%ymm18
|
||||||
|
vpunpckhdq %ymm3,%ymm2,%ymm19
|
||||||
|
vpunpckldq %ymm5,%ymm4,%ymm20
|
||||||
|
vpunpckhdq %ymm5,%ymm4,%ymm21
|
||||||
|
vpunpckldq %ymm7,%ymm6,%ymm22
|
||||||
|
vpunpckhdq %ymm7,%ymm6,%ymm23
|
||||||
|
vpunpckldq %ymm9,%ymm8,%ymm24
|
||||||
|
vpunpckhdq %ymm9,%ymm8,%ymm25
|
||||||
|
vpunpckldq %ymm11,%ymm10,%ymm26
|
||||||
|
vpunpckhdq %ymm11,%ymm10,%ymm27
|
||||||
|
vpunpckldq %ymm13,%ymm12,%ymm28
|
||||||
|
vpunpckhdq %ymm13,%ymm12,%ymm29
|
||||||
|
vpunpckldq %ymm15,%ymm14,%ymm30
|
||||||
|
vpunpckhdq %ymm15,%ymm14,%ymm31
|
||||||
|
|
||||||
|
# interleave 64-bit words in state n, n+2
|
||||||
|
vpunpcklqdq %ymm18,%ymm16,%ymm0
|
||||||
|
vpunpcklqdq %ymm19,%ymm17,%ymm1
|
||||||
|
vpunpckhqdq %ymm18,%ymm16,%ymm2
|
||||||
|
vpunpckhqdq %ymm19,%ymm17,%ymm3
|
||||||
|
vpunpcklqdq %ymm22,%ymm20,%ymm4
|
||||||
|
vpunpcklqdq %ymm23,%ymm21,%ymm5
|
||||||
|
vpunpckhqdq %ymm22,%ymm20,%ymm6
|
||||||
|
vpunpckhqdq %ymm23,%ymm21,%ymm7
|
||||||
|
vpunpcklqdq %ymm26,%ymm24,%ymm8
|
||||||
|
vpunpcklqdq %ymm27,%ymm25,%ymm9
|
||||||
|
vpunpckhqdq %ymm26,%ymm24,%ymm10
|
||||||
|
vpunpckhqdq %ymm27,%ymm25,%ymm11
|
||||||
|
vpunpcklqdq %ymm30,%ymm28,%ymm12
|
||||||
|
vpunpcklqdq %ymm31,%ymm29,%ymm13
|
||||||
|
vpunpckhqdq %ymm30,%ymm28,%ymm14
|
||||||
|
vpunpckhqdq %ymm31,%ymm29,%ymm15
|
||||||
|
|
||||||
|
# interleave 128-bit words in state n, n+4
|
||||||
|
# xor/write first four blocks
|
||||||
|
vmovdqa64 %ymm0,%ymm16
|
||||||
|
vperm2i128 $0x20,%ymm4,%ymm0,%ymm0
|
||||||
|
cmp $0x0020,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0000(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0000(%rsi)
|
||||||
|
vmovdqa64 %ymm16,%ymm0
|
||||||
|
vperm2i128 $0x31,%ymm4,%ymm0,%ymm4
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm12,%ymm8,%ymm0
|
||||||
|
cmp $0x0040,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0020(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0020(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm12,%ymm8,%ymm12
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm6,%ymm2,%ymm0
|
||||||
|
cmp $0x0060,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0040(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0040(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm6,%ymm2,%ymm6
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm14,%ymm10,%ymm0
|
||||||
|
cmp $0x0080,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0060(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0060(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm14,%ymm10,%ymm14
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm5,%ymm1,%ymm0
|
||||||
|
cmp $0x00a0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0080(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0080(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm5,%ymm1,%ymm5
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm13,%ymm9,%ymm0
|
||||||
|
cmp $0x00c0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x00a0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x00a0(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm13,%ymm9,%ymm13
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm7,%ymm3,%ymm0
|
||||||
|
cmp $0x00e0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x00c0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x00c0(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm7,%ymm3,%ymm7
|
||||||
|
|
||||||
|
vperm2i128 $0x20,%ymm15,%ymm11,%ymm0
|
||||||
|
cmp $0x0100,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x00e0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x00e0(%rsi)
|
||||||
|
vperm2i128 $0x31,%ymm15,%ymm11,%ymm15
|
||||||
|
|
||||||
|
# xor remaining blocks, write to output
|
||||||
|
vmovdqa64 %ymm4,%ymm0
|
||||||
|
cmp $0x0120,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0100(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0100(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm12,%ymm0
|
||||||
|
cmp $0x0140,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0120(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0120(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm6,%ymm0
|
||||||
|
cmp $0x0160,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0140(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0140(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm14,%ymm0
|
||||||
|
cmp $0x0180,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0160(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0160(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm5,%ymm0
|
||||||
|
cmp $0x01a0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x0180(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x0180(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm13,%ymm0
|
||||||
|
cmp $0x01c0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x01a0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x01a0(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm7,%ymm0
|
||||||
|
cmp $0x01e0,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x01c0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x01c0(%rsi)
|
||||||
|
|
||||||
|
vmovdqa64 %ymm15,%ymm0
|
||||||
|
cmp $0x0200,%rcx
|
||||||
|
jl .Lxorpart8
|
||||||
|
vpxord 0x01e0(%rdx),%ymm0,%ymm0
|
||||||
|
vmovdqu64 %ymm0,0x01e0(%rsi)
|
||||||
|
|
||||||
|
.Ldone8:
|
||||||
|
vzeroupper
|
||||||
|
ret
|
||||||
|
|
||||||
|
.Lxorpart8:
|
||||||
|
# xor remaining bytes from partial register into output
|
||||||
|
mov %rcx,%rax
|
||||||
|
and $0x1f,%rcx
|
||||||
|
jz .Ldone8
|
||||||
|
mov %rax,%r9
|
||||||
|
and $~0x1f,%r9
|
||||||
|
|
||||||
|
mov $1,%rax
|
||||||
|
shld %cl,%rax,%rax
|
||||||
|
sub $1,%rax
|
||||||
|
kmovq %rax,%k1
|
||||||
|
|
||||||
|
vmovdqu8 (%rdx,%r9),%ymm1{%k1}{z}
|
||||||
|
vpxord %ymm0,%ymm1,%ymm1
|
||||||
|
vmovdqu8 %ymm1,(%rsi,%r9){%k1}
|
||||||
|
|
||||||
|
jmp .Ldone8
|
||||||
|
|
||||||
|
ENDPROC(chacha_8block_xor_avx512vl)
|
|
@ -1,5 +1,5 @@
|
||||||
/*
|
/*
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
|
* ChaCha 256-bit cipher algorithm, x64 SSSE3 functions
|
||||||
*
|
*
|
||||||
* Copyright (C) 2015 Martin Willi
|
* Copyright (C) 2015 Martin Willi
|
||||||
*
|
*
|
||||||
|
@ -10,6 +10,7 @@
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/linkage.h>
|
#include <linux/linkage.h>
|
||||||
|
#include <asm/frame.h>
|
||||||
|
|
||||||
.section .rodata.cst16.ROT8, "aM", @progbits, 16
|
.section .rodata.cst16.ROT8, "aM", @progbits, 16
|
||||||
.align 16
|
.align 16
|
||||||
|
@ -23,35 +24,25 @@ CTRINC: .octa 0x00000003000000020000000100000000
|
||||||
|
|
||||||
.text
|
.text
|
||||||
|
|
||||||
ENTRY(chacha20_block_xor_ssse3)
|
/*
|
||||||
# %rdi: Input state matrix, s
|
* chacha_permute - permute one block
|
||||||
# %rsi: 1 data block output, o
|
*
|
||||||
# %rdx: 1 data block input, i
|
* Permute one 64-byte block where the state matrix is in %xmm0-%xmm3. This
|
||||||
|
* function performs matrix operations on four words in parallel, but requires
|
||||||
# This function encrypts one ChaCha20 block by loading the state matrix
|
* shuffling to rearrange the words after each round. 8/16-bit word rotation is
|
||||||
# in four SSE registers. It performs matrix operation on four words in
|
* done with the slightly better performing SSSE3 byte shuffling, 7/12-bit word
|
||||||
# parallel, but requireds shuffling to rearrange the words after each
|
* rotation uses traditional shift+OR.
|
||||||
# round. 8/16-bit word rotation is done with the slightly better
|
*
|
||||||
# performing SSSE3 byte shuffling, 7/12-bit word rotation uses
|
* The round count is given in %r8d.
|
||||||
# traditional shift+OR.
|
*
|
||||||
|
* Clobbers: %r8d, %xmm4-%xmm7
|
||||||
# x0..3 = s0..3
|
*/
|
||||||
movdqa 0x00(%rdi),%xmm0
|
chacha_permute:
|
||||||
movdqa 0x10(%rdi),%xmm1
|
|
||||||
movdqa 0x20(%rdi),%xmm2
|
|
||||||
movdqa 0x30(%rdi),%xmm3
|
|
||||||
movdqa %xmm0,%xmm8
|
|
||||||
movdqa %xmm1,%xmm9
|
|
||||||
movdqa %xmm2,%xmm10
|
|
||||||
movdqa %xmm3,%xmm11
|
|
||||||
|
|
||||||
movdqa ROT8(%rip),%xmm4
|
movdqa ROT8(%rip),%xmm4
|
||||||
movdqa ROT16(%rip),%xmm5
|
movdqa ROT16(%rip),%xmm5
|
||||||
|
|
||||||
mov $10,%ecx
|
|
||||||
|
|
||||||
.Ldoubleround:
|
.Ldoubleround:
|
||||||
|
|
||||||
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
# x0 += x1, x3 = rotl32(x3 ^ x0, 16)
|
||||||
paddd %xmm1,%xmm0
|
paddd %xmm1,%xmm0
|
||||||
pxor %xmm0,%xmm3
|
pxor %xmm0,%xmm3
|
||||||
|
@ -118,39 +109,129 @@ ENTRY(chacha20_block_xor_ssse3)
|
||||||
# x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
# x3 = shuffle32(x3, MASK(0, 3, 2, 1))
|
||||||
pshufd $0x39,%xmm3,%xmm3
|
pshufd $0x39,%xmm3,%xmm3
|
||||||
|
|
||||||
dec %ecx
|
sub $2,%r8d
|
||||||
jnz .Ldoubleround
|
jnz .Ldoubleround
|
||||||
|
|
||||||
|
ret
|
||||||
|
ENDPROC(chacha_permute)
|
||||||
|
|
||||||
|
ENTRY(chacha_block_xor_ssse3)
|
||||||
|
# %rdi: Input state matrix, s
|
||||||
|
# %rsi: up to 1 data block output, o
|
||||||
|
# %rdx: up to 1 data block input, i
|
||||||
|
# %rcx: input/output length in bytes
|
||||||
|
# %r8d: nrounds
|
||||||
|
FRAME_BEGIN
|
||||||
|
|
||||||
|
# x0..3 = s0..3
|
||||||
|
movdqa 0x00(%rdi),%xmm0
|
||||||
|
movdqa 0x10(%rdi),%xmm1
|
||||||
|
movdqa 0x20(%rdi),%xmm2
|
||||||
|
movdqa 0x30(%rdi),%xmm3
|
||||||
|
movdqa %xmm0,%xmm8
|
||||||
|
movdqa %xmm1,%xmm9
|
||||||
|
movdqa %xmm2,%xmm10
|
||||||
|
movdqa %xmm3,%xmm11
|
||||||
|
|
||||||
|
mov %rcx,%rax
|
||||||
|
call chacha_permute
|
||||||
|
|
||||||
# o0 = i0 ^ (x0 + s0)
|
# o0 = i0 ^ (x0 + s0)
|
||||||
movdqu 0x00(%rdx),%xmm4
|
|
||||||
paddd %xmm8,%xmm0
|
paddd %xmm8,%xmm0
|
||||||
|
cmp $0x10,%rax
|
||||||
|
jl .Lxorpart
|
||||||
|
movdqu 0x00(%rdx),%xmm4
|
||||||
pxor %xmm4,%xmm0
|
pxor %xmm4,%xmm0
|
||||||
movdqu %xmm0,0x00(%rsi)
|
movdqu %xmm0,0x00(%rsi)
|
||||||
# o1 = i1 ^ (x1 + s1)
|
# o1 = i1 ^ (x1 + s1)
|
||||||
movdqu 0x10(%rdx),%xmm5
|
|
||||||
paddd %xmm9,%xmm1
|
paddd %xmm9,%xmm1
|
||||||
pxor %xmm5,%xmm1
|
movdqa %xmm1,%xmm0
|
||||||
movdqu %xmm1,0x10(%rsi)
|
cmp $0x20,%rax
|
||||||
|
jl .Lxorpart
|
||||||
|
movdqu 0x10(%rdx),%xmm0
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x10(%rsi)
|
||||||
# o2 = i2 ^ (x2 + s2)
|
# o2 = i2 ^ (x2 + s2)
|
||||||
movdqu 0x20(%rdx),%xmm6
|
|
||||||
paddd %xmm10,%xmm2
|
paddd %xmm10,%xmm2
|
||||||
pxor %xmm6,%xmm2
|
movdqa %xmm2,%xmm0
|
||||||
movdqu %xmm2,0x20(%rsi)
|
cmp $0x30,%rax
|
||||||
|
jl .Lxorpart
|
||||||
|
movdqu 0x20(%rdx),%xmm0
|
||||||
|
pxor %xmm2,%xmm0
|
||||||
|
movdqu %xmm0,0x20(%rsi)
|
||||||
# o3 = i3 ^ (x3 + s3)
|
# o3 = i3 ^ (x3 + s3)
|
||||||
movdqu 0x30(%rdx),%xmm7
|
|
||||||
paddd %xmm11,%xmm3
|
paddd %xmm11,%xmm3
|
||||||
pxor %xmm7,%xmm3
|
movdqa %xmm3,%xmm0
|
||||||
movdqu %xmm3,0x30(%rsi)
|
cmp $0x40,%rax
|
||||||
|
jl .Lxorpart
|
||||||
|
movdqu 0x30(%rdx),%xmm0
|
||||||
|
pxor %xmm3,%xmm0
|
||||||
|
movdqu %xmm0,0x30(%rsi)
|
||||||
|
|
||||||
|
.Ldone:
|
||||||
|
FRAME_END
|
||||||
ret
|
ret
|
||||||
ENDPROC(chacha20_block_xor_ssse3)
|
|
||||||
|
|
||||||
ENTRY(chacha20_4block_xor_ssse3)
|
.Lxorpart:
|
||||||
|
# xor remaining bytes from partial register into output
|
||||||
|
mov %rax,%r9
|
||||||
|
and $0x0f,%r9
|
||||||
|
jz .Ldone
|
||||||
|
and $~0x0f,%rax
|
||||||
|
|
||||||
|
mov %rsi,%r11
|
||||||
|
|
||||||
|
lea 8(%rsp),%r10
|
||||||
|
sub $0x10,%rsp
|
||||||
|
and $~31,%rsp
|
||||||
|
|
||||||
|
lea (%rdx,%rax),%rsi
|
||||||
|
mov %rsp,%rdi
|
||||||
|
mov %r9,%rcx
|
||||||
|
rep movsb
|
||||||
|
|
||||||
|
pxor 0x00(%rsp),%xmm0
|
||||||
|
movdqa %xmm0,0x00(%rsp)
|
||||||
|
|
||||||
|
mov %rsp,%rsi
|
||||||
|
lea (%r11,%rax),%rdi
|
||||||
|
mov %r9,%rcx
|
||||||
|
rep movsb
|
||||||
|
|
||||||
|
lea -8(%r10),%rsp
|
||||||
|
jmp .Ldone
|
||||||
|
|
||||||
|
ENDPROC(chacha_block_xor_ssse3)
|
||||||
|
|
||||||
|
ENTRY(hchacha_block_ssse3)
|
||||||
# %rdi: Input state matrix, s
|
# %rdi: Input state matrix, s
|
||||||
# %rsi: 4 data blocks output, o
|
# %rsi: output (8 32-bit words)
|
||||||
# %rdx: 4 data blocks input, i
|
# %edx: nrounds
|
||||||
|
FRAME_BEGIN
|
||||||
|
|
||||||
# This function encrypts four consecutive ChaCha20 blocks by loading the
|
movdqa 0x00(%rdi),%xmm0
|
||||||
|
movdqa 0x10(%rdi),%xmm1
|
||||||
|
movdqa 0x20(%rdi),%xmm2
|
||||||
|
movdqa 0x30(%rdi),%xmm3
|
||||||
|
|
||||||
|
mov %edx,%r8d
|
||||||
|
call chacha_permute
|
||||||
|
|
||||||
|
movdqu %xmm0,0x00(%rsi)
|
||||||
|
movdqu %xmm3,0x10(%rsi)
|
||||||
|
|
||||||
|
FRAME_END
|
||||||
|
ret
|
||||||
|
ENDPROC(hchacha_block_ssse3)
|
||||||
|
|
||||||
|
ENTRY(chacha_4block_xor_ssse3)
|
||||||
|
# %rdi: Input state matrix, s
|
||||||
|
# %rsi: up to 4 data blocks output, o
|
||||||
|
# %rdx: up to 4 data blocks input, i
|
||||||
|
# %rcx: input/output length in bytes
|
||||||
|
# %r8d: nrounds
|
||||||
|
|
||||||
|
# This function encrypts four consecutive ChaCha blocks by loading the
|
||||||
# the state matrix in SSE registers four times. As we need some scratch
|
# the state matrix in SSE registers four times. As we need some scratch
|
||||||
# registers, we save the first four registers on the stack. The
|
# registers, we save the first four registers on the stack. The
|
||||||
# algorithm performs each operation on the corresponding word of each
|
# algorithm performs each operation on the corresponding word of each
|
||||||
|
@ -163,6 +244,7 @@ ENTRY(chacha20_4block_xor_ssse3)
|
||||||
lea 8(%rsp),%r10
|
lea 8(%rsp),%r10
|
||||||
sub $0x80,%rsp
|
sub $0x80,%rsp
|
||||||
and $~63,%rsp
|
and $~63,%rsp
|
||||||
|
mov %rcx,%rax
|
||||||
|
|
||||||
# x0..15[0-3] = s0..3[0..3]
|
# x0..15[0-3] = s0..3[0..3]
|
||||||
movq 0x00(%rdi),%xmm1
|
movq 0x00(%rdi),%xmm1
|
||||||
|
@ -202,8 +284,6 @@ ENTRY(chacha20_4block_xor_ssse3)
|
||||||
# x12 += counter values 0-3
|
# x12 += counter values 0-3
|
||||||
paddd %xmm1,%xmm12
|
paddd %xmm1,%xmm12
|
||||||
|
|
||||||
mov $10,%ecx
|
|
||||||
|
|
||||||
.Ldoubleround4:
|
.Ldoubleround4:
|
||||||
# x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
# x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
||||||
movdqa 0x00(%rsp),%xmm0
|
movdqa 0x00(%rsp),%xmm0
|
||||||
|
@ -421,7 +501,7 @@ ENTRY(chacha20_4block_xor_ssse3)
|
||||||
psrld $25,%xmm4
|
psrld $25,%xmm4
|
||||||
por %xmm0,%xmm4
|
por %xmm0,%xmm4
|
||||||
|
|
||||||
dec %ecx
|
sub $2,%r8d
|
||||||
jnz .Ldoubleround4
|
jnz .Ldoubleround4
|
||||||
|
|
||||||
# x0[0-3] += s0[0]
|
# x0[0-3] += s0[0]
|
||||||
|
@ -573,58 +653,143 @@ ENTRY(chacha20_4block_xor_ssse3)
|
||||||
|
|
||||||
# xor with corresponding input, write to output
|
# xor with corresponding input, write to output
|
||||||
movdqa 0x00(%rsp),%xmm0
|
movdqa 0x00(%rsp),%xmm0
|
||||||
|
cmp $0x10,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
movdqu 0x00(%rdx),%xmm1
|
movdqu 0x00(%rdx),%xmm1
|
||||||
pxor %xmm1,%xmm0
|
pxor %xmm1,%xmm0
|
||||||
movdqu %xmm0,0x00(%rsi)
|
movdqu %xmm0,0x00(%rsi)
|
||||||
movdqa 0x10(%rsp),%xmm0
|
|
||||||
movdqu 0x80(%rdx),%xmm1
|
movdqu %xmm4,%xmm0
|
||||||
|
cmp $0x20,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x10(%rdx),%xmm1
|
||||||
pxor %xmm1,%xmm0
|
pxor %xmm1,%xmm0
|
||||||
movdqu %xmm0,0x80(%rsi)
|
movdqu %xmm0,0x10(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm8,%xmm0
|
||||||
|
cmp $0x30,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x20(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x20(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm12,%xmm0
|
||||||
|
cmp $0x40,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x30(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x30(%rsi)
|
||||||
|
|
||||||
movdqa 0x20(%rsp),%xmm0
|
movdqa 0x20(%rsp),%xmm0
|
||||||
|
cmp $0x50,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
movdqu 0x40(%rdx),%xmm1
|
movdqu 0x40(%rdx),%xmm1
|
||||||
pxor %xmm1,%xmm0
|
pxor %xmm1,%xmm0
|
||||||
movdqu %xmm0,0x40(%rsi)
|
movdqu %xmm0,0x40(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm6,%xmm0
|
||||||
|
cmp $0x60,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x50(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x50(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm10,%xmm0
|
||||||
|
cmp $0x70,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x60(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x60(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm14,%xmm0
|
||||||
|
cmp $0x80,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x70(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x70(%rsi)
|
||||||
|
|
||||||
|
movdqa 0x10(%rsp),%xmm0
|
||||||
|
cmp $0x90,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x80(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x80(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm5,%xmm0
|
||||||
|
cmp $0xa0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0x90(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0x90(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm9,%xmm0
|
||||||
|
cmp $0xb0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0xa0(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0xa0(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm13,%xmm0
|
||||||
|
cmp $0xc0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0xb0(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0xb0(%rsi)
|
||||||
|
|
||||||
movdqa 0x30(%rsp),%xmm0
|
movdqa 0x30(%rsp),%xmm0
|
||||||
|
cmp $0xd0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
movdqu 0xc0(%rdx),%xmm1
|
movdqu 0xc0(%rdx),%xmm1
|
||||||
pxor %xmm1,%xmm0
|
pxor %xmm1,%xmm0
|
||||||
movdqu %xmm0,0xc0(%rsi)
|
movdqu %xmm0,0xc0(%rsi)
|
||||||
movdqu 0x10(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm4
|
|
||||||
movdqu %xmm4,0x10(%rsi)
|
|
||||||
movdqu 0x90(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm5
|
|
||||||
movdqu %xmm5,0x90(%rsi)
|
|
||||||
movdqu 0x50(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm6
|
|
||||||
movdqu %xmm6,0x50(%rsi)
|
|
||||||
movdqu 0xd0(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm7
|
|
||||||
movdqu %xmm7,0xd0(%rsi)
|
|
||||||
movdqu 0x20(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm8
|
|
||||||
movdqu %xmm8,0x20(%rsi)
|
|
||||||
movdqu 0xa0(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm9
|
|
||||||
movdqu %xmm9,0xa0(%rsi)
|
|
||||||
movdqu 0x60(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm10
|
|
||||||
movdqu %xmm10,0x60(%rsi)
|
|
||||||
movdqu 0xe0(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm11
|
|
||||||
movdqu %xmm11,0xe0(%rsi)
|
|
||||||
movdqu 0x30(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm12
|
|
||||||
movdqu %xmm12,0x30(%rsi)
|
|
||||||
movdqu 0xb0(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm13
|
|
||||||
movdqu %xmm13,0xb0(%rsi)
|
|
||||||
movdqu 0x70(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm14
|
|
||||||
movdqu %xmm14,0x70(%rsi)
|
|
||||||
movdqu 0xf0(%rdx),%xmm1
|
|
||||||
pxor %xmm1,%xmm15
|
|
||||||
movdqu %xmm15,0xf0(%rsi)
|
|
||||||
|
|
||||||
|
movdqu %xmm7,%xmm0
|
||||||
|
cmp $0xe0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0xd0(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0xd0(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm11,%xmm0
|
||||||
|
cmp $0xf0,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0xe0(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0xe0(%rsi)
|
||||||
|
|
||||||
|
movdqu %xmm15,%xmm0
|
||||||
|
cmp $0x100,%rax
|
||||||
|
jl .Lxorpart4
|
||||||
|
movdqu 0xf0(%rdx),%xmm1
|
||||||
|
pxor %xmm1,%xmm0
|
||||||
|
movdqu %xmm0,0xf0(%rsi)
|
||||||
|
|
||||||
|
.Ldone4:
|
||||||
lea -8(%r10),%rsp
|
lea -8(%r10),%rsp
|
||||||
ret
|
ret
|
||||||
ENDPROC(chacha20_4block_xor_ssse3)
|
|
||||||
|
.Lxorpart4:
|
||||||
|
# xor remaining bytes from partial register into output
|
||||||
|
mov %rax,%r9
|
||||||
|
and $0x0f,%r9
|
||||||
|
jz .Ldone4
|
||||||
|
and $~0x0f,%rax
|
||||||
|
|
||||||
|
mov %rsi,%r11
|
||||||
|
|
||||||
|
lea (%rdx,%rax),%rsi
|
||||||
|
mov %rsp,%rdi
|
||||||
|
mov %r9,%rcx
|
||||||
|
rep movsb
|
||||||
|
|
||||||
|
pxor 0x00(%rsp),%xmm0
|
||||||
|
movdqa %xmm0,0x00(%rsp)
|
||||||
|
|
||||||
|
mov %rsp,%rsi
|
||||||
|
lea (%r11,%rax),%rdi
|
||||||
|
mov %r9,%rcx
|
||||||
|
rep movsb
|
||||||
|
|
||||||
|
jmp .Ldone4
|
||||||
|
|
||||||
|
ENDPROC(chacha_4block_xor_ssse3)
|
|
@ -1,448 +0,0 @@
|
||||||
/*
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, x64 AVX2 functions
|
|
||||||
*
|
|
||||||
* Copyright (C) 2015 Martin Willi
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License as published by
|
|
||||||
* the Free Software Foundation; either version 2 of the License, or
|
|
||||||
* (at your option) any later version.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <linux/linkage.h>
|
|
||||||
|
|
||||||
.section .rodata.cst32.ROT8, "aM", @progbits, 32
|
|
||||||
.align 32
|
|
||||||
ROT8: .octa 0x0e0d0c0f0a09080b0605040702010003
|
|
||||||
.octa 0x0e0d0c0f0a09080b0605040702010003
|
|
||||||
|
|
||||||
.section .rodata.cst32.ROT16, "aM", @progbits, 32
|
|
||||||
.align 32
|
|
||||||
ROT16: .octa 0x0d0c0f0e09080b0a0504070601000302
|
|
||||||
.octa 0x0d0c0f0e09080b0a0504070601000302
|
|
||||||
|
|
||||||
.section .rodata.cst32.CTRINC, "aM", @progbits, 32
|
|
||||||
.align 32
|
|
||||||
CTRINC: .octa 0x00000003000000020000000100000000
|
|
||||||
.octa 0x00000007000000060000000500000004
|
|
||||||
|
|
||||||
.text
|
|
||||||
|
|
||||||
ENTRY(chacha20_8block_xor_avx2)
|
|
||||||
# %rdi: Input state matrix, s
|
|
||||||
# %rsi: 8 data blocks output, o
|
|
||||||
# %rdx: 8 data blocks input, i
|
|
||||||
|
|
||||||
# This function encrypts eight consecutive ChaCha20 blocks by loading
|
|
||||||
# the state matrix in AVX registers eight times. As we need some
|
|
||||||
# scratch registers, we save the first four registers on the stack. The
|
|
||||||
# algorithm performs each operation on the corresponding word of each
|
|
||||||
# state matrix, hence requires no word shuffling. For final XORing step
|
|
||||||
# we transpose the matrix by interleaving 32-, 64- and then 128-bit
|
|
||||||
# words, which allows us to do XOR in AVX registers. 8/16-bit word
|
|
||||||
# rotation is done with the slightly better performing byte shuffling,
|
|
||||||
# 7/12-bit word rotation uses traditional shift+OR.
|
|
||||||
|
|
||||||
vzeroupper
|
|
||||||
# 4 * 32 byte stack, 32-byte aligned
|
|
||||||
lea 8(%rsp),%r10
|
|
||||||
and $~31, %rsp
|
|
||||||
sub $0x80, %rsp
|
|
||||||
|
|
||||||
# x0..15[0-7] = s[0..15]
|
|
||||||
vpbroadcastd 0x00(%rdi),%ymm0
|
|
||||||
vpbroadcastd 0x04(%rdi),%ymm1
|
|
||||||
vpbroadcastd 0x08(%rdi),%ymm2
|
|
||||||
vpbroadcastd 0x0c(%rdi),%ymm3
|
|
||||||
vpbroadcastd 0x10(%rdi),%ymm4
|
|
||||||
vpbroadcastd 0x14(%rdi),%ymm5
|
|
||||||
vpbroadcastd 0x18(%rdi),%ymm6
|
|
||||||
vpbroadcastd 0x1c(%rdi),%ymm7
|
|
||||||
vpbroadcastd 0x20(%rdi),%ymm8
|
|
||||||
vpbroadcastd 0x24(%rdi),%ymm9
|
|
||||||
vpbroadcastd 0x28(%rdi),%ymm10
|
|
||||||
vpbroadcastd 0x2c(%rdi),%ymm11
|
|
||||||
vpbroadcastd 0x30(%rdi),%ymm12
|
|
||||||
vpbroadcastd 0x34(%rdi),%ymm13
|
|
||||||
vpbroadcastd 0x38(%rdi),%ymm14
|
|
||||||
vpbroadcastd 0x3c(%rdi),%ymm15
|
|
||||||
# x0..3 on stack
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vmovdqa %ymm1,0x20(%rsp)
|
|
||||||
vmovdqa %ymm2,0x40(%rsp)
|
|
||||||
vmovdqa %ymm3,0x60(%rsp)
|
|
||||||
|
|
||||||
vmovdqa CTRINC(%rip),%ymm1
|
|
||||||
vmovdqa ROT8(%rip),%ymm2
|
|
||||||
vmovdqa ROT16(%rip),%ymm3
|
|
||||||
|
|
||||||
# x12 += counter values 0-3
|
|
||||||
vpaddd %ymm1,%ymm12,%ymm12
|
|
||||||
|
|
||||||
mov $10,%ecx
|
|
||||||
|
|
||||||
.Ldoubleround8:
|
|
||||||
# x0 += x4, x12 = rotl32(x12 ^ x0, 16)
|
|
||||||
vpaddd 0x00(%rsp),%ymm4,%ymm0
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vpxor %ymm0,%ymm12,%ymm12
|
|
||||||
vpshufb %ymm3,%ymm12,%ymm12
|
|
||||||
# x1 += x5, x13 = rotl32(x13 ^ x1, 16)
|
|
||||||
vpaddd 0x20(%rsp),%ymm5,%ymm0
|
|
||||||
vmovdqa %ymm0,0x20(%rsp)
|
|
||||||
vpxor %ymm0,%ymm13,%ymm13
|
|
||||||
vpshufb %ymm3,%ymm13,%ymm13
|
|
||||||
# x2 += x6, x14 = rotl32(x14 ^ x2, 16)
|
|
||||||
vpaddd 0x40(%rsp),%ymm6,%ymm0
|
|
||||||
vmovdqa %ymm0,0x40(%rsp)
|
|
||||||
vpxor %ymm0,%ymm14,%ymm14
|
|
||||||
vpshufb %ymm3,%ymm14,%ymm14
|
|
||||||
# x3 += x7, x15 = rotl32(x15 ^ x3, 16)
|
|
||||||
vpaddd 0x60(%rsp),%ymm7,%ymm0
|
|
||||||
vmovdqa %ymm0,0x60(%rsp)
|
|
||||||
vpxor %ymm0,%ymm15,%ymm15
|
|
||||||
vpshufb %ymm3,%ymm15,%ymm15
|
|
||||||
|
|
||||||
# x8 += x12, x4 = rotl32(x4 ^ x8, 12)
|
|
||||||
vpaddd %ymm12,%ymm8,%ymm8
|
|
||||||
vpxor %ymm8,%ymm4,%ymm4
|
|
||||||
vpslld $12,%ymm4,%ymm0
|
|
||||||
vpsrld $20,%ymm4,%ymm4
|
|
||||||
vpor %ymm0,%ymm4,%ymm4
|
|
||||||
# x9 += x13, x5 = rotl32(x5 ^ x9, 12)
|
|
||||||
vpaddd %ymm13,%ymm9,%ymm9
|
|
||||||
vpxor %ymm9,%ymm5,%ymm5
|
|
||||||
vpslld $12,%ymm5,%ymm0
|
|
||||||
vpsrld $20,%ymm5,%ymm5
|
|
||||||
vpor %ymm0,%ymm5,%ymm5
|
|
||||||
# x10 += x14, x6 = rotl32(x6 ^ x10, 12)
|
|
||||||
vpaddd %ymm14,%ymm10,%ymm10
|
|
||||||
vpxor %ymm10,%ymm6,%ymm6
|
|
||||||
vpslld $12,%ymm6,%ymm0
|
|
||||||
vpsrld $20,%ymm6,%ymm6
|
|
||||||
vpor %ymm0,%ymm6,%ymm6
|
|
||||||
# x11 += x15, x7 = rotl32(x7 ^ x11, 12)
|
|
||||||
vpaddd %ymm15,%ymm11,%ymm11
|
|
||||||
vpxor %ymm11,%ymm7,%ymm7
|
|
||||||
vpslld $12,%ymm7,%ymm0
|
|
||||||
vpsrld $20,%ymm7,%ymm7
|
|
||||||
vpor %ymm0,%ymm7,%ymm7
|
|
||||||
|
|
||||||
# x0 += x4, x12 = rotl32(x12 ^ x0, 8)
|
|
||||||
vpaddd 0x00(%rsp),%ymm4,%ymm0
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vpxor %ymm0,%ymm12,%ymm12
|
|
||||||
vpshufb %ymm2,%ymm12,%ymm12
|
|
||||||
# x1 += x5, x13 = rotl32(x13 ^ x1, 8)
|
|
||||||
vpaddd 0x20(%rsp),%ymm5,%ymm0
|
|
||||||
vmovdqa %ymm0,0x20(%rsp)
|
|
||||||
vpxor %ymm0,%ymm13,%ymm13
|
|
||||||
vpshufb %ymm2,%ymm13,%ymm13
|
|
||||||
# x2 += x6, x14 = rotl32(x14 ^ x2, 8)
|
|
||||||
vpaddd 0x40(%rsp),%ymm6,%ymm0
|
|
||||||
vmovdqa %ymm0,0x40(%rsp)
|
|
||||||
vpxor %ymm0,%ymm14,%ymm14
|
|
||||||
vpshufb %ymm2,%ymm14,%ymm14
|
|
||||||
# x3 += x7, x15 = rotl32(x15 ^ x3, 8)
|
|
||||||
vpaddd 0x60(%rsp),%ymm7,%ymm0
|
|
||||||
vmovdqa %ymm0,0x60(%rsp)
|
|
||||||
vpxor %ymm0,%ymm15,%ymm15
|
|
||||||
vpshufb %ymm2,%ymm15,%ymm15
|
|
||||||
|
|
||||||
# x8 += x12, x4 = rotl32(x4 ^ x8, 7)
|
|
||||||
vpaddd %ymm12,%ymm8,%ymm8
|
|
||||||
vpxor %ymm8,%ymm4,%ymm4
|
|
||||||
vpslld $7,%ymm4,%ymm0
|
|
||||||
vpsrld $25,%ymm4,%ymm4
|
|
||||||
vpor %ymm0,%ymm4,%ymm4
|
|
||||||
# x9 += x13, x5 = rotl32(x5 ^ x9, 7)
|
|
||||||
vpaddd %ymm13,%ymm9,%ymm9
|
|
||||||
vpxor %ymm9,%ymm5,%ymm5
|
|
||||||
vpslld $7,%ymm5,%ymm0
|
|
||||||
vpsrld $25,%ymm5,%ymm5
|
|
||||||
vpor %ymm0,%ymm5,%ymm5
|
|
||||||
# x10 += x14, x6 = rotl32(x6 ^ x10, 7)
|
|
||||||
vpaddd %ymm14,%ymm10,%ymm10
|
|
||||||
vpxor %ymm10,%ymm6,%ymm6
|
|
||||||
vpslld $7,%ymm6,%ymm0
|
|
||||||
vpsrld $25,%ymm6,%ymm6
|
|
||||||
vpor %ymm0,%ymm6,%ymm6
|
|
||||||
# x11 += x15, x7 = rotl32(x7 ^ x11, 7)
|
|
||||||
vpaddd %ymm15,%ymm11,%ymm11
|
|
||||||
vpxor %ymm11,%ymm7,%ymm7
|
|
||||||
vpslld $7,%ymm7,%ymm0
|
|
||||||
vpsrld $25,%ymm7,%ymm7
|
|
||||||
vpor %ymm0,%ymm7,%ymm7
|
|
||||||
|
|
||||||
# x0 += x5, x15 = rotl32(x15 ^ x0, 16)
|
|
||||||
vpaddd 0x00(%rsp),%ymm5,%ymm0
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vpxor %ymm0,%ymm15,%ymm15
|
|
||||||
vpshufb %ymm3,%ymm15,%ymm15
|
|
||||||
# x1 += x6, x12 = rotl32(x12 ^ x1, 16)%ymm0
|
|
||||||
vpaddd 0x20(%rsp),%ymm6,%ymm0
|
|
||||||
vmovdqa %ymm0,0x20(%rsp)
|
|
||||||
vpxor %ymm0,%ymm12,%ymm12
|
|
||||||
vpshufb %ymm3,%ymm12,%ymm12
|
|
||||||
# x2 += x7, x13 = rotl32(x13 ^ x2, 16)
|
|
||||||
vpaddd 0x40(%rsp),%ymm7,%ymm0
|
|
||||||
vmovdqa %ymm0,0x40(%rsp)
|
|
||||||
vpxor %ymm0,%ymm13,%ymm13
|
|
||||||
vpshufb %ymm3,%ymm13,%ymm13
|
|
||||||
# x3 += x4, x14 = rotl32(x14 ^ x3, 16)
|
|
||||||
vpaddd 0x60(%rsp),%ymm4,%ymm0
|
|
||||||
vmovdqa %ymm0,0x60(%rsp)
|
|
||||||
vpxor %ymm0,%ymm14,%ymm14
|
|
||||||
vpshufb %ymm3,%ymm14,%ymm14
|
|
||||||
|
|
||||||
# x10 += x15, x5 = rotl32(x5 ^ x10, 12)
|
|
||||||
vpaddd %ymm15,%ymm10,%ymm10
|
|
||||||
vpxor %ymm10,%ymm5,%ymm5
|
|
||||||
vpslld $12,%ymm5,%ymm0
|
|
||||||
vpsrld $20,%ymm5,%ymm5
|
|
||||||
vpor %ymm0,%ymm5,%ymm5
|
|
||||||
# x11 += x12, x6 = rotl32(x6 ^ x11, 12)
|
|
||||||
vpaddd %ymm12,%ymm11,%ymm11
|
|
||||||
vpxor %ymm11,%ymm6,%ymm6
|
|
||||||
vpslld $12,%ymm6,%ymm0
|
|
||||||
vpsrld $20,%ymm6,%ymm6
|
|
||||||
vpor %ymm0,%ymm6,%ymm6
|
|
||||||
# x8 += x13, x7 = rotl32(x7 ^ x8, 12)
|
|
||||||
vpaddd %ymm13,%ymm8,%ymm8
|
|
||||||
vpxor %ymm8,%ymm7,%ymm7
|
|
||||||
vpslld $12,%ymm7,%ymm0
|
|
||||||
vpsrld $20,%ymm7,%ymm7
|
|
||||||
vpor %ymm0,%ymm7,%ymm7
|
|
||||||
# x9 += x14, x4 = rotl32(x4 ^ x9, 12)
|
|
||||||
vpaddd %ymm14,%ymm9,%ymm9
|
|
||||||
vpxor %ymm9,%ymm4,%ymm4
|
|
||||||
vpslld $12,%ymm4,%ymm0
|
|
||||||
vpsrld $20,%ymm4,%ymm4
|
|
||||||
vpor %ymm0,%ymm4,%ymm4
|
|
||||||
|
|
||||||
# x0 += x5, x15 = rotl32(x15 ^ x0, 8)
|
|
||||||
vpaddd 0x00(%rsp),%ymm5,%ymm0
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vpxor %ymm0,%ymm15,%ymm15
|
|
||||||
vpshufb %ymm2,%ymm15,%ymm15
|
|
||||||
# x1 += x6, x12 = rotl32(x12 ^ x1, 8)
|
|
||||||
vpaddd 0x20(%rsp),%ymm6,%ymm0
|
|
||||||
vmovdqa %ymm0,0x20(%rsp)
|
|
||||||
vpxor %ymm0,%ymm12,%ymm12
|
|
||||||
vpshufb %ymm2,%ymm12,%ymm12
|
|
||||||
# x2 += x7, x13 = rotl32(x13 ^ x2, 8)
|
|
||||||
vpaddd 0x40(%rsp),%ymm7,%ymm0
|
|
||||||
vmovdqa %ymm0,0x40(%rsp)
|
|
||||||
vpxor %ymm0,%ymm13,%ymm13
|
|
||||||
vpshufb %ymm2,%ymm13,%ymm13
|
|
||||||
# x3 += x4, x14 = rotl32(x14 ^ x3, 8)
|
|
||||||
vpaddd 0x60(%rsp),%ymm4,%ymm0
|
|
||||||
vmovdqa %ymm0,0x60(%rsp)
|
|
||||||
vpxor %ymm0,%ymm14,%ymm14
|
|
||||||
vpshufb %ymm2,%ymm14,%ymm14
|
|
||||||
|
|
||||||
# x10 += x15, x5 = rotl32(x5 ^ x10, 7)
|
|
||||||
vpaddd %ymm15,%ymm10,%ymm10
|
|
||||||
vpxor %ymm10,%ymm5,%ymm5
|
|
||||||
vpslld $7,%ymm5,%ymm0
|
|
||||||
vpsrld $25,%ymm5,%ymm5
|
|
||||||
vpor %ymm0,%ymm5,%ymm5
|
|
||||||
# x11 += x12, x6 = rotl32(x6 ^ x11, 7)
|
|
||||||
vpaddd %ymm12,%ymm11,%ymm11
|
|
||||||
vpxor %ymm11,%ymm6,%ymm6
|
|
||||||
vpslld $7,%ymm6,%ymm0
|
|
||||||
vpsrld $25,%ymm6,%ymm6
|
|
||||||
vpor %ymm0,%ymm6,%ymm6
|
|
||||||
# x8 += x13, x7 = rotl32(x7 ^ x8, 7)
|
|
||||||
vpaddd %ymm13,%ymm8,%ymm8
|
|
||||||
vpxor %ymm8,%ymm7,%ymm7
|
|
||||||
vpslld $7,%ymm7,%ymm0
|
|
||||||
vpsrld $25,%ymm7,%ymm7
|
|
||||||
vpor %ymm0,%ymm7,%ymm7
|
|
||||||
# x9 += x14, x4 = rotl32(x4 ^ x9, 7)
|
|
||||||
vpaddd %ymm14,%ymm9,%ymm9
|
|
||||||
vpxor %ymm9,%ymm4,%ymm4
|
|
||||||
vpslld $7,%ymm4,%ymm0
|
|
||||||
vpsrld $25,%ymm4,%ymm4
|
|
||||||
vpor %ymm0,%ymm4,%ymm4
|
|
||||||
|
|
||||||
dec %ecx
|
|
||||||
jnz .Ldoubleround8
|
|
||||||
|
|
||||||
# x0..15[0-3] += s[0..15]
|
|
||||||
vpbroadcastd 0x00(%rdi),%ymm0
|
|
||||||
vpaddd 0x00(%rsp),%ymm0,%ymm0
|
|
||||||
vmovdqa %ymm0,0x00(%rsp)
|
|
||||||
vpbroadcastd 0x04(%rdi),%ymm0
|
|
||||||
vpaddd 0x20(%rsp),%ymm0,%ymm0
|
|
||||||
vmovdqa %ymm0,0x20(%rsp)
|
|
||||||
vpbroadcastd 0x08(%rdi),%ymm0
|
|
||||||
vpaddd 0x40(%rsp),%ymm0,%ymm0
|
|
||||||
vmovdqa %ymm0,0x40(%rsp)
|
|
||||||
vpbroadcastd 0x0c(%rdi),%ymm0
|
|
||||||
vpaddd 0x60(%rsp),%ymm0,%ymm0
|
|
||||||
vmovdqa %ymm0,0x60(%rsp)
|
|
||||||
vpbroadcastd 0x10(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm4,%ymm4
|
|
||||||
vpbroadcastd 0x14(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm5,%ymm5
|
|
||||||
vpbroadcastd 0x18(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm6,%ymm6
|
|
||||||
vpbroadcastd 0x1c(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm7,%ymm7
|
|
||||||
vpbroadcastd 0x20(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm8,%ymm8
|
|
||||||
vpbroadcastd 0x24(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm9,%ymm9
|
|
||||||
vpbroadcastd 0x28(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm10,%ymm10
|
|
||||||
vpbroadcastd 0x2c(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm11,%ymm11
|
|
||||||
vpbroadcastd 0x30(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm12,%ymm12
|
|
||||||
vpbroadcastd 0x34(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm13,%ymm13
|
|
||||||
vpbroadcastd 0x38(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm14,%ymm14
|
|
||||||
vpbroadcastd 0x3c(%rdi),%ymm0
|
|
||||||
vpaddd %ymm0,%ymm15,%ymm15
|
|
||||||
|
|
||||||
# x12 += counter values 0-3
|
|
||||||
vpaddd %ymm1,%ymm12,%ymm12
|
|
||||||
|
|
||||||
# interleave 32-bit words in state n, n+1
|
|
||||||
vmovdqa 0x00(%rsp),%ymm0
|
|
||||||
vmovdqa 0x20(%rsp),%ymm1
|
|
||||||
vpunpckldq %ymm1,%ymm0,%ymm2
|
|
||||||
vpunpckhdq %ymm1,%ymm0,%ymm1
|
|
||||||
vmovdqa %ymm2,0x00(%rsp)
|
|
||||||
vmovdqa %ymm1,0x20(%rsp)
|
|
||||||
vmovdqa 0x40(%rsp),%ymm0
|
|
||||||
vmovdqa 0x60(%rsp),%ymm1
|
|
||||||
vpunpckldq %ymm1,%ymm0,%ymm2
|
|
||||||
vpunpckhdq %ymm1,%ymm0,%ymm1
|
|
||||||
vmovdqa %ymm2,0x40(%rsp)
|
|
||||||
vmovdqa %ymm1,0x60(%rsp)
|
|
||||||
vmovdqa %ymm4,%ymm0
|
|
||||||
vpunpckldq %ymm5,%ymm0,%ymm4
|
|
||||||
vpunpckhdq %ymm5,%ymm0,%ymm5
|
|
||||||
vmovdqa %ymm6,%ymm0
|
|
||||||
vpunpckldq %ymm7,%ymm0,%ymm6
|
|
||||||
vpunpckhdq %ymm7,%ymm0,%ymm7
|
|
||||||
vmovdqa %ymm8,%ymm0
|
|
||||||
vpunpckldq %ymm9,%ymm0,%ymm8
|
|
||||||
vpunpckhdq %ymm9,%ymm0,%ymm9
|
|
||||||
vmovdqa %ymm10,%ymm0
|
|
||||||
vpunpckldq %ymm11,%ymm0,%ymm10
|
|
||||||
vpunpckhdq %ymm11,%ymm0,%ymm11
|
|
||||||
vmovdqa %ymm12,%ymm0
|
|
||||||
vpunpckldq %ymm13,%ymm0,%ymm12
|
|
||||||
vpunpckhdq %ymm13,%ymm0,%ymm13
|
|
||||||
vmovdqa %ymm14,%ymm0
|
|
||||||
vpunpckldq %ymm15,%ymm0,%ymm14
|
|
||||||
vpunpckhdq %ymm15,%ymm0,%ymm15
|
|
||||||
|
|
||||||
# interleave 64-bit words in state n, n+2
|
|
||||||
vmovdqa 0x00(%rsp),%ymm0
|
|
||||||
vmovdqa 0x40(%rsp),%ymm2
|
|
||||||
vpunpcklqdq %ymm2,%ymm0,%ymm1
|
|
||||||
vpunpckhqdq %ymm2,%ymm0,%ymm2
|
|
||||||
vmovdqa %ymm1,0x00(%rsp)
|
|
||||||
vmovdqa %ymm2,0x40(%rsp)
|
|
||||||
vmovdqa 0x20(%rsp),%ymm0
|
|
||||||
vmovdqa 0x60(%rsp),%ymm2
|
|
||||||
vpunpcklqdq %ymm2,%ymm0,%ymm1
|
|
||||||
vpunpckhqdq %ymm2,%ymm0,%ymm2
|
|
||||||
vmovdqa %ymm1,0x20(%rsp)
|
|
||||||
vmovdqa %ymm2,0x60(%rsp)
|
|
||||||
vmovdqa %ymm4,%ymm0
|
|
||||||
vpunpcklqdq %ymm6,%ymm0,%ymm4
|
|
||||||
vpunpckhqdq %ymm6,%ymm0,%ymm6
|
|
||||||
vmovdqa %ymm5,%ymm0
|
|
||||||
vpunpcklqdq %ymm7,%ymm0,%ymm5
|
|
||||||
vpunpckhqdq %ymm7,%ymm0,%ymm7
|
|
||||||
vmovdqa %ymm8,%ymm0
|
|
||||||
vpunpcklqdq %ymm10,%ymm0,%ymm8
|
|
||||||
vpunpckhqdq %ymm10,%ymm0,%ymm10
|
|
||||||
vmovdqa %ymm9,%ymm0
|
|
||||||
vpunpcklqdq %ymm11,%ymm0,%ymm9
|
|
||||||
vpunpckhqdq %ymm11,%ymm0,%ymm11
|
|
||||||
vmovdqa %ymm12,%ymm0
|
|
||||||
vpunpcklqdq %ymm14,%ymm0,%ymm12
|
|
||||||
vpunpckhqdq %ymm14,%ymm0,%ymm14
|
|
||||||
vmovdqa %ymm13,%ymm0
|
|
||||||
vpunpcklqdq %ymm15,%ymm0,%ymm13
|
|
||||||
vpunpckhqdq %ymm15,%ymm0,%ymm15
|
|
||||||
|
|
||||||
# interleave 128-bit words in state n, n+4
|
|
||||||
vmovdqa 0x00(%rsp),%ymm0
|
|
||||||
vperm2i128 $0x20,%ymm4,%ymm0,%ymm1
|
|
||||||
vperm2i128 $0x31,%ymm4,%ymm0,%ymm4
|
|
||||||
vmovdqa %ymm1,0x00(%rsp)
|
|
||||||
vmovdqa 0x20(%rsp),%ymm0
|
|
||||||
vperm2i128 $0x20,%ymm5,%ymm0,%ymm1
|
|
||||||
vperm2i128 $0x31,%ymm5,%ymm0,%ymm5
|
|
||||||
vmovdqa %ymm1,0x20(%rsp)
|
|
||||||
vmovdqa 0x40(%rsp),%ymm0
|
|
||||||
vperm2i128 $0x20,%ymm6,%ymm0,%ymm1
|
|
||||||
vperm2i128 $0x31,%ymm6,%ymm0,%ymm6
|
|
||||||
vmovdqa %ymm1,0x40(%rsp)
|
|
||||||
vmovdqa 0x60(%rsp),%ymm0
|
|
||||||
vperm2i128 $0x20,%ymm7,%ymm0,%ymm1
|
|
||||||
vperm2i128 $0x31,%ymm7,%ymm0,%ymm7
|
|
||||||
vmovdqa %ymm1,0x60(%rsp)
|
|
||||||
vperm2i128 $0x20,%ymm12,%ymm8,%ymm0
|
|
||||||
vperm2i128 $0x31,%ymm12,%ymm8,%ymm12
|
|
||||||
vmovdqa %ymm0,%ymm8
|
|
||||||
vperm2i128 $0x20,%ymm13,%ymm9,%ymm0
|
|
||||||
vperm2i128 $0x31,%ymm13,%ymm9,%ymm13
|
|
||||||
vmovdqa %ymm0,%ymm9
|
|
||||||
vperm2i128 $0x20,%ymm14,%ymm10,%ymm0
|
|
||||||
vperm2i128 $0x31,%ymm14,%ymm10,%ymm14
|
|
||||||
vmovdqa %ymm0,%ymm10
|
|
||||||
vperm2i128 $0x20,%ymm15,%ymm11,%ymm0
|
|
||||||
vperm2i128 $0x31,%ymm15,%ymm11,%ymm15
|
|
||||||
vmovdqa %ymm0,%ymm11
|
|
||||||
|
|
||||||
# xor with corresponding input, write to output
|
|
||||||
vmovdqa 0x00(%rsp),%ymm0
|
|
||||||
vpxor 0x0000(%rdx),%ymm0,%ymm0
|
|
||||||
vmovdqu %ymm0,0x0000(%rsi)
|
|
||||||
vmovdqa 0x20(%rsp),%ymm0
|
|
||||||
vpxor 0x0080(%rdx),%ymm0,%ymm0
|
|
||||||
vmovdqu %ymm0,0x0080(%rsi)
|
|
||||||
vmovdqa 0x40(%rsp),%ymm0
|
|
||||||
vpxor 0x0040(%rdx),%ymm0,%ymm0
|
|
||||||
vmovdqu %ymm0,0x0040(%rsi)
|
|
||||||
vmovdqa 0x60(%rsp),%ymm0
|
|
||||||
vpxor 0x00c0(%rdx),%ymm0,%ymm0
|
|
||||||
vmovdqu %ymm0,0x00c0(%rsi)
|
|
||||||
vpxor 0x0100(%rdx),%ymm4,%ymm4
|
|
||||||
vmovdqu %ymm4,0x0100(%rsi)
|
|
||||||
vpxor 0x0180(%rdx),%ymm5,%ymm5
|
|
||||||
vmovdqu %ymm5,0x00180(%rsi)
|
|
||||||
vpxor 0x0140(%rdx),%ymm6,%ymm6
|
|
||||||
vmovdqu %ymm6,0x0140(%rsi)
|
|
||||||
vpxor 0x01c0(%rdx),%ymm7,%ymm7
|
|
||||||
vmovdqu %ymm7,0x01c0(%rsi)
|
|
||||||
vpxor 0x0020(%rdx),%ymm8,%ymm8
|
|
||||||
vmovdqu %ymm8,0x0020(%rsi)
|
|
||||||
vpxor 0x00a0(%rdx),%ymm9,%ymm9
|
|
||||||
vmovdqu %ymm9,0x00a0(%rsi)
|
|
||||||
vpxor 0x0060(%rdx),%ymm10,%ymm10
|
|
||||||
vmovdqu %ymm10,0x0060(%rsi)
|
|
||||||
vpxor 0x00e0(%rdx),%ymm11,%ymm11
|
|
||||||
vmovdqu %ymm11,0x00e0(%rsi)
|
|
||||||
vpxor 0x0120(%rdx),%ymm12,%ymm12
|
|
||||||
vmovdqu %ymm12,0x0120(%rsi)
|
|
||||||
vpxor 0x01a0(%rdx),%ymm13,%ymm13
|
|
||||||
vmovdqu %ymm13,0x01a0(%rsi)
|
|
||||||
vpxor 0x0160(%rdx),%ymm14,%ymm14
|
|
||||||
vmovdqu %ymm14,0x0160(%rsi)
|
|
||||||
vpxor 0x01e0(%rdx),%ymm15,%ymm15
|
|
||||||
vmovdqu %ymm15,0x01e0(%rsi)
|
|
||||||
|
|
||||||
vzeroupper
|
|
||||||
lea -8(%r10),%rsp
|
|
||||||
ret
|
|
||||||
ENDPROC(chacha20_8block_xor_avx2)
|
|
|
@ -1,146 +0,0 @@
|
||||||
/*
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
|
|
||||||
*
|
|
||||||
* Copyright (C) 2015 Martin Willi
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License as published by
|
|
||||||
* the Free Software Foundation; either version 2 of the License, or
|
|
||||||
* (at your option) any later version.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <crypto/algapi.h>
|
|
||||||
#include <crypto/chacha20.h>
|
|
||||||
#include <crypto/internal/skcipher.h>
|
|
||||||
#include <linux/kernel.h>
|
|
||||||
#include <linux/module.h>
|
|
||||||
#include <asm/fpu/api.h>
|
|
||||||
#include <asm/simd.h>
|
|
||||||
|
|
||||||
#define CHACHA20_STATE_ALIGN 16
|
|
||||||
|
|
||||||
asmlinkage void chacha20_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
asmlinkage void chacha20_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
#ifdef CONFIG_AS_AVX2
|
|
||||||
asmlinkage void chacha20_8block_xor_avx2(u32 *state, u8 *dst, const u8 *src);
|
|
||||||
static bool chacha20_use_avx2;
|
|
||||||
#endif
|
|
||||||
|
|
||||||
static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
|
|
||||||
unsigned int bytes)
|
|
||||||
{
|
|
||||||
u8 buf[CHACHA20_BLOCK_SIZE];
|
|
||||||
|
|
||||||
#ifdef CONFIG_AS_AVX2
|
|
||||||
if (chacha20_use_avx2) {
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE * 8) {
|
|
||||||
chacha20_8block_xor_avx2(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE * 8;
|
|
||||||
src += CHACHA20_BLOCK_SIZE * 8;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE * 8;
|
|
||||||
state[12] += 8;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
|
|
||||||
chacha20_4block_xor_ssse3(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
src += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE * 4;
|
|
||||||
state[12] += 4;
|
|
||||||
}
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE) {
|
|
||||||
chacha20_block_xor_ssse3(state, dst, src);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE;
|
|
||||||
src += CHACHA20_BLOCK_SIZE;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE;
|
|
||||||
state[12]++;
|
|
||||||
}
|
|
||||||
if (bytes) {
|
|
||||||
memcpy(buf, src, bytes);
|
|
||||||
chacha20_block_xor_ssse3(state, buf, buf);
|
|
||||||
memcpy(dst, buf, bytes);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static int chacha20_simd(struct skcipher_request *req)
|
|
||||||
{
|
|
||||||
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
|
||||||
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
|
|
||||||
u32 *state, state_buf[16 + 2] __aligned(8);
|
|
||||||
struct skcipher_walk walk;
|
|
||||||
int err;
|
|
||||||
|
|
||||||
BUILD_BUG_ON(CHACHA20_STATE_ALIGN != 16);
|
|
||||||
state = PTR_ALIGN(state_buf + 0, CHACHA20_STATE_ALIGN);
|
|
||||||
|
|
||||||
if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
|
|
||||||
return crypto_chacha20_crypt(req);
|
|
||||||
|
|
||||||
err = skcipher_walk_virt(&walk, req, true);
|
|
||||||
|
|
||||||
crypto_chacha20_init(state, ctx, walk.iv);
|
|
||||||
|
|
||||||
kernel_fpu_begin();
|
|
||||||
|
|
||||||
while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
|
|
||||||
chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
|
|
||||||
rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
|
|
||||||
err = skcipher_walk_done(&walk,
|
|
||||||
walk.nbytes % CHACHA20_BLOCK_SIZE);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (walk.nbytes) {
|
|
||||||
chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
|
|
||||||
walk.nbytes);
|
|
||||||
err = skcipher_walk_done(&walk, 0);
|
|
||||||
}
|
|
||||||
|
|
||||||
kernel_fpu_end();
|
|
||||||
|
|
||||||
return err;
|
|
||||||
}
|
|
||||||
|
|
||||||
static struct skcipher_alg alg = {
|
|
||||||
.base.cra_name = "chacha20",
|
|
||||||
.base.cra_driver_name = "chacha20-simd",
|
|
||||||
.base.cra_priority = 300,
|
|
||||||
.base.cra_blocksize = 1,
|
|
||||||
.base.cra_ctxsize = sizeof(struct chacha20_ctx),
|
|
||||||
.base.cra_module = THIS_MODULE,
|
|
||||||
|
|
||||||
.min_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.max_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.ivsize = CHACHA20_IV_SIZE,
|
|
||||||
.chunksize = CHACHA20_BLOCK_SIZE,
|
|
||||||
.setkey = crypto_chacha20_setkey,
|
|
||||||
.encrypt = chacha20_simd,
|
|
||||||
.decrypt = chacha20_simd,
|
|
||||||
};
|
|
||||||
|
|
||||||
static int __init chacha20_simd_mod_init(void)
|
|
||||||
{
|
|
||||||
if (!boot_cpu_has(X86_FEATURE_SSSE3))
|
|
||||||
return -ENODEV;
|
|
||||||
|
|
||||||
#ifdef CONFIG_AS_AVX2
|
|
||||||
chacha20_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) &&
|
|
||||||
boot_cpu_has(X86_FEATURE_AVX2) &&
|
|
||||||
cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
|
|
||||||
#endif
|
|
||||||
return crypto_register_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void __exit chacha20_simd_mod_fini(void)
|
|
||||||
{
|
|
||||||
crypto_unregister_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
module_init(chacha20_simd_mod_init);
|
|
||||||
module_exit(chacha20_simd_mod_fini);
|
|
||||||
|
|
||||||
MODULE_LICENSE("GPL");
|
|
||||||
MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
|
|
||||||
MODULE_DESCRIPTION("chacha20 cipher algorithm, SIMD accelerated");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20-simd");
|
|
|
@ -0,0 +1,304 @@
|
||||||
|
/*
|
||||||
|
* x64 SIMD accelerated ChaCha and XChaCha stream ciphers,
|
||||||
|
* including ChaCha20 (RFC7539)
|
||||||
|
*
|
||||||
|
* Copyright (C) 2015 Martin Willi
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License as published by
|
||||||
|
* the Free Software Foundation; either version 2 of the License, or
|
||||||
|
* (at your option) any later version.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/algapi.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/internal/skcipher.h>
|
||||||
|
#include <linux/kernel.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
#include <asm/fpu/api.h>
|
||||||
|
#include <asm/simd.h>
|
||||||
|
|
||||||
|
#define CHACHA_STATE_ALIGN 16
|
||||||
|
|
||||||
|
asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void hchacha_block_ssse3(const u32 *state, u32 *out, int nrounds);
|
||||||
|
#ifdef CONFIG_AS_AVX2
|
||||||
|
asmlinkage void chacha_2block_xor_avx2(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void chacha_4block_xor_avx2(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void chacha_8block_xor_avx2(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
static bool chacha_use_avx2;
|
||||||
|
#ifdef CONFIG_AS_AVX512
|
||||||
|
asmlinkage void chacha_2block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void chacha_4block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
asmlinkage void chacha_8block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int len, int nrounds);
|
||||||
|
static bool chacha_use_avx512vl;
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static unsigned int chacha_advance(unsigned int len, unsigned int maxblocks)
|
||||||
|
{
|
||||||
|
len = min(len, maxblocks * CHACHA_BLOCK_SIZE);
|
||||||
|
return round_up(len, CHACHA_BLOCK_SIZE) / CHACHA_BLOCK_SIZE;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void chacha_dosimd(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int bytes, int nrounds)
|
||||||
|
{
|
||||||
|
#ifdef CONFIG_AS_AVX2
|
||||||
|
#ifdef CONFIG_AS_AVX512
|
||||||
|
if (chacha_use_avx512vl) {
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE * 8) {
|
||||||
|
chacha_8block_xor_avx512vl(state, dst, src, bytes,
|
||||||
|
nrounds);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE * 8;
|
||||||
|
src += CHACHA_BLOCK_SIZE * 8;
|
||||||
|
dst += CHACHA_BLOCK_SIZE * 8;
|
||||||
|
state[12] += 8;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE * 4) {
|
||||||
|
chacha_8block_xor_avx512vl(state, dst, src, bytes,
|
||||||
|
nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 8);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE * 2) {
|
||||||
|
chacha_4block_xor_avx512vl(state, dst, src, bytes,
|
||||||
|
nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 4);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (bytes) {
|
||||||
|
chacha_2block_xor_avx512vl(state, dst, src, bytes,
|
||||||
|
nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 2);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
if (chacha_use_avx2) {
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE * 8) {
|
||||||
|
chacha_8block_xor_avx2(state, dst, src, bytes, nrounds);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE * 8;
|
||||||
|
src += CHACHA_BLOCK_SIZE * 8;
|
||||||
|
dst += CHACHA_BLOCK_SIZE * 8;
|
||||||
|
state[12] += 8;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE * 4) {
|
||||||
|
chacha_8block_xor_avx2(state, dst, src, bytes, nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 8);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE * 2) {
|
||||||
|
chacha_4block_xor_avx2(state, dst, src, bytes, nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 4);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE) {
|
||||||
|
chacha_2block_xor_avx2(state, dst, src, bytes, nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 2);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE * 4) {
|
||||||
|
chacha_4block_xor_ssse3(state, dst, src, bytes, nrounds);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE * 4;
|
||||||
|
src += CHACHA_BLOCK_SIZE * 4;
|
||||||
|
dst += CHACHA_BLOCK_SIZE * 4;
|
||||||
|
state[12] += 4;
|
||||||
|
}
|
||||||
|
if (bytes > CHACHA_BLOCK_SIZE) {
|
||||||
|
chacha_4block_xor_ssse3(state, dst, src, bytes, nrounds);
|
||||||
|
state[12] += chacha_advance(bytes, 4);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (bytes) {
|
||||||
|
chacha_block_xor_ssse3(state, dst, src, bytes, nrounds);
|
||||||
|
state[12]++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_simd_stream_xor(struct skcipher_walk *walk,
|
||||||
|
struct chacha_ctx *ctx, u8 *iv)
|
||||||
|
{
|
||||||
|
u32 *state, state_buf[16 + 2] __aligned(8);
|
||||||
|
int next_yield = 4096; /* bytes until next FPU yield */
|
||||||
|
int err = 0;
|
||||||
|
|
||||||
|
BUILD_BUG_ON(CHACHA_STATE_ALIGN != 16);
|
||||||
|
state = PTR_ALIGN(state_buf + 0, CHACHA_STATE_ALIGN);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, iv);
|
||||||
|
|
||||||
|
while (walk->nbytes > 0) {
|
||||||
|
unsigned int nbytes = walk->nbytes;
|
||||||
|
|
||||||
|
if (nbytes < walk->total) {
|
||||||
|
nbytes = round_down(nbytes, walk->stride);
|
||||||
|
next_yield -= nbytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
chacha_dosimd(state, walk->dst.virt.addr, walk->src.virt.addr,
|
||||||
|
nbytes, ctx->nrounds);
|
||||||
|
|
||||||
|
if (next_yield <= 0) {
|
||||||
|
/* temporarily allow preemption */
|
||||||
|
kernel_fpu_end();
|
||||||
|
kernel_fpu_begin();
|
||||||
|
next_yield = 4096;
|
||||||
|
}
|
||||||
|
|
||||||
|
err = skcipher_walk_done(walk, walk->nbytes - nbytes);
|
||||||
|
}
|
||||||
|
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_simd(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct skcipher_walk walk;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !irq_fpu_usable())
|
||||||
|
return crypto_chacha_crypt(req);
|
||||||
|
|
||||||
|
err = skcipher_walk_virt(&walk, req, true);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
kernel_fpu_begin();
|
||||||
|
err = chacha_simd_stream_xor(&walk, ctx, req->iv);
|
||||||
|
kernel_fpu_end();
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int xchacha_simd(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct skcipher_walk walk;
|
||||||
|
struct chacha_ctx subctx;
|
||||||
|
u32 *state, state_buf[16 + 2] __aligned(8);
|
||||||
|
u8 real_iv[16];
|
||||||
|
int err;
|
||||||
|
|
||||||
|
if (req->cryptlen <= CHACHA_BLOCK_SIZE || !irq_fpu_usable())
|
||||||
|
return crypto_xchacha_crypt(req);
|
||||||
|
|
||||||
|
err = skcipher_walk_virt(&walk, req, true);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
BUILD_BUG_ON(CHACHA_STATE_ALIGN != 16);
|
||||||
|
state = PTR_ALIGN(state_buf + 0, CHACHA_STATE_ALIGN);
|
||||||
|
crypto_chacha_init(state, ctx, req->iv);
|
||||||
|
|
||||||
|
kernel_fpu_begin();
|
||||||
|
|
||||||
|
hchacha_block_ssse3(state, subctx.key, ctx->nrounds);
|
||||||
|
subctx.nrounds = ctx->nrounds;
|
||||||
|
|
||||||
|
memcpy(&real_iv[0], req->iv + 24, 8);
|
||||||
|
memcpy(&real_iv[8], req->iv + 16, 8);
|
||||||
|
err = chacha_simd_stream_xor(&walk, &subctx, real_iv);
|
||||||
|
|
||||||
|
kernel_fpu_end();
|
||||||
|
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct skcipher_alg algs[] = {
|
||||||
|
{
|
||||||
|
.base.cra_name = "chacha20",
|
||||||
|
.base.cra_driver_name = "chacha20-simd",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = CHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = chacha_simd,
|
||||||
|
.decrypt = chacha_simd,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha20",
|
||||||
|
.base.cra_driver_name = "xchacha20-simd",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = xchacha_simd,
|
||||||
|
.decrypt = xchacha_simd,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha12",
|
||||||
|
.base.cra_driver_name = "xchacha12-simd",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha12_setkey,
|
||||||
|
.encrypt = xchacha_simd,
|
||||||
|
.decrypt = xchacha_simd,
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init chacha_simd_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!boot_cpu_has(X86_FEATURE_SSSE3))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
#ifdef CONFIG_AS_AVX2
|
||||||
|
chacha_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) &&
|
||||||
|
boot_cpu_has(X86_FEATURE_AVX2) &&
|
||||||
|
cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
|
||||||
|
#ifdef CONFIG_AS_AVX512
|
||||||
|
chacha_use_avx512vl = chacha_use_avx2 &&
|
||||||
|
boot_cpu_has(X86_FEATURE_AVX512VL) &&
|
||||||
|
boot_cpu_has(X86_FEATURE_AVX512BW); /* kmovq */
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit chacha_simd_mod_fini(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(chacha_simd_mod_init);
|
||||||
|
module_exit(chacha_simd_mod_fini);
|
||||||
|
|
||||||
|
MODULE_LICENSE("GPL");
|
||||||
|
MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
|
||||||
|
MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (x64 SIMD accelerated)");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20-simd");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20-simd");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12-simd");
|
|
@ -0,0 +1,157 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0 */
|
||||||
|
/*
|
||||||
|
* NH - ε-almost-universal hash function, x86_64 AVX2 accelerated
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*
|
||||||
|
* Author: Eric Biggers <ebiggers@google.com>
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/linkage.h>
|
||||||
|
|
||||||
|
#define PASS0_SUMS %ymm0
|
||||||
|
#define PASS1_SUMS %ymm1
|
||||||
|
#define PASS2_SUMS %ymm2
|
||||||
|
#define PASS3_SUMS %ymm3
|
||||||
|
#define K0 %ymm4
|
||||||
|
#define K0_XMM %xmm4
|
||||||
|
#define K1 %ymm5
|
||||||
|
#define K1_XMM %xmm5
|
||||||
|
#define K2 %ymm6
|
||||||
|
#define K2_XMM %xmm6
|
||||||
|
#define K3 %ymm7
|
||||||
|
#define K3_XMM %xmm7
|
||||||
|
#define T0 %ymm8
|
||||||
|
#define T1 %ymm9
|
||||||
|
#define T2 %ymm10
|
||||||
|
#define T2_XMM %xmm10
|
||||||
|
#define T3 %ymm11
|
||||||
|
#define T3_XMM %xmm11
|
||||||
|
#define T4 %ymm12
|
||||||
|
#define T5 %ymm13
|
||||||
|
#define T6 %ymm14
|
||||||
|
#define T7 %ymm15
|
||||||
|
#define KEY %rdi
|
||||||
|
#define MESSAGE %rsi
|
||||||
|
#define MESSAGE_LEN %rdx
|
||||||
|
#define HASH %rcx
|
||||||
|
|
||||||
|
.macro _nh_2xstride k0, k1, k2, k3
|
||||||
|
|
||||||
|
// Add message words to key words
|
||||||
|
vpaddd \k0, T3, T0
|
||||||
|
vpaddd \k1, T3, T1
|
||||||
|
vpaddd \k2, T3, T2
|
||||||
|
vpaddd \k3, T3, T3
|
||||||
|
|
||||||
|
// Multiply 32x32 => 64 and accumulate
|
||||||
|
vpshufd $0x10, T0, T4
|
||||||
|
vpshufd $0x32, T0, T0
|
||||||
|
vpshufd $0x10, T1, T5
|
||||||
|
vpshufd $0x32, T1, T1
|
||||||
|
vpshufd $0x10, T2, T6
|
||||||
|
vpshufd $0x32, T2, T2
|
||||||
|
vpshufd $0x10, T3, T7
|
||||||
|
vpshufd $0x32, T3, T3
|
||||||
|
vpmuludq T4, T0, T0
|
||||||
|
vpmuludq T5, T1, T1
|
||||||
|
vpmuludq T6, T2, T2
|
||||||
|
vpmuludq T7, T3, T3
|
||||||
|
vpaddq T0, PASS0_SUMS, PASS0_SUMS
|
||||||
|
vpaddq T1, PASS1_SUMS, PASS1_SUMS
|
||||||
|
vpaddq T2, PASS2_SUMS, PASS2_SUMS
|
||||||
|
vpaddq T3, PASS3_SUMS, PASS3_SUMS
|
||||||
|
.endm
|
||||||
|
|
||||||
|
/*
|
||||||
|
* void nh_avx2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
* u8 hash[NH_HASH_BYTES])
|
||||||
|
*
|
||||||
|
* It's guaranteed that message_len % 16 == 0.
|
||||||
|
*/
|
||||||
|
ENTRY(nh_avx2)
|
||||||
|
|
||||||
|
vmovdqu 0x00(KEY), K0
|
||||||
|
vmovdqu 0x10(KEY), K1
|
||||||
|
add $0x20, KEY
|
||||||
|
vpxor PASS0_SUMS, PASS0_SUMS, PASS0_SUMS
|
||||||
|
vpxor PASS1_SUMS, PASS1_SUMS, PASS1_SUMS
|
||||||
|
vpxor PASS2_SUMS, PASS2_SUMS, PASS2_SUMS
|
||||||
|
vpxor PASS3_SUMS, PASS3_SUMS, PASS3_SUMS
|
||||||
|
|
||||||
|
sub $0x40, MESSAGE_LEN
|
||||||
|
jl .Lloop4_done
|
||||||
|
.Lloop4:
|
||||||
|
vmovdqu (MESSAGE), T3
|
||||||
|
vmovdqu 0x00(KEY), K2
|
||||||
|
vmovdqu 0x10(KEY), K3
|
||||||
|
_nh_2xstride K0, K1, K2, K3
|
||||||
|
|
||||||
|
vmovdqu 0x20(MESSAGE), T3
|
||||||
|
vmovdqu 0x20(KEY), K0
|
||||||
|
vmovdqu 0x30(KEY), K1
|
||||||
|
_nh_2xstride K2, K3, K0, K1
|
||||||
|
|
||||||
|
add $0x40, MESSAGE
|
||||||
|
add $0x40, KEY
|
||||||
|
sub $0x40, MESSAGE_LEN
|
||||||
|
jge .Lloop4
|
||||||
|
|
||||||
|
.Lloop4_done:
|
||||||
|
and $0x3f, MESSAGE_LEN
|
||||||
|
jz .Ldone
|
||||||
|
|
||||||
|
cmp $0x20, MESSAGE_LEN
|
||||||
|
jl .Llast
|
||||||
|
|
||||||
|
// 2 or 3 strides remain; do 2 more.
|
||||||
|
vmovdqu (MESSAGE), T3
|
||||||
|
vmovdqu 0x00(KEY), K2
|
||||||
|
vmovdqu 0x10(KEY), K3
|
||||||
|
_nh_2xstride K0, K1, K2, K3
|
||||||
|
add $0x20, MESSAGE
|
||||||
|
add $0x20, KEY
|
||||||
|
sub $0x20, MESSAGE_LEN
|
||||||
|
jz .Ldone
|
||||||
|
vmovdqa K2, K0
|
||||||
|
vmovdqa K3, K1
|
||||||
|
.Llast:
|
||||||
|
// Last stride. Zero the high 128 bits of the message and keys so they
|
||||||
|
// don't affect the result when processing them like 2 strides.
|
||||||
|
vmovdqu (MESSAGE), T3_XMM
|
||||||
|
vmovdqa K0_XMM, K0_XMM
|
||||||
|
vmovdqa K1_XMM, K1_XMM
|
||||||
|
vmovdqu 0x00(KEY), K2_XMM
|
||||||
|
vmovdqu 0x10(KEY), K3_XMM
|
||||||
|
_nh_2xstride K0, K1, K2, K3
|
||||||
|
|
||||||
|
.Ldone:
|
||||||
|
// Sum the accumulators for each pass, then store the sums to 'hash'
|
||||||
|
|
||||||
|
// PASS0_SUMS is (0A 0B 0C 0D)
|
||||||
|
// PASS1_SUMS is (1A 1B 1C 1D)
|
||||||
|
// PASS2_SUMS is (2A 2B 2C 2D)
|
||||||
|
// PASS3_SUMS is (3A 3B 3C 3D)
|
||||||
|
// We need the horizontal sums:
|
||||||
|
// (0A + 0B + 0C + 0D,
|
||||||
|
// 1A + 1B + 1C + 1D,
|
||||||
|
// 2A + 2B + 2C + 2D,
|
||||||
|
// 3A + 3B + 3C + 3D)
|
||||||
|
//
|
||||||
|
|
||||||
|
vpunpcklqdq PASS1_SUMS, PASS0_SUMS, T0 // T0 = (0A 1A 0C 1C)
|
||||||
|
vpunpckhqdq PASS1_SUMS, PASS0_SUMS, T1 // T1 = (0B 1B 0D 1D)
|
||||||
|
vpunpcklqdq PASS3_SUMS, PASS2_SUMS, T2 // T2 = (2A 3A 2C 3C)
|
||||||
|
vpunpckhqdq PASS3_SUMS, PASS2_SUMS, T3 // T3 = (2B 3B 2D 3D)
|
||||||
|
|
||||||
|
vinserti128 $0x1, T2_XMM, T0, T4 // T4 = (0A 1A 2A 3A)
|
||||||
|
vinserti128 $0x1, T3_XMM, T1, T5 // T5 = (0B 1B 2B 3B)
|
||||||
|
vperm2i128 $0x31, T2, T0, T0 // T0 = (0C 1C 2C 3C)
|
||||||
|
vperm2i128 $0x31, T3, T1, T1 // T1 = (0D 1D 2D 3D)
|
||||||
|
|
||||||
|
vpaddq T5, T4, T4
|
||||||
|
vpaddq T1, T0, T0
|
||||||
|
vpaddq T4, T0, T0
|
||||||
|
vmovdqu T0, (HASH)
|
||||||
|
ret
|
||||||
|
ENDPROC(nh_avx2)
|
|
@ -0,0 +1,123 @@
|
||||||
|
/* SPDX-License-Identifier: GPL-2.0 */
|
||||||
|
/*
|
||||||
|
* NH - ε-almost-universal hash function, x86_64 SSE2 accelerated
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*
|
||||||
|
* Author: Eric Biggers <ebiggers@google.com>
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <linux/linkage.h>
|
||||||
|
|
||||||
|
#define PASS0_SUMS %xmm0
|
||||||
|
#define PASS1_SUMS %xmm1
|
||||||
|
#define PASS2_SUMS %xmm2
|
||||||
|
#define PASS3_SUMS %xmm3
|
||||||
|
#define K0 %xmm4
|
||||||
|
#define K1 %xmm5
|
||||||
|
#define K2 %xmm6
|
||||||
|
#define K3 %xmm7
|
||||||
|
#define T0 %xmm8
|
||||||
|
#define T1 %xmm9
|
||||||
|
#define T2 %xmm10
|
||||||
|
#define T3 %xmm11
|
||||||
|
#define T4 %xmm12
|
||||||
|
#define T5 %xmm13
|
||||||
|
#define T6 %xmm14
|
||||||
|
#define T7 %xmm15
|
||||||
|
#define KEY %rdi
|
||||||
|
#define MESSAGE %rsi
|
||||||
|
#define MESSAGE_LEN %rdx
|
||||||
|
#define HASH %rcx
|
||||||
|
|
||||||
|
.macro _nh_stride k0, k1, k2, k3, offset
|
||||||
|
|
||||||
|
// Load next message stride
|
||||||
|
movdqu \offset(MESSAGE), T1
|
||||||
|
|
||||||
|
// Load next key stride
|
||||||
|
movdqu \offset(KEY), \k3
|
||||||
|
|
||||||
|
// Add message words to key words
|
||||||
|
movdqa T1, T2
|
||||||
|
movdqa T1, T3
|
||||||
|
paddd T1, \k0 // reuse k0 to avoid a move
|
||||||
|
paddd \k1, T1
|
||||||
|
paddd \k2, T2
|
||||||
|
paddd \k3, T3
|
||||||
|
|
||||||
|
// Multiply 32x32 => 64 and accumulate
|
||||||
|
pshufd $0x10, \k0, T4
|
||||||
|
pshufd $0x32, \k0, \k0
|
||||||
|
pshufd $0x10, T1, T5
|
||||||
|
pshufd $0x32, T1, T1
|
||||||
|
pshufd $0x10, T2, T6
|
||||||
|
pshufd $0x32, T2, T2
|
||||||
|
pshufd $0x10, T3, T7
|
||||||
|
pshufd $0x32, T3, T3
|
||||||
|
pmuludq T4, \k0
|
||||||
|
pmuludq T5, T1
|
||||||
|
pmuludq T6, T2
|
||||||
|
pmuludq T7, T3
|
||||||
|
paddq \k0, PASS0_SUMS
|
||||||
|
paddq T1, PASS1_SUMS
|
||||||
|
paddq T2, PASS2_SUMS
|
||||||
|
paddq T3, PASS3_SUMS
|
||||||
|
.endm
|
||||||
|
|
||||||
|
/*
|
||||||
|
* void nh_sse2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
* u8 hash[NH_HASH_BYTES])
|
||||||
|
*
|
||||||
|
* It's guaranteed that message_len % 16 == 0.
|
||||||
|
*/
|
||||||
|
ENTRY(nh_sse2)
|
||||||
|
|
||||||
|
movdqu 0x00(KEY), K0
|
||||||
|
movdqu 0x10(KEY), K1
|
||||||
|
movdqu 0x20(KEY), K2
|
||||||
|
add $0x30, KEY
|
||||||
|
pxor PASS0_SUMS, PASS0_SUMS
|
||||||
|
pxor PASS1_SUMS, PASS1_SUMS
|
||||||
|
pxor PASS2_SUMS, PASS2_SUMS
|
||||||
|
pxor PASS3_SUMS, PASS3_SUMS
|
||||||
|
|
||||||
|
sub $0x40, MESSAGE_LEN
|
||||||
|
jl .Lloop4_done
|
||||||
|
.Lloop4:
|
||||||
|
_nh_stride K0, K1, K2, K3, 0x00
|
||||||
|
_nh_stride K1, K2, K3, K0, 0x10
|
||||||
|
_nh_stride K2, K3, K0, K1, 0x20
|
||||||
|
_nh_stride K3, K0, K1, K2, 0x30
|
||||||
|
add $0x40, KEY
|
||||||
|
add $0x40, MESSAGE
|
||||||
|
sub $0x40, MESSAGE_LEN
|
||||||
|
jge .Lloop4
|
||||||
|
|
||||||
|
.Lloop4_done:
|
||||||
|
and $0x3f, MESSAGE_LEN
|
||||||
|
jz .Ldone
|
||||||
|
_nh_stride K0, K1, K2, K3, 0x00
|
||||||
|
|
||||||
|
sub $0x10, MESSAGE_LEN
|
||||||
|
jz .Ldone
|
||||||
|
_nh_stride K1, K2, K3, K0, 0x10
|
||||||
|
|
||||||
|
sub $0x10, MESSAGE_LEN
|
||||||
|
jz .Ldone
|
||||||
|
_nh_stride K2, K3, K0, K1, 0x20
|
||||||
|
|
||||||
|
.Ldone:
|
||||||
|
// Sum the accumulators for each pass, then store the sums to 'hash'
|
||||||
|
movdqa PASS0_SUMS, T0
|
||||||
|
movdqa PASS2_SUMS, T1
|
||||||
|
punpcklqdq PASS1_SUMS, T0 // => (PASS0_SUM_A PASS1_SUM_A)
|
||||||
|
punpcklqdq PASS3_SUMS, T1 // => (PASS2_SUM_A PASS3_SUM_A)
|
||||||
|
punpckhqdq PASS1_SUMS, PASS0_SUMS // => (PASS0_SUM_B PASS1_SUM_B)
|
||||||
|
punpckhqdq PASS3_SUMS, PASS2_SUMS // => (PASS2_SUM_B PASS3_SUM_B)
|
||||||
|
paddq PASS0_SUMS, T0
|
||||||
|
paddq PASS2_SUMS, T1
|
||||||
|
movdqu T0, 0x00(HASH)
|
||||||
|
movdqu T1, 0x10(HASH)
|
||||||
|
ret
|
||||||
|
ENDPROC(nh_sse2)
|
|
@ -0,0 +1,77 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
|
||||||
|
* (AVX2 accelerated version)
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
#include <asm/fpu/api.h>
|
||||||
|
|
||||||
|
asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
u8 hash[NH_HASH_BYTES]);
|
||||||
|
|
||||||
|
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
|
||||||
|
static void _nh_avx2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
__le64 hash[NH_NUM_PASSES])
|
||||||
|
{
|
||||||
|
nh_avx2(key, message, message_len, (u8 *)hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nhpoly1305_avx2_update(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen)
|
||||||
|
{
|
||||||
|
if (srclen < 64 || !irq_fpu_usable())
|
||||||
|
return crypto_nhpoly1305_update(desc, src, srclen);
|
||||||
|
|
||||||
|
do {
|
||||||
|
unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
|
||||||
|
|
||||||
|
kernel_fpu_begin();
|
||||||
|
crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2);
|
||||||
|
kernel_fpu_end();
|
||||||
|
src += n;
|
||||||
|
srclen -= n;
|
||||||
|
} while (srclen);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct shash_alg nhpoly1305_alg = {
|
||||||
|
.base.cra_name = "nhpoly1305",
|
||||||
|
.base.cra_driver_name = "nhpoly1305-avx2",
|
||||||
|
.base.cra_priority = 300,
|
||||||
|
.base.cra_ctxsize = sizeof(struct nhpoly1305_key),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
.digestsize = POLY1305_DIGEST_SIZE,
|
||||||
|
.init = crypto_nhpoly1305_init,
|
||||||
|
.update = nhpoly1305_avx2_update,
|
||||||
|
.final = crypto_nhpoly1305_final,
|
||||||
|
.setkey = crypto_nhpoly1305_setkey,
|
||||||
|
.descsize = sizeof(struct nhpoly1305_state),
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init nhpoly1305_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!boot_cpu_has(X86_FEATURE_AVX2) ||
|
||||||
|
!boot_cpu_has(X86_FEATURE_OSXSAVE))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit nhpoly1305_mod_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(nhpoly1305_mod_init);
|
||||||
|
module_exit(nhpoly1305_mod_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (AVX2-accelerated)");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305-avx2");
|
|
@ -0,0 +1,76 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
|
||||||
|
* (SSE2 accelerated version)
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
#include <asm/fpu/api.h>
|
||||||
|
|
||||||
|
asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
u8 hash[NH_HASH_BYTES]);
|
||||||
|
|
||||||
|
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
|
||||||
|
static void _nh_sse2(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
__le64 hash[NH_NUM_PASSES])
|
||||||
|
{
|
||||||
|
nh_sse2(key, message, message_len, (u8 *)hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nhpoly1305_sse2_update(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen)
|
||||||
|
{
|
||||||
|
if (srclen < 64 || !irq_fpu_usable())
|
||||||
|
return crypto_nhpoly1305_update(desc, src, srclen);
|
||||||
|
|
||||||
|
do {
|
||||||
|
unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
|
||||||
|
|
||||||
|
kernel_fpu_begin();
|
||||||
|
crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2);
|
||||||
|
kernel_fpu_end();
|
||||||
|
src += n;
|
||||||
|
srclen -= n;
|
||||||
|
} while (srclen);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct shash_alg nhpoly1305_alg = {
|
||||||
|
.base.cra_name = "nhpoly1305",
|
||||||
|
.base.cra_driver_name = "nhpoly1305-sse2",
|
||||||
|
.base.cra_priority = 200,
|
||||||
|
.base.cra_ctxsize = sizeof(struct nhpoly1305_key),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
.digestsize = POLY1305_DIGEST_SIZE,
|
||||||
|
.init = crypto_nhpoly1305_init,
|
||||||
|
.update = nhpoly1305_sse2_update,
|
||||||
|
.final = crypto_nhpoly1305_final,
|
||||||
|
.setkey = crypto_nhpoly1305_setkey,
|
||||||
|
.descsize = sizeof(struct nhpoly1305_state),
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init nhpoly1305_mod_init(void)
|
||||||
|
{
|
||||||
|
if (!boot_cpu_has(X86_FEATURE_XMM2))
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
return crypto_register_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit nhpoly1305_mod_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(nhpoly1305_mod_init);
|
||||||
|
module_exit(nhpoly1305_mod_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (SSE2-accelerated)");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305-sse2");
|
|
@ -83,35 +83,37 @@ static unsigned int poly1305_simd_blocks(struct poly1305_desc_ctx *dctx,
|
||||||
if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) {
|
if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) {
|
||||||
if (unlikely(!sctx->wset)) {
|
if (unlikely(!sctx->wset)) {
|
||||||
if (!sctx->uset) {
|
if (!sctx->uset) {
|
||||||
memcpy(sctx->u, dctx->r, sizeof(sctx->u));
|
memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
|
||||||
poly1305_simd_mult(sctx->u, dctx->r);
|
poly1305_simd_mult(sctx->u, dctx->r.r);
|
||||||
sctx->uset = true;
|
sctx->uset = true;
|
||||||
}
|
}
|
||||||
memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u));
|
memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u));
|
||||||
poly1305_simd_mult(sctx->u + 5, dctx->r);
|
poly1305_simd_mult(sctx->u + 5, dctx->r.r);
|
||||||
memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u));
|
memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u));
|
||||||
poly1305_simd_mult(sctx->u + 10, dctx->r);
|
poly1305_simd_mult(sctx->u + 10, dctx->r.r);
|
||||||
sctx->wset = true;
|
sctx->wset = true;
|
||||||
}
|
}
|
||||||
blocks = srclen / (POLY1305_BLOCK_SIZE * 4);
|
blocks = srclen / (POLY1305_BLOCK_SIZE * 4);
|
||||||
poly1305_4block_avx2(dctx->h, src, dctx->r, blocks, sctx->u);
|
poly1305_4block_avx2(dctx->h.h, src, dctx->r.r, blocks,
|
||||||
|
sctx->u);
|
||||||
src += POLY1305_BLOCK_SIZE * 4 * blocks;
|
src += POLY1305_BLOCK_SIZE * 4 * blocks;
|
||||||
srclen -= POLY1305_BLOCK_SIZE * 4 * blocks;
|
srclen -= POLY1305_BLOCK_SIZE * 4 * blocks;
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) {
|
if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) {
|
||||||
if (unlikely(!sctx->uset)) {
|
if (unlikely(!sctx->uset)) {
|
||||||
memcpy(sctx->u, dctx->r, sizeof(sctx->u));
|
memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
|
||||||
poly1305_simd_mult(sctx->u, dctx->r);
|
poly1305_simd_mult(sctx->u, dctx->r.r);
|
||||||
sctx->uset = true;
|
sctx->uset = true;
|
||||||
}
|
}
|
||||||
blocks = srclen / (POLY1305_BLOCK_SIZE * 2);
|
blocks = srclen / (POLY1305_BLOCK_SIZE * 2);
|
||||||
poly1305_2block_sse2(dctx->h, src, dctx->r, blocks, sctx->u);
|
poly1305_2block_sse2(dctx->h.h, src, dctx->r.r, blocks,
|
||||||
|
sctx->u);
|
||||||
src += POLY1305_BLOCK_SIZE * 2 * blocks;
|
src += POLY1305_BLOCK_SIZE * 2 * blocks;
|
||||||
srclen -= POLY1305_BLOCK_SIZE * 2 * blocks;
|
srclen -= POLY1305_BLOCK_SIZE * 2 * blocks;
|
||||||
}
|
}
|
||||||
if (srclen >= POLY1305_BLOCK_SIZE) {
|
if (srclen >= POLY1305_BLOCK_SIZE) {
|
||||||
poly1305_block_sse2(dctx->h, src, dctx->r, 1);
|
poly1305_block_sse2(dctx->h.h, src, dctx->r.r, 1);
|
||||||
srclen -= POLY1305_BLOCK_SIZE;
|
srclen -= POLY1305_BLOCK_SIZE;
|
||||||
}
|
}
|
||||||
return srclen;
|
return srclen;
|
||||||
|
|
|
@ -430,11 +430,14 @@ config CRYPTO_CTS
|
||||||
help
|
help
|
||||||
CTS: Cipher Text Stealing
|
CTS: Cipher Text Stealing
|
||||||
This is the Cipher Text Stealing mode as described by
|
This is the Cipher Text Stealing mode as described by
|
||||||
Section 8 of rfc2040 and referenced by rfc3962.
|
Section 8 of rfc2040 and referenced by rfc3962
|
||||||
(rfc3962 includes errata information in its Appendix A)
|
(rfc3962 includes errata information in its Appendix A) or
|
||||||
|
CBC-CS3 as defined by NIST in Sp800-38A addendum from Oct 2010.
|
||||||
This mode is required for Kerberos gss mechanism support
|
This mode is required for Kerberos gss mechanism support
|
||||||
for AES encryption.
|
for AES encryption.
|
||||||
|
|
||||||
|
See: https://csrc.nist.gov/publications/detail/sp/800-38a/addendum/final
|
||||||
|
|
||||||
config CRYPTO_ECB
|
config CRYPTO_ECB
|
||||||
tristate "ECB support"
|
tristate "ECB support"
|
||||||
select CRYPTO_BLKCIPHER
|
select CRYPTO_BLKCIPHER
|
||||||
|
@ -493,6 +496,50 @@ config CRYPTO_KEYWRAP
|
||||||
Support for key wrapping (NIST SP800-38F / RFC3394) without
|
Support for key wrapping (NIST SP800-38F / RFC3394) without
|
||||||
padding.
|
padding.
|
||||||
|
|
||||||
|
config CRYPTO_NHPOLY1305
|
||||||
|
tristate
|
||||||
|
select CRYPTO_HASH
|
||||||
|
select CRYPTO_POLY1305
|
||||||
|
|
||||||
|
config CRYPTO_NHPOLY1305_SSE2
|
||||||
|
tristate "NHPoly1305 hash function (x86_64 SSE2 implementation)"
|
||||||
|
depends on X86 && 64BIT
|
||||||
|
select CRYPTO_NHPOLY1305
|
||||||
|
help
|
||||||
|
SSE2 optimized implementation of the hash function used by the
|
||||||
|
Adiantum encryption mode.
|
||||||
|
|
||||||
|
config CRYPTO_NHPOLY1305_AVX2
|
||||||
|
tristate "NHPoly1305 hash function (x86_64 AVX2 implementation)"
|
||||||
|
depends on X86 && 64BIT
|
||||||
|
select CRYPTO_NHPOLY1305
|
||||||
|
help
|
||||||
|
AVX2 optimized implementation of the hash function used by the
|
||||||
|
Adiantum encryption mode.
|
||||||
|
|
||||||
|
config CRYPTO_ADIANTUM
|
||||||
|
tristate "Adiantum support"
|
||||||
|
select CRYPTO_CHACHA20
|
||||||
|
select CRYPTO_POLY1305
|
||||||
|
select CRYPTO_NHPOLY1305
|
||||||
|
help
|
||||||
|
Adiantum is a tweakable, length-preserving encryption mode
|
||||||
|
designed for fast and secure disk encryption, especially on
|
||||||
|
CPUs without dedicated crypto instructions. It encrypts
|
||||||
|
each sector using the XChaCha12 stream cipher, two passes of
|
||||||
|
an ε-almost-∆-universal hash function, and an invocation of
|
||||||
|
the AES-256 block cipher on a single 16-byte block. On CPUs
|
||||||
|
without AES instructions, Adiantum is much faster than
|
||||||
|
AES-XTS.
|
||||||
|
|
||||||
|
Adiantum's security is provably reducible to that of its
|
||||||
|
underlying stream and block ciphers, subject to a security
|
||||||
|
bound. Unlike XTS, Adiantum is a true wide-block encryption
|
||||||
|
mode, so it actually provides an even stronger notion of
|
||||||
|
security than XTS, subject to the security bound.
|
||||||
|
|
||||||
|
If unsure, say N.
|
||||||
|
|
||||||
comment "Hash modes"
|
comment "Hash modes"
|
||||||
|
|
||||||
config CRYPTO_CMAC
|
config CRYPTO_CMAC
|
||||||
|
@ -936,6 +983,18 @@ config CRYPTO_SM3
|
||||||
http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
|
http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
|
||||||
https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
|
https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
|
||||||
|
|
||||||
|
config CRYPTO_STREEBOG
|
||||||
|
tristate "Streebog Hash Function"
|
||||||
|
select CRYPTO_HASH
|
||||||
|
help
|
||||||
|
Streebog Hash Function (GOST R 34.11-2012, RFC 6986) is one of the Russian
|
||||||
|
cryptographic standard algorithms (called GOST algorithms).
|
||||||
|
This setting enables two hash algorithms with 256 and 512 bits output.
|
||||||
|
|
||||||
|
References:
|
||||||
|
https://tc26.ru/upload/iblock/fed/feddbb4d26b685903faa2ba11aea43f6.pdf
|
||||||
|
https://tools.ietf.org/html/rfc6986
|
||||||
|
|
||||||
config CRYPTO_TGR192
|
config CRYPTO_TGR192
|
||||||
tristate "Tiger digest algorithms"
|
tristate "Tiger digest algorithms"
|
||||||
select CRYPTO_HASH
|
select CRYPTO_HASH
|
||||||
|
@ -1006,7 +1065,8 @@ config CRYPTO_AES_TI
|
||||||
8 for decryption), this implementation only uses just two S-boxes of
|
8 for decryption), this implementation only uses just two S-boxes of
|
||||||
256 bytes each, and attempts to eliminate data dependent latencies by
|
256 bytes each, and attempts to eliminate data dependent latencies by
|
||||||
prefetching the entire table into the cache at the start of each
|
prefetching the entire table into the cache at the start of each
|
||||||
block.
|
block. Interrupts are also disabled to avoid races where cachelines
|
||||||
|
are evicted when the CPU is interrupted to do something else.
|
||||||
|
|
||||||
config CRYPTO_AES_586
|
config CRYPTO_AES_586
|
||||||
tristate "AES cipher algorithms (i586)"
|
tristate "AES cipher algorithms (i586)"
|
||||||
|
@ -1387,32 +1447,34 @@ config CRYPTO_SALSA20
|
||||||
Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
|
Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
|
||||||
|
|
||||||
config CRYPTO_CHACHA20
|
config CRYPTO_CHACHA20
|
||||||
tristate "ChaCha20 cipher algorithm"
|
tristate "ChaCha stream cipher algorithms"
|
||||||
select CRYPTO_BLKCIPHER
|
select CRYPTO_BLKCIPHER
|
||||||
help
|
help
|
||||||
ChaCha20 cipher algorithm, RFC7539.
|
The ChaCha20, XChaCha20, and XChaCha12 stream cipher algorithms.
|
||||||
|
|
||||||
ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
|
ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
|
||||||
Bernstein and further specified in RFC7539 for use in IETF protocols.
|
Bernstein and further specified in RFC7539 for use in IETF protocols.
|
||||||
This is the portable C implementation of ChaCha20.
|
This is the portable C implementation of ChaCha20. See also:
|
||||||
|
|
||||||
See also:
|
|
||||||
<http://cr.yp.to/chacha/chacha-20080128.pdf>
|
<http://cr.yp.to/chacha/chacha-20080128.pdf>
|
||||||
|
|
||||||
|
XChaCha20 is the application of the XSalsa20 construction to ChaCha20
|
||||||
|
rather than to Salsa20. XChaCha20 extends ChaCha20's nonce length
|
||||||
|
from 64 bits (or 96 bits using the RFC7539 convention) to 192 bits,
|
||||||
|
while provably retaining ChaCha20's security. See also:
|
||||||
|
<https://cr.yp.to/snuffle/xsalsa-20081128.pdf>
|
||||||
|
|
||||||
|
XChaCha12 is XChaCha20 reduced to 12 rounds, with correspondingly
|
||||||
|
reduced security margin but increased performance. It can be needed
|
||||||
|
in some performance-sensitive scenarios.
|
||||||
|
|
||||||
config CRYPTO_CHACHA20_X86_64
|
config CRYPTO_CHACHA20_X86_64
|
||||||
tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
|
tristate "ChaCha stream cipher algorithms (x86_64/SSSE3/AVX2/AVX-512VL)"
|
||||||
depends on X86 && 64BIT
|
depends on X86 && 64BIT
|
||||||
select CRYPTO_BLKCIPHER
|
select CRYPTO_BLKCIPHER
|
||||||
select CRYPTO_CHACHA20
|
select CRYPTO_CHACHA20
|
||||||
help
|
help
|
||||||
ChaCha20 cipher algorithm, RFC7539.
|
SSSE3, AVX2, and AVX-512VL optimized implementations of the ChaCha20,
|
||||||
|
XChaCha20, and XChaCha12 stream ciphers.
|
||||||
ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
|
|
||||||
Bernstein and further specified in RFC7539 for use in IETF protocols.
|
|
||||||
This is the x86_64 assembler implementation using SIMD instructions.
|
|
||||||
|
|
||||||
See also:
|
|
||||||
<http://cr.yp.to/chacha/chacha-20080128.pdf>
|
|
||||||
|
|
||||||
config CRYPTO_SEED
|
config CRYPTO_SEED
|
||||||
tristate "SEED cipher algorithm"
|
tristate "SEED cipher algorithm"
|
||||||
|
@ -1812,7 +1874,8 @@ config CRYPTO_USER_API_AEAD
|
||||||
cipher algorithms.
|
cipher algorithms.
|
||||||
|
|
||||||
config CRYPTO_STATS
|
config CRYPTO_STATS
|
||||||
bool
|
bool "Crypto usage statistics for User-space"
|
||||||
|
depends on CRYPTO_USER
|
||||||
help
|
help
|
||||||
This option enables the gathering of crypto stats.
|
This option enables the gathering of crypto stats.
|
||||||
This will collect:
|
This will collect:
|
||||||
|
|
|
@ -54,7 +54,8 @@ cryptomgr-y := algboss.o testmgr.o
|
||||||
|
|
||||||
obj-$(CONFIG_CRYPTO_MANAGER2) += cryptomgr.o
|
obj-$(CONFIG_CRYPTO_MANAGER2) += cryptomgr.o
|
||||||
obj-$(CONFIG_CRYPTO_USER) += crypto_user.o
|
obj-$(CONFIG_CRYPTO_USER) += crypto_user.o
|
||||||
crypto_user-y := crypto_user_base.o crypto_user_stat.o
|
crypto_user-y := crypto_user_base.o
|
||||||
|
crypto_user-$(CONFIG_CRYPTO_STATS) += crypto_user_stat.o
|
||||||
obj-$(CONFIG_CRYPTO_CMAC) += cmac.o
|
obj-$(CONFIG_CRYPTO_CMAC) += cmac.o
|
||||||
obj-$(CONFIG_CRYPTO_HMAC) += hmac.o
|
obj-$(CONFIG_CRYPTO_HMAC) += hmac.o
|
||||||
obj-$(CONFIG_CRYPTO_VMAC) += vmac.o
|
obj-$(CONFIG_CRYPTO_VMAC) += vmac.o
|
||||||
|
@ -71,6 +72,7 @@ obj-$(CONFIG_CRYPTO_SHA256) += sha256_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA512) += sha512_generic.o
|
obj-$(CONFIG_CRYPTO_SHA512) += sha512_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o
|
obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_SM3) += sm3_generic.o
|
obj-$(CONFIG_CRYPTO_SM3) += sm3_generic.o
|
||||||
|
obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
|
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
|
||||||
CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
|
CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
|
||||||
obj-$(CONFIG_CRYPTO_TGR192) += tgr192.o
|
obj-$(CONFIG_CRYPTO_TGR192) += tgr192.o
|
||||||
|
@ -84,6 +86,8 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
|
||||||
obj-$(CONFIG_CRYPTO_XTS) += xts.o
|
obj-$(CONFIG_CRYPTO_XTS) += xts.o
|
||||||
obj-$(CONFIG_CRYPTO_CTR) += ctr.o
|
obj-$(CONFIG_CRYPTO_CTR) += ctr.o
|
||||||
obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
|
obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
|
||||||
|
obj-$(CONFIG_CRYPTO_ADIANTUM) += adiantum.o
|
||||||
|
obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
|
||||||
obj-$(CONFIG_CRYPTO_GCM) += gcm.o
|
obj-$(CONFIG_CRYPTO_GCM) += gcm.o
|
||||||
obj-$(CONFIG_CRYPTO_CCM) += ccm.o
|
obj-$(CONFIG_CRYPTO_CCM) += ccm.o
|
||||||
obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
|
obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
|
||||||
|
@ -116,7 +120,7 @@ obj-$(CONFIG_CRYPTO_KHAZAD) += khazad.o
|
||||||
obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
|
obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
|
||||||
obj-$(CONFIG_CRYPTO_SEED) += seed.o
|
obj-$(CONFIG_CRYPTO_SEED) += seed.o
|
||||||
obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
|
obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_generic.o
|
obj-$(CONFIG_CRYPTO_CHACHA20) += chacha_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
|
obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
|
||||||
obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
|
obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
|
||||||
obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
|
obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
|
||||||
|
|
|
@ -365,23 +365,18 @@ static int crypto_ablkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_blkcipher rblkcipher;
|
struct crypto_report_blkcipher rblkcipher;
|
||||||
|
|
||||||
strncpy(rblkcipher.type, "ablkcipher", sizeof(rblkcipher.type));
|
memset(&rblkcipher, 0, sizeof(rblkcipher));
|
||||||
strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "<default>",
|
|
||||||
sizeof(rblkcipher.geniv));
|
strscpy(rblkcipher.type, "ablkcipher", sizeof(rblkcipher.type));
|
||||||
rblkcipher.geniv[sizeof(rblkcipher.geniv) - 1] = '\0';
|
strscpy(rblkcipher.geniv, "<default>", sizeof(rblkcipher.geniv));
|
||||||
|
|
||||||
rblkcipher.blocksize = alg->cra_blocksize;
|
rblkcipher.blocksize = alg->cra_blocksize;
|
||||||
rblkcipher.min_keysize = alg->cra_ablkcipher.min_keysize;
|
rblkcipher.min_keysize = alg->cra_ablkcipher.min_keysize;
|
||||||
rblkcipher.max_keysize = alg->cra_ablkcipher.max_keysize;
|
rblkcipher.max_keysize = alg->cra_ablkcipher.max_keysize;
|
||||||
rblkcipher.ivsize = alg->cra_ablkcipher.ivsize;
|
rblkcipher.ivsize = alg->cra_ablkcipher.ivsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
return nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
||||||
sizeof(struct crypto_report_blkcipher), &rblkcipher))
|
sizeof(rblkcipher), &rblkcipher);
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_ablkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_ablkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
@ -403,7 +398,7 @@ static void crypto_ablkcipher_show(struct seq_file *m, struct crypto_alg *alg)
|
||||||
seq_printf(m, "min keysize : %u\n", ablkcipher->min_keysize);
|
seq_printf(m, "min keysize : %u\n", ablkcipher->min_keysize);
|
||||||
seq_printf(m, "max keysize : %u\n", ablkcipher->max_keysize);
|
seq_printf(m, "max keysize : %u\n", ablkcipher->max_keysize);
|
||||||
seq_printf(m, "ivsize : %u\n", ablkcipher->ivsize);
|
seq_printf(m, "ivsize : %u\n", ablkcipher->ivsize);
|
||||||
seq_printf(m, "geniv : %s\n", ablkcipher->geniv ?: "<default>");
|
seq_printf(m, "geniv : <default>\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
const struct crypto_type crypto_ablkcipher_type = {
|
const struct crypto_type crypto_ablkcipher_type = {
|
||||||
|
@ -415,78 +410,3 @@ const struct crypto_type crypto_ablkcipher_type = {
|
||||||
.report = crypto_ablkcipher_report,
|
.report = crypto_ablkcipher_report,
|
||||||
};
|
};
|
||||||
EXPORT_SYMBOL_GPL(crypto_ablkcipher_type);
|
EXPORT_SYMBOL_GPL(crypto_ablkcipher_type);
|
||||||
|
|
||||||
static int crypto_init_givcipher_ops(struct crypto_tfm *tfm, u32 type,
|
|
||||||
u32 mask)
|
|
||||||
{
|
|
||||||
struct ablkcipher_alg *alg = &tfm->__crt_alg->cra_ablkcipher;
|
|
||||||
struct ablkcipher_tfm *crt = &tfm->crt_ablkcipher;
|
|
||||||
|
|
||||||
if (alg->ivsize > PAGE_SIZE / 8)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
crt->setkey = tfm->__crt_alg->cra_flags & CRYPTO_ALG_GENIV ?
|
|
||||||
alg->setkey : setkey;
|
|
||||||
crt->encrypt = alg->encrypt;
|
|
||||||
crt->decrypt = alg->decrypt;
|
|
||||||
crt->base = __crypto_ablkcipher_cast(tfm);
|
|
||||||
crt->ivsize = alg->ivsize;
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
#ifdef CONFIG_NET
|
|
||||||
static int crypto_givcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
|
||||||
{
|
|
||||||
struct crypto_report_blkcipher rblkcipher;
|
|
||||||
|
|
||||||
strncpy(rblkcipher.type, "givcipher", sizeof(rblkcipher.type));
|
|
||||||
strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "<built-in>",
|
|
||||||
sizeof(rblkcipher.geniv));
|
|
||||||
rblkcipher.geniv[sizeof(rblkcipher.geniv) - 1] = '\0';
|
|
||||||
|
|
||||||
rblkcipher.blocksize = alg->cra_blocksize;
|
|
||||||
rblkcipher.min_keysize = alg->cra_ablkcipher.min_keysize;
|
|
||||||
rblkcipher.max_keysize = alg->cra_ablkcipher.max_keysize;
|
|
||||||
rblkcipher.ivsize = alg->cra_ablkcipher.ivsize;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
|
||||||
sizeof(struct crypto_report_blkcipher), &rblkcipher))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
|
||||||
#else
|
|
||||||
static int crypto_givcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
|
||||||
{
|
|
||||||
return -ENOSYS;
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
static void crypto_givcipher_show(struct seq_file *m, struct crypto_alg *alg)
|
|
||||||
__maybe_unused;
|
|
||||||
static void crypto_givcipher_show(struct seq_file *m, struct crypto_alg *alg)
|
|
||||||
{
|
|
||||||
struct ablkcipher_alg *ablkcipher = &alg->cra_ablkcipher;
|
|
||||||
|
|
||||||
seq_printf(m, "type : givcipher\n");
|
|
||||||
seq_printf(m, "async : %s\n", alg->cra_flags & CRYPTO_ALG_ASYNC ?
|
|
||||||
"yes" : "no");
|
|
||||||
seq_printf(m, "blocksize : %u\n", alg->cra_blocksize);
|
|
||||||
seq_printf(m, "min keysize : %u\n", ablkcipher->min_keysize);
|
|
||||||
seq_printf(m, "max keysize : %u\n", ablkcipher->max_keysize);
|
|
||||||
seq_printf(m, "ivsize : %u\n", ablkcipher->ivsize);
|
|
||||||
seq_printf(m, "geniv : %s\n", ablkcipher->geniv ?: "<built-in>");
|
|
||||||
}
|
|
||||||
|
|
||||||
const struct crypto_type crypto_givcipher_type = {
|
|
||||||
.ctxsize = crypto_ablkcipher_ctxsize,
|
|
||||||
.init = crypto_init_givcipher_ops,
|
|
||||||
#ifdef CONFIG_PROC_FS
|
|
||||||
.show = crypto_givcipher_show,
|
|
||||||
#endif
|
|
||||||
.report = crypto_givcipher_report,
|
|
||||||
};
|
|
||||||
EXPORT_SYMBOL_GPL(crypto_givcipher_type);
|
|
||||||
|
|
|
@ -33,15 +33,11 @@ static int crypto_acomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_acomp racomp;
|
struct crypto_report_acomp racomp;
|
||||||
|
|
||||||
strncpy(racomp.type, "acomp", sizeof(racomp.type));
|
memset(&racomp, 0, sizeof(racomp));
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_ACOMP,
|
strscpy(racomp.type, "acomp", sizeof(racomp.type));
|
||||||
sizeof(struct crypto_report_acomp), &racomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
return nla_put(skb, CRYPTOCFGA_REPORT_ACOMP, sizeof(racomp), &racomp);
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_acomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_acomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -0,0 +1,664 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* Adiantum length-preserving encryption mode
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Adiantum is a tweakable, length-preserving encryption mode designed for fast
|
||||||
|
* and secure disk encryption, especially on CPUs without dedicated crypto
|
||||||
|
* instructions. Adiantum encrypts each sector using the XChaCha12 stream
|
||||||
|
* cipher, two passes of an ε-almost-∆-universal (ε-∆U) hash function based on
|
||||||
|
* NH and Poly1305, and an invocation of the AES-256 block cipher on a single
|
||||||
|
* 16-byte block. See the paper for details:
|
||||||
|
*
|
||||||
|
* Adiantum: length-preserving encryption for entry-level processors
|
||||||
|
* (https://eprint.iacr.org/2018/720.pdf)
|
||||||
|
*
|
||||||
|
* For flexibility, this implementation also allows other ciphers:
|
||||||
|
*
|
||||||
|
* - Stream cipher: XChaCha12 or XChaCha20
|
||||||
|
* - Block cipher: any with a 128-bit block size and 256-bit key
|
||||||
|
*
|
||||||
|
* This implementation doesn't currently allow other ε-∆U hash functions, i.e.
|
||||||
|
* HPolyC is not supported. This is because Adiantum is ~20% faster than HPolyC
|
||||||
|
* but still provably as secure, and also the ε-∆U hash function of HBSH is
|
||||||
|
* formally defined to take two inputs (tweak, message) which makes it difficult
|
||||||
|
* to wrap with the crypto_shash API. Rather, some details need to be handled
|
||||||
|
* here. Nevertheless, if needed in the future, support for other ε-∆U hash
|
||||||
|
* functions could be added here.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <crypto/b128ops.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/internal/skcipher.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <crypto/scatterwalk.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
#include "internal.h"
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Size of right-hand part of input data, in bytes; also the size of the block
|
||||||
|
* cipher's block size and the hash function's output.
|
||||||
|
*/
|
||||||
|
#define BLOCKCIPHER_BLOCK_SIZE 16
|
||||||
|
|
||||||
|
/* Size of the block cipher key (K_E) in bytes */
|
||||||
|
#define BLOCKCIPHER_KEY_SIZE 32
|
||||||
|
|
||||||
|
/* Size of the hash key (K_H) in bytes */
|
||||||
|
#define HASH_KEY_SIZE (POLY1305_BLOCK_SIZE + NHPOLY1305_KEY_SIZE)
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The specification allows variable-length tweaks, but Linux's crypto API
|
||||||
|
* currently only allows algorithms to support a single length. The "natural"
|
||||||
|
* tweak length for Adiantum is 16, since that fits into one Poly1305 block for
|
||||||
|
* the best performance. But longer tweaks are useful for fscrypt, to avoid
|
||||||
|
* needing to derive per-file keys. So instead we use two blocks, or 32 bytes.
|
||||||
|
*/
|
||||||
|
#define TWEAK_SIZE 32
|
||||||
|
|
||||||
|
struct adiantum_instance_ctx {
|
||||||
|
struct crypto_skcipher_spawn streamcipher_spawn;
|
||||||
|
struct crypto_spawn blockcipher_spawn;
|
||||||
|
struct crypto_shash_spawn hash_spawn;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct adiantum_tfm_ctx {
|
||||||
|
struct crypto_skcipher *streamcipher;
|
||||||
|
struct crypto_cipher *blockcipher;
|
||||||
|
struct crypto_shash *hash;
|
||||||
|
struct poly1305_key header_hash_key;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct adiantum_request_ctx {
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Buffer for right-hand part of data, i.e.
|
||||||
|
*
|
||||||
|
* P_L => P_M => C_M => C_R when encrypting, or
|
||||||
|
* C_R => C_M => P_M => P_L when decrypting.
|
||||||
|
*
|
||||||
|
* Also used to build the IV for the stream cipher.
|
||||||
|
*/
|
||||||
|
union {
|
||||||
|
u8 bytes[XCHACHA_IV_SIZE];
|
||||||
|
__le32 words[XCHACHA_IV_SIZE / sizeof(__le32)];
|
||||||
|
le128 bignum; /* interpret as element of Z/(2^{128}Z) */
|
||||||
|
} rbuf;
|
||||||
|
|
||||||
|
bool enc; /* true if encrypting, false if decrypting */
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The result of the Poly1305 ε-∆U hash function applied to
|
||||||
|
* (bulk length, tweak)
|
||||||
|
*/
|
||||||
|
le128 header_hash;
|
||||||
|
|
||||||
|
/* Sub-requests, must be last */
|
||||||
|
union {
|
||||||
|
struct shash_desc hash_desc;
|
||||||
|
struct skcipher_request streamcipher_req;
|
||||||
|
} u;
|
||||||
|
};
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Given the XChaCha stream key K_S, derive the block cipher key K_E and the
|
||||||
|
* hash key K_H as follows:
|
||||||
|
*
|
||||||
|
* K_E || K_H || ... = XChaCha(key=K_S, nonce=1||0^191)
|
||||||
|
*
|
||||||
|
* Note that this denotes using bits from the XChaCha keystream, which here we
|
||||||
|
* get indirectly by encrypting a buffer containing all 0's.
|
||||||
|
*/
|
||||||
|
static int adiantum_setkey(struct crypto_skcipher *tfm, const u8 *key,
|
||||||
|
unsigned int keylen)
|
||||||
|
{
|
||||||
|
struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct {
|
||||||
|
u8 iv[XCHACHA_IV_SIZE];
|
||||||
|
u8 derived_keys[BLOCKCIPHER_KEY_SIZE + HASH_KEY_SIZE];
|
||||||
|
struct scatterlist sg;
|
||||||
|
struct crypto_wait wait;
|
||||||
|
struct skcipher_request req; /* must be last */
|
||||||
|
} *data;
|
||||||
|
u8 *keyp;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
/* Set the stream cipher key (K_S) */
|
||||||
|
crypto_skcipher_clear_flags(tctx->streamcipher, CRYPTO_TFM_REQ_MASK);
|
||||||
|
crypto_skcipher_set_flags(tctx->streamcipher,
|
||||||
|
crypto_skcipher_get_flags(tfm) &
|
||||||
|
CRYPTO_TFM_REQ_MASK);
|
||||||
|
err = crypto_skcipher_setkey(tctx->streamcipher, key, keylen);
|
||||||
|
crypto_skcipher_set_flags(tfm,
|
||||||
|
crypto_skcipher_get_flags(tctx->streamcipher) &
|
||||||
|
CRYPTO_TFM_RES_MASK);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
/* Derive the subkeys */
|
||||||
|
data = kzalloc(sizeof(*data) +
|
||||||
|
crypto_skcipher_reqsize(tctx->streamcipher), GFP_KERNEL);
|
||||||
|
if (!data)
|
||||||
|
return -ENOMEM;
|
||||||
|
data->iv[0] = 1;
|
||||||
|
sg_init_one(&data->sg, data->derived_keys, sizeof(data->derived_keys));
|
||||||
|
crypto_init_wait(&data->wait);
|
||||||
|
skcipher_request_set_tfm(&data->req, tctx->streamcipher);
|
||||||
|
skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP |
|
||||||
|
CRYPTO_TFM_REQ_MAY_BACKLOG,
|
||||||
|
crypto_req_done, &data->wait);
|
||||||
|
skcipher_request_set_crypt(&data->req, &data->sg, &data->sg,
|
||||||
|
sizeof(data->derived_keys), data->iv);
|
||||||
|
err = crypto_wait_req(crypto_skcipher_encrypt(&data->req), &data->wait);
|
||||||
|
if (err)
|
||||||
|
goto out;
|
||||||
|
keyp = data->derived_keys;
|
||||||
|
|
||||||
|
/* Set the block cipher key (K_E) */
|
||||||
|
crypto_cipher_clear_flags(tctx->blockcipher, CRYPTO_TFM_REQ_MASK);
|
||||||
|
crypto_cipher_set_flags(tctx->blockcipher,
|
||||||
|
crypto_skcipher_get_flags(tfm) &
|
||||||
|
CRYPTO_TFM_REQ_MASK);
|
||||||
|
err = crypto_cipher_setkey(tctx->blockcipher, keyp,
|
||||||
|
BLOCKCIPHER_KEY_SIZE);
|
||||||
|
crypto_skcipher_set_flags(tfm,
|
||||||
|
crypto_cipher_get_flags(tctx->blockcipher) &
|
||||||
|
CRYPTO_TFM_RES_MASK);
|
||||||
|
if (err)
|
||||||
|
goto out;
|
||||||
|
keyp += BLOCKCIPHER_KEY_SIZE;
|
||||||
|
|
||||||
|
/* Set the hash key (K_H) */
|
||||||
|
poly1305_core_setkey(&tctx->header_hash_key, keyp);
|
||||||
|
keyp += POLY1305_BLOCK_SIZE;
|
||||||
|
|
||||||
|
crypto_shash_clear_flags(tctx->hash, CRYPTO_TFM_REQ_MASK);
|
||||||
|
crypto_shash_set_flags(tctx->hash, crypto_skcipher_get_flags(tfm) &
|
||||||
|
CRYPTO_TFM_REQ_MASK);
|
||||||
|
err = crypto_shash_setkey(tctx->hash, keyp, NHPOLY1305_KEY_SIZE);
|
||||||
|
crypto_skcipher_set_flags(tfm, crypto_shash_get_flags(tctx->hash) &
|
||||||
|
CRYPTO_TFM_RES_MASK);
|
||||||
|
keyp += NHPOLY1305_KEY_SIZE;
|
||||||
|
WARN_ON(keyp != &data->derived_keys[ARRAY_SIZE(data->derived_keys)]);
|
||||||
|
out:
|
||||||
|
kzfree(data);
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Addition in Z/(2^{128}Z) */
|
||||||
|
static inline void le128_add(le128 *r, const le128 *v1, const le128 *v2)
|
||||||
|
{
|
||||||
|
u64 x = le64_to_cpu(v1->b);
|
||||||
|
u64 y = le64_to_cpu(v2->b);
|
||||||
|
|
||||||
|
r->b = cpu_to_le64(x + y);
|
||||||
|
r->a = cpu_to_le64(le64_to_cpu(v1->a) + le64_to_cpu(v2->a) +
|
||||||
|
(x + y < x));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Subtraction in Z/(2^{128}Z) */
|
||||||
|
static inline void le128_sub(le128 *r, const le128 *v1, const le128 *v2)
|
||||||
|
{
|
||||||
|
u64 x = le64_to_cpu(v1->b);
|
||||||
|
u64 y = le64_to_cpu(v2->b);
|
||||||
|
|
||||||
|
r->b = cpu_to_le64(x - y);
|
||||||
|
r->a = cpu_to_le64(le64_to_cpu(v1->a) - le64_to_cpu(v2->a) -
|
||||||
|
(x - y > x));
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Apply the Poly1305 ε-∆U hash function to (bulk length, tweak) and save the
|
||||||
|
* result to rctx->header_hash. This is the calculation
|
||||||
|
*
|
||||||
|
* H_T ← Poly1305_{K_T}(bin_{128}(|L|) || T)
|
||||||
|
*
|
||||||
|
* from the procedure in section 6.4 of the Adiantum paper. The resulting value
|
||||||
|
* is reused in both the first and second hash steps. Specifically, it's added
|
||||||
|
* to the result of an independently keyed ε-∆U hash function (for equal length
|
||||||
|
* inputs only) taken over the left-hand part (the "bulk") of the message, to
|
||||||
|
* give the overall Adiantum hash of the (tweak, left-hand part) pair.
|
||||||
|
*/
|
||||||
|
static void adiantum_hash_header(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
|
||||||
|
const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
|
||||||
|
struct {
|
||||||
|
__le64 message_bits;
|
||||||
|
__le64 padding;
|
||||||
|
} header = {
|
||||||
|
.message_bits = cpu_to_le64((u64)bulk_len * 8)
|
||||||
|
};
|
||||||
|
struct poly1305_state state;
|
||||||
|
|
||||||
|
poly1305_core_init(&state);
|
||||||
|
|
||||||
|
BUILD_BUG_ON(sizeof(header) % POLY1305_BLOCK_SIZE != 0);
|
||||||
|
poly1305_core_blocks(&state, &tctx->header_hash_key,
|
||||||
|
&header, sizeof(header) / POLY1305_BLOCK_SIZE);
|
||||||
|
|
||||||
|
BUILD_BUG_ON(TWEAK_SIZE % POLY1305_BLOCK_SIZE != 0);
|
||||||
|
poly1305_core_blocks(&state, &tctx->header_hash_key, req->iv,
|
||||||
|
TWEAK_SIZE / POLY1305_BLOCK_SIZE);
|
||||||
|
|
||||||
|
poly1305_core_emit(&state, &rctx->header_hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Hash the left-hand part (the "bulk") of the message using NHPoly1305 */
|
||||||
|
static int adiantum_hash_message(struct skcipher_request *req,
|
||||||
|
struct scatterlist *sgl, le128 *digest)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
|
||||||
|
const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
|
||||||
|
struct shash_desc *hash_desc = &rctx->u.hash_desc;
|
||||||
|
struct sg_mapping_iter miter;
|
||||||
|
unsigned int i, n;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
hash_desc->tfm = tctx->hash;
|
||||||
|
hash_desc->flags = 0;
|
||||||
|
|
||||||
|
err = crypto_shash_init(hash_desc);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
sg_miter_start(&miter, sgl, sg_nents(sgl),
|
||||||
|
SG_MITER_FROM_SG | SG_MITER_ATOMIC);
|
||||||
|
for (i = 0; i < bulk_len; i += n) {
|
||||||
|
sg_miter_next(&miter);
|
||||||
|
n = min_t(unsigned int, miter.length, bulk_len - i);
|
||||||
|
err = crypto_shash_update(hash_desc, miter.addr, n);
|
||||||
|
if (err)
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
sg_miter_stop(&miter);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
return crypto_shash_final(hash_desc, (u8 *)digest);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Continue Adiantum encryption/decryption after the stream cipher step */
|
||||||
|
static int adiantum_finish(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
|
||||||
|
const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
|
||||||
|
le128 digest;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
/* If decrypting, decrypt C_M with the block cipher to get P_M */
|
||||||
|
if (!rctx->enc)
|
||||||
|
crypto_cipher_decrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
|
||||||
|
rctx->rbuf.bytes);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Second hash step
|
||||||
|
* enc: C_R = C_M - H_{K_H}(T, C_L)
|
||||||
|
* dec: P_R = P_M - H_{K_H}(T, P_L)
|
||||||
|
*/
|
||||||
|
err = adiantum_hash_message(req, req->dst, &digest);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
le128_add(&digest, &digest, &rctx->header_hash);
|
||||||
|
le128_sub(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
|
||||||
|
scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->dst,
|
||||||
|
bulk_len, BLOCKCIPHER_BLOCK_SIZE, 1);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void adiantum_streamcipher_done(struct crypto_async_request *areq,
|
||||||
|
int err)
|
||||||
|
{
|
||||||
|
struct skcipher_request *req = areq->data;
|
||||||
|
|
||||||
|
if (!err)
|
||||||
|
err = adiantum_finish(req);
|
||||||
|
|
||||||
|
skcipher_request_complete(req, err);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int adiantum_crypt(struct skcipher_request *req, bool enc)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
|
||||||
|
const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
|
||||||
|
unsigned int stream_len;
|
||||||
|
le128 digest;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
if (req->cryptlen < BLOCKCIPHER_BLOCK_SIZE)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
rctx->enc = enc;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* First hash step
|
||||||
|
* enc: P_M = P_R + H_{K_H}(T, P_L)
|
||||||
|
* dec: C_M = C_R + H_{K_H}(T, C_L)
|
||||||
|
*/
|
||||||
|
adiantum_hash_header(req);
|
||||||
|
err = adiantum_hash_message(req, req->src, &digest);
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
le128_add(&digest, &digest, &rctx->header_hash);
|
||||||
|
scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->src,
|
||||||
|
bulk_len, BLOCKCIPHER_BLOCK_SIZE, 0);
|
||||||
|
le128_add(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
|
||||||
|
|
||||||
|
/* If encrypting, encrypt P_M with the block cipher to get C_M */
|
||||||
|
if (enc)
|
||||||
|
crypto_cipher_encrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
|
||||||
|
rctx->rbuf.bytes);
|
||||||
|
|
||||||
|
/* Initialize the rest of the XChaCha IV (first part is C_M) */
|
||||||
|
BUILD_BUG_ON(BLOCKCIPHER_BLOCK_SIZE != 16);
|
||||||
|
BUILD_BUG_ON(XCHACHA_IV_SIZE != 32); /* nonce || stream position */
|
||||||
|
rctx->rbuf.words[4] = cpu_to_le32(1);
|
||||||
|
rctx->rbuf.words[5] = 0;
|
||||||
|
rctx->rbuf.words[6] = 0;
|
||||||
|
rctx->rbuf.words[7] = 0;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* XChaCha needs to be done on all the data except the last 16 bytes;
|
||||||
|
* for disk encryption that usually means 4080 or 496 bytes. But ChaCha
|
||||||
|
* implementations tend to be most efficient when passed a whole number
|
||||||
|
* of 64-byte ChaCha blocks, or sometimes even a multiple of 256 bytes.
|
||||||
|
* And here it doesn't matter whether the last 16 bytes are written to,
|
||||||
|
* as the second hash step will overwrite them. Thus, round the XChaCha
|
||||||
|
* length up to the next 64-byte boundary if possible.
|
||||||
|
*/
|
||||||
|
stream_len = bulk_len;
|
||||||
|
if (round_up(stream_len, CHACHA_BLOCK_SIZE) <= req->cryptlen)
|
||||||
|
stream_len = round_up(stream_len, CHACHA_BLOCK_SIZE);
|
||||||
|
|
||||||
|
skcipher_request_set_tfm(&rctx->u.streamcipher_req, tctx->streamcipher);
|
||||||
|
skcipher_request_set_crypt(&rctx->u.streamcipher_req, req->src,
|
||||||
|
req->dst, stream_len, &rctx->rbuf);
|
||||||
|
skcipher_request_set_callback(&rctx->u.streamcipher_req,
|
||||||
|
req->base.flags,
|
||||||
|
adiantum_streamcipher_done, req);
|
||||||
|
return crypto_skcipher_encrypt(&rctx->u.streamcipher_req) ?:
|
||||||
|
adiantum_finish(req);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int adiantum_encrypt(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
return adiantum_crypt(req, true);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int adiantum_decrypt(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
return adiantum_crypt(req, false);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int adiantum_init_tfm(struct crypto_skcipher *tfm)
|
||||||
|
{
|
||||||
|
struct skcipher_instance *inst = skcipher_alg_instance(tfm);
|
||||||
|
struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
|
||||||
|
struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct crypto_skcipher *streamcipher;
|
||||||
|
struct crypto_cipher *blockcipher;
|
||||||
|
struct crypto_shash *hash;
|
||||||
|
unsigned int subreq_size;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
streamcipher = crypto_spawn_skcipher(&ictx->streamcipher_spawn);
|
||||||
|
if (IS_ERR(streamcipher))
|
||||||
|
return PTR_ERR(streamcipher);
|
||||||
|
|
||||||
|
blockcipher = crypto_spawn_cipher(&ictx->blockcipher_spawn);
|
||||||
|
if (IS_ERR(blockcipher)) {
|
||||||
|
err = PTR_ERR(blockcipher);
|
||||||
|
goto err_free_streamcipher;
|
||||||
|
}
|
||||||
|
|
||||||
|
hash = crypto_spawn_shash(&ictx->hash_spawn);
|
||||||
|
if (IS_ERR(hash)) {
|
||||||
|
err = PTR_ERR(hash);
|
||||||
|
goto err_free_blockcipher;
|
||||||
|
}
|
||||||
|
|
||||||
|
tctx->streamcipher = streamcipher;
|
||||||
|
tctx->blockcipher = blockcipher;
|
||||||
|
tctx->hash = hash;
|
||||||
|
|
||||||
|
BUILD_BUG_ON(offsetofend(struct adiantum_request_ctx, u) !=
|
||||||
|
sizeof(struct adiantum_request_ctx));
|
||||||
|
subreq_size = max(FIELD_SIZEOF(struct adiantum_request_ctx,
|
||||||
|
u.hash_desc) +
|
||||||
|
crypto_shash_descsize(hash),
|
||||||
|
FIELD_SIZEOF(struct adiantum_request_ctx,
|
||||||
|
u.streamcipher_req) +
|
||||||
|
crypto_skcipher_reqsize(streamcipher));
|
||||||
|
|
||||||
|
crypto_skcipher_set_reqsize(tfm,
|
||||||
|
offsetof(struct adiantum_request_ctx, u) +
|
||||||
|
subreq_size);
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
err_free_blockcipher:
|
||||||
|
crypto_free_cipher(blockcipher);
|
||||||
|
err_free_streamcipher:
|
||||||
|
crypto_free_skcipher(streamcipher);
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void adiantum_exit_tfm(struct crypto_skcipher *tfm)
|
||||||
|
{
|
||||||
|
struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
|
||||||
|
|
||||||
|
crypto_free_skcipher(tctx->streamcipher);
|
||||||
|
crypto_free_cipher(tctx->blockcipher);
|
||||||
|
crypto_free_shash(tctx->hash);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void adiantum_free_instance(struct skcipher_instance *inst)
|
||||||
|
{
|
||||||
|
struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
|
||||||
|
|
||||||
|
crypto_drop_skcipher(&ictx->streamcipher_spawn);
|
||||||
|
crypto_drop_spawn(&ictx->blockcipher_spawn);
|
||||||
|
crypto_drop_shash(&ictx->hash_spawn);
|
||||||
|
kfree(inst);
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Check for a supported set of inner algorithms.
|
||||||
|
* See the comment at the beginning of this file.
|
||||||
|
*/
|
||||||
|
static bool adiantum_supported_algorithms(struct skcipher_alg *streamcipher_alg,
|
||||||
|
struct crypto_alg *blockcipher_alg,
|
||||||
|
struct shash_alg *hash_alg)
|
||||||
|
{
|
||||||
|
if (strcmp(streamcipher_alg->base.cra_name, "xchacha12") != 0 &&
|
||||||
|
strcmp(streamcipher_alg->base.cra_name, "xchacha20") != 0)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
if (blockcipher_alg->cra_cipher.cia_min_keysize > BLOCKCIPHER_KEY_SIZE ||
|
||||||
|
blockcipher_alg->cra_cipher.cia_max_keysize < BLOCKCIPHER_KEY_SIZE)
|
||||||
|
return false;
|
||||||
|
if (blockcipher_alg->cra_blocksize != BLOCKCIPHER_BLOCK_SIZE)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
if (strcmp(hash_alg->base.cra_name, "nhpoly1305") != 0)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int adiantum_create(struct crypto_template *tmpl, struct rtattr **tb)
|
||||||
|
{
|
||||||
|
struct crypto_attr_type *algt;
|
||||||
|
const char *streamcipher_name;
|
||||||
|
const char *blockcipher_name;
|
||||||
|
const char *nhpoly1305_name;
|
||||||
|
struct skcipher_instance *inst;
|
||||||
|
struct adiantum_instance_ctx *ictx;
|
||||||
|
struct skcipher_alg *streamcipher_alg;
|
||||||
|
struct crypto_alg *blockcipher_alg;
|
||||||
|
struct crypto_alg *_hash_alg;
|
||||||
|
struct shash_alg *hash_alg;
|
||||||
|
int err;
|
||||||
|
|
||||||
|
algt = crypto_get_attr_type(tb);
|
||||||
|
if (IS_ERR(algt))
|
||||||
|
return PTR_ERR(algt);
|
||||||
|
|
||||||
|
if ((algt->type ^ CRYPTO_ALG_TYPE_SKCIPHER) & algt->mask)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
streamcipher_name = crypto_attr_alg_name(tb[1]);
|
||||||
|
if (IS_ERR(streamcipher_name))
|
||||||
|
return PTR_ERR(streamcipher_name);
|
||||||
|
|
||||||
|
blockcipher_name = crypto_attr_alg_name(tb[2]);
|
||||||
|
if (IS_ERR(blockcipher_name))
|
||||||
|
return PTR_ERR(blockcipher_name);
|
||||||
|
|
||||||
|
nhpoly1305_name = crypto_attr_alg_name(tb[3]);
|
||||||
|
if (nhpoly1305_name == ERR_PTR(-ENOENT))
|
||||||
|
nhpoly1305_name = "nhpoly1305";
|
||||||
|
if (IS_ERR(nhpoly1305_name))
|
||||||
|
return PTR_ERR(nhpoly1305_name);
|
||||||
|
|
||||||
|
inst = kzalloc(sizeof(*inst) + sizeof(*ictx), GFP_KERNEL);
|
||||||
|
if (!inst)
|
||||||
|
return -ENOMEM;
|
||||||
|
ictx = skcipher_instance_ctx(inst);
|
||||||
|
|
||||||
|
/* Stream cipher, e.g. "xchacha12" */
|
||||||
|
err = crypto_grab_skcipher(&ictx->streamcipher_spawn, streamcipher_name,
|
||||||
|
0, crypto_requires_sync(algt->type,
|
||||||
|
algt->mask));
|
||||||
|
if (err)
|
||||||
|
goto out_free_inst;
|
||||||
|
streamcipher_alg = crypto_spawn_skcipher_alg(&ictx->streamcipher_spawn);
|
||||||
|
|
||||||
|
/* Block cipher, e.g. "aes" */
|
||||||
|
err = crypto_grab_spawn(&ictx->blockcipher_spawn, blockcipher_name,
|
||||||
|
CRYPTO_ALG_TYPE_CIPHER, CRYPTO_ALG_TYPE_MASK);
|
||||||
|
if (err)
|
||||||
|
goto out_drop_streamcipher;
|
||||||
|
blockcipher_alg = ictx->blockcipher_spawn.alg;
|
||||||
|
|
||||||
|
/* NHPoly1305 ε-∆U hash function */
|
||||||
|
_hash_alg = crypto_alg_mod_lookup(nhpoly1305_name,
|
||||||
|
CRYPTO_ALG_TYPE_SHASH,
|
||||||
|
CRYPTO_ALG_TYPE_MASK);
|
||||||
|
if (IS_ERR(_hash_alg)) {
|
||||||
|
err = PTR_ERR(_hash_alg);
|
||||||
|
goto out_drop_blockcipher;
|
||||||
|
}
|
||||||
|
hash_alg = __crypto_shash_alg(_hash_alg);
|
||||||
|
err = crypto_init_shash_spawn(&ictx->hash_spawn, hash_alg,
|
||||||
|
skcipher_crypto_instance(inst));
|
||||||
|
if (err)
|
||||||
|
goto out_put_hash;
|
||||||
|
|
||||||
|
/* Check the set of algorithms */
|
||||||
|
if (!adiantum_supported_algorithms(streamcipher_alg, blockcipher_alg,
|
||||||
|
hash_alg)) {
|
||||||
|
pr_warn("Unsupported Adiantum instantiation: (%s,%s,%s)\n",
|
||||||
|
streamcipher_alg->base.cra_name,
|
||||||
|
blockcipher_alg->cra_name, hash_alg->base.cra_name);
|
||||||
|
err = -EINVAL;
|
||||||
|
goto out_drop_hash;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Instance fields */
|
||||||
|
|
||||||
|
err = -ENAMETOOLONG;
|
||||||
|
if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
|
||||||
|
"adiantum(%s,%s)", streamcipher_alg->base.cra_name,
|
||||||
|
blockcipher_alg->cra_name) >= CRYPTO_MAX_ALG_NAME)
|
||||||
|
goto out_drop_hash;
|
||||||
|
if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
|
||||||
|
"adiantum(%s,%s,%s)",
|
||||||
|
streamcipher_alg->base.cra_driver_name,
|
||||||
|
blockcipher_alg->cra_driver_name,
|
||||||
|
hash_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
|
||||||
|
goto out_drop_hash;
|
||||||
|
|
||||||
|
inst->alg.base.cra_flags = streamcipher_alg->base.cra_flags &
|
||||||
|
CRYPTO_ALG_ASYNC;
|
||||||
|
inst->alg.base.cra_blocksize = BLOCKCIPHER_BLOCK_SIZE;
|
||||||
|
inst->alg.base.cra_ctxsize = sizeof(struct adiantum_tfm_ctx);
|
||||||
|
inst->alg.base.cra_alignmask = streamcipher_alg->base.cra_alignmask |
|
||||||
|
hash_alg->base.cra_alignmask;
|
||||||
|
/*
|
||||||
|
* The block cipher is only invoked once per message, so for long
|
||||||
|
* messages (e.g. sectors for disk encryption) its performance doesn't
|
||||||
|
* matter as much as that of the stream cipher and hash function. Thus,
|
||||||
|
* weigh the block cipher's ->cra_priority less.
|
||||||
|
*/
|
||||||
|
inst->alg.base.cra_priority = (4 * streamcipher_alg->base.cra_priority +
|
||||||
|
2 * hash_alg->base.cra_priority +
|
||||||
|
blockcipher_alg->cra_priority) / 7;
|
||||||
|
|
||||||
|
inst->alg.setkey = adiantum_setkey;
|
||||||
|
inst->alg.encrypt = adiantum_encrypt;
|
||||||
|
inst->alg.decrypt = adiantum_decrypt;
|
||||||
|
inst->alg.init = adiantum_init_tfm;
|
||||||
|
inst->alg.exit = adiantum_exit_tfm;
|
||||||
|
inst->alg.min_keysize = crypto_skcipher_alg_min_keysize(streamcipher_alg);
|
||||||
|
inst->alg.max_keysize = crypto_skcipher_alg_max_keysize(streamcipher_alg);
|
||||||
|
inst->alg.ivsize = TWEAK_SIZE;
|
||||||
|
|
||||||
|
inst->free = adiantum_free_instance;
|
||||||
|
|
||||||
|
err = skcipher_register_instance(tmpl, inst);
|
||||||
|
if (err)
|
||||||
|
goto out_drop_hash;
|
||||||
|
|
||||||
|
crypto_mod_put(_hash_alg);
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
out_drop_hash:
|
||||||
|
crypto_drop_shash(&ictx->hash_spawn);
|
||||||
|
out_put_hash:
|
||||||
|
crypto_mod_put(_hash_alg);
|
||||||
|
out_drop_blockcipher:
|
||||||
|
crypto_drop_spawn(&ictx->blockcipher_spawn);
|
||||||
|
out_drop_streamcipher:
|
||||||
|
crypto_drop_skcipher(&ictx->streamcipher_spawn);
|
||||||
|
out_free_inst:
|
||||||
|
kfree(inst);
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* adiantum(streamcipher_name, blockcipher_name [, nhpoly1305_name]) */
|
||||||
|
static struct crypto_template adiantum_tmpl = {
|
||||||
|
.name = "adiantum",
|
||||||
|
.create = adiantum_create,
|
||||||
|
.module = THIS_MODULE,
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init adiantum_module_init(void)
|
||||||
|
{
|
||||||
|
return crypto_register_template(&adiantum_tmpl);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit adiantum_module_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_template(&adiantum_tmpl);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(adiantum_module_init);
|
||||||
|
module_exit(adiantum_module_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("Adiantum length-preserving encryption mode");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("adiantum");
|
|
@ -119,20 +119,16 @@ static int crypto_aead_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
struct crypto_report_aead raead;
|
struct crypto_report_aead raead;
|
||||||
struct aead_alg *aead = container_of(alg, struct aead_alg, base);
|
struct aead_alg *aead = container_of(alg, struct aead_alg, base);
|
||||||
|
|
||||||
strncpy(raead.type, "aead", sizeof(raead.type));
|
memset(&raead, 0, sizeof(raead));
|
||||||
strncpy(raead.geniv, "<none>", sizeof(raead.geniv));
|
|
||||||
|
strscpy(raead.type, "aead", sizeof(raead.type));
|
||||||
|
strscpy(raead.geniv, "<none>", sizeof(raead.geniv));
|
||||||
|
|
||||||
raead.blocksize = alg->cra_blocksize;
|
raead.blocksize = alg->cra_blocksize;
|
||||||
raead.maxauthsize = aead->maxauthsize;
|
raead.maxauthsize = aead->maxauthsize;
|
||||||
raead.ivsize = aead->ivsize;
|
raead.ivsize = aead->ivsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_AEAD,
|
return nla_put(skb, CRYPTOCFGA_REPORT_AEAD, sizeof(raead), &raead);
|
||||||
sizeof(struct crypto_report_aead), &raead))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_aead_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_aead_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -63,7 +63,8 @@ static inline u8 byte(const u32 x, const unsigned n)
|
||||||
|
|
||||||
static const u32 rco_tab[10] = { 1, 2, 4, 8, 16, 32, 64, 128, 27, 54 };
|
static const u32 rco_tab[10] = { 1, 2, 4, 8, 16, 32, 64, 128, 27, 54 };
|
||||||
|
|
||||||
__visible const u32 crypto_ft_tab[4][256] = {
|
/* cacheline-aligned to facilitate prefetching into cache */
|
||||||
|
__visible const u32 crypto_ft_tab[4][256] __cacheline_aligned = {
|
||||||
{
|
{
|
||||||
0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
|
0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
|
||||||
0x0df2f2ff, 0xbd6b6bd6, 0xb16f6fde, 0x54c5c591,
|
0x0df2f2ff, 0xbd6b6bd6, 0xb16f6fde, 0x54c5c591,
|
||||||
|
@ -327,7 +328,7 @@ __visible const u32 crypto_ft_tab[4][256] = {
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
__visible const u32 crypto_fl_tab[4][256] = {
|
__visible const u32 crypto_fl_tab[4][256] __cacheline_aligned = {
|
||||||
{
|
{
|
||||||
0x00000063, 0x0000007c, 0x00000077, 0x0000007b,
|
0x00000063, 0x0000007c, 0x00000077, 0x0000007b,
|
||||||
0x000000f2, 0x0000006b, 0x0000006f, 0x000000c5,
|
0x000000f2, 0x0000006b, 0x0000006f, 0x000000c5,
|
||||||
|
@ -591,7 +592,7 @@ __visible const u32 crypto_fl_tab[4][256] = {
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
__visible const u32 crypto_it_tab[4][256] = {
|
__visible const u32 crypto_it_tab[4][256] __cacheline_aligned = {
|
||||||
{
|
{
|
||||||
0x50a7f451, 0x5365417e, 0xc3a4171a, 0x965e273a,
|
0x50a7f451, 0x5365417e, 0xc3a4171a, 0x965e273a,
|
||||||
0xcb6bab3b, 0xf1459d1f, 0xab58faac, 0x9303e34b,
|
0xcb6bab3b, 0xf1459d1f, 0xab58faac, 0x9303e34b,
|
||||||
|
@ -855,7 +856,7 @@ __visible const u32 crypto_it_tab[4][256] = {
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
__visible const u32 crypto_il_tab[4][256] = {
|
__visible const u32 crypto_il_tab[4][256] __cacheline_aligned = {
|
||||||
{
|
{
|
||||||
0x00000052, 0x00000009, 0x0000006a, 0x000000d5,
|
0x00000052, 0x00000009, 0x0000006a, 0x000000d5,
|
||||||
0x00000030, 0x00000036, 0x000000a5, 0x00000038,
|
0x00000030, 0x00000036, 0x000000a5, 0x00000038,
|
||||||
|
|
|
@ -269,6 +269,7 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
const u32 *rkp = ctx->key_enc + 4;
|
const u32 *rkp = ctx->key_enc + 4;
|
||||||
int rounds = 6 + ctx->key_length / 4;
|
int rounds = 6 + ctx->key_length / 4;
|
||||||
u32 st0[4], st1[4];
|
u32 st0[4], st1[4];
|
||||||
|
unsigned long flags;
|
||||||
int round;
|
int round;
|
||||||
|
|
||||||
st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
|
st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
|
||||||
|
@ -276,6 +277,12 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
|
st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
|
||||||
st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
|
st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Temporarily disable interrupts to avoid races where cachelines are
|
||||||
|
* evicted when the CPU is interrupted to do something else.
|
||||||
|
*/
|
||||||
|
local_irq_save(flags);
|
||||||
|
|
||||||
st0[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
|
st0[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
|
||||||
st0[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
|
st0[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
|
||||||
st0[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
|
st0[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
|
||||||
|
@ -300,6 +307,8 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
|
put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
|
||||||
put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
|
put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
|
||||||
put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
|
put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
|
||||||
|
|
||||||
|
local_irq_restore(flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
|
@ -308,6 +317,7 @@ static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
const u32 *rkp = ctx->key_dec + 4;
|
const u32 *rkp = ctx->key_dec + 4;
|
||||||
int rounds = 6 + ctx->key_length / 4;
|
int rounds = 6 + ctx->key_length / 4;
|
||||||
u32 st0[4], st1[4];
|
u32 st0[4], st1[4];
|
||||||
|
unsigned long flags;
|
||||||
int round;
|
int round;
|
||||||
|
|
||||||
st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
|
st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
|
||||||
|
@ -315,6 +325,12 @@ static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
|
st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
|
||||||
st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
|
st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Temporarily disable interrupts to avoid races where cachelines are
|
||||||
|
* evicted when the CPU is interrupted to do something else.
|
||||||
|
*/
|
||||||
|
local_irq_save(flags);
|
||||||
|
|
||||||
st0[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
|
st0[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
|
||||||
st0[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
|
st0[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
|
||||||
st0[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
|
st0[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
|
||||||
|
@ -339,6 +355,8 @@ static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
|
||||||
put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
|
put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
|
||||||
put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
|
put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
|
||||||
put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
|
put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
|
||||||
|
|
||||||
|
local_irq_restore(flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
static struct crypto_alg aes_alg = {
|
static struct crypto_alg aes_alg = {
|
||||||
|
|
|
@ -364,20 +364,28 @@ static int crypto_ahash_op(struct ahash_request *req,
|
||||||
|
|
||||||
int crypto_ahash_final(struct ahash_request *req)
|
int crypto_ahash_final(struct ahash_request *req)
|
||||||
{
|
{
|
||||||
|
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||||
|
struct crypto_alg *alg = tfm->base.__crt_alg;
|
||||||
|
unsigned int nbytes = req->nbytes;
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
|
crypto_stats_get(alg);
|
||||||
ret = crypto_ahash_op(req, crypto_ahash_reqtfm(req)->final);
|
ret = crypto_ahash_op(req, crypto_ahash_reqtfm(req)->final);
|
||||||
crypto_stat_ahash_final(req, ret);
|
crypto_stats_ahash_final(nbytes, ret, alg);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_ahash_final);
|
EXPORT_SYMBOL_GPL(crypto_ahash_final);
|
||||||
|
|
||||||
int crypto_ahash_finup(struct ahash_request *req)
|
int crypto_ahash_finup(struct ahash_request *req)
|
||||||
{
|
{
|
||||||
|
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||||
|
struct crypto_alg *alg = tfm->base.__crt_alg;
|
||||||
|
unsigned int nbytes = req->nbytes;
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
|
crypto_stats_get(alg);
|
||||||
ret = crypto_ahash_op(req, crypto_ahash_reqtfm(req)->finup);
|
ret = crypto_ahash_op(req, crypto_ahash_reqtfm(req)->finup);
|
||||||
crypto_stat_ahash_final(req, ret);
|
crypto_stats_ahash_final(nbytes, ret, alg);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_ahash_finup);
|
EXPORT_SYMBOL_GPL(crypto_ahash_finup);
|
||||||
|
@ -385,13 +393,16 @@ EXPORT_SYMBOL_GPL(crypto_ahash_finup);
|
||||||
int crypto_ahash_digest(struct ahash_request *req)
|
int crypto_ahash_digest(struct ahash_request *req)
|
||||||
{
|
{
|
||||||
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||||
|
struct crypto_alg *alg = tfm->base.__crt_alg;
|
||||||
|
unsigned int nbytes = req->nbytes;
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
|
crypto_stats_get(alg);
|
||||||
if (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
|
if (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
|
||||||
ret = -ENOKEY;
|
ret = -ENOKEY;
|
||||||
else
|
else
|
||||||
ret = crypto_ahash_op(req, tfm->digest);
|
ret = crypto_ahash_op(req, tfm->digest);
|
||||||
crypto_stat_ahash_final(req, ret);
|
crypto_stats_ahash_final(nbytes, ret, alg);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_ahash_digest);
|
EXPORT_SYMBOL_GPL(crypto_ahash_digest);
|
||||||
|
@ -498,18 +509,14 @@ static int crypto_ahash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_hash rhash;
|
struct crypto_report_hash rhash;
|
||||||
|
|
||||||
strncpy(rhash.type, "ahash", sizeof(rhash.type));
|
memset(&rhash, 0, sizeof(rhash));
|
||||||
|
|
||||||
|
strscpy(rhash.type, "ahash", sizeof(rhash.type));
|
||||||
|
|
||||||
rhash.blocksize = alg->cra_blocksize;
|
rhash.blocksize = alg->cra_blocksize;
|
||||||
rhash.digestsize = __crypto_hash_alg_common(alg)->digestsize;
|
rhash.digestsize = __crypto_hash_alg_common(alg)->digestsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_HASH,
|
return nla_put(skb, CRYPTOCFGA_REPORT_HASH, sizeof(rhash), &rhash);
|
||||||
sizeof(struct crypto_report_hash), &rhash))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_ahash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_ahash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -30,15 +30,12 @@ static int crypto_akcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_akcipher rakcipher;
|
struct crypto_report_akcipher rakcipher;
|
||||||
|
|
||||||
strncpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
|
memset(&rakcipher, 0, sizeof(rakcipher));
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_AKCIPHER,
|
strscpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
|
||||||
sizeof(struct crypto_report_akcipher), &rakcipher))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
return nla_put(skb, CRYPTOCFGA_REPORT_AKCIPHER,
|
||||||
return -EMSGSIZE;
|
sizeof(rakcipher), &rakcipher);
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_akcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_akcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
247
crypto/algapi.c
247
crypto/algapi.c
|
@ -258,13 +258,7 @@ static struct crypto_larval *__crypto_register_alg(struct crypto_alg *alg)
|
||||||
list_add(&alg->cra_list, &crypto_alg_list);
|
list_add(&alg->cra_list, &crypto_alg_list);
|
||||||
list_add(&larval->alg.cra_list, &crypto_alg_list);
|
list_add(&larval->alg.cra_list, &crypto_alg_list);
|
||||||
|
|
||||||
atomic_set(&alg->encrypt_cnt, 0);
|
crypto_stats_init(alg);
|
||||||
atomic_set(&alg->decrypt_cnt, 0);
|
|
||||||
atomic64_set(&alg->encrypt_tlen, 0);
|
|
||||||
atomic64_set(&alg->decrypt_tlen, 0);
|
|
||||||
atomic_set(&alg->verify_cnt, 0);
|
|
||||||
atomic_set(&alg->cipher_err_cnt, 0);
|
|
||||||
atomic_set(&alg->sign_cnt, 0);
|
|
||||||
|
|
||||||
out:
|
out:
|
||||||
return larval;
|
return larval;
|
||||||
|
@ -1076,6 +1070,245 @@ int crypto_type_has_alg(const char *name, const struct crypto_type *frontend,
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_type_has_alg);
|
EXPORT_SYMBOL_GPL(crypto_type_has_alg);
|
||||||
|
|
||||||
|
#ifdef CONFIG_CRYPTO_STATS
|
||||||
|
void crypto_stats_init(struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
memset(&alg->stats, 0, sizeof(alg->stats));
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_init);
|
||||||
|
|
||||||
|
void crypto_stats_get(struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
crypto_alg_get(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_get);
|
||||||
|
|
||||||
|
void crypto_stats_ablkcipher_encrypt(unsigned int nbytes, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.cipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.cipher.encrypt_cnt);
|
||||||
|
atomic64_add(nbytes, &alg->stats.cipher.encrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_ablkcipher_encrypt);
|
||||||
|
|
||||||
|
void crypto_stats_ablkcipher_decrypt(unsigned int nbytes, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.cipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.cipher.decrypt_cnt);
|
||||||
|
atomic64_add(nbytes, &alg->stats.cipher.decrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_ablkcipher_decrypt);
|
||||||
|
|
||||||
|
void crypto_stats_aead_encrypt(unsigned int cryptlen, struct crypto_alg *alg,
|
||||||
|
int ret)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.aead.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.aead.encrypt_cnt);
|
||||||
|
atomic64_add(cryptlen, &alg->stats.aead.encrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_aead_encrypt);
|
||||||
|
|
||||||
|
void crypto_stats_aead_decrypt(unsigned int cryptlen, struct crypto_alg *alg,
|
||||||
|
int ret)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.aead.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.aead.decrypt_cnt);
|
||||||
|
atomic64_add(cryptlen, &alg->stats.aead.decrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_aead_decrypt);
|
||||||
|
|
||||||
|
void crypto_stats_akcipher_encrypt(unsigned int src_len, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.akcipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.akcipher.encrypt_cnt);
|
||||||
|
atomic64_add(src_len, &alg->stats.akcipher.encrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_akcipher_encrypt);
|
||||||
|
|
||||||
|
void crypto_stats_akcipher_decrypt(unsigned int src_len, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.akcipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.akcipher.decrypt_cnt);
|
||||||
|
atomic64_add(src_len, &alg->stats.akcipher.decrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_akcipher_decrypt);
|
||||||
|
|
||||||
|
void crypto_stats_akcipher_sign(int ret, struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY)
|
||||||
|
atomic64_inc(&alg->stats.akcipher.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.akcipher.sign_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_akcipher_sign);
|
||||||
|
|
||||||
|
void crypto_stats_akcipher_verify(int ret, struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY)
|
||||||
|
atomic64_inc(&alg->stats.akcipher.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.akcipher.verify_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_akcipher_verify);
|
||||||
|
|
||||||
|
void crypto_stats_compress(unsigned int slen, int ret, struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.compress.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.compress.compress_cnt);
|
||||||
|
atomic64_add(slen, &alg->stats.compress.compress_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_compress);
|
||||||
|
|
||||||
|
void crypto_stats_decompress(unsigned int slen, int ret, struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.compress.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.compress.decompress_cnt);
|
||||||
|
atomic64_add(slen, &alg->stats.compress.decompress_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_decompress);
|
||||||
|
|
||||||
|
void crypto_stats_ahash_update(unsigned int nbytes, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY)
|
||||||
|
atomic64_inc(&alg->stats.hash.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_add(nbytes, &alg->stats.hash.hash_tlen);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_ahash_update);
|
||||||
|
|
||||||
|
void crypto_stats_ahash_final(unsigned int nbytes, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.hash.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.hash.hash_cnt);
|
||||||
|
atomic64_add(nbytes, &alg->stats.hash.hash_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_ahash_final);
|
||||||
|
|
||||||
|
void crypto_stats_kpp_set_secret(struct crypto_alg *alg, int ret)
|
||||||
|
{
|
||||||
|
if (ret)
|
||||||
|
atomic64_inc(&alg->stats.kpp.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.kpp.setsecret_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_kpp_set_secret);
|
||||||
|
|
||||||
|
void crypto_stats_kpp_generate_public_key(struct crypto_alg *alg, int ret)
|
||||||
|
{
|
||||||
|
if (ret)
|
||||||
|
atomic64_inc(&alg->stats.kpp.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.kpp.generate_public_key_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_kpp_generate_public_key);
|
||||||
|
|
||||||
|
void crypto_stats_kpp_compute_shared_secret(struct crypto_alg *alg, int ret)
|
||||||
|
{
|
||||||
|
if (ret)
|
||||||
|
atomic64_inc(&alg->stats.kpp.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.kpp.compute_shared_secret_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_kpp_compute_shared_secret);
|
||||||
|
|
||||||
|
void crypto_stats_rng_seed(struct crypto_alg *alg, int ret)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY)
|
||||||
|
atomic64_inc(&alg->stats.rng.err_cnt);
|
||||||
|
else
|
||||||
|
atomic64_inc(&alg->stats.rng.seed_cnt);
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_rng_seed);
|
||||||
|
|
||||||
|
void crypto_stats_rng_generate(struct crypto_alg *alg, unsigned int dlen,
|
||||||
|
int ret)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.rng.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.rng.generate_cnt);
|
||||||
|
atomic64_add(dlen, &alg->stats.rng.generate_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_rng_generate);
|
||||||
|
|
||||||
|
void crypto_stats_skcipher_encrypt(unsigned int cryptlen, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.cipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.cipher.encrypt_cnt);
|
||||||
|
atomic64_add(cryptlen, &alg->stats.cipher.encrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_skcipher_encrypt);
|
||||||
|
|
||||||
|
void crypto_stats_skcipher_decrypt(unsigned int cryptlen, int ret,
|
||||||
|
struct crypto_alg *alg)
|
||||||
|
{
|
||||||
|
if (ret && ret != -EINPROGRESS && ret != -EBUSY) {
|
||||||
|
atomic64_inc(&alg->stats.cipher.err_cnt);
|
||||||
|
} else {
|
||||||
|
atomic64_inc(&alg->stats.cipher.decrypt_cnt);
|
||||||
|
atomic64_add(cryptlen, &alg->stats.cipher.decrypt_tlen);
|
||||||
|
}
|
||||||
|
crypto_alg_put(alg);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_stats_skcipher_decrypt);
|
||||||
|
#endif
|
||||||
|
|
||||||
static int __init crypto_algapi_init(void)
|
static int __init crypto_algapi_init(void)
|
||||||
{
|
{
|
||||||
crypto_init_proc();
|
crypto_init_proc();
|
||||||
|
|
|
@ -507,23 +507,18 @@ static int crypto_blkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_blkcipher rblkcipher;
|
struct crypto_report_blkcipher rblkcipher;
|
||||||
|
|
||||||
strncpy(rblkcipher.type, "blkcipher", sizeof(rblkcipher.type));
|
memset(&rblkcipher, 0, sizeof(rblkcipher));
|
||||||
strncpy(rblkcipher.geniv, alg->cra_blkcipher.geniv ?: "<default>",
|
|
||||||
sizeof(rblkcipher.geniv));
|
strscpy(rblkcipher.type, "blkcipher", sizeof(rblkcipher.type));
|
||||||
rblkcipher.geniv[sizeof(rblkcipher.geniv) - 1] = '\0';
|
strscpy(rblkcipher.geniv, "<default>", sizeof(rblkcipher.geniv));
|
||||||
|
|
||||||
rblkcipher.blocksize = alg->cra_blocksize;
|
rblkcipher.blocksize = alg->cra_blocksize;
|
||||||
rblkcipher.min_keysize = alg->cra_blkcipher.min_keysize;
|
rblkcipher.min_keysize = alg->cra_blkcipher.min_keysize;
|
||||||
rblkcipher.max_keysize = alg->cra_blkcipher.max_keysize;
|
rblkcipher.max_keysize = alg->cra_blkcipher.max_keysize;
|
||||||
rblkcipher.ivsize = alg->cra_blkcipher.ivsize;
|
rblkcipher.ivsize = alg->cra_blkcipher.ivsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
return nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
||||||
sizeof(struct crypto_report_blkcipher), &rblkcipher))
|
sizeof(rblkcipher), &rblkcipher);
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_blkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_blkcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
@ -541,8 +536,7 @@ static void crypto_blkcipher_show(struct seq_file *m, struct crypto_alg *alg)
|
||||||
seq_printf(m, "min keysize : %u\n", alg->cra_blkcipher.min_keysize);
|
seq_printf(m, "min keysize : %u\n", alg->cra_blkcipher.min_keysize);
|
||||||
seq_printf(m, "max keysize : %u\n", alg->cra_blkcipher.max_keysize);
|
seq_printf(m, "max keysize : %u\n", alg->cra_blkcipher.max_keysize);
|
||||||
seq_printf(m, "ivsize : %u\n", alg->cra_blkcipher.ivsize);
|
seq_printf(m, "ivsize : %u\n", alg->cra_blkcipher.ivsize);
|
||||||
seq_printf(m, "geniv : %s\n", alg->cra_blkcipher.geniv ?:
|
seq_printf(m, "geniv : <default>\n");
|
||||||
"<default>");
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const struct crypto_type crypto_blkcipher_type = {
|
const struct crypto_type crypto_blkcipher_type = {
|
||||||
|
|
|
@ -144,7 +144,7 @@ static int crypto_cfb_decrypt_segment(struct skcipher_walk *walk,
|
||||||
|
|
||||||
do {
|
do {
|
||||||
crypto_cfb_encrypt_one(tfm, iv, dst);
|
crypto_cfb_encrypt_one(tfm, iv, dst);
|
||||||
crypto_xor(dst, iv, bsize);
|
crypto_xor(dst, src, bsize);
|
||||||
iv = src;
|
iv = src;
|
||||||
|
|
||||||
src += bsize;
|
src += bsize;
|
||||||
|
|
|
@ -1,137 +0,0 @@
|
||||||
/*
|
|
||||||
* ChaCha20 256-bit cipher algorithm, RFC7539
|
|
||||||
*
|
|
||||||
* Copyright (C) 2015 Martin Willi
|
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or modify
|
|
||||||
* it under the terms of the GNU General Public License as published by
|
|
||||||
* the Free Software Foundation; either version 2 of the License, or
|
|
||||||
* (at your option) any later version.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include <asm/unaligned.h>
|
|
||||||
#include <crypto/algapi.h>
|
|
||||||
#include <crypto/chacha20.h>
|
|
||||||
#include <crypto/internal/skcipher.h>
|
|
||||||
#include <linux/module.h>
|
|
||||||
|
|
||||||
static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
|
|
||||||
unsigned int bytes)
|
|
||||||
{
|
|
||||||
/* aligned to potentially speed up crypto_xor() */
|
|
||||||
u8 stream[CHACHA20_BLOCK_SIZE] __aligned(sizeof(long));
|
|
||||||
|
|
||||||
if (dst != src)
|
|
||||||
memcpy(dst, src, bytes);
|
|
||||||
|
|
||||||
while (bytes >= CHACHA20_BLOCK_SIZE) {
|
|
||||||
chacha20_block(state, stream);
|
|
||||||
crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE);
|
|
||||||
bytes -= CHACHA20_BLOCK_SIZE;
|
|
||||||
dst += CHACHA20_BLOCK_SIZE;
|
|
||||||
}
|
|
||||||
if (bytes) {
|
|
||||||
chacha20_block(state, stream);
|
|
||||||
crypto_xor(dst, stream, bytes);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
|
|
||||||
{
|
|
||||||
state[0] = 0x61707865; /* "expa" */
|
|
||||||
state[1] = 0x3320646e; /* "nd 3" */
|
|
||||||
state[2] = 0x79622d32; /* "2-by" */
|
|
||||||
state[3] = 0x6b206574; /* "te k" */
|
|
||||||
state[4] = ctx->key[0];
|
|
||||||
state[5] = ctx->key[1];
|
|
||||||
state[6] = ctx->key[2];
|
|
||||||
state[7] = ctx->key[3];
|
|
||||||
state[8] = ctx->key[4];
|
|
||||||
state[9] = ctx->key[5];
|
|
||||||
state[10] = ctx->key[6];
|
|
||||||
state[11] = ctx->key[7];
|
|
||||||
state[12] = get_unaligned_le32(iv + 0);
|
|
||||||
state[13] = get_unaligned_le32(iv + 4);
|
|
||||||
state[14] = get_unaligned_le32(iv + 8);
|
|
||||||
state[15] = get_unaligned_le32(iv + 12);
|
|
||||||
}
|
|
||||||
EXPORT_SYMBOL_GPL(crypto_chacha20_init);
|
|
||||||
|
|
||||||
int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
|
|
||||||
unsigned int keysize)
|
|
||||||
{
|
|
||||||
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
|
|
||||||
int i;
|
|
||||||
|
|
||||||
if (keysize != CHACHA20_KEY_SIZE)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(ctx->key); i++)
|
|
||||||
ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32));
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
|
|
||||||
|
|
||||||
int crypto_chacha20_crypt(struct skcipher_request *req)
|
|
||||||
{
|
|
||||||
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
|
||||||
struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
|
|
||||||
struct skcipher_walk walk;
|
|
||||||
u32 state[16];
|
|
||||||
int err;
|
|
||||||
|
|
||||||
err = skcipher_walk_virt(&walk, req, true);
|
|
||||||
|
|
||||||
crypto_chacha20_init(state, ctx, walk.iv);
|
|
||||||
|
|
||||||
while (walk.nbytes > 0) {
|
|
||||||
unsigned int nbytes = walk.nbytes;
|
|
||||||
|
|
||||||
if (nbytes < walk.total)
|
|
||||||
nbytes = round_down(nbytes, walk.stride);
|
|
||||||
|
|
||||||
chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
|
|
||||||
nbytes);
|
|
||||||
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
|
||||||
}
|
|
||||||
|
|
||||||
return err;
|
|
||||||
}
|
|
||||||
EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
|
|
||||||
|
|
||||||
static struct skcipher_alg alg = {
|
|
||||||
.base.cra_name = "chacha20",
|
|
||||||
.base.cra_driver_name = "chacha20-generic",
|
|
||||||
.base.cra_priority = 100,
|
|
||||||
.base.cra_blocksize = 1,
|
|
||||||
.base.cra_ctxsize = sizeof(struct chacha20_ctx),
|
|
||||||
.base.cra_module = THIS_MODULE,
|
|
||||||
|
|
||||||
.min_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.max_keysize = CHACHA20_KEY_SIZE,
|
|
||||||
.ivsize = CHACHA20_IV_SIZE,
|
|
||||||
.chunksize = CHACHA20_BLOCK_SIZE,
|
|
||||||
.setkey = crypto_chacha20_setkey,
|
|
||||||
.encrypt = crypto_chacha20_crypt,
|
|
||||||
.decrypt = crypto_chacha20_crypt,
|
|
||||||
};
|
|
||||||
|
|
||||||
static int __init chacha20_generic_mod_init(void)
|
|
||||||
{
|
|
||||||
return crypto_register_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void __exit chacha20_generic_mod_fini(void)
|
|
||||||
{
|
|
||||||
crypto_unregister_skcipher(&alg);
|
|
||||||
}
|
|
||||||
|
|
||||||
module_init(chacha20_generic_mod_init);
|
|
||||||
module_exit(chacha20_generic_mod_fini);
|
|
||||||
|
|
||||||
MODULE_LICENSE("GPL");
|
|
||||||
MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
|
|
||||||
MODULE_DESCRIPTION("chacha20 cipher algorithm");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20");
|
|
||||||
MODULE_ALIAS_CRYPTO("chacha20-generic");
|
|
|
@ -13,7 +13,7 @@
|
||||||
#include <crypto/internal/hash.h>
|
#include <crypto/internal/hash.h>
|
||||||
#include <crypto/internal/skcipher.h>
|
#include <crypto/internal/skcipher.h>
|
||||||
#include <crypto/scatterwalk.h>
|
#include <crypto/scatterwalk.h>
|
||||||
#include <crypto/chacha20.h>
|
#include <crypto/chacha.h>
|
||||||
#include <crypto/poly1305.h>
|
#include <crypto/poly1305.h>
|
||||||
#include <linux/err.h>
|
#include <linux/err.h>
|
||||||
#include <linux/init.h>
|
#include <linux/init.h>
|
||||||
|
@ -22,8 +22,6 @@
|
||||||
|
|
||||||
#include "internal.h"
|
#include "internal.h"
|
||||||
|
|
||||||
#define CHACHAPOLY_IV_SIZE 12
|
|
||||||
|
|
||||||
struct chachapoly_instance_ctx {
|
struct chachapoly_instance_ctx {
|
||||||
struct crypto_skcipher_spawn chacha;
|
struct crypto_skcipher_spawn chacha;
|
||||||
struct crypto_ahash_spawn poly;
|
struct crypto_ahash_spawn poly;
|
||||||
|
@ -51,7 +49,7 @@ struct poly_req {
|
||||||
};
|
};
|
||||||
|
|
||||||
struct chacha_req {
|
struct chacha_req {
|
||||||
u8 iv[CHACHA20_IV_SIZE];
|
u8 iv[CHACHA_IV_SIZE];
|
||||||
struct scatterlist src[1];
|
struct scatterlist src[1];
|
||||||
struct skcipher_request req; /* must be last member */
|
struct skcipher_request req; /* must be last member */
|
||||||
};
|
};
|
||||||
|
@ -91,7 +89,7 @@ static void chacha_iv(u8 *iv, struct aead_request *req, u32 icb)
|
||||||
memcpy(iv, &leicb, sizeof(leicb));
|
memcpy(iv, &leicb, sizeof(leicb));
|
||||||
memcpy(iv + sizeof(leicb), ctx->salt, ctx->saltlen);
|
memcpy(iv + sizeof(leicb), ctx->salt, ctx->saltlen);
|
||||||
memcpy(iv + sizeof(leicb) + ctx->saltlen, req->iv,
|
memcpy(iv + sizeof(leicb) + ctx->saltlen, req->iv,
|
||||||
CHACHA20_IV_SIZE - sizeof(leicb) - ctx->saltlen);
|
CHACHA_IV_SIZE - sizeof(leicb) - ctx->saltlen);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int poly_verify_tag(struct aead_request *req)
|
static int poly_verify_tag(struct aead_request *req)
|
||||||
|
@ -494,7 +492,7 @@ static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
|
||||||
struct chachapoly_ctx *ctx = crypto_aead_ctx(aead);
|
struct chachapoly_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
if (keylen != ctx->saltlen + CHACHA20_KEY_SIZE)
|
if (keylen != ctx->saltlen + CHACHA_KEY_SIZE)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
keylen -= ctx->saltlen;
|
keylen -= ctx->saltlen;
|
||||||
|
@ -639,7 +637,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
|
||||||
|
|
||||||
err = -EINVAL;
|
err = -EINVAL;
|
||||||
/* Need 16-byte IV size, including Initial Block Counter value */
|
/* Need 16-byte IV size, including Initial Block Counter value */
|
||||||
if (crypto_skcipher_alg_ivsize(chacha) != CHACHA20_IV_SIZE)
|
if (crypto_skcipher_alg_ivsize(chacha) != CHACHA_IV_SIZE)
|
||||||
goto out_drop_chacha;
|
goto out_drop_chacha;
|
||||||
/* Not a stream cipher? */
|
/* Not a stream cipher? */
|
||||||
if (chacha->base.cra_blocksize != 1)
|
if (chacha->base.cra_blocksize != 1)
|
||||||
|
|
|
@ -0,0 +1,217 @@
|
||||||
|
/*
|
||||||
|
* ChaCha and XChaCha stream ciphers, including ChaCha20 (RFC7539)
|
||||||
|
*
|
||||||
|
* Copyright (C) 2015 Martin Willi
|
||||||
|
* Copyright (C) 2018 Google LLC
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or modify
|
||||||
|
* it under the terms of the GNU General Public License as published by
|
||||||
|
* the Free Software Foundation; either version 2 of the License, or
|
||||||
|
* (at your option) any later version.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <asm/unaligned.h>
|
||||||
|
#include <crypto/algapi.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/internal/skcipher.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
static void chacha_docrypt(u32 *state, u8 *dst, const u8 *src,
|
||||||
|
unsigned int bytes, int nrounds)
|
||||||
|
{
|
||||||
|
/* aligned to potentially speed up crypto_xor() */
|
||||||
|
u8 stream[CHACHA_BLOCK_SIZE] __aligned(sizeof(long));
|
||||||
|
|
||||||
|
if (dst != src)
|
||||||
|
memcpy(dst, src, bytes);
|
||||||
|
|
||||||
|
while (bytes >= CHACHA_BLOCK_SIZE) {
|
||||||
|
chacha_block(state, stream, nrounds);
|
||||||
|
crypto_xor(dst, stream, CHACHA_BLOCK_SIZE);
|
||||||
|
bytes -= CHACHA_BLOCK_SIZE;
|
||||||
|
dst += CHACHA_BLOCK_SIZE;
|
||||||
|
}
|
||||||
|
if (bytes) {
|
||||||
|
chacha_block(state, stream, nrounds);
|
||||||
|
crypto_xor(dst, stream, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chacha_stream_xor(struct skcipher_request *req,
|
||||||
|
struct chacha_ctx *ctx, u8 *iv)
|
||||||
|
{
|
||||||
|
struct skcipher_walk walk;
|
||||||
|
u32 state[16];
|
||||||
|
int err;
|
||||||
|
|
||||||
|
err = skcipher_walk_virt(&walk, req, false);
|
||||||
|
|
||||||
|
crypto_chacha_init(state, ctx, iv);
|
||||||
|
|
||||||
|
while (walk.nbytes > 0) {
|
||||||
|
unsigned int nbytes = walk.nbytes;
|
||||||
|
|
||||||
|
if (nbytes < walk.total)
|
||||||
|
nbytes = round_down(nbytes, walk.stride);
|
||||||
|
|
||||||
|
chacha_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
|
||||||
|
nbytes, ctx->nrounds);
|
||||||
|
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
|
||||||
|
}
|
||||||
|
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv)
|
||||||
|
{
|
||||||
|
state[0] = 0x61707865; /* "expa" */
|
||||||
|
state[1] = 0x3320646e; /* "nd 3" */
|
||||||
|
state[2] = 0x79622d32; /* "2-by" */
|
||||||
|
state[3] = 0x6b206574; /* "te k" */
|
||||||
|
state[4] = ctx->key[0];
|
||||||
|
state[5] = ctx->key[1];
|
||||||
|
state[6] = ctx->key[2];
|
||||||
|
state[7] = ctx->key[3];
|
||||||
|
state[8] = ctx->key[4];
|
||||||
|
state[9] = ctx->key[5];
|
||||||
|
state[10] = ctx->key[6];
|
||||||
|
state[11] = ctx->key[7];
|
||||||
|
state[12] = get_unaligned_le32(iv + 0);
|
||||||
|
state[13] = get_unaligned_le32(iv + 4);
|
||||||
|
state[14] = get_unaligned_le32(iv + 8);
|
||||||
|
state[15] = get_unaligned_le32(iv + 12);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_chacha_init);
|
||||||
|
|
||||||
|
static int chacha_setkey(struct crypto_skcipher *tfm, const u8 *key,
|
||||||
|
unsigned int keysize, int nrounds)
|
||||||
|
{
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
int i;
|
||||||
|
|
||||||
|
if (keysize != CHACHA_KEY_SIZE)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
for (i = 0; i < ARRAY_SIZE(ctx->key); i++)
|
||||||
|
ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32));
|
||||||
|
|
||||||
|
ctx->nrounds = nrounds;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
|
||||||
|
unsigned int keysize)
|
||||||
|
{
|
||||||
|
return chacha_setkey(tfm, key, keysize, 20);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
|
||||||
|
|
||||||
|
int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
|
||||||
|
unsigned int keysize)
|
||||||
|
{
|
||||||
|
return chacha_setkey(tfm, key, keysize, 12);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_chacha12_setkey);
|
||||||
|
|
||||||
|
int crypto_chacha_crypt(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
|
||||||
|
return chacha_stream_xor(req, ctx, req->iv);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_chacha_crypt);
|
||||||
|
|
||||||
|
int crypto_xchacha_crypt(struct skcipher_request *req)
|
||||||
|
{
|
||||||
|
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
|
||||||
|
struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
|
||||||
|
struct chacha_ctx subctx;
|
||||||
|
u32 state[16];
|
||||||
|
u8 real_iv[16];
|
||||||
|
|
||||||
|
/* Compute the subkey given the original key and first 128 nonce bits */
|
||||||
|
crypto_chacha_init(state, ctx, req->iv);
|
||||||
|
hchacha_block(state, subctx.key, ctx->nrounds);
|
||||||
|
subctx.nrounds = ctx->nrounds;
|
||||||
|
|
||||||
|
/* Build the real IV */
|
||||||
|
memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */
|
||||||
|
memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */
|
||||||
|
|
||||||
|
/* Generate the stream and XOR it with the data */
|
||||||
|
return chacha_stream_xor(req, &subctx, real_iv);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(crypto_xchacha_crypt);
|
||||||
|
|
||||||
|
static struct skcipher_alg algs[] = {
|
||||||
|
{
|
||||||
|
.base.cra_name = "chacha20",
|
||||||
|
.base.cra_driver_name = "chacha20-generic",
|
||||||
|
.base.cra_priority = 100,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = CHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = crypto_chacha_crypt,
|
||||||
|
.decrypt = crypto_chacha_crypt,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha20",
|
||||||
|
.base.cra_driver_name = "xchacha20-generic",
|
||||||
|
.base.cra_priority = 100,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha20_setkey,
|
||||||
|
.encrypt = crypto_xchacha_crypt,
|
||||||
|
.decrypt = crypto_xchacha_crypt,
|
||||||
|
}, {
|
||||||
|
.base.cra_name = "xchacha12",
|
||||||
|
.base.cra_driver_name = "xchacha12-generic",
|
||||||
|
.base.cra_priority = 100,
|
||||||
|
.base.cra_blocksize = 1,
|
||||||
|
.base.cra_ctxsize = sizeof(struct chacha_ctx),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = XCHACHA_IV_SIZE,
|
||||||
|
.chunksize = CHACHA_BLOCK_SIZE,
|
||||||
|
.setkey = crypto_chacha12_setkey,
|
||||||
|
.encrypt = crypto_xchacha_crypt,
|
||||||
|
.decrypt = crypto_xchacha_crypt,
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init chacha_generic_mod_init(void)
|
||||||
|
{
|
||||||
|
return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit chacha_generic_mod_fini(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(chacha_generic_mod_init);
|
||||||
|
module_exit(chacha_generic_mod_fini);
|
||||||
|
|
||||||
|
MODULE_LICENSE("GPL");
|
||||||
|
MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
|
||||||
|
MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (generic)");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("chacha20-generic");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha20-generic");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12");
|
||||||
|
MODULE_ALIAS_CRYPTO("xchacha12-generic");
|
|
@ -422,8 +422,6 @@ static int cryptd_create_blkcipher(struct crypto_template *tmpl,
|
||||||
inst->alg.cra_ablkcipher.min_keysize = alg->cra_blkcipher.min_keysize;
|
inst->alg.cra_ablkcipher.min_keysize = alg->cra_blkcipher.min_keysize;
|
||||||
inst->alg.cra_ablkcipher.max_keysize = alg->cra_blkcipher.max_keysize;
|
inst->alg.cra_ablkcipher.max_keysize = alg->cra_blkcipher.max_keysize;
|
||||||
|
|
||||||
inst->alg.cra_ablkcipher.geniv = alg->cra_blkcipher.geniv;
|
|
||||||
|
|
||||||
inst->alg.cra_ctxsize = sizeof(struct cryptd_blkcipher_ctx);
|
inst->alg.cra_ctxsize = sizeof(struct cryptd_blkcipher_ctx);
|
||||||
|
|
||||||
inst->alg.cra_init = cryptd_blkcipher_init_tfm;
|
inst->alg.cra_init = cryptd_blkcipher_init_tfm;
|
||||||
|
@ -1174,7 +1172,7 @@ struct cryptd_ablkcipher *cryptd_alloc_ablkcipher(const char *alg_name,
|
||||||
return ERR_PTR(-EINVAL);
|
return ERR_PTR(-EINVAL);
|
||||||
type = crypto_skcipher_type(type);
|
type = crypto_skcipher_type(type);
|
||||||
mask &= ~CRYPTO_ALG_TYPE_MASK;
|
mask &= ~CRYPTO_ALG_TYPE_MASK;
|
||||||
mask |= (CRYPTO_ALG_GENIV | CRYPTO_ALG_TYPE_BLKCIPHER_MASK);
|
mask |= CRYPTO_ALG_TYPE_BLKCIPHER_MASK;
|
||||||
tfm = crypto_alloc_base(cryptd_alg_name, type, mask);
|
tfm = crypto_alloc_base(cryptd_alg_name, type, mask);
|
||||||
if (IS_ERR(tfm))
|
if (IS_ERR(tfm))
|
||||||
return ERR_CAST(tfm);
|
return ERR_CAST(tfm);
|
||||||
|
|
|
@ -84,87 +84,38 @@ static int crypto_report_cipher(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_cipher rcipher;
|
struct crypto_report_cipher rcipher;
|
||||||
|
|
||||||
strncpy(rcipher.type, "cipher", sizeof(rcipher.type));
|
memset(&rcipher, 0, sizeof(rcipher));
|
||||||
|
|
||||||
|
strscpy(rcipher.type, "cipher", sizeof(rcipher.type));
|
||||||
|
|
||||||
rcipher.blocksize = alg->cra_blocksize;
|
rcipher.blocksize = alg->cra_blocksize;
|
||||||
rcipher.min_keysize = alg->cra_cipher.cia_min_keysize;
|
rcipher.min_keysize = alg->cra_cipher.cia_min_keysize;
|
||||||
rcipher.max_keysize = alg->cra_cipher.cia_max_keysize;
|
rcipher.max_keysize = alg->cra_cipher.cia_max_keysize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_CIPHER,
|
return nla_put(skb, CRYPTOCFGA_REPORT_CIPHER,
|
||||||
sizeof(struct crypto_report_cipher), &rcipher))
|
sizeof(rcipher), &rcipher);
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_comp(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_comp(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_comp rcomp;
|
struct crypto_report_comp rcomp;
|
||||||
|
|
||||||
strncpy(rcomp.type, "compression", sizeof(rcomp.type));
|
memset(&rcomp, 0, sizeof(rcomp));
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_COMPRESS,
|
|
||||||
sizeof(struct crypto_report_comp), &rcomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
strscpy(rcomp.type, "compression", sizeof(rcomp.type));
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int crypto_report_acomp(struct sk_buff *skb, struct crypto_alg *alg)
|
return nla_put(skb, CRYPTOCFGA_REPORT_COMPRESS, sizeof(rcomp), &rcomp);
|
||||||
{
|
|
||||||
struct crypto_report_acomp racomp;
|
|
||||||
|
|
||||||
strncpy(racomp.type, "acomp", sizeof(racomp.type));
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_ACOMP,
|
|
||||||
sizeof(struct crypto_report_acomp), &racomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int crypto_report_akcipher(struct sk_buff *skb, struct crypto_alg *alg)
|
|
||||||
{
|
|
||||||
struct crypto_report_akcipher rakcipher;
|
|
||||||
|
|
||||||
strncpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_AKCIPHER,
|
|
||||||
sizeof(struct crypto_report_akcipher), &rakcipher))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int crypto_report_kpp(struct sk_buff *skb, struct crypto_alg *alg)
|
|
||||||
{
|
|
||||||
struct crypto_report_kpp rkpp;
|
|
||||||
|
|
||||||
strncpy(rkpp.type, "kpp", sizeof(rkpp.type));
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_KPP,
|
|
||||||
sizeof(struct crypto_report_kpp), &rkpp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_one(struct crypto_alg *alg,
|
static int crypto_report_one(struct crypto_alg *alg,
|
||||||
struct crypto_user_alg *ualg, struct sk_buff *skb)
|
struct crypto_user_alg *ualg, struct sk_buff *skb)
|
||||||
{
|
{
|
||||||
strncpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
|
memset(ualg, 0, sizeof(*ualg));
|
||||||
strncpy(ualg->cru_driver_name, alg->cra_driver_name,
|
|
||||||
|
strscpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
|
||||||
|
strscpy(ualg->cru_driver_name, alg->cra_driver_name,
|
||||||
sizeof(ualg->cru_driver_name));
|
sizeof(ualg->cru_driver_name));
|
||||||
strncpy(ualg->cru_module_name, module_name(alg->cra_module),
|
strscpy(ualg->cru_module_name, module_name(alg->cra_module),
|
||||||
sizeof(ualg->cru_module_name));
|
sizeof(ualg->cru_module_name));
|
||||||
|
|
||||||
ualg->cru_type = 0;
|
ualg->cru_type = 0;
|
||||||
|
@ -177,9 +128,9 @@ static int crypto_report_one(struct crypto_alg *alg,
|
||||||
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
|
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
|
||||||
struct crypto_report_larval rl;
|
struct crypto_report_larval rl;
|
||||||
|
|
||||||
strncpy(rl.type, "larval", sizeof(rl.type));
|
memset(&rl, 0, sizeof(rl));
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_LARVAL,
|
strscpy(rl.type, "larval", sizeof(rl.type));
|
||||||
sizeof(struct crypto_report_larval), &rl))
|
if (nla_put(skb, CRYPTOCFGA_REPORT_LARVAL, sizeof(rl), &rl))
|
||||||
goto nla_put_failure;
|
goto nla_put_failure;
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
@ -202,20 +153,6 @@ static int crypto_report_one(struct crypto_alg *alg,
|
||||||
goto nla_put_failure;
|
goto nla_put_failure;
|
||||||
|
|
||||||
break;
|
break;
|
||||||
case CRYPTO_ALG_TYPE_ACOMPRESS:
|
|
||||||
if (crypto_report_acomp(skb, alg))
|
|
||||||
goto nla_put_failure;
|
|
||||||
|
|
||||||
break;
|
|
||||||
case CRYPTO_ALG_TYPE_AKCIPHER:
|
|
||||||
if (crypto_report_akcipher(skb, alg))
|
|
||||||
goto nla_put_failure;
|
|
||||||
|
|
||||||
break;
|
|
||||||
case CRYPTO_ALG_TYPE_KPP:
|
|
||||||
if (crypto_report_kpp(skb, alg))
|
|
||||||
goto nla_put_failure;
|
|
||||||
break;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
out:
|
out:
|
||||||
|
@ -294,30 +231,33 @@ static int crypto_report(struct sk_buff *in_skb, struct nlmsghdr *in_nlh,
|
||||||
|
|
||||||
static int crypto_dump_report(struct sk_buff *skb, struct netlink_callback *cb)
|
static int crypto_dump_report(struct sk_buff *skb, struct netlink_callback *cb)
|
||||||
{
|
{
|
||||||
struct crypto_alg *alg;
|
const size_t start_pos = cb->args[0];
|
||||||
|
size_t pos = 0;
|
||||||
struct crypto_dump_info info;
|
struct crypto_dump_info info;
|
||||||
int err;
|
struct crypto_alg *alg;
|
||||||
|
int res;
|
||||||
if (cb->args[0])
|
|
||||||
goto out;
|
|
||||||
|
|
||||||
cb->args[0] = 1;
|
|
||||||
|
|
||||||
info.in_skb = cb->skb;
|
info.in_skb = cb->skb;
|
||||||
info.out_skb = skb;
|
info.out_skb = skb;
|
||||||
info.nlmsg_seq = cb->nlh->nlmsg_seq;
|
info.nlmsg_seq = cb->nlh->nlmsg_seq;
|
||||||
info.nlmsg_flags = NLM_F_MULTI;
|
info.nlmsg_flags = NLM_F_MULTI;
|
||||||
|
|
||||||
|
down_read(&crypto_alg_sem);
|
||||||
list_for_each_entry(alg, &crypto_alg_list, cra_list) {
|
list_for_each_entry(alg, &crypto_alg_list, cra_list) {
|
||||||
err = crypto_report_alg(alg, &info);
|
if (pos >= start_pos) {
|
||||||
if (err)
|
res = crypto_report_alg(alg, &info);
|
||||||
goto out_err;
|
if (res == -EMSGSIZE)
|
||||||
|
break;
|
||||||
|
if (res)
|
||||||
|
goto out;
|
||||||
|
}
|
||||||
|
pos++;
|
||||||
}
|
}
|
||||||
|
cb->args[0] = pos;
|
||||||
|
res = skb->len;
|
||||||
out:
|
out:
|
||||||
return skb->len;
|
up_read(&crypto_alg_sem);
|
||||||
out_err:
|
return res;
|
||||||
return err;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_dump_report_done(struct netlink_callback *cb)
|
static int crypto_dump_report_done(struct netlink_callback *cb)
|
||||||
|
@ -483,9 +423,7 @@ static const struct crypto_link {
|
||||||
.dump = crypto_dump_report,
|
.dump = crypto_dump_report,
|
||||||
.done = crypto_dump_report_done},
|
.done = crypto_dump_report_done},
|
||||||
[CRYPTO_MSG_DELRNG - CRYPTO_MSG_BASE] = { .doit = crypto_del_rng },
|
[CRYPTO_MSG_DELRNG - CRYPTO_MSG_BASE] = { .doit = crypto_del_rng },
|
||||||
[CRYPTO_MSG_GETSTAT - CRYPTO_MSG_BASE] = { .doit = crypto_reportstat,
|
[CRYPTO_MSG_GETSTAT - CRYPTO_MSG_BASE] = { .doit = crypto_reportstat},
|
||||||
.dump = crypto_dump_reportstat,
|
|
||||||
.done = crypto_dump_reportstat_done},
|
|
||||||
};
|
};
|
||||||
|
|
||||||
static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
|
static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
|
||||||
|
@ -505,7 +443,7 @@ static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
|
||||||
if ((type == (CRYPTO_MSG_GETALG - CRYPTO_MSG_BASE) &&
|
if ((type == (CRYPTO_MSG_GETALG - CRYPTO_MSG_BASE) &&
|
||||||
(nlh->nlmsg_flags & NLM_F_DUMP))) {
|
(nlh->nlmsg_flags & NLM_F_DUMP))) {
|
||||||
struct crypto_alg *alg;
|
struct crypto_alg *alg;
|
||||||
u16 dump_alloc = 0;
|
unsigned long dump_alloc = 0;
|
||||||
|
|
||||||
if (link->dump == NULL)
|
if (link->dump == NULL)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
@ -513,16 +451,16 @@ static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
|
||||||
down_read(&crypto_alg_sem);
|
down_read(&crypto_alg_sem);
|
||||||
list_for_each_entry(alg, &crypto_alg_list, cra_list)
|
list_for_each_entry(alg, &crypto_alg_list, cra_list)
|
||||||
dump_alloc += CRYPTO_REPORT_MAXSIZE;
|
dump_alloc += CRYPTO_REPORT_MAXSIZE;
|
||||||
|
up_read(&crypto_alg_sem);
|
||||||
|
|
||||||
{
|
{
|
||||||
struct netlink_dump_control c = {
|
struct netlink_dump_control c = {
|
||||||
.dump = link->dump,
|
.dump = link->dump,
|
||||||
.done = link->done,
|
.done = link->done,
|
||||||
.min_dump_alloc = dump_alloc,
|
.min_dump_alloc = min(dump_alloc, 65535UL),
|
||||||
};
|
};
|
||||||
err = netlink_dump_start(crypto_nlsk, skb, nlh, &c);
|
err = netlink_dump_start(crypto_nlsk, skb, nlh, &c);
|
||||||
}
|
}
|
||||||
up_read(&crypto_alg_sem);
|
|
||||||
|
|
||||||
return err;
|
return err;
|
||||||
}
|
}
|
||||||
|
|
|
@ -33,260 +33,149 @@ struct crypto_dump_info {
|
||||||
|
|
||||||
static int crypto_report_aead(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_aead(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat raead;
|
struct crypto_stat_aead raead;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&raead, 0, sizeof(raead));
|
memset(&raead, 0, sizeof(raead));
|
||||||
|
|
||||||
strncpy(raead.type, "aead", sizeof(raead.type));
|
strscpy(raead.type, "aead", sizeof(raead.type));
|
||||||
|
|
||||||
v32 = atomic_read(&alg->encrypt_cnt);
|
raead.stat_encrypt_cnt = atomic64_read(&alg->stats.aead.encrypt_cnt);
|
||||||
raead.stat_encrypt_cnt = v32;
|
raead.stat_encrypt_tlen = atomic64_read(&alg->stats.aead.encrypt_tlen);
|
||||||
v64 = atomic64_read(&alg->encrypt_tlen);
|
raead.stat_decrypt_cnt = atomic64_read(&alg->stats.aead.decrypt_cnt);
|
||||||
raead.stat_encrypt_tlen = v64;
|
raead.stat_decrypt_tlen = atomic64_read(&alg->stats.aead.decrypt_tlen);
|
||||||
v32 = atomic_read(&alg->decrypt_cnt);
|
raead.stat_err_cnt = atomic64_read(&alg->stats.aead.err_cnt);
|
||||||
raead.stat_decrypt_cnt = v32;
|
|
||||||
v64 = atomic64_read(&alg->decrypt_tlen);
|
|
||||||
raead.stat_decrypt_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->aead_err_cnt);
|
|
||||||
raead.stat_aead_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_AEAD,
|
return nla_put(skb, CRYPTOCFGA_STAT_AEAD, sizeof(raead), &raead);
|
||||||
sizeof(struct crypto_stat), &raead))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_cipher(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_cipher(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rcipher;
|
struct crypto_stat_cipher rcipher;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rcipher, 0, sizeof(rcipher));
|
memset(&rcipher, 0, sizeof(rcipher));
|
||||||
|
|
||||||
strlcpy(rcipher.type, "cipher", sizeof(rcipher.type));
|
strscpy(rcipher.type, "cipher", sizeof(rcipher.type));
|
||||||
|
|
||||||
v32 = atomic_read(&alg->encrypt_cnt);
|
rcipher.stat_encrypt_cnt = atomic64_read(&alg->stats.cipher.encrypt_cnt);
|
||||||
rcipher.stat_encrypt_cnt = v32;
|
rcipher.stat_encrypt_tlen = atomic64_read(&alg->stats.cipher.encrypt_tlen);
|
||||||
v64 = atomic64_read(&alg->encrypt_tlen);
|
rcipher.stat_decrypt_cnt = atomic64_read(&alg->stats.cipher.decrypt_cnt);
|
||||||
rcipher.stat_encrypt_tlen = v64;
|
rcipher.stat_decrypt_tlen = atomic64_read(&alg->stats.cipher.decrypt_tlen);
|
||||||
v32 = atomic_read(&alg->decrypt_cnt);
|
rcipher.stat_err_cnt = atomic64_read(&alg->stats.cipher.err_cnt);
|
||||||
rcipher.stat_decrypt_cnt = v32;
|
|
||||||
v64 = atomic64_read(&alg->decrypt_tlen);
|
|
||||||
rcipher.stat_decrypt_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->cipher_err_cnt);
|
|
||||||
rcipher.stat_cipher_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_CIPHER,
|
return nla_put(skb, CRYPTOCFGA_STAT_CIPHER, sizeof(rcipher), &rcipher);
|
||||||
sizeof(struct crypto_stat), &rcipher))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_comp(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_comp(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rcomp;
|
struct crypto_stat_compress rcomp;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rcomp, 0, sizeof(rcomp));
|
memset(&rcomp, 0, sizeof(rcomp));
|
||||||
|
|
||||||
strlcpy(rcomp.type, "compression", sizeof(rcomp.type));
|
strscpy(rcomp.type, "compression", sizeof(rcomp.type));
|
||||||
v32 = atomic_read(&alg->compress_cnt);
|
rcomp.stat_compress_cnt = atomic64_read(&alg->stats.compress.compress_cnt);
|
||||||
rcomp.stat_compress_cnt = v32;
|
rcomp.stat_compress_tlen = atomic64_read(&alg->stats.compress.compress_tlen);
|
||||||
v64 = atomic64_read(&alg->compress_tlen);
|
rcomp.stat_decompress_cnt = atomic64_read(&alg->stats.compress.decompress_cnt);
|
||||||
rcomp.stat_compress_tlen = v64;
|
rcomp.stat_decompress_tlen = atomic64_read(&alg->stats.compress.decompress_tlen);
|
||||||
v32 = atomic_read(&alg->decompress_cnt);
|
rcomp.stat_err_cnt = atomic64_read(&alg->stats.compress.err_cnt);
|
||||||
rcomp.stat_decompress_cnt = v32;
|
|
||||||
v64 = atomic64_read(&alg->decompress_tlen);
|
|
||||||
rcomp.stat_decompress_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->cipher_err_cnt);
|
|
||||||
rcomp.stat_compress_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_COMPRESS,
|
return nla_put(skb, CRYPTOCFGA_STAT_COMPRESS, sizeof(rcomp), &rcomp);
|
||||||
sizeof(struct crypto_stat), &rcomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_acomp(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_acomp(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat racomp;
|
struct crypto_stat_compress racomp;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&racomp, 0, sizeof(racomp));
|
memset(&racomp, 0, sizeof(racomp));
|
||||||
|
|
||||||
strlcpy(racomp.type, "acomp", sizeof(racomp.type));
|
strscpy(racomp.type, "acomp", sizeof(racomp.type));
|
||||||
v32 = atomic_read(&alg->compress_cnt);
|
racomp.stat_compress_cnt = atomic64_read(&alg->stats.compress.compress_cnt);
|
||||||
racomp.stat_compress_cnt = v32;
|
racomp.stat_compress_tlen = atomic64_read(&alg->stats.compress.compress_tlen);
|
||||||
v64 = atomic64_read(&alg->compress_tlen);
|
racomp.stat_decompress_cnt = atomic64_read(&alg->stats.compress.decompress_cnt);
|
||||||
racomp.stat_compress_tlen = v64;
|
racomp.stat_decompress_tlen = atomic64_read(&alg->stats.compress.decompress_tlen);
|
||||||
v32 = atomic_read(&alg->decompress_cnt);
|
racomp.stat_err_cnt = atomic64_read(&alg->stats.compress.err_cnt);
|
||||||
racomp.stat_decompress_cnt = v32;
|
|
||||||
v64 = atomic64_read(&alg->decompress_tlen);
|
|
||||||
racomp.stat_decompress_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->cipher_err_cnt);
|
|
||||||
racomp.stat_compress_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_ACOMP,
|
return nla_put(skb, CRYPTOCFGA_STAT_ACOMP, sizeof(racomp), &racomp);
|
||||||
sizeof(struct crypto_stat), &racomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_akcipher(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_akcipher(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rakcipher;
|
struct crypto_stat_akcipher rakcipher;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rakcipher, 0, sizeof(rakcipher));
|
memset(&rakcipher, 0, sizeof(rakcipher));
|
||||||
|
|
||||||
strncpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
|
strscpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
|
||||||
v32 = atomic_read(&alg->encrypt_cnt);
|
rakcipher.stat_encrypt_cnt = atomic64_read(&alg->stats.akcipher.encrypt_cnt);
|
||||||
rakcipher.stat_encrypt_cnt = v32;
|
rakcipher.stat_encrypt_tlen = atomic64_read(&alg->stats.akcipher.encrypt_tlen);
|
||||||
v64 = atomic64_read(&alg->encrypt_tlen);
|
rakcipher.stat_decrypt_cnt = atomic64_read(&alg->stats.akcipher.decrypt_cnt);
|
||||||
rakcipher.stat_encrypt_tlen = v64;
|
rakcipher.stat_decrypt_tlen = atomic64_read(&alg->stats.akcipher.decrypt_tlen);
|
||||||
v32 = atomic_read(&alg->decrypt_cnt);
|
rakcipher.stat_sign_cnt = atomic64_read(&alg->stats.akcipher.sign_cnt);
|
||||||
rakcipher.stat_decrypt_cnt = v32;
|
rakcipher.stat_verify_cnt = atomic64_read(&alg->stats.akcipher.verify_cnt);
|
||||||
v64 = atomic64_read(&alg->decrypt_tlen);
|
rakcipher.stat_err_cnt = atomic64_read(&alg->stats.akcipher.err_cnt);
|
||||||
rakcipher.stat_decrypt_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->sign_cnt);
|
|
||||||
rakcipher.stat_sign_cnt = v32;
|
|
||||||
v32 = atomic_read(&alg->verify_cnt);
|
|
||||||
rakcipher.stat_verify_cnt = v32;
|
|
||||||
v32 = atomic_read(&alg->akcipher_err_cnt);
|
|
||||||
rakcipher.stat_akcipher_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_AKCIPHER,
|
return nla_put(skb, CRYPTOCFGA_STAT_AKCIPHER,
|
||||||
sizeof(struct crypto_stat), &rakcipher))
|
sizeof(rakcipher), &rakcipher);
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_kpp(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_kpp(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rkpp;
|
struct crypto_stat_kpp rkpp;
|
||||||
u32 v;
|
|
||||||
|
|
||||||
memset(&rkpp, 0, sizeof(rkpp));
|
memset(&rkpp, 0, sizeof(rkpp));
|
||||||
|
|
||||||
strlcpy(rkpp.type, "kpp", sizeof(rkpp.type));
|
strscpy(rkpp.type, "kpp", sizeof(rkpp.type));
|
||||||
|
|
||||||
v = atomic_read(&alg->setsecret_cnt);
|
rkpp.stat_setsecret_cnt = atomic64_read(&alg->stats.kpp.setsecret_cnt);
|
||||||
rkpp.stat_setsecret_cnt = v;
|
rkpp.stat_generate_public_key_cnt = atomic64_read(&alg->stats.kpp.generate_public_key_cnt);
|
||||||
v = atomic_read(&alg->generate_public_key_cnt);
|
rkpp.stat_compute_shared_secret_cnt = atomic64_read(&alg->stats.kpp.compute_shared_secret_cnt);
|
||||||
rkpp.stat_generate_public_key_cnt = v;
|
rkpp.stat_err_cnt = atomic64_read(&alg->stats.kpp.err_cnt);
|
||||||
v = atomic_read(&alg->compute_shared_secret_cnt);
|
|
||||||
rkpp.stat_compute_shared_secret_cnt = v;
|
|
||||||
v = atomic_read(&alg->kpp_err_cnt);
|
|
||||||
rkpp.stat_kpp_err_cnt = v;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_KPP,
|
return nla_put(skb, CRYPTOCFGA_STAT_KPP, sizeof(rkpp), &rkpp);
|
||||||
sizeof(struct crypto_stat), &rkpp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_ahash(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_ahash(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rhash;
|
struct crypto_stat_hash rhash;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rhash, 0, sizeof(rhash));
|
memset(&rhash, 0, sizeof(rhash));
|
||||||
|
|
||||||
strncpy(rhash.type, "ahash", sizeof(rhash.type));
|
strscpy(rhash.type, "ahash", sizeof(rhash.type));
|
||||||
|
|
||||||
v32 = atomic_read(&alg->hash_cnt);
|
rhash.stat_hash_cnt = atomic64_read(&alg->stats.hash.hash_cnt);
|
||||||
rhash.stat_hash_cnt = v32;
|
rhash.stat_hash_tlen = atomic64_read(&alg->stats.hash.hash_tlen);
|
||||||
v64 = atomic64_read(&alg->hash_tlen);
|
rhash.stat_err_cnt = atomic64_read(&alg->stats.hash.err_cnt);
|
||||||
rhash.stat_hash_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->hash_err_cnt);
|
|
||||||
rhash.stat_hash_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_HASH,
|
return nla_put(skb, CRYPTOCFGA_STAT_HASH, sizeof(rhash), &rhash);
|
||||||
sizeof(struct crypto_stat), &rhash))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_shash(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_shash(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rhash;
|
struct crypto_stat_hash rhash;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rhash, 0, sizeof(rhash));
|
memset(&rhash, 0, sizeof(rhash));
|
||||||
|
|
||||||
strncpy(rhash.type, "shash", sizeof(rhash.type));
|
strscpy(rhash.type, "shash", sizeof(rhash.type));
|
||||||
|
|
||||||
v32 = atomic_read(&alg->hash_cnt);
|
rhash.stat_hash_cnt = atomic64_read(&alg->stats.hash.hash_cnt);
|
||||||
rhash.stat_hash_cnt = v32;
|
rhash.stat_hash_tlen = atomic64_read(&alg->stats.hash.hash_tlen);
|
||||||
v64 = atomic64_read(&alg->hash_tlen);
|
rhash.stat_err_cnt = atomic64_read(&alg->stats.hash.err_cnt);
|
||||||
rhash.stat_hash_tlen = v64;
|
|
||||||
v32 = atomic_read(&alg->hash_err_cnt);
|
|
||||||
rhash.stat_hash_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_HASH,
|
return nla_put(skb, CRYPTOCFGA_STAT_HASH, sizeof(rhash), &rhash);
|
||||||
sizeof(struct crypto_stat), &rhash))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_report_rng(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_report_rng(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_stat rrng;
|
struct crypto_stat_rng rrng;
|
||||||
u64 v64;
|
|
||||||
u32 v32;
|
|
||||||
|
|
||||||
memset(&rrng, 0, sizeof(rrng));
|
memset(&rrng, 0, sizeof(rrng));
|
||||||
|
|
||||||
strncpy(rrng.type, "rng", sizeof(rrng.type));
|
strscpy(rrng.type, "rng", sizeof(rrng.type));
|
||||||
|
|
||||||
v32 = atomic_read(&alg->generate_cnt);
|
rrng.stat_generate_cnt = atomic64_read(&alg->stats.rng.generate_cnt);
|
||||||
rrng.stat_generate_cnt = v32;
|
rrng.stat_generate_tlen = atomic64_read(&alg->stats.rng.generate_tlen);
|
||||||
v64 = atomic64_read(&alg->generate_tlen);
|
rrng.stat_seed_cnt = atomic64_read(&alg->stats.rng.seed_cnt);
|
||||||
rrng.stat_generate_tlen = v64;
|
rrng.stat_err_cnt = atomic64_read(&alg->stats.rng.err_cnt);
|
||||||
v32 = atomic_read(&alg->seed_cnt);
|
|
||||||
rrng.stat_seed_cnt = v32;
|
|
||||||
v32 = atomic_read(&alg->hash_err_cnt);
|
|
||||||
rrng.stat_rng_err_cnt = v32;
|
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_RNG,
|
return nla_put(skb, CRYPTOCFGA_STAT_RNG, sizeof(rrng), &rrng);
|
||||||
sizeof(struct crypto_stat), &rrng))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int crypto_reportstat_one(struct crypto_alg *alg,
|
static int crypto_reportstat_one(struct crypto_alg *alg,
|
||||||
|
@ -295,10 +184,10 @@ static int crypto_reportstat_one(struct crypto_alg *alg,
|
||||||
{
|
{
|
||||||
memset(ualg, 0, sizeof(*ualg));
|
memset(ualg, 0, sizeof(*ualg));
|
||||||
|
|
||||||
strlcpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
|
strscpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
|
||||||
strlcpy(ualg->cru_driver_name, alg->cra_driver_name,
|
strscpy(ualg->cru_driver_name, alg->cra_driver_name,
|
||||||
sizeof(ualg->cru_driver_name));
|
sizeof(ualg->cru_driver_name));
|
||||||
strlcpy(ualg->cru_module_name, module_name(alg->cra_module),
|
strscpy(ualg->cru_module_name, module_name(alg->cra_module),
|
||||||
sizeof(ualg->cru_module_name));
|
sizeof(ualg->cru_module_name));
|
||||||
|
|
||||||
ualg->cru_type = 0;
|
ualg->cru_type = 0;
|
||||||
|
@ -309,12 +198,11 @@ static int crypto_reportstat_one(struct crypto_alg *alg,
|
||||||
if (nla_put_u32(skb, CRYPTOCFGA_PRIORITY_VAL, alg->cra_priority))
|
if (nla_put_u32(skb, CRYPTOCFGA_PRIORITY_VAL, alg->cra_priority))
|
||||||
goto nla_put_failure;
|
goto nla_put_failure;
|
||||||
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
|
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
|
||||||
struct crypto_stat rl;
|
struct crypto_stat_larval rl;
|
||||||
|
|
||||||
memset(&rl, 0, sizeof(rl));
|
memset(&rl, 0, sizeof(rl));
|
||||||
strlcpy(rl.type, "larval", sizeof(rl.type));
|
strscpy(rl.type, "larval", sizeof(rl.type));
|
||||||
if (nla_put(skb, CRYPTOCFGA_STAT_LARVAL,
|
if (nla_put(skb, CRYPTOCFGA_STAT_LARVAL, sizeof(rl), &rl))
|
||||||
sizeof(struct crypto_stat), &rl))
|
|
||||||
goto nla_put_failure;
|
goto nla_put_failure;
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
@ -448,37 +336,4 @@ int crypto_reportstat(struct sk_buff *in_skb, struct nlmsghdr *in_nlh,
|
||||||
return nlmsg_unicast(crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
|
return nlmsg_unicast(crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
|
||||||
}
|
}
|
||||||
|
|
||||||
int crypto_dump_reportstat(struct sk_buff *skb, struct netlink_callback *cb)
|
|
||||||
{
|
|
||||||
struct crypto_alg *alg;
|
|
||||||
struct crypto_dump_info info;
|
|
||||||
int err;
|
|
||||||
|
|
||||||
if (cb->args[0])
|
|
||||||
goto out;
|
|
||||||
|
|
||||||
cb->args[0] = 1;
|
|
||||||
|
|
||||||
info.in_skb = cb->skb;
|
|
||||||
info.out_skb = skb;
|
|
||||||
info.nlmsg_seq = cb->nlh->nlmsg_seq;
|
|
||||||
info.nlmsg_flags = NLM_F_MULTI;
|
|
||||||
|
|
||||||
list_for_each_entry(alg, &crypto_alg_list, cra_list) {
|
|
||||||
err = crypto_reportstat_alg(alg, &info);
|
|
||||||
if (err)
|
|
||||||
goto out_err;
|
|
||||||
}
|
|
||||||
|
|
||||||
out:
|
|
||||||
return skb->len;
|
|
||||||
out_err:
|
|
||||||
return err;
|
|
||||||
}
|
|
||||||
|
|
||||||
int crypto_dump_reportstat_done(struct netlink_callback *cb)
|
|
||||||
{
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
MODULE_LICENSE("GPL");
|
MODULE_LICENSE("GPL");
|
||||||
|
|
|
@ -233,8 +233,6 @@ static struct crypto_instance *crypto_ctr_alloc(struct rtattr **tb)
|
||||||
inst->alg.cra_blkcipher.encrypt = crypto_ctr_crypt;
|
inst->alg.cra_blkcipher.encrypt = crypto_ctr_crypt;
|
||||||
inst->alg.cra_blkcipher.decrypt = crypto_ctr_crypt;
|
inst->alg.cra_blkcipher.decrypt = crypto_ctr_crypt;
|
||||||
|
|
||||||
inst->alg.cra_blkcipher.geniv = "chainiv";
|
|
||||||
|
|
||||||
out:
|
out:
|
||||||
crypto_mod_put(alg);
|
crypto_mod_put(alg);
|
||||||
return inst;
|
return inst;
|
||||||
|
|
58
crypto/ecc.c
58
crypto/ecc.c
|
@ -842,15 +842,23 @@ static void xycz_add_c(u64 *x1, u64 *y1, u64 *x2, u64 *y2, u64 *curve_prime,
|
||||||
|
|
||||||
static void ecc_point_mult(struct ecc_point *result,
|
static void ecc_point_mult(struct ecc_point *result,
|
||||||
const struct ecc_point *point, const u64 *scalar,
|
const struct ecc_point *point, const u64 *scalar,
|
||||||
u64 *initial_z, u64 *curve_prime,
|
u64 *initial_z, const struct ecc_curve *curve,
|
||||||
unsigned int ndigits)
|
unsigned int ndigits)
|
||||||
{
|
{
|
||||||
/* R0 and R1 */
|
/* R0 and R1 */
|
||||||
u64 rx[2][ECC_MAX_DIGITS];
|
u64 rx[2][ECC_MAX_DIGITS];
|
||||||
u64 ry[2][ECC_MAX_DIGITS];
|
u64 ry[2][ECC_MAX_DIGITS];
|
||||||
u64 z[ECC_MAX_DIGITS];
|
u64 z[ECC_MAX_DIGITS];
|
||||||
|
u64 sk[2][ECC_MAX_DIGITS];
|
||||||
|
u64 *curve_prime = curve->p;
|
||||||
int i, nb;
|
int i, nb;
|
||||||
int num_bits = vli_num_bits(scalar, ndigits);
|
int num_bits;
|
||||||
|
int carry;
|
||||||
|
|
||||||
|
carry = vli_add(sk[0], scalar, curve->n, ndigits);
|
||||||
|
vli_add(sk[1], sk[0], curve->n, ndigits);
|
||||||
|
scalar = sk[!carry];
|
||||||
|
num_bits = sizeof(u64) * ndigits * 8 + 1;
|
||||||
|
|
||||||
vli_set(rx[1], point->x, ndigits);
|
vli_set(rx[1], point->x, ndigits);
|
||||||
vli_set(ry[1], point->y, ndigits);
|
vli_set(ry[1], point->y, ndigits);
|
||||||
|
@ -904,28 +912,41 @@ static inline void ecc_swap_digits(const u64 *in, u64 *out,
|
||||||
out[i] = __swab64(in[ndigits - 1 - i]);
|
out[i] = __swab64(in[ndigits - 1 - i]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int __ecc_is_key_valid(const struct ecc_curve *curve,
|
||||||
|
const u64 *private_key, unsigned int ndigits)
|
||||||
|
{
|
||||||
|
u64 one[ECC_MAX_DIGITS] = { 1, };
|
||||||
|
u64 res[ECC_MAX_DIGITS];
|
||||||
|
|
||||||
|
if (!private_key)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
if (curve->g.ndigits != ndigits)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
/* Make sure the private key is in the range [2, n-3]. */
|
||||||
|
if (vli_cmp(one, private_key, ndigits) != -1)
|
||||||
|
return -EINVAL;
|
||||||
|
vli_sub(res, curve->n, one, ndigits);
|
||||||
|
vli_sub(res, res, one, ndigits);
|
||||||
|
if (vli_cmp(res, private_key, ndigits) != 1)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
int ecc_is_key_valid(unsigned int curve_id, unsigned int ndigits,
|
int ecc_is_key_valid(unsigned int curve_id, unsigned int ndigits,
|
||||||
const u64 *private_key, unsigned int private_key_len)
|
const u64 *private_key, unsigned int private_key_len)
|
||||||
{
|
{
|
||||||
int nbytes;
|
int nbytes;
|
||||||
const struct ecc_curve *curve = ecc_get_curve(curve_id);
|
const struct ecc_curve *curve = ecc_get_curve(curve_id);
|
||||||
|
|
||||||
if (!private_key)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
nbytes = ndigits << ECC_DIGITS_TO_BYTES_SHIFT;
|
nbytes = ndigits << ECC_DIGITS_TO_BYTES_SHIFT;
|
||||||
|
|
||||||
if (private_key_len != nbytes)
|
if (private_key_len != nbytes)
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
if (vli_is_zero(private_key, ndigits))
|
return __ecc_is_key_valid(curve, private_key, ndigits);
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
/* Make sure the private key is in the range [1, n-1]. */
|
|
||||||
if (vli_cmp(curve->n, private_key, ndigits) != 1)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -971,11 +992,8 @@ int ecc_gen_privkey(unsigned int curve_id, unsigned int ndigits, u64 *privkey)
|
||||||
if (err)
|
if (err)
|
||||||
return err;
|
return err;
|
||||||
|
|
||||||
if (vli_is_zero(priv, ndigits))
|
/* Make sure the private key is in the valid range. */
|
||||||
return -EINVAL;
|
if (__ecc_is_key_valid(curve, priv, ndigits))
|
||||||
|
|
||||||
/* Make sure the private key is in the range [1, n-1]. */
|
|
||||||
if (vli_cmp(curve->n, priv, ndigits) != 1)
|
|
||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
|
|
||||||
ecc_swap_digits(priv, privkey, ndigits);
|
ecc_swap_digits(priv, privkey, ndigits);
|
||||||
|
@ -1004,7 +1022,7 @@ int ecc_make_pub_key(unsigned int curve_id, unsigned int ndigits,
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
|
||||||
ecc_point_mult(pk, &curve->g, priv, NULL, curve->p, ndigits);
|
ecc_point_mult(pk, &curve->g, priv, NULL, curve, ndigits);
|
||||||
if (ecc_point_is_zero(pk)) {
|
if (ecc_point_is_zero(pk)) {
|
||||||
ret = -EAGAIN;
|
ret = -EAGAIN;
|
||||||
goto err_free_point;
|
goto err_free_point;
|
||||||
|
@ -1090,7 +1108,7 @@ int crypto_ecdh_shared_secret(unsigned int curve_id, unsigned int ndigits,
|
||||||
goto err_alloc_product;
|
goto err_alloc_product;
|
||||||
}
|
}
|
||||||
|
|
||||||
ecc_point_mult(product, pk, priv, rand_z, curve->p, ndigits);
|
ecc_point_mult(product, pk, priv, rand_z, curve, ndigits);
|
||||||
|
|
||||||
ecc_swap_digits(product->x, secret, ndigits);
|
ecc_swap_digits(product->x, secret, ndigits);
|
||||||
|
|
||||||
|
|
|
@ -32,6 +32,8 @@ const char *const hash_algo_name[HASH_ALGO__LAST] = {
|
||||||
[HASH_ALGO_TGR_160] = "tgr160",
|
[HASH_ALGO_TGR_160] = "tgr160",
|
||||||
[HASH_ALGO_TGR_192] = "tgr192",
|
[HASH_ALGO_TGR_192] = "tgr192",
|
||||||
[HASH_ALGO_SM3_256] = "sm3-256",
|
[HASH_ALGO_SM3_256] = "sm3-256",
|
||||||
|
[HASH_ALGO_STREEBOG_256] = "streebog256",
|
||||||
|
[HASH_ALGO_STREEBOG_512] = "streebog512",
|
||||||
};
|
};
|
||||||
EXPORT_SYMBOL_GPL(hash_algo_name);
|
EXPORT_SYMBOL_GPL(hash_algo_name);
|
||||||
|
|
||||||
|
@ -54,5 +56,7 @@ const int hash_digest_size[HASH_ALGO__LAST] = {
|
||||||
[HASH_ALGO_TGR_160] = TGR160_DIGEST_SIZE,
|
[HASH_ALGO_TGR_160] = TGR160_DIGEST_SIZE,
|
||||||
[HASH_ALGO_TGR_192] = TGR192_DIGEST_SIZE,
|
[HASH_ALGO_TGR_192] = TGR192_DIGEST_SIZE,
|
||||||
[HASH_ALGO_SM3_256] = SM3256_DIGEST_SIZE,
|
[HASH_ALGO_SM3_256] = SM3256_DIGEST_SIZE,
|
||||||
|
[HASH_ALGO_STREEBOG_256] = STREEBOG256_DIGEST_SIZE,
|
||||||
|
[HASH_ALGO_STREEBOG_512] = STREEBOG512_DIGEST_SIZE,
|
||||||
};
|
};
|
||||||
EXPORT_SYMBOL_GPL(hash_digest_size);
|
EXPORT_SYMBOL_GPL(hash_digest_size);
|
||||||
|
|
10
crypto/kpp.c
10
crypto/kpp.c
|
@ -30,15 +30,11 @@ static int crypto_kpp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_kpp rkpp;
|
struct crypto_report_kpp rkpp;
|
||||||
|
|
||||||
strncpy(rkpp.type, "kpp", sizeof(rkpp.type));
|
memset(&rkpp, 0, sizeof(rkpp));
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_KPP,
|
strscpy(rkpp.type, "kpp", sizeof(rkpp.type));
|
||||||
sizeof(struct crypto_report_kpp), &rkpp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
return nla_put(skb, CRYPTOCFGA_REPORT_KPP, sizeof(rkpp), &rkpp);
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_kpp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_kpp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -122,7 +122,6 @@ static struct crypto_alg alg_lz4 = {
|
||||||
.cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
|
.cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
|
||||||
.cra_ctxsize = sizeof(struct lz4_ctx),
|
.cra_ctxsize = sizeof(struct lz4_ctx),
|
||||||
.cra_module = THIS_MODULE,
|
.cra_module = THIS_MODULE,
|
||||||
.cra_list = LIST_HEAD_INIT(alg_lz4.cra_list),
|
|
||||||
.cra_init = lz4_init,
|
.cra_init = lz4_init,
|
||||||
.cra_exit = lz4_exit,
|
.cra_exit = lz4_exit,
|
||||||
.cra_u = { .compress = {
|
.cra_u = { .compress = {
|
||||||
|
|
|
@ -123,7 +123,6 @@ static struct crypto_alg alg_lz4hc = {
|
||||||
.cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
|
.cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
|
||||||
.cra_ctxsize = sizeof(struct lz4hc_ctx),
|
.cra_ctxsize = sizeof(struct lz4hc_ctx),
|
||||||
.cra_module = THIS_MODULE,
|
.cra_module = THIS_MODULE,
|
||||||
.cra_list = LIST_HEAD_INIT(alg_lz4hc.cra_list),
|
|
||||||
.cra_init = lz4hc_init,
|
.cra_init = lz4hc_init,
|
||||||
.cra_exit = lz4hc_exit,
|
.cra_exit = lz4hc_exit,
|
||||||
.cra_u = { .compress = {
|
.cra_u = { .compress = {
|
||||||
|
|
|
@ -0,0 +1,254 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
|
* NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
|
||||||
|
*
|
||||||
|
* Copyright 2018 Google LLC
|
||||||
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
* "NHPoly1305" is the main component of Adiantum hashing.
|
||||||
|
* Specifically, it is the calculation
|
||||||
|
*
|
||||||
|
* H_L ← Poly1305_{K_L}(NH_{K_N}(pad_{128}(L)))
|
||||||
|
*
|
||||||
|
* from the procedure in section 6.4 of the Adiantum paper [1]. It is an
|
||||||
|
* ε-almost-∆-universal (ε-∆U) hash function for equal-length inputs over
|
||||||
|
* Z/(2^{128}Z), where the "∆" operation is addition. It hashes 1024-byte
|
||||||
|
* chunks of the input with the NH hash function [2], reducing the input length
|
||||||
|
* by 32x. The resulting NH digests are evaluated as a polynomial in
|
||||||
|
* GF(2^{130}-5), like in the Poly1305 MAC [3]. Note that the polynomial
|
||||||
|
* evaluation by itself would suffice to achieve the ε-∆U property; NH is used
|
||||||
|
* for performance since it's over twice as fast as Poly1305.
|
||||||
|
*
|
||||||
|
* This is *not* a cryptographic hash function; do not use it as such!
|
||||||
|
*
|
||||||
|
* [1] Adiantum: length-preserving encryption for entry-level processors
|
||||||
|
* (https://eprint.iacr.org/2018/720.pdf)
|
||||||
|
* [2] UMAC: Fast and Secure Message Authentication
|
||||||
|
* (https://fastcrypto.org/umac/umac_proc.pdf)
|
||||||
|
* [3] The Poly1305-AES message-authentication code
|
||||||
|
* (https://cr.yp.to/mac/poly1305-20050329.pdf)
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <asm/unaligned.h>
|
||||||
|
#include <crypto/algapi.h>
|
||||||
|
#include <crypto/internal/hash.h>
|
||||||
|
#include <crypto/nhpoly1305.h>
|
||||||
|
#include <linux/crypto.h>
|
||||||
|
#include <linux/kernel.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
|
||||||
|
static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
|
||||||
|
__le64 hash[NH_NUM_PASSES])
|
||||||
|
{
|
||||||
|
u64 sums[4] = { 0, 0, 0, 0 };
|
||||||
|
|
||||||
|
BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
|
||||||
|
BUILD_BUG_ON(NH_NUM_PASSES != 4);
|
||||||
|
|
||||||
|
while (message_len) {
|
||||||
|
u32 m0 = get_unaligned_le32(message + 0);
|
||||||
|
u32 m1 = get_unaligned_le32(message + 4);
|
||||||
|
u32 m2 = get_unaligned_le32(message + 8);
|
||||||
|
u32 m3 = get_unaligned_le32(message + 12);
|
||||||
|
|
||||||
|
sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
|
||||||
|
sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
|
||||||
|
sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
|
||||||
|
sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
|
||||||
|
sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
|
||||||
|
sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
|
||||||
|
sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
|
||||||
|
sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
|
||||||
|
key += NH_MESSAGE_UNIT / sizeof(key[0]);
|
||||||
|
message += NH_MESSAGE_UNIT;
|
||||||
|
message_len -= NH_MESSAGE_UNIT;
|
||||||
|
}
|
||||||
|
|
||||||
|
hash[0] = cpu_to_le64(sums[0]);
|
||||||
|
hash[1] = cpu_to_le64(sums[1]);
|
||||||
|
hash[2] = cpu_to_le64(sums[2]);
|
||||||
|
hash[3] = cpu_to_le64(sums[3]);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Pass the next NH hash value through Poly1305 */
|
||||||
|
static void process_nh_hash_value(struct nhpoly1305_state *state,
|
||||||
|
const struct nhpoly1305_key *key)
|
||||||
|
{
|
||||||
|
BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0);
|
||||||
|
|
||||||
|
poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash,
|
||||||
|
NH_HASH_BYTES / POLY1305_BLOCK_SIZE);
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Feed the next portion of the source data, as a whole number of 16-byte
|
||||||
|
* "NH message units", through NH and Poly1305. Each NH hash is taken over
|
||||||
|
* 1024 bytes, except possibly the final one which is taken over a multiple of
|
||||||
|
* 16 bytes up to 1024. Also, in the case where data is passed in misaligned
|
||||||
|
* chunks, we combine partial hashes; the end result is the same either way.
|
||||||
|
*/
|
||||||
|
static void nhpoly1305_units(struct nhpoly1305_state *state,
|
||||||
|
const struct nhpoly1305_key *key,
|
||||||
|
const u8 *src, unsigned int srclen, nh_t nh_fn)
|
||||||
|
{
|
||||||
|
do {
|
||||||
|
unsigned int bytes;
|
||||||
|
|
||||||
|
if (state->nh_remaining == 0) {
|
||||||
|
/* Starting a new NH message */
|
||||||
|
bytes = min_t(unsigned int, srclen, NH_MESSAGE_BYTES);
|
||||||
|
nh_fn(key->nh_key, src, bytes, state->nh_hash);
|
||||||
|
state->nh_remaining = NH_MESSAGE_BYTES - bytes;
|
||||||
|
} else {
|
||||||
|
/* Continuing a previous NH message */
|
||||||
|
__le64 tmp_hash[NH_NUM_PASSES];
|
||||||
|
unsigned int pos;
|
||||||
|
int i;
|
||||||
|
|
||||||
|
pos = NH_MESSAGE_BYTES - state->nh_remaining;
|
||||||
|
bytes = min(srclen, state->nh_remaining);
|
||||||
|
nh_fn(&key->nh_key[pos / 4], src, bytes, tmp_hash);
|
||||||
|
for (i = 0; i < NH_NUM_PASSES; i++)
|
||||||
|
le64_add_cpu(&state->nh_hash[i],
|
||||||
|
le64_to_cpu(tmp_hash[i]));
|
||||||
|
state->nh_remaining -= bytes;
|
||||||
|
}
|
||||||
|
if (state->nh_remaining == 0)
|
||||||
|
process_nh_hash_value(state, key);
|
||||||
|
src += bytes;
|
||||||
|
srclen -= bytes;
|
||||||
|
} while (srclen);
|
||||||
|
}
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_setkey(struct crypto_shash *tfm,
|
||||||
|
const u8 *key, unsigned int keylen)
|
||||||
|
{
|
||||||
|
struct nhpoly1305_key *ctx = crypto_shash_ctx(tfm);
|
||||||
|
int i;
|
||||||
|
|
||||||
|
if (keylen != NHPOLY1305_KEY_SIZE)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
poly1305_core_setkey(&ctx->poly_key, key);
|
||||||
|
key += POLY1305_BLOCK_SIZE;
|
||||||
|
|
||||||
|
for (i = 0; i < NH_KEY_WORDS; i++)
|
||||||
|
ctx->nh_key[i] = get_unaligned_le32(key + i * sizeof(u32));
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_setkey);
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_init(struct shash_desc *desc)
|
||||||
|
{
|
||||||
|
struct nhpoly1305_state *state = shash_desc_ctx(desc);
|
||||||
|
|
||||||
|
poly1305_core_init(&state->poly_state);
|
||||||
|
state->buflen = 0;
|
||||||
|
state->nh_remaining = 0;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_init);
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_update_helper(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen,
|
||||||
|
nh_t nh_fn)
|
||||||
|
{
|
||||||
|
struct nhpoly1305_state *state = shash_desc_ctx(desc);
|
||||||
|
const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
|
||||||
|
unsigned int bytes;
|
||||||
|
|
||||||
|
if (state->buflen) {
|
||||||
|
bytes = min(srclen, (int)NH_MESSAGE_UNIT - state->buflen);
|
||||||
|
memcpy(&state->buffer[state->buflen], src, bytes);
|
||||||
|
state->buflen += bytes;
|
||||||
|
if (state->buflen < NH_MESSAGE_UNIT)
|
||||||
|
return 0;
|
||||||
|
nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
|
||||||
|
nh_fn);
|
||||||
|
state->buflen = 0;
|
||||||
|
src += bytes;
|
||||||
|
srclen -= bytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (srclen >= NH_MESSAGE_UNIT) {
|
||||||
|
bytes = round_down(srclen, NH_MESSAGE_UNIT);
|
||||||
|
nhpoly1305_units(state, key, src, bytes, nh_fn);
|
||||||
|
src += bytes;
|
||||||
|
srclen -= bytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (srclen) {
|
||||||
|
memcpy(state->buffer, src, srclen);
|
||||||
|
state->buflen = srclen;
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_update_helper);
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_update(struct shash_desc *desc,
|
||||||
|
const u8 *src, unsigned int srclen)
|
||||||
|
{
|
||||||
|
return crypto_nhpoly1305_update_helper(desc, src, srclen, nh_generic);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_update);
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst, nh_t nh_fn)
|
||||||
|
{
|
||||||
|
struct nhpoly1305_state *state = shash_desc_ctx(desc);
|
||||||
|
const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
|
||||||
|
|
||||||
|
if (state->buflen) {
|
||||||
|
memset(&state->buffer[state->buflen], 0,
|
||||||
|
NH_MESSAGE_UNIT - state->buflen);
|
||||||
|
nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
|
||||||
|
nh_fn);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (state->nh_remaining)
|
||||||
|
process_nh_hash_value(state, key);
|
||||||
|
|
||||||
|
poly1305_core_emit(&state->poly_state, dst);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_final_helper);
|
||||||
|
|
||||||
|
int crypto_nhpoly1305_final(struct shash_desc *desc, u8 *dst)
|
||||||
|
{
|
||||||
|
return crypto_nhpoly1305_final_helper(desc, dst, nh_generic);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(crypto_nhpoly1305_final);
|
||||||
|
|
||||||
|
static struct shash_alg nhpoly1305_alg = {
|
||||||
|
.base.cra_name = "nhpoly1305",
|
||||||
|
.base.cra_driver_name = "nhpoly1305-generic",
|
||||||
|
.base.cra_priority = 100,
|
||||||
|
.base.cra_ctxsize = sizeof(struct nhpoly1305_key),
|
||||||
|
.base.cra_module = THIS_MODULE,
|
||||||
|
.digestsize = POLY1305_DIGEST_SIZE,
|
||||||
|
.init = crypto_nhpoly1305_init,
|
||||||
|
.update = crypto_nhpoly1305_update,
|
||||||
|
.final = crypto_nhpoly1305_final,
|
||||||
|
.setkey = crypto_nhpoly1305_setkey,
|
||||||
|
.descsize = sizeof(struct nhpoly1305_state),
|
||||||
|
};
|
||||||
|
|
||||||
|
static int __init nhpoly1305_mod_init(void)
|
||||||
|
{
|
||||||
|
return crypto_register_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit nhpoly1305_mod_exit(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_shash(&nhpoly1305_alg);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(nhpoly1305_mod_init);
|
||||||
|
module_exit(nhpoly1305_mod_exit);
|
||||||
|
|
||||||
|
MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function");
|
||||||
|
MODULE_LICENSE("GPL v2");
|
||||||
|
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305");
|
||||||
|
MODULE_ALIAS_CRYPTO("nhpoly1305-generic");
|
|
@ -394,7 +394,7 @@ static int pcrypt_sysfs_add(struct padata_instance *pinst, const char *name)
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
pinst->kobj.kset = pcrypt_kset;
|
pinst->kobj.kset = pcrypt_kset;
|
||||||
ret = kobject_add(&pinst->kobj, NULL, name);
|
ret = kobject_add(&pinst->kobj, NULL, "%s", name);
|
||||||
if (!ret)
|
if (!ret)
|
||||||
kobject_uevent(&pinst->kobj, KOBJ_ADD);
|
kobject_uevent(&pinst->kobj, KOBJ_ADD);
|
||||||
|
|
||||||
|
|
|
@ -38,7 +38,7 @@ int crypto_poly1305_init(struct shash_desc *desc)
|
||||||
{
|
{
|
||||||
struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
|
struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
|
||||||
|
|
||||||
memset(dctx->h, 0, sizeof(dctx->h));
|
poly1305_core_init(&dctx->h);
|
||||||
dctx->buflen = 0;
|
dctx->buflen = 0;
|
||||||
dctx->rset = false;
|
dctx->rset = false;
|
||||||
dctx->sset = false;
|
dctx->sset = false;
|
||||||
|
@ -47,23 +47,16 @@ int crypto_poly1305_init(struct shash_desc *desc)
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_poly1305_init);
|
EXPORT_SYMBOL_GPL(crypto_poly1305_init);
|
||||||
|
|
||||||
static void poly1305_setrkey(struct poly1305_desc_ctx *dctx, const u8 *key)
|
void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key)
|
||||||
{
|
{
|
||||||
/* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
|
/* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
|
||||||
dctx->r[0] = (get_unaligned_le32(key + 0) >> 0) & 0x3ffffff;
|
key->r[0] = (get_unaligned_le32(raw_key + 0) >> 0) & 0x3ffffff;
|
||||||
dctx->r[1] = (get_unaligned_le32(key + 3) >> 2) & 0x3ffff03;
|
key->r[1] = (get_unaligned_le32(raw_key + 3) >> 2) & 0x3ffff03;
|
||||||
dctx->r[2] = (get_unaligned_le32(key + 6) >> 4) & 0x3ffc0ff;
|
key->r[2] = (get_unaligned_le32(raw_key + 6) >> 4) & 0x3ffc0ff;
|
||||||
dctx->r[3] = (get_unaligned_le32(key + 9) >> 6) & 0x3f03fff;
|
key->r[3] = (get_unaligned_le32(raw_key + 9) >> 6) & 0x3f03fff;
|
||||||
dctx->r[4] = (get_unaligned_le32(key + 12) >> 8) & 0x00fffff;
|
key->r[4] = (get_unaligned_le32(raw_key + 12) >> 8) & 0x00fffff;
|
||||||
}
|
|
||||||
|
|
||||||
static void poly1305_setskey(struct poly1305_desc_ctx *dctx, const u8 *key)
|
|
||||||
{
|
|
||||||
dctx->s[0] = get_unaligned_le32(key + 0);
|
|
||||||
dctx->s[1] = get_unaligned_le32(key + 4);
|
|
||||||
dctx->s[2] = get_unaligned_le32(key + 8);
|
|
||||||
dctx->s[3] = get_unaligned_le32(key + 12);
|
|
||||||
}
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(poly1305_core_setkey);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Poly1305 requires a unique key for each tag, which implies that we can't set
|
* Poly1305 requires a unique key for each tag, which implies that we can't set
|
||||||
|
@ -75,13 +68,16 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
|
||||||
{
|
{
|
||||||
if (!dctx->sset) {
|
if (!dctx->sset) {
|
||||||
if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) {
|
if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) {
|
||||||
poly1305_setrkey(dctx, src);
|
poly1305_core_setkey(&dctx->r, src);
|
||||||
src += POLY1305_BLOCK_SIZE;
|
src += POLY1305_BLOCK_SIZE;
|
||||||
srclen -= POLY1305_BLOCK_SIZE;
|
srclen -= POLY1305_BLOCK_SIZE;
|
||||||
dctx->rset = true;
|
dctx->rset = true;
|
||||||
}
|
}
|
||||||
if (srclen >= POLY1305_BLOCK_SIZE) {
|
if (srclen >= POLY1305_BLOCK_SIZE) {
|
||||||
poly1305_setskey(dctx, src);
|
dctx->s[0] = get_unaligned_le32(src + 0);
|
||||||
|
dctx->s[1] = get_unaligned_le32(src + 4);
|
||||||
|
dctx->s[2] = get_unaligned_le32(src + 8);
|
||||||
|
dctx->s[3] = get_unaligned_le32(src + 12);
|
||||||
src += POLY1305_BLOCK_SIZE;
|
src += POLY1305_BLOCK_SIZE;
|
||||||
srclen -= POLY1305_BLOCK_SIZE;
|
srclen -= POLY1305_BLOCK_SIZE;
|
||||||
dctx->sset = true;
|
dctx->sset = true;
|
||||||
|
@ -91,41 +87,37 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey);
|
EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey);
|
||||||
|
|
||||||
static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
|
static void poly1305_blocks_internal(struct poly1305_state *state,
|
||||||
const u8 *src, unsigned int srclen,
|
const struct poly1305_key *key,
|
||||||
u32 hibit)
|
const void *src, unsigned int nblocks,
|
||||||
|
u32 hibit)
|
||||||
{
|
{
|
||||||
u32 r0, r1, r2, r3, r4;
|
u32 r0, r1, r2, r3, r4;
|
||||||
u32 s1, s2, s3, s4;
|
u32 s1, s2, s3, s4;
|
||||||
u32 h0, h1, h2, h3, h4;
|
u32 h0, h1, h2, h3, h4;
|
||||||
u64 d0, d1, d2, d3, d4;
|
u64 d0, d1, d2, d3, d4;
|
||||||
unsigned int datalen;
|
|
||||||
|
|
||||||
if (unlikely(!dctx->sset)) {
|
if (!nblocks)
|
||||||
datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
|
return;
|
||||||
src += srclen - datalen;
|
|
||||||
srclen = datalen;
|
|
||||||
}
|
|
||||||
|
|
||||||
r0 = dctx->r[0];
|
r0 = key->r[0];
|
||||||
r1 = dctx->r[1];
|
r1 = key->r[1];
|
||||||
r2 = dctx->r[2];
|
r2 = key->r[2];
|
||||||
r3 = dctx->r[3];
|
r3 = key->r[3];
|
||||||
r4 = dctx->r[4];
|
r4 = key->r[4];
|
||||||
|
|
||||||
s1 = r1 * 5;
|
s1 = r1 * 5;
|
||||||
s2 = r2 * 5;
|
s2 = r2 * 5;
|
||||||
s3 = r3 * 5;
|
s3 = r3 * 5;
|
||||||
s4 = r4 * 5;
|
s4 = r4 * 5;
|
||||||
|
|
||||||
h0 = dctx->h[0];
|
h0 = state->h[0];
|
||||||
h1 = dctx->h[1];
|
h1 = state->h[1];
|
||||||
h2 = dctx->h[2];
|
h2 = state->h[2];
|
||||||
h3 = dctx->h[3];
|
h3 = state->h[3];
|
||||||
h4 = dctx->h[4];
|
h4 = state->h[4];
|
||||||
|
|
||||||
while (likely(srclen >= POLY1305_BLOCK_SIZE)) {
|
|
||||||
|
|
||||||
|
do {
|
||||||
/* h += m[i] */
|
/* h += m[i] */
|
||||||
h0 += (get_unaligned_le32(src + 0) >> 0) & 0x3ffffff;
|
h0 += (get_unaligned_le32(src + 0) >> 0) & 0x3ffffff;
|
||||||
h1 += (get_unaligned_le32(src + 3) >> 2) & 0x3ffffff;
|
h1 += (get_unaligned_le32(src + 3) >> 2) & 0x3ffffff;
|
||||||
|
@ -154,16 +146,36 @@ static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
|
||||||
h1 += h0 >> 26; h0 = h0 & 0x3ffffff;
|
h1 += h0 >> 26; h0 = h0 & 0x3ffffff;
|
||||||
|
|
||||||
src += POLY1305_BLOCK_SIZE;
|
src += POLY1305_BLOCK_SIZE;
|
||||||
srclen -= POLY1305_BLOCK_SIZE;
|
} while (--nblocks);
|
||||||
|
|
||||||
|
state->h[0] = h0;
|
||||||
|
state->h[1] = h1;
|
||||||
|
state->h[2] = h2;
|
||||||
|
state->h[3] = h3;
|
||||||
|
state->h[4] = h4;
|
||||||
|
}
|
||||||
|
|
||||||
|
void poly1305_core_blocks(struct poly1305_state *state,
|
||||||
|
const struct poly1305_key *key,
|
||||||
|
const void *src, unsigned int nblocks)
|
||||||
|
{
|
||||||
|
poly1305_blocks_internal(state, key, src, nblocks, 1 << 24);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(poly1305_core_blocks);
|
||||||
|
|
||||||
|
static void poly1305_blocks(struct poly1305_desc_ctx *dctx,
|
||||||
|
const u8 *src, unsigned int srclen, u32 hibit)
|
||||||
|
{
|
||||||
|
unsigned int datalen;
|
||||||
|
|
||||||
|
if (unlikely(!dctx->sset)) {
|
||||||
|
datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
|
||||||
|
src += srclen - datalen;
|
||||||
|
srclen = datalen;
|
||||||
}
|
}
|
||||||
|
|
||||||
dctx->h[0] = h0;
|
poly1305_blocks_internal(&dctx->h, &dctx->r,
|
||||||
dctx->h[1] = h1;
|
src, srclen / POLY1305_BLOCK_SIZE, hibit);
|
||||||
dctx->h[2] = h2;
|
|
||||||
dctx->h[3] = h3;
|
|
||||||
dctx->h[4] = h4;
|
|
||||||
|
|
||||||
return srclen;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
int crypto_poly1305_update(struct shash_desc *desc,
|
int crypto_poly1305_update(struct shash_desc *desc,
|
||||||
|
@ -187,9 +199,9 @@ int crypto_poly1305_update(struct shash_desc *desc,
|
||||||
}
|
}
|
||||||
|
|
||||||
if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
|
if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
|
||||||
bytes = poly1305_blocks(dctx, src, srclen, 1 << 24);
|
poly1305_blocks(dctx, src, srclen, 1 << 24);
|
||||||
src += srclen - bytes;
|
src += srclen - (srclen % POLY1305_BLOCK_SIZE);
|
||||||
srclen = bytes;
|
srclen %= POLY1305_BLOCK_SIZE;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (unlikely(srclen)) {
|
if (unlikely(srclen)) {
|
||||||
|
@ -201,30 +213,18 @@ int crypto_poly1305_update(struct shash_desc *desc,
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(crypto_poly1305_update);
|
EXPORT_SYMBOL_GPL(crypto_poly1305_update);
|
||||||
|
|
||||||
int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
|
void poly1305_core_emit(const struct poly1305_state *state, void *dst)
|
||||||
{
|
{
|
||||||
struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
|
|
||||||
u32 h0, h1, h2, h3, h4;
|
u32 h0, h1, h2, h3, h4;
|
||||||
u32 g0, g1, g2, g3, g4;
|
u32 g0, g1, g2, g3, g4;
|
||||||
u32 mask;
|
u32 mask;
|
||||||
u64 f = 0;
|
|
||||||
|
|
||||||
if (unlikely(!dctx->sset))
|
|
||||||
return -ENOKEY;
|
|
||||||
|
|
||||||
if (unlikely(dctx->buflen)) {
|
|
||||||
dctx->buf[dctx->buflen++] = 1;
|
|
||||||
memset(dctx->buf + dctx->buflen, 0,
|
|
||||||
POLY1305_BLOCK_SIZE - dctx->buflen);
|
|
||||||
poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* fully carry h */
|
/* fully carry h */
|
||||||
h0 = dctx->h[0];
|
h0 = state->h[0];
|
||||||
h1 = dctx->h[1];
|
h1 = state->h[1];
|
||||||
h2 = dctx->h[2];
|
h2 = state->h[2];
|
||||||
h3 = dctx->h[3];
|
h3 = state->h[3];
|
||||||
h4 = dctx->h[4];
|
h4 = state->h[4];
|
||||||
|
|
||||||
h2 += (h1 >> 26); h1 = h1 & 0x3ffffff;
|
h2 += (h1 >> 26); h1 = h1 & 0x3ffffff;
|
||||||
h3 += (h2 >> 26); h2 = h2 & 0x3ffffff;
|
h3 += (h2 >> 26); h2 = h2 & 0x3ffffff;
|
||||||
|
@ -254,16 +254,40 @@ int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
|
||||||
h4 = (h4 & mask) | g4;
|
h4 = (h4 & mask) | g4;
|
||||||
|
|
||||||
/* h = h % (2^128) */
|
/* h = h % (2^128) */
|
||||||
h0 = (h0 >> 0) | (h1 << 26);
|
put_unaligned_le32((h0 >> 0) | (h1 << 26), dst + 0);
|
||||||
h1 = (h1 >> 6) | (h2 << 20);
|
put_unaligned_le32((h1 >> 6) | (h2 << 20), dst + 4);
|
||||||
h2 = (h2 >> 12) | (h3 << 14);
|
put_unaligned_le32((h2 >> 12) | (h3 << 14), dst + 8);
|
||||||
h3 = (h3 >> 18) | (h4 << 8);
|
put_unaligned_le32((h3 >> 18) | (h4 << 8), dst + 12);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(poly1305_core_emit);
|
||||||
|
|
||||||
|
int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
|
||||||
|
{
|
||||||
|
struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
|
||||||
|
__le32 digest[4];
|
||||||
|
u64 f = 0;
|
||||||
|
|
||||||
|
if (unlikely(!dctx->sset))
|
||||||
|
return -ENOKEY;
|
||||||
|
|
||||||
|
if (unlikely(dctx->buflen)) {
|
||||||
|
dctx->buf[dctx->buflen++] = 1;
|
||||||
|
memset(dctx->buf + dctx->buflen, 0,
|
||||||
|
POLY1305_BLOCK_SIZE - dctx->buflen);
|
||||||
|
poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
poly1305_core_emit(&dctx->h, digest);
|
||||||
|
|
||||||
/* mac = (h + s) % (2^128) */
|
/* mac = (h + s) % (2^128) */
|
||||||
f = (f >> 32) + h0 + dctx->s[0]; put_unaligned_le32(f, dst + 0);
|
f = (f >> 32) + le32_to_cpu(digest[0]) + dctx->s[0];
|
||||||
f = (f >> 32) + h1 + dctx->s[1]; put_unaligned_le32(f, dst + 4);
|
put_unaligned_le32(f, dst + 0);
|
||||||
f = (f >> 32) + h2 + dctx->s[2]; put_unaligned_le32(f, dst + 8);
|
f = (f >> 32) + le32_to_cpu(digest[1]) + dctx->s[1];
|
||||||
f = (f >> 32) + h3 + dctx->s[3]; put_unaligned_le32(f, dst + 12);
|
put_unaligned_le32(f, dst + 4);
|
||||||
|
f = (f >> 32) + le32_to_cpu(digest[2]) + dctx->s[2];
|
||||||
|
put_unaligned_le32(f, dst + 8);
|
||||||
|
f = (f >> 32) + le32_to_cpu(digest[3]) + dctx->s[3];
|
||||||
|
put_unaligned_le32(f, dst + 12);
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
16
crypto/rng.c
16
crypto/rng.c
|
@ -35,9 +35,11 @@ static int crypto_default_rng_refcnt;
|
||||||
|
|
||||||
int crypto_rng_reset(struct crypto_rng *tfm, const u8 *seed, unsigned int slen)
|
int crypto_rng_reset(struct crypto_rng *tfm, const u8 *seed, unsigned int slen)
|
||||||
{
|
{
|
||||||
|
struct crypto_alg *alg = tfm->base.__crt_alg;
|
||||||
u8 *buf = NULL;
|
u8 *buf = NULL;
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
|
crypto_stats_get(alg);
|
||||||
if (!seed && slen) {
|
if (!seed && slen) {
|
||||||
buf = kmalloc(slen, GFP_KERNEL);
|
buf = kmalloc(slen, GFP_KERNEL);
|
||||||
if (!buf)
|
if (!buf)
|
||||||
|
@ -50,7 +52,7 @@ int crypto_rng_reset(struct crypto_rng *tfm, const u8 *seed, unsigned int slen)
|
||||||
}
|
}
|
||||||
|
|
||||||
err = crypto_rng_alg(tfm)->seed(tfm, seed, slen);
|
err = crypto_rng_alg(tfm)->seed(tfm, seed, slen);
|
||||||
crypto_stat_rng_seed(tfm, err);
|
crypto_stats_rng_seed(alg, err);
|
||||||
out:
|
out:
|
||||||
kzfree(buf);
|
kzfree(buf);
|
||||||
return err;
|
return err;
|
||||||
|
@ -74,17 +76,13 @@ static int crypto_rng_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_rng rrng;
|
struct crypto_report_rng rrng;
|
||||||
|
|
||||||
strncpy(rrng.type, "rng", sizeof(rrng.type));
|
memset(&rrng, 0, sizeof(rrng));
|
||||||
|
|
||||||
|
strscpy(rrng.type, "rng", sizeof(rrng.type));
|
||||||
|
|
||||||
rrng.seedsize = seedsize(alg);
|
rrng.seedsize = seedsize(alg);
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_RNG,
|
return nla_put(skb, CRYPTOCFGA_REPORT_RNG, sizeof(rrng), &rrng);
|
||||||
sizeof(struct crypto_report_rng), &rrng))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_rng_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_rng_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -159,7 +159,7 @@ static int salsa20_crypt(struct skcipher_request *req)
|
||||||
u32 state[16];
|
u32 state[16];
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
err = skcipher_walk_virt(&walk, req, true);
|
err = skcipher_walk_virt(&walk, req, false);
|
||||||
|
|
||||||
salsa20_init(state, ctx, walk.iv);
|
salsa20_init(state, ctx, walk.iv);
|
||||||
|
|
||||||
|
|
|
@ -40,15 +40,12 @@ static int crypto_scomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
{
|
{
|
||||||
struct crypto_report_comp rscomp;
|
struct crypto_report_comp rscomp;
|
||||||
|
|
||||||
strncpy(rscomp.type, "scomp", sizeof(rscomp.type));
|
memset(&rscomp, 0, sizeof(rscomp));
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_COMPRESS,
|
strscpy(rscomp.type, "scomp", sizeof(rscomp.type));
|
||||||
sizeof(struct crypto_report_comp), &rscomp))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
return nla_put(skb, CRYPTOCFGA_REPORT_COMPRESS,
|
||||||
return -EMSGSIZE;
|
sizeof(rscomp), &rscomp);
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_scomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_scomp_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -408,18 +408,14 @@ static int crypto_shash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
struct crypto_report_hash rhash;
|
struct crypto_report_hash rhash;
|
||||||
struct shash_alg *salg = __crypto_shash_alg(alg);
|
struct shash_alg *salg = __crypto_shash_alg(alg);
|
||||||
|
|
||||||
strncpy(rhash.type, "shash", sizeof(rhash.type));
|
memset(&rhash, 0, sizeof(rhash));
|
||||||
|
|
||||||
|
strscpy(rhash.type, "shash", sizeof(rhash.type));
|
||||||
|
|
||||||
rhash.blocksize = alg->cra_blocksize;
|
rhash.blocksize = alg->cra_blocksize;
|
||||||
rhash.digestsize = salg->digestsize;
|
rhash.digestsize = salg->digestsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_HASH,
|
return nla_put(skb, CRYPTOCFGA_REPORT_HASH, sizeof(rhash), &rhash);
|
||||||
sizeof(struct crypto_report_hash), &rhash))
|
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_shash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_shash_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
|
@ -474,6 +474,8 @@ int skcipher_walk_virt(struct skcipher_walk *walk,
|
||||||
{
|
{
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
|
might_sleep_if(req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP);
|
||||||
|
|
||||||
walk->flags &= ~SKCIPHER_WALK_PHYS;
|
walk->flags &= ~SKCIPHER_WALK_PHYS;
|
||||||
|
|
||||||
err = skcipher_walk_skcipher(walk, req);
|
err = skcipher_walk_skcipher(walk, req);
|
||||||
|
@ -577,8 +579,7 @@ static unsigned int crypto_skcipher_extsize(struct crypto_alg *alg)
|
||||||
if (alg->cra_type == &crypto_blkcipher_type)
|
if (alg->cra_type == &crypto_blkcipher_type)
|
||||||
return sizeof(struct crypto_blkcipher *);
|
return sizeof(struct crypto_blkcipher *);
|
||||||
|
|
||||||
if (alg->cra_type == &crypto_ablkcipher_type ||
|
if (alg->cra_type == &crypto_ablkcipher_type)
|
||||||
alg->cra_type == &crypto_givcipher_type)
|
|
||||||
return sizeof(struct crypto_ablkcipher *);
|
return sizeof(struct crypto_ablkcipher *);
|
||||||
|
|
||||||
return crypto_alg_extsize(alg);
|
return crypto_alg_extsize(alg);
|
||||||
|
@ -842,8 +843,7 @@ static int crypto_skcipher_init_tfm(struct crypto_tfm *tfm)
|
||||||
if (tfm->__crt_alg->cra_type == &crypto_blkcipher_type)
|
if (tfm->__crt_alg->cra_type == &crypto_blkcipher_type)
|
||||||
return crypto_init_skcipher_ops_blkcipher(tfm);
|
return crypto_init_skcipher_ops_blkcipher(tfm);
|
||||||
|
|
||||||
if (tfm->__crt_alg->cra_type == &crypto_ablkcipher_type ||
|
if (tfm->__crt_alg->cra_type == &crypto_ablkcipher_type)
|
||||||
tfm->__crt_alg->cra_type == &crypto_givcipher_type)
|
|
||||||
return crypto_init_skcipher_ops_ablkcipher(tfm);
|
return crypto_init_skcipher_ops_ablkcipher(tfm);
|
||||||
|
|
||||||
skcipher->setkey = skcipher_setkey;
|
skcipher->setkey = skcipher_setkey;
|
||||||
|
@ -897,21 +897,18 @@ static int crypto_skcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
struct skcipher_alg *skcipher = container_of(alg, struct skcipher_alg,
|
struct skcipher_alg *skcipher = container_of(alg, struct skcipher_alg,
|
||||||
base);
|
base);
|
||||||
|
|
||||||
strncpy(rblkcipher.type, "skcipher", sizeof(rblkcipher.type));
|
memset(&rblkcipher, 0, sizeof(rblkcipher));
|
||||||
strncpy(rblkcipher.geniv, "<none>", sizeof(rblkcipher.geniv));
|
|
||||||
|
strscpy(rblkcipher.type, "skcipher", sizeof(rblkcipher.type));
|
||||||
|
strscpy(rblkcipher.geniv, "<none>", sizeof(rblkcipher.geniv));
|
||||||
|
|
||||||
rblkcipher.blocksize = alg->cra_blocksize;
|
rblkcipher.blocksize = alg->cra_blocksize;
|
||||||
rblkcipher.min_keysize = skcipher->min_keysize;
|
rblkcipher.min_keysize = skcipher->min_keysize;
|
||||||
rblkcipher.max_keysize = skcipher->max_keysize;
|
rblkcipher.max_keysize = skcipher->max_keysize;
|
||||||
rblkcipher.ivsize = skcipher->ivsize;
|
rblkcipher.ivsize = skcipher->ivsize;
|
||||||
|
|
||||||
if (nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
return nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER,
|
||||||
sizeof(struct crypto_report_blkcipher), &rblkcipher))
|
sizeof(rblkcipher), &rblkcipher);
|
||||||
goto nla_put_failure;
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
nla_put_failure:
|
|
||||||
return -EMSGSIZE;
|
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
static int crypto_skcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
static int crypto_skcipher_report(struct sk_buff *skb, struct crypto_alg *alg)
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -76,10 +76,12 @@ static char *check[] = {
|
||||||
"cast6", "arc4", "michael_mic", "deflate", "crc32c", "tea", "xtea",
|
"cast6", "arc4", "michael_mic", "deflate", "crc32c", "tea", "xtea",
|
||||||
"khazad", "wp512", "wp384", "wp256", "tnepres", "xeta", "fcrypt",
|
"khazad", "wp512", "wp384", "wp256", "tnepres", "xeta", "fcrypt",
|
||||||
"camellia", "seed", "salsa20", "rmd128", "rmd160", "rmd256", "rmd320",
|
"camellia", "seed", "salsa20", "rmd128", "rmd160", "rmd256", "rmd320",
|
||||||
"lzo", "cts", "sha3-224", "sha3-256", "sha3-384", "sha3-512", NULL
|
"lzo", "cts", "sha3-224", "sha3-256", "sha3-384", "sha3-512",
|
||||||
|
"streebog256", "streebog512",
|
||||||
|
NULL
|
||||||
};
|
};
|
||||||
|
|
||||||
static u32 block_sizes[] = { 16, 64, 256, 1024, 8192, 0 };
|
static u32 block_sizes[] = { 16, 64, 256, 1024, 1472, 8192, 0 };
|
||||||
static u32 aead_sizes[] = { 16, 64, 256, 512, 1024, 2048, 4096, 8192, 0 };
|
static u32 aead_sizes[] = { 16, 64, 256, 512, 1024, 2048, 4096, 8192, 0 };
|
||||||
|
|
||||||
#define XBUFSIZE 8
|
#define XBUFSIZE 8
|
||||||
|
@ -1736,6 +1738,7 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
ret += tcrypt_test("ctr(aes)");
|
ret += tcrypt_test("ctr(aes)");
|
||||||
ret += tcrypt_test("rfc3686(ctr(aes))");
|
ret += tcrypt_test("rfc3686(ctr(aes))");
|
||||||
ret += tcrypt_test("ofb(aes)");
|
ret += tcrypt_test("ofb(aes)");
|
||||||
|
ret += tcrypt_test("cfb(aes)");
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 11:
|
case 11:
|
||||||
|
@ -1913,6 +1916,14 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
ret += tcrypt_test("sm3");
|
ret += tcrypt_test("sm3");
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case 53:
|
||||||
|
ret += tcrypt_test("streebog256");
|
||||||
|
break;
|
||||||
|
|
||||||
|
case 54:
|
||||||
|
ret += tcrypt_test("streebog512");
|
||||||
|
break;
|
||||||
|
|
||||||
case 100:
|
case 100:
|
||||||
ret += tcrypt_test("hmac(md5)");
|
ret += tcrypt_test("hmac(md5)");
|
||||||
break;
|
break;
|
||||||
|
@ -1969,6 +1980,14 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
ret += tcrypt_test("hmac(sha3-512)");
|
ret += tcrypt_test("hmac(sha3-512)");
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case 115:
|
||||||
|
ret += tcrypt_test("hmac(streebog256)");
|
||||||
|
break;
|
||||||
|
|
||||||
|
case 116:
|
||||||
|
ret += tcrypt_test("hmac(streebog512)");
|
||||||
|
break;
|
||||||
|
|
||||||
case 150:
|
case 150:
|
||||||
ret += tcrypt_test("ansi_cprng");
|
ret += tcrypt_test("ansi_cprng");
|
||||||
break;
|
break;
|
||||||
|
@ -2060,6 +2079,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
speed_template_16_24_32);
|
speed_template_16_24_32);
|
||||||
test_cipher_speed("ctr(aes)", DECRYPT, sec, NULL, 0,
|
test_cipher_speed("ctr(aes)", DECRYPT, sec, NULL, 0,
|
||||||
speed_template_16_24_32);
|
speed_template_16_24_32);
|
||||||
|
test_cipher_speed("cfb(aes)", ENCRYPT, sec, NULL, 0,
|
||||||
|
speed_template_16_24_32);
|
||||||
|
test_cipher_speed("cfb(aes)", DECRYPT, sec, NULL, 0,
|
||||||
|
speed_template_16_24_32);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 201:
|
case 201:
|
||||||
|
@ -2297,6 +2320,18 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
test_cipher_speed("ctr(sm4)", DECRYPT, sec, NULL, 0,
|
test_cipher_speed("ctr(sm4)", DECRYPT, sec, NULL, 0,
|
||||||
speed_template_16);
|
speed_template_16);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case 219:
|
||||||
|
test_cipher_speed("adiantum(xchacha12,aes)", ENCRYPT, sec, NULL,
|
||||||
|
0, speed_template_32);
|
||||||
|
test_cipher_speed("adiantum(xchacha12,aes)", DECRYPT, sec, NULL,
|
||||||
|
0, speed_template_32);
|
||||||
|
test_cipher_speed("adiantum(xchacha20,aes)", ENCRYPT, sec, NULL,
|
||||||
|
0, speed_template_32);
|
||||||
|
test_cipher_speed("adiantum(xchacha20,aes)", DECRYPT, sec, NULL,
|
||||||
|
0, speed_template_32);
|
||||||
|
break;
|
||||||
|
|
||||||
case 300:
|
case 300:
|
||||||
if (alg) {
|
if (alg) {
|
||||||
test_hash_speed(alg, sec, generic_hash_speed_template);
|
test_hash_speed(alg, sec, generic_hash_speed_template);
|
||||||
|
@ -2407,6 +2442,16 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
test_hash_speed("sm3", sec, generic_hash_speed_template);
|
test_hash_speed("sm3", sec, generic_hash_speed_template);
|
||||||
if (mode > 300 && mode < 400) break;
|
if (mode > 300 && mode < 400) break;
|
||||||
/* fall through */
|
/* fall through */
|
||||||
|
case 327:
|
||||||
|
test_hash_speed("streebog256", sec,
|
||||||
|
generic_hash_speed_template);
|
||||||
|
if (mode > 300 && mode < 400) break;
|
||||||
|
/* fall through */
|
||||||
|
case 328:
|
||||||
|
test_hash_speed("streebog512", sec,
|
||||||
|
generic_hash_speed_template);
|
||||||
|
if (mode > 300 && mode < 400) break;
|
||||||
|
/* fall through */
|
||||||
case 399:
|
case 399:
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
@ -2520,6 +2565,16 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
|
||||||
num_mb);
|
num_mb);
|
||||||
if (mode > 400 && mode < 500) break;
|
if (mode > 400 && mode < 500) break;
|
||||||
/* fall through */
|
/* fall through */
|
||||||
|
case 426:
|
||||||
|
test_mb_ahash_speed("streebog256", sec,
|
||||||
|
generic_hash_speed_template, num_mb);
|
||||||
|
if (mode > 400 && mode < 500) break;
|
||||||
|
/* fall through */
|
||||||
|
case 427:
|
||||||
|
test_mb_ahash_speed("streebog512", sec,
|
||||||
|
generic_hash_speed_template, num_mb);
|
||||||
|
if (mode > 400 && mode < 500) break;
|
||||||
|
/* fall through */
|
||||||
case 499:
|
case 499:
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
|
|
@ -2404,6 +2404,18 @@ static int alg_test_null(const struct alg_test_desc *desc,
|
||||||
/* Please keep this list sorted by algorithm name. */
|
/* Please keep this list sorted by algorithm name. */
|
||||||
static const struct alg_test_desc alg_test_descs[] = {
|
static const struct alg_test_desc alg_test_descs[] = {
|
||||||
{
|
{
|
||||||
|
.alg = "adiantum(xchacha12,aes)",
|
||||||
|
.test = alg_test_skcipher,
|
||||||
|
.suite = {
|
||||||
|
.cipher = __VECS(adiantum_xchacha12_aes_tv_template)
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
.alg = "adiantum(xchacha20,aes)",
|
||||||
|
.test = alg_test_skcipher,
|
||||||
|
.suite = {
|
||||||
|
.cipher = __VECS(adiantum_xchacha20_aes_tv_template)
|
||||||
|
},
|
||||||
|
}, {
|
||||||
.alg = "aegis128",
|
.alg = "aegis128",
|
||||||
.test = alg_test_aead,
|
.test = alg_test_aead,
|
||||||
.suite = {
|
.suite = {
|
||||||
|
@ -2690,6 +2702,13 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
.dec = __VECS(aes_ccm_dec_tv_template)
|
.dec = __VECS(aes_ccm_dec_tv_template)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "cfb(aes)",
|
||||||
|
.test = alg_test_skcipher,
|
||||||
|
.fips_allowed = 1,
|
||||||
|
.suite = {
|
||||||
|
.cipher = __VECS(aes_cfb_tv_template)
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
.alg = "chacha20",
|
.alg = "chacha20",
|
||||||
.test = alg_test_skcipher,
|
.test = alg_test_skcipher,
|
||||||
|
@ -2805,6 +2824,7 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
}, {
|
}, {
|
||||||
.alg = "cts(cbc(aes))",
|
.alg = "cts(cbc(aes))",
|
||||||
.test = alg_test_skcipher,
|
.test = alg_test_skcipher,
|
||||||
|
.fips_allowed = 1,
|
||||||
.suite = {
|
.suite = {
|
||||||
.cipher = __VECS(cts_mode_tv_template)
|
.cipher = __VECS(cts_mode_tv_template)
|
||||||
}
|
}
|
||||||
|
@ -3184,6 +3204,18 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
.suite = {
|
.suite = {
|
||||||
.hash = __VECS(hmac_sha512_tv_template)
|
.hash = __VECS(hmac_sha512_tv_template)
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "hmac(streebog256)",
|
||||||
|
.test = alg_test_hash,
|
||||||
|
.suite = {
|
||||||
|
.hash = __VECS(hmac_streebog256_tv_template)
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "hmac(streebog512)",
|
||||||
|
.test = alg_test_hash,
|
||||||
|
.suite = {
|
||||||
|
.hash = __VECS(hmac_streebog512_tv_template)
|
||||||
|
}
|
||||||
}, {
|
}, {
|
||||||
.alg = "jitterentropy_rng",
|
.alg = "jitterentropy_rng",
|
||||||
.fips_allowed = 1,
|
.fips_allowed = 1,
|
||||||
|
@ -3291,6 +3323,12 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
.dec = __VECS(morus640_dec_tv_template),
|
.dec = __VECS(morus640_dec_tv_template),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "nhpoly1305",
|
||||||
|
.test = alg_test_hash,
|
||||||
|
.suite = {
|
||||||
|
.hash = __VECS(nhpoly1305_tv_template)
|
||||||
|
}
|
||||||
}, {
|
}, {
|
||||||
.alg = "ofb(aes)",
|
.alg = "ofb(aes)",
|
||||||
.test = alg_test_skcipher,
|
.test = alg_test_skcipher,
|
||||||
|
@ -3496,6 +3534,18 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
.suite = {
|
.suite = {
|
||||||
.hash = __VECS(sm3_tv_template)
|
.hash = __VECS(sm3_tv_template)
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "streebog256",
|
||||||
|
.test = alg_test_hash,
|
||||||
|
.suite = {
|
||||||
|
.hash = __VECS(streebog256_tv_template)
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "streebog512",
|
||||||
|
.test = alg_test_hash,
|
||||||
|
.suite = {
|
||||||
|
.hash = __VECS(streebog512_tv_template)
|
||||||
|
}
|
||||||
}, {
|
}, {
|
||||||
.alg = "tgr128",
|
.alg = "tgr128",
|
||||||
.test = alg_test_hash,
|
.test = alg_test_hash,
|
||||||
|
@ -3544,6 +3594,18 @@ static const struct alg_test_desc alg_test_descs[] = {
|
||||||
.suite = {
|
.suite = {
|
||||||
.hash = __VECS(aes_xcbc128_tv_template)
|
.hash = __VECS(aes_xcbc128_tv_template)
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
.alg = "xchacha12",
|
||||||
|
.test = alg_test_skcipher,
|
||||||
|
.suite = {
|
||||||
|
.cipher = __VECS(xchacha12_tv_template)
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
.alg = "xchacha20",
|
||||||
|
.test = alg_test_skcipher,
|
||||||
|
.suite = {
|
||||||
|
.cipher = __VECS(xchacha20_tv_template)
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
.alg = "xts(aes)",
|
.alg = "xts(aes)",
|
||||||
.test = alg_test_skcipher,
|
.test = alg_test_skcipher,
|
||||||
|
|
3220
crypto/testmgr.h
3220
crypto/testmgr.h
File diff suppressed because it is too large
Load Diff
|
@ -3623,7 +3623,7 @@ static int receive_protocol(struct drbd_connection *connection, struct packet_in
|
||||||
* change.
|
* change.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
peer_integrity_tfm = crypto_alloc_shash(integrity_alg, 0, CRYPTO_ALG_ASYNC);
|
peer_integrity_tfm = crypto_alloc_shash(integrity_alg, 0, 0);
|
||||||
if (IS_ERR(peer_integrity_tfm)) {
|
if (IS_ERR(peer_integrity_tfm)) {
|
||||||
peer_integrity_tfm = NULL;
|
peer_integrity_tfm = NULL;
|
||||||
drbd_err(connection, "peer data-integrity-alg %s not supported\n",
|
drbd_err(connection, "peer data-integrity-alg %s not supported\n",
|
||||||
|
|
|
@ -1,10 +1,7 @@
|
||||||
/**
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
/*
|
||||||
* Copyright (c) 2010-2012 Broadcom. All rights reserved.
|
* Copyright (c) 2010-2012 Broadcom. All rights reserved.
|
||||||
* Copyright (c) 2013 Lubomir Rintel
|
* Copyright (c) 2013 Lubomir Rintel
|
||||||
*
|
|
||||||
* This program is free software; you can redistribute it and/or
|
|
||||||
* modify it under the terms of the GNU General Public License ("GPL")
|
|
||||||
* version 2, as published by the Free Software Foundation.
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/hw_random.h>
|
#include <linux/hw_random.h>
|
||||||
|
|
|
@ -265,7 +265,7 @@
|
||||||
#include <linux/syscalls.h>
|
#include <linux/syscalls.h>
|
||||||
#include <linux/completion.h>
|
#include <linux/completion.h>
|
||||||
#include <linux/uuid.h>
|
#include <linux/uuid.h>
|
||||||
#include <crypto/chacha20.h>
|
#include <crypto/chacha.h>
|
||||||
|
|
||||||
#include <asm/processor.h>
|
#include <asm/processor.h>
|
||||||
#include <linux/uaccess.h>
|
#include <linux/uaccess.h>
|
||||||
|
@ -431,11 +431,10 @@ static int crng_init = 0;
|
||||||
#define crng_ready() (likely(crng_init > 1))
|
#define crng_ready() (likely(crng_init > 1))
|
||||||
static int crng_init_cnt = 0;
|
static int crng_init_cnt = 0;
|
||||||
static unsigned long crng_global_init_time = 0;
|
static unsigned long crng_global_init_time = 0;
|
||||||
#define CRNG_INIT_CNT_THRESH (2*CHACHA20_KEY_SIZE)
|
#define CRNG_INIT_CNT_THRESH (2*CHACHA_KEY_SIZE)
|
||||||
static void _extract_crng(struct crng_state *crng,
|
static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA_BLOCK_SIZE]);
|
||||||
__u8 out[CHACHA20_BLOCK_SIZE]);
|
|
||||||
static void _crng_backtrack_protect(struct crng_state *crng,
|
static void _crng_backtrack_protect(struct crng_state *crng,
|
||||||
__u8 tmp[CHACHA20_BLOCK_SIZE], int used);
|
__u8 tmp[CHACHA_BLOCK_SIZE], int used);
|
||||||
static void process_random_ready_list(void);
|
static void process_random_ready_list(void);
|
||||||
static void _get_random_bytes(void *buf, int nbytes);
|
static void _get_random_bytes(void *buf, int nbytes);
|
||||||
|
|
||||||
|
@ -863,7 +862,7 @@ static int crng_fast_load(const char *cp, size_t len)
|
||||||
}
|
}
|
||||||
p = (unsigned char *) &primary_crng.state[4];
|
p = (unsigned char *) &primary_crng.state[4];
|
||||||
while (len > 0 && crng_init_cnt < CRNG_INIT_CNT_THRESH) {
|
while (len > 0 && crng_init_cnt < CRNG_INIT_CNT_THRESH) {
|
||||||
p[crng_init_cnt % CHACHA20_KEY_SIZE] ^= *cp;
|
p[crng_init_cnt % CHACHA_KEY_SIZE] ^= *cp;
|
||||||
cp++; crng_init_cnt++; len--;
|
cp++; crng_init_cnt++; len--;
|
||||||
}
|
}
|
||||||
spin_unlock_irqrestore(&primary_crng.lock, flags);
|
spin_unlock_irqrestore(&primary_crng.lock, flags);
|
||||||
|
@ -895,7 +894,7 @@ static int crng_slow_load(const char *cp, size_t len)
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
static unsigned char lfsr = 1;
|
static unsigned char lfsr = 1;
|
||||||
unsigned char tmp;
|
unsigned char tmp;
|
||||||
unsigned i, max = CHACHA20_KEY_SIZE;
|
unsigned i, max = CHACHA_KEY_SIZE;
|
||||||
const char * src_buf = cp;
|
const char * src_buf = cp;
|
||||||
char * dest_buf = (char *) &primary_crng.state[4];
|
char * dest_buf = (char *) &primary_crng.state[4];
|
||||||
|
|
||||||
|
@ -913,8 +912,8 @@ static int crng_slow_load(const char *cp, size_t len)
|
||||||
lfsr >>= 1;
|
lfsr >>= 1;
|
||||||
if (tmp & 1)
|
if (tmp & 1)
|
||||||
lfsr ^= 0xE1;
|
lfsr ^= 0xE1;
|
||||||
tmp = dest_buf[i % CHACHA20_KEY_SIZE];
|
tmp = dest_buf[i % CHACHA_KEY_SIZE];
|
||||||
dest_buf[i % CHACHA20_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
|
dest_buf[i % CHACHA_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
|
||||||
lfsr += (tmp << 3) | (tmp >> 5);
|
lfsr += (tmp << 3) | (tmp >> 5);
|
||||||
}
|
}
|
||||||
spin_unlock_irqrestore(&primary_crng.lock, flags);
|
spin_unlock_irqrestore(&primary_crng.lock, flags);
|
||||||
|
@ -926,7 +925,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
int i, num;
|
int i, num;
|
||||||
union {
|
union {
|
||||||
__u8 block[CHACHA20_BLOCK_SIZE];
|
__u8 block[CHACHA_BLOCK_SIZE];
|
||||||
__u32 key[8];
|
__u32 key[8];
|
||||||
} buf;
|
} buf;
|
||||||
|
|
||||||
|
@ -937,7 +936,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
|
||||||
} else {
|
} else {
|
||||||
_extract_crng(&primary_crng, buf.block);
|
_extract_crng(&primary_crng, buf.block);
|
||||||
_crng_backtrack_protect(&primary_crng, buf.block,
|
_crng_backtrack_protect(&primary_crng, buf.block,
|
||||||
CHACHA20_KEY_SIZE);
|
CHACHA_KEY_SIZE);
|
||||||
}
|
}
|
||||||
spin_lock_irqsave(&crng->lock, flags);
|
spin_lock_irqsave(&crng->lock, flags);
|
||||||
for (i = 0; i < 8; i++) {
|
for (i = 0; i < 8; i++) {
|
||||||
|
@ -973,7 +972,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
|
||||||
}
|
}
|
||||||
|
|
||||||
static void _extract_crng(struct crng_state *crng,
|
static void _extract_crng(struct crng_state *crng,
|
||||||
__u8 out[CHACHA20_BLOCK_SIZE])
|
__u8 out[CHACHA_BLOCK_SIZE])
|
||||||
{
|
{
|
||||||
unsigned long v, flags;
|
unsigned long v, flags;
|
||||||
|
|
||||||
|
@ -990,7 +989,7 @@ static void _extract_crng(struct crng_state *crng,
|
||||||
spin_unlock_irqrestore(&crng->lock, flags);
|
spin_unlock_irqrestore(&crng->lock, flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
|
static void extract_crng(__u8 out[CHACHA_BLOCK_SIZE])
|
||||||
{
|
{
|
||||||
struct crng_state *crng = NULL;
|
struct crng_state *crng = NULL;
|
||||||
|
|
||||||
|
@ -1008,14 +1007,14 @@ static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
|
||||||
* enough) to mutate the CRNG key to provide backtracking protection.
|
* enough) to mutate the CRNG key to provide backtracking protection.
|
||||||
*/
|
*/
|
||||||
static void _crng_backtrack_protect(struct crng_state *crng,
|
static void _crng_backtrack_protect(struct crng_state *crng,
|
||||||
__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
|
__u8 tmp[CHACHA_BLOCK_SIZE], int used)
|
||||||
{
|
{
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
__u32 *s, *d;
|
__u32 *s, *d;
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
used = round_up(used, sizeof(__u32));
|
used = round_up(used, sizeof(__u32));
|
||||||
if (used + CHACHA20_KEY_SIZE > CHACHA20_BLOCK_SIZE) {
|
if (used + CHACHA_KEY_SIZE > CHACHA_BLOCK_SIZE) {
|
||||||
extract_crng(tmp);
|
extract_crng(tmp);
|
||||||
used = 0;
|
used = 0;
|
||||||
}
|
}
|
||||||
|
@ -1027,7 +1026,7 @@ static void _crng_backtrack_protect(struct crng_state *crng,
|
||||||
spin_unlock_irqrestore(&crng->lock, flags);
|
spin_unlock_irqrestore(&crng->lock, flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
|
static void crng_backtrack_protect(__u8 tmp[CHACHA_BLOCK_SIZE], int used)
|
||||||
{
|
{
|
||||||
struct crng_state *crng = NULL;
|
struct crng_state *crng = NULL;
|
||||||
|
|
||||||
|
@ -1042,8 +1041,8 @@ static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
|
||||||
|
|
||||||
static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
|
static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
|
||||||
{
|
{
|
||||||
ssize_t ret = 0, i = CHACHA20_BLOCK_SIZE;
|
ssize_t ret = 0, i = CHACHA_BLOCK_SIZE;
|
||||||
__u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
|
__u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
|
||||||
int large_request = (nbytes > 256);
|
int large_request = (nbytes > 256);
|
||||||
|
|
||||||
while (nbytes) {
|
while (nbytes) {
|
||||||
|
@ -1057,7 +1056,7 @@ static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
|
||||||
}
|
}
|
||||||
|
|
||||||
extract_crng(tmp);
|
extract_crng(tmp);
|
||||||
i = min_t(int, nbytes, CHACHA20_BLOCK_SIZE);
|
i = min_t(int, nbytes, CHACHA_BLOCK_SIZE);
|
||||||
if (copy_to_user(buf, tmp, i)) {
|
if (copy_to_user(buf, tmp, i)) {
|
||||||
ret = -EFAULT;
|
ret = -EFAULT;
|
||||||
break;
|
break;
|
||||||
|
@ -1622,14 +1621,14 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller,
|
||||||
*/
|
*/
|
||||||
static void _get_random_bytes(void *buf, int nbytes)
|
static void _get_random_bytes(void *buf, int nbytes)
|
||||||
{
|
{
|
||||||
__u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
|
__u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
|
||||||
|
|
||||||
trace_get_random_bytes(nbytes, _RET_IP_);
|
trace_get_random_bytes(nbytes, _RET_IP_);
|
||||||
|
|
||||||
while (nbytes >= CHACHA20_BLOCK_SIZE) {
|
while (nbytes >= CHACHA_BLOCK_SIZE) {
|
||||||
extract_crng(buf);
|
extract_crng(buf);
|
||||||
buf += CHACHA20_BLOCK_SIZE;
|
buf += CHACHA_BLOCK_SIZE;
|
||||||
nbytes -= CHACHA20_BLOCK_SIZE;
|
nbytes -= CHACHA_BLOCK_SIZE;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (nbytes > 0) {
|
if (nbytes > 0) {
|
||||||
|
@ -1637,7 +1636,7 @@ static void _get_random_bytes(void *buf, int nbytes)
|
||||||
memcpy(buf, tmp, nbytes);
|
memcpy(buf, tmp, nbytes);
|
||||||
crng_backtrack_protect(tmp, nbytes);
|
crng_backtrack_protect(tmp, nbytes);
|
||||||
} else
|
} else
|
||||||
crng_backtrack_protect(tmp, CHACHA20_BLOCK_SIZE);
|
crng_backtrack_protect(tmp, CHACHA_BLOCK_SIZE);
|
||||||
memzero_explicit(tmp, sizeof(tmp));
|
memzero_explicit(tmp, sizeof(tmp));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2208,8 +2207,8 @@ struct ctl_table random_table[] = {
|
||||||
|
|
||||||
struct batched_entropy {
|
struct batched_entropy {
|
||||||
union {
|
union {
|
||||||
u64 entropy_u64[CHACHA20_BLOCK_SIZE / sizeof(u64)];
|
u64 entropy_u64[CHACHA_BLOCK_SIZE / sizeof(u64)];
|
||||||
u32 entropy_u32[CHACHA20_BLOCK_SIZE / sizeof(u32)];
|
u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
|
||||||
};
|
};
|
||||||
unsigned int position;
|
unsigned int position;
|
||||||
};
|
};
|
||||||
|
|
|
@ -762,10 +762,12 @@ config CRYPTO_DEV_CCREE
|
||||||
select CRYPTO_ECB
|
select CRYPTO_ECB
|
||||||
select CRYPTO_CTR
|
select CRYPTO_CTR
|
||||||
select CRYPTO_XTS
|
select CRYPTO_XTS
|
||||||
|
select CRYPTO_SM4
|
||||||
|
select CRYPTO_SM3
|
||||||
help
|
help
|
||||||
Say 'Y' to enable a driver for the REE interface of the Arm
|
Say 'Y' to enable a driver for the REE interface of the Arm
|
||||||
TrustZone CryptoCell family of processors. Currently the
|
TrustZone CryptoCell family of processors. Currently the
|
||||||
CryptoCell 712, 710 and 630 are supported.
|
CryptoCell 713, 703, 712, 710 and 630 are supported.
|
||||||
Choose this if you wish to use hardware acceleration of
|
Choose this if you wish to use hardware acceleration of
|
||||||
cryptographic operations on the system REE.
|
cryptographic operations on the system REE.
|
||||||
If unsure say Y.
|
If unsure say Y.
|
||||||
|
|
|
@ -520,8 +520,7 @@ static int crypto4xx_compute_gcm_hash_key_sw(__le32 *hash_start, const u8 *key,
|
||||||
uint8_t src[16] = { 0 };
|
uint8_t src[16] = { 0 };
|
||||||
int rc = 0;
|
int rc = 0;
|
||||||
|
|
||||||
aes_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_ASYNC |
|
aes_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_NEED_FALLBACK);
|
||||||
CRYPTO_ALG_NEED_FALLBACK);
|
|
||||||
if (IS_ERR(aes_tfm)) {
|
if (IS_ERR(aes_tfm)) {
|
||||||
rc = PTR_ERR(aes_tfm);
|
rc = PTR_ERR(aes_tfm);
|
||||||
pr_warn("could not load aes cipher driver: %d\n", rc);
|
pr_warn("could not load aes cipher driver: %d\n", rc);
|
||||||
|
|
|
@ -3868,7 +3868,6 @@ static struct iproc_alg_s driver_algs[] = {
|
||||||
.cra_driver_name = "ctr-aes-iproc",
|
.cra_driver_name = "ctr-aes-iproc",
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
.cra_blocksize = AES_BLOCK_SIZE,
|
||||||
.cra_ablkcipher = {
|
.cra_ablkcipher = {
|
||||||
/* .geniv = "chainiv", */
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE,
|
.min_keysize = AES_MIN_KEY_SIZE,
|
||||||
.max_keysize = AES_MAX_KEY_SIZE,
|
.max_keysize = AES_MAX_KEY_SIZE,
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
.ivsize = AES_BLOCK_SIZE,
|
||||||
|
@ -4605,7 +4604,6 @@ static int spu_register_ablkcipher(struct iproc_alg_s *driver_alg)
|
||||||
crypto->cra_priority = cipher_pri;
|
crypto->cra_priority = cipher_pri;
|
||||||
crypto->cra_alignmask = 0;
|
crypto->cra_alignmask = 0;
|
||||||
crypto->cra_ctxsize = sizeof(struct iproc_ctx_s);
|
crypto->cra_ctxsize = sizeof(struct iproc_ctx_s);
|
||||||
INIT_LIST_HEAD(&crypto->cra_list);
|
|
||||||
|
|
||||||
crypto->cra_init = ablkcipher_cra_init;
|
crypto->cra_init = ablkcipher_cra_init;
|
||||||
crypto->cra_exit = generic_cra_exit;
|
crypto->cra_exit = generic_cra_exit;
|
||||||
|
@ -4652,12 +4650,16 @@ static int spu_register_ahash(struct iproc_alg_s *driver_alg)
|
||||||
hash->halg.statesize = sizeof(struct spu_hash_export_s);
|
hash->halg.statesize = sizeof(struct spu_hash_export_s);
|
||||||
|
|
||||||
if (driver_alg->auth_info.mode != HASH_MODE_HMAC) {
|
if (driver_alg->auth_info.mode != HASH_MODE_HMAC) {
|
||||||
hash->setkey = ahash_setkey;
|
|
||||||
hash->init = ahash_init;
|
hash->init = ahash_init;
|
||||||
hash->update = ahash_update;
|
hash->update = ahash_update;
|
||||||
hash->final = ahash_final;
|
hash->final = ahash_final;
|
||||||
hash->finup = ahash_finup;
|
hash->finup = ahash_finup;
|
||||||
hash->digest = ahash_digest;
|
hash->digest = ahash_digest;
|
||||||
|
if ((driver_alg->auth_info.alg == HASH_ALG_AES) &&
|
||||||
|
((driver_alg->auth_info.mode == HASH_MODE_XCBC) ||
|
||||||
|
(driver_alg->auth_info.mode == HASH_MODE_CMAC))) {
|
||||||
|
hash->setkey = ahash_setkey;
|
||||||
|
}
|
||||||
} else {
|
} else {
|
||||||
hash->setkey = ahash_hmac_setkey;
|
hash->setkey = ahash_hmac_setkey;
|
||||||
hash->init = ahash_hmac_init;
|
hash->init = ahash_hmac_init;
|
||||||
|
@ -4687,7 +4689,6 @@ static int spu_register_aead(struct iproc_alg_s *driver_alg)
|
||||||
aead->base.cra_priority = aead_pri;
|
aead->base.cra_priority = aead_pri;
|
||||||
aead->base.cra_alignmask = 0;
|
aead->base.cra_alignmask = 0;
|
||||||
aead->base.cra_ctxsize = sizeof(struct iproc_ctx_s);
|
aead->base.cra_ctxsize = sizeof(struct iproc_ctx_s);
|
||||||
INIT_LIST_HEAD(&aead->base.cra_list);
|
|
||||||
|
|
||||||
aead->base.cra_flags |= CRYPTO_ALG_ASYNC;
|
aead->base.cra_flags |= CRYPTO_ALG_ASYNC;
|
||||||
/* setkey set in alg initialization */
|
/* setkey set in alg initialization */
|
||||||
|
|
|
@ -72,6 +72,8 @@
|
||||||
#define AUTHENC_DESC_JOB_IO_LEN (AEAD_DESC_JOB_IO_LEN + \
|
#define AUTHENC_DESC_JOB_IO_LEN (AEAD_DESC_JOB_IO_LEN + \
|
||||||
CAAM_CMD_SZ * 5)
|
CAAM_CMD_SZ * 5)
|
||||||
|
|
||||||
|
#define CHACHAPOLY_DESC_JOB_IO_LEN (AEAD_DESC_JOB_IO_LEN + CAAM_CMD_SZ * 6)
|
||||||
|
|
||||||
#define DESC_MAX_USED_BYTES (CAAM_DESC_BYTES_MAX - DESC_JOB_IO_LEN)
|
#define DESC_MAX_USED_BYTES (CAAM_DESC_BYTES_MAX - DESC_JOB_IO_LEN)
|
||||||
#define DESC_MAX_USED_LEN (DESC_MAX_USED_BYTES / CAAM_CMD_SZ)
|
#define DESC_MAX_USED_LEN (DESC_MAX_USED_BYTES / CAAM_CMD_SZ)
|
||||||
|
|
||||||
|
@ -513,6 +515,61 @@ static int rfc4543_setauthsize(struct crypto_aead *authenc,
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int chachapoly_set_sh_desc(struct crypto_aead *aead)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
struct device *jrdev = ctx->jrdev;
|
||||||
|
unsigned int ivsize = crypto_aead_ivsize(aead);
|
||||||
|
u32 *desc;
|
||||||
|
|
||||||
|
if (!ctx->cdata.keylen || !ctx->authsize)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
desc = ctx->sh_desc_enc;
|
||||||
|
cnstr_shdsc_chachapoly(desc, &ctx->cdata, &ctx->adata, ivsize,
|
||||||
|
ctx->authsize, true, false);
|
||||||
|
dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma,
|
||||||
|
desc_bytes(desc), ctx->dir);
|
||||||
|
|
||||||
|
desc = ctx->sh_desc_dec;
|
||||||
|
cnstr_shdsc_chachapoly(desc, &ctx->cdata, &ctx->adata, ivsize,
|
||||||
|
ctx->authsize, false, false);
|
||||||
|
dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma,
|
||||||
|
desc_bytes(desc), ctx->dir);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chachapoly_setauthsize(struct crypto_aead *aead,
|
||||||
|
unsigned int authsize)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
|
||||||
|
if (authsize != POLY1305_DIGEST_SIZE)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
ctx->authsize = authsize;
|
||||||
|
return chachapoly_set_sh_desc(aead);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
|
||||||
|
unsigned int keylen)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
unsigned int ivsize = crypto_aead_ivsize(aead);
|
||||||
|
unsigned int saltlen = CHACHAPOLY_IV_SIZE - ivsize;
|
||||||
|
|
||||||
|
if (keylen != CHACHA_KEY_SIZE + saltlen) {
|
||||||
|
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
||||||
|
return -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx->cdata.key_virt = key;
|
||||||
|
ctx->cdata.keylen = keylen - saltlen;
|
||||||
|
|
||||||
|
return chachapoly_set_sh_desc(aead);
|
||||||
|
}
|
||||||
|
|
||||||
static int aead_setkey(struct crypto_aead *aead,
|
static int aead_setkey(struct crypto_aead *aead,
|
||||||
const u8 *key, unsigned int keylen)
|
const u8 *key, unsigned int keylen)
|
||||||
{
|
{
|
||||||
|
@ -1031,6 +1088,40 @@ static void init_gcm_job(struct aead_request *req,
|
||||||
/* End of blank commands */
|
/* End of blank commands */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void init_chachapoly_job(struct aead_request *req,
|
||||||
|
struct aead_edesc *edesc, bool all_contig,
|
||||||
|
bool encrypt)
|
||||||
|
{
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(req);
|
||||||
|
unsigned int ivsize = crypto_aead_ivsize(aead);
|
||||||
|
unsigned int assoclen = req->assoclen;
|
||||||
|
u32 *desc = edesc->hw_desc;
|
||||||
|
u32 ctx_iv_off = 4;
|
||||||
|
|
||||||
|
init_aead_job(req, edesc, all_contig, encrypt);
|
||||||
|
|
||||||
|
if (ivsize != CHACHAPOLY_IV_SIZE) {
|
||||||
|
/* IPsec specific: CONTEXT1[223:128] = {NONCE, IV} */
|
||||||
|
ctx_iv_off += 4;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The associated data comes already with the IV but we need
|
||||||
|
* to skip it when we authenticate or encrypt...
|
||||||
|
*/
|
||||||
|
assoclen -= ivsize;
|
||||||
|
}
|
||||||
|
|
||||||
|
append_math_add_imm_u32(desc, REG3, ZERO, IMM, assoclen);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* For IPsec load the IV further in the same register.
|
||||||
|
* For RFC7539 simply load the 12 bytes nonce in a single operation
|
||||||
|
*/
|
||||||
|
append_load_as_imm(desc, req->iv, ivsize, LDST_CLASS_1_CCB |
|
||||||
|
LDST_SRCDST_BYTE_CONTEXT |
|
||||||
|
ctx_iv_off << LDST_OFFSET_SHIFT);
|
||||||
|
}
|
||||||
|
|
||||||
static void init_authenc_job(struct aead_request *req,
|
static void init_authenc_job(struct aead_request *req,
|
||||||
struct aead_edesc *edesc,
|
struct aead_edesc *edesc,
|
||||||
bool all_contig, bool encrypt)
|
bool all_contig, bool encrypt)
|
||||||
|
@ -1289,6 +1380,72 @@ static int gcm_encrypt(struct aead_request *req)
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int chachapoly_encrypt(struct aead_request *req)
|
||||||
|
{
|
||||||
|
struct aead_edesc *edesc;
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(req);
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
struct device *jrdev = ctx->jrdev;
|
||||||
|
bool all_contig;
|
||||||
|
u32 *desc;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
edesc = aead_edesc_alloc(req, CHACHAPOLY_DESC_JOB_IO_LEN, &all_contig,
|
||||||
|
true);
|
||||||
|
if (IS_ERR(edesc))
|
||||||
|
return PTR_ERR(edesc);
|
||||||
|
|
||||||
|
desc = edesc->hw_desc;
|
||||||
|
|
||||||
|
init_chachapoly_job(req, edesc, all_contig, true);
|
||||||
|
print_hex_dump_debug("chachapoly jobdesc@" __stringify(__LINE__)": ",
|
||||||
|
DUMP_PREFIX_ADDRESS, 16, 4, desc, desc_bytes(desc),
|
||||||
|
1);
|
||||||
|
|
||||||
|
ret = caam_jr_enqueue(jrdev, desc, aead_encrypt_done, req);
|
||||||
|
if (!ret) {
|
||||||
|
ret = -EINPROGRESS;
|
||||||
|
} else {
|
||||||
|
aead_unmap(jrdev, edesc, req);
|
||||||
|
kfree(edesc);
|
||||||
|
}
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chachapoly_decrypt(struct aead_request *req)
|
||||||
|
{
|
||||||
|
struct aead_edesc *edesc;
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(req);
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
struct device *jrdev = ctx->jrdev;
|
||||||
|
bool all_contig;
|
||||||
|
u32 *desc;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
edesc = aead_edesc_alloc(req, CHACHAPOLY_DESC_JOB_IO_LEN, &all_contig,
|
||||||
|
false);
|
||||||
|
if (IS_ERR(edesc))
|
||||||
|
return PTR_ERR(edesc);
|
||||||
|
|
||||||
|
desc = edesc->hw_desc;
|
||||||
|
|
||||||
|
init_chachapoly_job(req, edesc, all_contig, false);
|
||||||
|
print_hex_dump_debug("chachapoly jobdesc@" __stringify(__LINE__)": ",
|
||||||
|
DUMP_PREFIX_ADDRESS, 16, 4, desc, desc_bytes(desc),
|
||||||
|
1);
|
||||||
|
|
||||||
|
ret = caam_jr_enqueue(jrdev, desc, aead_decrypt_done, req);
|
||||||
|
if (!ret) {
|
||||||
|
ret = -EINPROGRESS;
|
||||||
|
} else {
|
||||||
|
aead_unmap(jrdev, edesc, req);
|
||||||
|
kfree(edesc);
|
||||||
|
}
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
static int ipsec_gcm_encrypt(struct aead_request *req)
|
static int ipsec_gcm_encrypt(struct aead_request *req)
|
||||||
{
|
{
|
||||||
if (req->assoclen < 8)
|
if (req->assoclen < 8)
|
||||||
|
@ -3002,6 +3159,50 @@ static struct caam_aead_alg driver_aeads[] = {
|
||||||
.geniv = true,
|
.geniv = true,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
.aead = {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "rfc7539(chacha20,poly1305)",
|
||||||
|
.cra_driver_name = "rfc7539-chacha20-poly1305-"
|
||||||
|
"caam",
|
||||||
|
.cra_blocksize = 1,
|
||||||
|
},
|
||||||
|
.setkey = chachapoly_setkey,
|
||||||
|
.setauthsize = chachapoly_setauthsize,
|
||||||
|
.encrypt = chachapoly_encrypt,
|
||||||
|
.decrypt = chachapoly_decrypt,
|
||||||
|
.ivsize = CHACHAPOLY_IV_SIZE,
|
||||||
|
.maxauthsize = POLY1305_DIGEST_SIZE,
|
||||||
|
},
|
||||||
|
.caam = {
|
||||||
|
.class1_alg_type = OP_ALG_ALGSEL_CHACHA20 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
.class2_alg_type = OP_ALG_ALGSEL_POLY1305 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.aead = {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "rfc7539esp(chacha20,poly1305)",
|
||||||
|
.cra_driver_name = "rfc7539esp-chacha20-"
|
||||||
|
"poly1305-caam",
|
||||||
|
.cra_blocksize = 1,
|
||||||
|
},
|
||||||
|
.setkey = chachapoly_setkey,
|
||||||
|
.setauthsize = chachapoly_setauthsize,
|
||||||
|
.encrypt = chachapoly_encrypt,
|
||||||
|
.decrypt = chachapoly_decrypt,
|
||||||
|
.ivsize = 8,
|
||||||
|
.maxauthsize = POLY1305_DIGEST_SIZE,
|
||||||
|
},
|
||||||
|
.caam = {
|
||||||
|
.class1_alg_type = OP_ALG_ALGSEL_CHACHA20 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
.class2_alg_type = OP_ALG_ALGSEL_POLY1305 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
},
|
||||||
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
static int caam_init_common(struct caam_ctx *ctx, struct caam_alg_entry *caam,
|
static int caam_init_common(struct caam_ctx *ctx, struct caam_alg_entry *caam,
|
||||||
|
@ -3135,7 +3336,7 @@ static int __init caam_algapi_init(void)
|
||||||
struct device *ctrldev;
|
struct device *ctrldev;
|
||||||
struct caam_drv_private *priv;
|
struct caam_drv_private *priv;
|
||||||
int i = 0, err = 0;
|
int i = 0, err = 0;
|
||||||
u32 cha_vid, cha_inst, des_inst, aes_inst, md_inst;
|
u32 aes_vid, aes_inst, des_inst, md_vid, md_inst, ccha_inst, ptha_inst;
|
||||||
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
||||||
bool registered = false;
|
bool registered = false;
|
||||||
|
|
||||||
|
@ -3168,14 +3369,38 @@ static int __init caam_algapi_init(void)
|
||||||
* Register crypto algorithms the device supports.
|
* Register crypto algorithms the device supports.
|
||||||
* First, detect presence and attributes of DES, AES, and MD blocks.
|
* First, detect presence and attributes of DES, AES, and MD blocks.
|
||||||
*/
|
*/
|
||||||
cha_vid = rd_reg32(&priv->ctrl->perfmon.cha_id_ls);
|
if (priv->era < 10) {
|
||||||
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
u32 cha_vid, cha_inst;
|
||||||
des_inst = (cha_inst & CHA_ID_LS_DES_MASK) >> CHA_ID_LS_DES_SHIFT;
|
|
||||||
aes_inst = (cha_inst & CHA_ID_LS_AES_MASK) >> CHA_ID_LS_AES_SHIFT;
|
cha_vid = rd_reg32(&priv->ctrl->perfmon.cha_id_ls);
|
||||||
md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
aes_vid = cha_vid & CHA_ID_LS_AES_MASK;
|
||||||
|
md_vid = (cha_vid & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
|
||||||
|
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
||||||
|
des_inst = (cha_inst & CHA_ID_LS_DES_MASK) >>
|
||||||
|
CHA_ID_LS_DES_SHIFT;
|
||||||
|
aes_inst = cha_inst & CHA_ID_LS_AES_MASK;
|
||||||
|
md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
ccha_inst = 0;
|
||||||
|
ptha_inst = 0;
|
||||||
|
} else {
|
||||||
|
u32 aesa, mdha;
|
||||||
|
|
||||||
|
aesa = rd_reg32(&priv->ctrl->vreg.aesa);
|
||||||
|
mdha = rd_reg32(&priv->ctrl->vreg.mdha);
|
||||||
|
|
||||||
|
aes_vid = (aesa & CHA_VER_VID_MASK) >> CHA_VER_VID_SHIFT;
|
||||||
|
md_vid = (mdha & CHA_VER_VID_MASK) >> CHA_VER_VID_SHIFT;
|
||||||
|
|
||||||
|
des_inst = rd_reg32(&priv->ctrl->vreg.desa) & CHA_VER_NUM_MASK;
|
||||||
|
aes_inst = aesa & CHA_VER_NUM_MASK;
|
||||||
|
md_inst = mdha & CHA_VER_NUM_MASK;
|
||||||
|
ccha_inst = rd_reg32(&priv->ctrl->vreg.ccha) & CHA_VER_NUM_MASK;
|
||||||
|
ptha_inst = rd_reg32(&priv->ctrl->vreg.ptha) & CHA_VER_NUM_MASK;
|
||||||
|
}
|
||||||
|
|
||||||
/* If MD is present, limit digest size based on LP256 */
|
/* If MD is present, limit digest size based on LP256 */
|
||||||
if (md_inst && ((cha_vid & CHA_ID_LS_MD_MASK) == CHA_ID_LS_MD_LP256))
|
if (md_inst && md_vid == CHA_VER_VID_MD_LP256)
|
||||||
md_limit = SHA256_DIGEST_SIZE;
|
md_limit = SHA256_DIGEST_SIZE;
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
|
for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
|
||||||
|
@ -3196,10 +3421,10 @@ static int __init caam_algapi_init(void)
|
||||||
* Check support for AES modes not available
|
* Check support for AES modes not available
|
||||||
* on LP devices.
|
* on LP devices.
|
||||||
*/
|
*/
|
||||||
if ((cha_vid & CHA_ID_LS_AES_MASK) == CHA_ID_LS_AES_LP)
|
if (aes_vid == CHA_VER_VID_AES_LP &&
|
||||||
if ((t_alg->caam.class1_alg_type & OP_ALG_AAI_MASK) ==
|
(t_alg->caam.class1_alg_type & OP_ALG_AAI_MASK) ==
|
||||||
OP_ALG_AAI_XTS)
|
OP_ALG_AAI_XTS)
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
caam_skcipher_alg_init(t_alg);
|
caam_skcipher_alg_init(t_alg);
|
||||||
|
|
||||||
|
@ -3232,21 +3457,28 @@ static int __init caam_algapi_init(void)
|
||||||
if (!aes_inst && (c1_alg_sel == OP_ALG_ALGSEL_AES))
|
if (!aes_inst && (c1_alg_sel == OP_ALG_ALGSEL_AES))
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
|
/* Skip CHACHA20 algorithms if not supported by device */
|
||||||
|
if (c1_alg_sel == OP_ALG_ALGSEL_CHACHA20 && !ccha_inst)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
/* Skip POLY1305 algorithms if not supported by device */
|
||||||
|
if (c2_alg_sel == OP_ALG_ALGSEL_POLY1305 && !ptha_inst)
|
||||||
|
continue;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Check support for AES algorithms not available
|
* Check support for AES algorithms not available
|
||||||
* on LP devices.
|
* on LP devices.
|
||||||
*/
|
*/
|
||||||
if ((cha_vid & CHA_ID_LS_AES_MASK) == CHA_ID_LS_AES_LP)
|
if (aes_vid == CHA_VER_VID_AES_LP && alg_aai == OP_ALG_AAI_GCM)
|
||||||
if (alg_aai == OP_ALG_AAI_GCM)
|
continue;
|
||||||
continue;
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Skip algorithms requiring message digests
|
* Skip algorithms requiring message digests
|
||||||
* if MD or MD size is not supported by device.
|
* if MD or MD size is not supported by device.
|
||||||
*/
|
*/
|
||||||
if (c2_alg_sel &&
|
if ((c2_alg_sel & ~OP_ALG_ALGSEL_SUBMASK) == 0x40 &&
|
||||||
(!md_inst || (t_alg->aead.maxauthsize > md_limit)))
|
(!md_inst || t_alg->aead.maxauthsize > md_limit))
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
caam_aead_alg_init(t_alg);
|
caam_aead_alg_init(t_alg);
|
||||||
|
|
||||||
|
|
|
@ -1213,6 +1213,139 @@ void cnstr_shdsc_rfc4543_decap(u32 * const desc, struct alginfo *cdata,
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(cnstr_shdsc_rfc4543_decap);
|
EXPORT_SYMBOL(cnstr_shdsc_rfc4543_decap);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* cnstr_shdsc_chachapoly - Chacha20 + Poly1305 generic AEAD (rfc7539) and
|
||||||
|
* IPsec ESP (rfc7634, a.k.a. rfc7539esp) shared
|
||||||
|
* descriptor (non-protocol).
|
||||||
|
* @desc: pointer to buffer used for descriptor construction
|
||||||
|
* @cdata: pointer to block cipher transform definitions
|
||||||
|
* Valid algorithm values - OP_ALG_ALGSEL_CHACHA20 ANDed with
|
||||||
|
* OP_ALG_AAI_AEAD.
|
||||||
|
* @adata: pointer to authentication transform definitions
|
||||||
|
* Valid algorithm values - OP_ALG_ALGSEL_POLY1305 ANDed with
|
||||||
|
* OP_ALG_AAI_AEAD.
|
||||||
|
* @ivsize: initialization vector size
|
||||||
|
* @icvsize: integrity check value (ICV) size (truncated or full)
|
||||||
|
* @encap: true if encapsulation, false if decapsulation
|
||||||
|
* @is_qi: true when called from caam/qi
|
||||||
|
*/
|
||||||
|
void cnstr_shdsc_chachapoly(u32 * const desc, struct alginfo *cdata,
|
||||||
|
struct alginfo *adata, unsigned int ivsize,
|
||||||
|
unsigned int icvsize, const bool encap,
|
||||||
|
const bool is_qi)
|
||||||
|
{
|
||||||
|
u32 *key_jump_cmd, *wait_cmd;
|
||||||
|
u32 nfifo;
|
||||||
|
const bool is_ipsec = (ivsize != CHACHAPOLY_IV_SIZE);
|
||||||
|
|
||||||
|
/* Note: Context registers are saved. */
|
||||||
|
init_sh_desc(desc, HDR_SHARE_SERIAL | HDR_SAVECTX);
|
||||||
|
|
||||||
|
/* skip key loading if they are loaded due to sharing */
|
||||||
|
key_jump_cmd = append_jump(desc, JUMP_JSL | JUMP_TEST_ALL |
|
||||||
|
JUMP_COND_SHRD);
|
||||||
|
|
||||||
|
append_key_as_imm(desc, cdata->key_virt, cdata->keylen, cdata->keylen,
|
||||||
|
CLASS_1 | KEY_DEST_CLASS_REG);
|
||||||
|
|
||||||
|
/* For IPsec load the salt from keymat in the context register */
|
||||||
|
if (is_ipsec)
|
||||||
|
append_load_as_imm(desc, cdata->key_virt + cdata->keylen, 4,
|
||||||
|
LDST_CLASS_1_CCB | LDST_SRCDST_BYTE_CONTEXT |
|
||||||
|
4 << LDST_OFFSET_SHIFT);
|
||||||
|
|
||||||
|
set_jump_tgt_here(desc, key_jump_cmd);
|
||||||
|
|
||||||
|
/* Class 2 and 1 operations: Poly & ChaCha */
|
||||||
|
if (encap) {
|
||||||
|
append_operation(desc, adata->algtype | OP_ALG_AS_INITFINAL |
|
||||||
|
OP_ALG_ENCRYPT);
|
||||||
|
append_operation(desc, cdata->algtype | OP_ALG_AS_INITFINAL |
|
||||||
|
OP_ALG_ENCRYPT);
|
||||||
|
} else {
|
||||||
|
append_operation(desc, adata->algtype | OP_ALG_AS_INITFINAL |
|
||||||
|
OP_ALG_DECRYPT | OP_ALG_ICV_ON);
|
||||||
|
append_operation(desc, cdata->algtype | OP_ALG_AS_INITFINAL |
|
||||||
|
OP_ALG_DECRYPT);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (is_qi) {
|
||||||
|
u32 *wait_load_cmd;
|
||||||
|
u32 ctx1_iv_off = is_ipsec ? 8 : 4;
|
||||||
|
|
||||||
|
/* REG3 = assoclen */
|
||||||
|
append_seq_load(desc, 4, LDST_CLASS_DECO |
|
||||||
|
LDST_SRCDST_WORD_DECO_MATH3 |
|
||||||
|
4 << LDST_OFFSET_SHIFT);
|
||||||
|
|
||||||
|
wait_load_cmd = append_jump(desc, JUMP_JSL | JUMP_TEST_ALL |
|
||||||
|
JUMP_COND_CALM | JUMP_COND_NCP |
|
||||||
|
JUMP_COND_NOP | JUMP_COND_NIP |
|
||||||
|
JUMP_COND_NIFP);
|
||||||
|
set_jump_tgt_here(desc, wait_load_cmd);
|
||||||
|
|
||||||
|
append_seq_load(desc, ivsize, LDST_CLASS_1_CCB |
|
||||||
|
LDST_SRCDST_BYTE_CONTEXT |
|
||||||
|
ctx1_iv_off << LDST_OFFSET_SHIFT);
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* MAGIC with NFIFO
|
||||||
|
* Read associated data from the input and send them to class1 and
|
||||||
|
* class2 alignment blocks. From class1 send data to output fifo and
|
||||||
|
* then write it to memory since we don't need to encrypt AD.
|
||||||
|
*/
|
||||||
|
nfifo = NFIFOENTRY_DEST_BOTH | NFIFOENTRY_FC1 | NFIFOENTRY_FC2 |
|
||||||
|
NFIFOENTRY_DTYPE_POLY | NFIFOENTRY_BND;
|
||||||
|
append_load_imm_u32(desc, nfifo, LDST_CLASS_IND_CCB |
|
||||||
|
LDST_SRCDST_WORD_INFO_FIFO_SM | LDLEN_MATH3);
|
||||||
|
|
||||||
|
append_math_add(desc, VARSEQINLEN, ZERO, REG3, CAAM_CMD_SZ);
|
||||||
|
append_math_add(desc, VARSEQOUTLEN, ZERO, REG3, CAAM_CMD_SZ);
|
||||||
|
append_seq_fifo_load(desc, 0, FIFOLD_TYPE_NOINFOFIFO |
|
||||||
|
FIFOLD_CLASS_CLASS1 | LDST_VLF);
|
||||||
|
append_move_len(desc, MOVE_AUX_LS | MOVE_SRC_AUX_ABLK |
|
||||||
|
MOVE_DEST_OUTFIFO | MOVELEN_MRSEL_MATH3);
|
||||||
|
append_seq_fifo_store(desc, 0, FIFOST_TYPE_MESSAGE_DATA | LDST_VLF);
|
||||||
|
|
||||||
|
/* IPsec - copy IV at the output */
|
||||||
|
if (is_ipsec)
|
||||||
|
append_seq_fifo_store(desc, ivsize, FIFOST_TYPE_METADATA |
|
||||||
|
0x2 << 25);
|
||||||
|
|
||||||
|
wait_cmd = append_jump(desc, JUMP_JSL | JUMP_TYPE_LOCAL |
|
||||||
|
JUMP_COND_NOP | JUMP_TEST_ALL);
|
||||||
|
set_jump_tgt_here(desc, wait_cmd);
|
||||||
|
|
||||||
|
if (encap) {
|
||||||
|
/* Read and write cryptlen bytes */
|
||||||
|
append_math_add(desc, VARSEQINLEN, SEQINLEN, REG0, CAAM_CMD_SZ);
|
||||||
|
append_math_add(desc, VARSEQOUTLEN, SEQINLEN, REG0,
|
||||||
|
CAAM_CMD_SZ);
|
||||||
|
aead_append_src_dst(desc, FIFOLD_TYPE_MSG1OUT2);
|
||||||
|
|
||||||
|
/* Write ICV */
|
||||||
|
append_seq_store(desc, icvsize, LDST_CLASS_2_CCB |
|
||||||
|
LDST_SRCDST_BYTE_CONTEXT);
|
||||||
|
} else {
|
||||||
|
/* Read and write cryptlen bytes */
|
||||||
|
append_math_add(desc, VARSEQINLEN, SEQOUTLEN, REG0,
|
||||||
|
CAAM_CMD_SZ);
|
||||||
|
append_math_add(desc, VARSEQOUTLEN, SEQOUTLEN, REG0,
|
||||||
|
CAAM_CMD_SZ);
|
||||||
|
aead_append_src_dst(desc, FIFOLD_TYPE_MSG);
|
||||||
|
|
||||||
|
/* Load ICV for verification */
|
||||||
|
append_seq_fifo_load(desc, icvsize, FIFOLD_CLASS_CLASS2 |
|
||||||
|
FIFOLD_TYPE_LAST2 | FIFOLD_TYPE_ICV);
|
||||||
|
}
|
||||||
|
|
||||||
|
print_hex_dump_debug("chachapoly shdesc@" __stringify(__LINE__)": ",
|
||||||
|
DUMP_PREFIX_ADDRESS, 16, 4, desc, desc_bytes(desc),
|
||||||
|
1);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(cnstr_shdsc_chachapoly);
|
||||||
|
|
||||||
/* For skcipher encrypt and decrypt, read from req->src and write to req->dst */
|
/* For skcipher encrypt and decrypt, read from req->src and write to req->dst */
|
||||||
static inline void skcipher_append_src_dst(u32 *desc)
|
static inline void skcipher_append_src_dst(u32 *desc)
|
||||||
{
|
{
|
||||||
|
@ -1228,7 +1361,8 @@ static inline void skcipher_append_src_dst(u32 *desc)
|
||||||
* @desc: pointer to buffer used for descriptor construction
|
* @desc: pointer to buffer used for descriptor construction
|
||||||
* @cdata: pointer to block cipher transform definitions
|
* @cdata: pointer to block cipher transform definitions
|
||||||
* Valid algorithm values - one of OP_ALG_ALGSEL_{AES, DES, 3DES} ANDed
|
* Valid algorithm values - one of OP_ALG_ALGSEL_{AES, DES, 3DES} ANDed
|
||||||
* with OP_ALG_AAI_CBC or OP_ALG_AAI_CTR_MOD128.
|
* with OP_ALG_AAI_CBC or OP_ALG_AAI_CTR_MOD128
|
||||||
|
* - OP_ALG_ALGSEL_CHACHA20
|
||||||
* @ivsize: initialization vector size
|
* @ivsize: initialization vector size
|
||||||
* @is_rfc3686: true when ctr(aes) is wrapped by rfc3686 template
|
* @is_rfc3686: true when ctr(aes) is wrapped by rfc3686 template
|
||||||
* @ctx1_iv_off: IV offset in CONTEXT1 register
|
* @ctx1_iv_off: IV offset in CONTEXT1 register
|
||||||
|
@ -1293,7 +1427,8 @@ EXPORT_SYMBOL(cnstr_shdsc_skcipher_encap);
|
||||||
* @desc: pointer to buffer used for descriptor construction
|
* @desc: pointer to buffer used for descriptor construction
|
||||||
* @cdata: pointer to block cipher transform definitions
|
* @cdata: pointer to block cipher transform definitions
|
||||||
* Valid algorithm values - one of OP_ALG_ALGSEL_{AES, DES, 3DES} ANDed
|
* Valid algorithm values - one of OP_ALG_ALGSEL_{AES, DES, 3DES} ANDed
|
||||||
* with OP_ALG_AAI_CBC or OP_ALG_AAI_CTR_MOD128.
|
* with OP_ALG_AAI_CBC or OP_ALG_AAI_CTR_MOD128
|
||||||
|
* - OP_ALG_ALGSEL_CHACHA20
|
||||||
* @ivsize: initialization vector size
|
* @ivsize: initialization vector size
|
||||||
* @is_rfc3686: true when ctr(aes) is wrapped by rfc3686 template
|
* @is_rfc3686: true when ctr(aes) is wrapped by rfc3686 template
|
||||||
* @ctx1_iv_off: IV offset in CONTEXT1 register
|
* @ctx1_iv_off: IV offset in CONTEXT1 register
|
||||||
|
|
|
@ -96,6 +96,11 @@ void cnstr_shdsc_rfc4543_decap(u32 * const desc, struct alginfo *cdata,
|
||||||
unsigned int ivsize, unsigned int icvsize,
|
unsigned int ivsize, unsigned int icvsize,
|
||||||
const bool is_qi);
|
const bool is_qi);
|
||||||
|
|
||||||
|
void cnstr_shdsc_chachapoly(u32 * const desc, struct alginfo *cdata,
|
||||||
|
struct alginfo *adata, unsigned int ivsize,
|
||||||
|
unsigned int icvsize, const bool encap,
|
||||||
|
const bool is_qi);
|
||||||
|
|
||||||
void cnstr_shdsc_skcipher_encap(u32 * const desc, struct alginfo *cdata,
|
void cnstr_shdsc_skcipher_encap(u32 * const desc, struct alginfo *cdata,
|
||||||
unsigned int ivsize, const bool is_rfc3686,
|
unsigned int ivsize, const bool is_rfc3686,
|
||||||
const u32 ctx1_iv_off);
|
const u32 ctx1_iv_off);
|
||||||
|
|
|
@ -2462,7 +2462,7 @@ static int __init caam_qi_algapi_init(void)
|
||||||
struct device *ctrldev;
|
struct device *ctrldev;
|
||||||
struct caam_drv_private *priv;
|
struct caam_drv_private *priv;
|
||||||
int i = 0, err = 0;
|
int i = 0, err = 0;
|
||||||
u32 cha_vid, cha_inst, des_inst, aes_inst, md_inst;
|
u32 aes_vid, aes_inst, des_inst, md_vid, md_inst;
|
||||||
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
||||||
bool registered = false;
|
bool registered = false;
|
||||||
|
|
||||||
|
@ -2497,14 +2497,34 @@ static int __init caam_qi_algapi_init(void)
|
||||||
* Register crypto algorithms the device supports.
|
* Register crypto algorithms the device supports.
|
||||||
* First, detect presence and attributes of DES, AES, and MD blocks.
|
* First, detect presence and attributes of DES, AES, and MD blocks.
|
||||||
*/
|
*/
|
||||||
cha_vid = rd_reg32(&priv->ctrl->perfmon.cha_id_ls);
|
if (priv->era < 10) {
|
||||||
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
u32 cha_vid, cha_inst;
|
||||||
des_inst = (cha_inst & CHA_ID_LS_DES_MASK) >> CHA_ID_LS_DES_SHIFT;
|
|
||||||
aes_inst = (cha_inst & CHA_ID_LS_AES_MASK) >> CHA_ID_LS_AES_SHIFT;
|
cha_vid = rd_reg32(&priv->ctrl->perfmon.cha_id_ls);
|
||||||
md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
aes_vid = cha_vid & CHA_ID_LS_AES_MASK;
|
||||||
|
md_vid = (cha_vid & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
|
||||||
|
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
||||||
|
des_inst = (cha_inst & CHA_ID_LS_DES_MASK) >>
|
||||||
|
CHA_ID_LS_DES_SHIFT;
|
||||||
|
aes_inst = cha_inst & CHA_ID_LS_AES_MASK;
|
||||||
|
md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
} else {
|
||||||
|
u32 aesa, mdha;
|
||||||
|
|
||||||
|
aesa = rd_reg32(&priv->ctrl->vreg.aesa);
|
||||||
|
mdha = rd_reg32(&priv->ctrl->vreg.mdha);
|
||||||
|
|
||||||
|
aes_vid = (aesa & CHA_VER_VID_MASK) >> CHA_VER_VID_SHIFT;
|
||||||
|
md_vid = (mdha & CHA_VER_VID_MASK) >> CHA_VER_VID_SHIFT;
|
||||||
|
|
||||||
|
des_inst = rd_reg32(&priv->ctrl->vreg.desa) & CHA_VER_NUM_MASK;
|
||||||
|
aes_inst = aesa & CHA_VER_NUM_MASK;
|
||||||
|
md_inst = mdha & CHA_VER_NUM_MASK;
|
||||||
|
}
|
||||||
|
|
||||||
/* If MD is present, limit digest size based on LP256 */
|
/* If MD is present, limit digest size based on LP256 */
|
||||||
if (md_inst && ((cha_vid & CHA_ID_LS_MD_MASK) == CHA_ID_LS_MD_LP256))
|
if (md_inst && md_vid == CHA_VER_VID_MD_LP256)
|
||||||
md_limit = SHA256_DIGEST_SIZE;
|
md_limit = SHA256_DIGEST_SIZE;
|
||||||
|
|
||||||
for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
|
for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
|
||||||
|
@ -2556,8 +2576,7 @@ static int __init caam_qi_algapi_init(void)
|
||||||
* Check support for AES algorithms not available
|
* Check support for AES algorithms not available
|
||||||
* on LP devices.
|
* on LP devices.
|
||||||
*/
|
*/
|
||||||
if (((cha_vid & CHA_ID_LS_AES_MASK) == CHA_ID_LS_AES_LP) &&
|
if (aes_vid == CHA_VER_VID_AES_LP && alg_aai == OP_ALG_AAI_GCM)
|
||||||
(alg_aai == OP_ALG_AAI_GCM))
|
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
|
@ -462,7 +462,15 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
|
||||||
edesc->dst_nents = dst_nents;
|
edesc->dst_nents = dst_nents;
|
||||||
edesc->iv_dma = iv_dma;
|
edesc->iv_dma = iv_dma;
|
||||||
|
|
||||||
edesc->assoclen = cpu_to_caam32(req->assoclen);
|
if ((alg->caam.class1_alg_type & OP_ALG_ALGSEL_MASK) ==
|
||||||
|
OP_ALG_ALGSEL_CHACHA20 && ivsize != CHACHAPOLY_IV_SIZE)
|
||||||
|
/*
|
||||||
|
* The associated data comes already with the IV but we need
|
||||||
|
* to skip it when we authenticate or encrypt...
|
||||||
|
*/
|
||||||
|
edesc->assoclen = cpu_to_caam32(req->assoclen - ivsize);
|
||||||
|
else
|
||||||
|
edesc->assoclen = cpu_to_caam32(req->assoclen);
|
||||||
edesc->assoclen_dma = dma_map_single(dev, &edesc->assoclen, 4,
|
edesc->assoclen_dma = dma_map_single(dev, &edesc->assoclen, 4,
|
||||||
DMA_TO_DEVICE);
|
DMA_TO_DEVICE);
|
||||||
if (dma_mapping_error(dev, edesc->assoclen_dma)) {
|
if (dma_mapping_error(dev, edesc->assoclen_dma)) {
|
||||||
|
@ -532,6 +540,68 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
|
||||||
return edesc;
|
return edesc;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int chachapoly_set_sh_desc(struct crypto_aead *aead)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
unsigned int ivsize = crypto_aead_ivsize(aead);
|
||||||
|
struct device *dev = ctx->dev;
|
||||||
|
struct caam_flc *flc;
|
||||||
|
u32 *desc;
|
||||||
|
|
||||||
|
if (!ctx->cdata.keylen || !ctx->authsize)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
flc = &ctx->flc[ENCRYPT];
|
||||||
|
desc = flc->sh_desc;
|
||||||
|
cnstr_shdsc_chachapoly(desc, &ctx->cdata, &ctx->adata, ivsize,
|
||||||
|
ctx->authsize, true, true);
|
||||||
|
flc->flc[1] = cpu_to_caam32(desc_len(desc)); /* SDL */
|
||||||
|
dma_sync_single_for_device(dev, ctx->flc_dma[ENCRYPT],
|
||||||
|
sizeof(flc->flc) + desc_bytes(desc),
|
||||||
|
ctx->dir);
|
||||||
|
|
||||||
|
flc = &ctx->flc[DECRYPT];
|
||||||
|
desc = flc->sh_desc;
|
||||||
|
cnstr_shdsc_chachapoly(desc, &ctx->cdata, &ctx->adata, ivsize,
|
||||||
|
ctx->authsize, false, true);
|
||||||
|
flc->flc[1] = cpu_to_caam32(desc_len(desc)); /* SDL */
|
||||||
|
dma_sync_single_for_device(dev, ctx->flc_dma[DECRYPT],
|
||||||
|
sizeof(flc->flc) + desc_bytes(desc),
|
||||||
|
ctx->dir);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chachapoly_setauthsize(struct crypto_aead *aead,
|
||||||
|
unsigned int authsize)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
|
||||||
|
if (authsize != POLY1305_DIGEST_SIZE)
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
ctx->authsize = authsize;
|
||||||
|
return chachapoly_set_sh_desc(aead);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
|
||||||
|
unsigned int keylen)
|
||||||
|
{
|
||||||
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
unsigned int ivsize = crypto_aead_ivsize(aead);
|
||||||
|
unsigned int saltlen = CHACHAPOLY_IV_SIZE - ivsize;
|
||||||
|
|
||||||
|
if (keylen != CHACHA_KEY_SIZE + saltlen) {
|
||||||
|
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
||||||
|
return -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx->cdata.key_virt = key;
|
||||||
|
ctx->cdata.keylen = keylen - saltlen;
|
||||||
|
|
||||||
|
return chachapoly_set_sh_desc(aead);
|
||||||
|
}
|
||||||
|
|
||||||
static int gcm_set_sh_desc(struct crypto_aead *aead)
|
static int gcm_set_sh_desc(struct crypto_aead *aead)
|
||||||
{
|
{
|
||||||
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
struct caam_ctx *ctx = crypto_aead_ctx(aead);
|
||||||
|
@ -816,7 +886,9 @@ static int skcipher_setkey(struct crypto_skcipher *skcipher, const u8 *key,
|
||||||
u32 *desc;
|
u32 *desc;
|
||||||
u32 ctx1_iv_off = 0;
|
u32 ctx1_iv_off = 0;
|
||||||
const bool ctr_mode = ((ctx->cdata.algtype & OP_ALG_AAI_MASK) ==
|
const bool ctr_mode = ((ctx->cdata.algtype & OP_ALG_AAI_MASK) ==
|
||||||
OP_ALG_AAI_CTR_MOD128);
|
OP_ALG_AAI_CTR_MOD128) &&
|
||||||
|
((ctx->cdata.algtype & OP_ALG_ALGSEL_MASK) !=
|
||||||
|
OP_ALG_ALGSEL_CHACHA20);
|
||||||
const bool is_rfc3686 = alg->caam.rfc3686;
|
const bool is_rfc3686 = alg->caam.rfc3686;
|
||||||
|
|
||||||
print_hex_dump_debug("key in @" __stringify(__LINE__)": ",
|
print_hex_dump_debug("key in @" __stringify(__LINE__)": ",
|
||||||
|
@ -1494,7 +1566,23 @@ static struct caam_skcipher_alg driver_algs[] = {
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
.ivsize = AES_BLOCK_SIZE,
|
||||||
},
|
},
|
||||||
.caam.class1_alg_type = OP_ALG_ALGSEL_AES | OP_ALG_AAI_XTS,
|
.caam.class1_alg_type = OP_ALG_ALGSEL_AES | OP_ALG_AAI_XTS,
|
||||||
}
|
},
|
||||||
|
{
|
||||||
|
.skcipher = {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "chacha20",
|
||||||
|
.cra_driver_name = "chacha20-caam-qi2",
|
||||||
|
.cra_blocksize = 1,
|
||||||
|
},
|
||||||
|
.setkey = skcipher_setkey,
|
||||||
|
.encrypt = skcipher_encrypt,
|
||||||
|
.decrypt = skcipher_decrypt,
|
||||||
|
.min_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.max_keysize = CHACHA_KEY_SIZE,
|
||||||
|
.ivsize = CHACHA_IV_SIZE,
|
||||||
|
},
|
||||||
|
.caam.class1_alg_type = OP_ALG_ALGSEL_CHACHA20,
|
||||||
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
static struct caam_aead_alg driver_aeads[] = {
|
static struct caam_aead_alg driver_aeads[] = {
|
||||||
|
@ -2608,6 +2696,50 @@ static struct caam_aead_alg driver_aeads[] = {
|
||||||
.geniv = true,
|
.geniv = true,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
.aead = {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "rfc7539(chacha20,poly1305)",
|
||||||
|
.cra_driver_name = "rfc7539-chacha20-poly1305-"
|
||||||
|
"caam-qi2",
|
||||||
|
.cra_blocksize = 1,
|
||||||
|
},
|
||||||
|
.setkey = chachapoly_setkey,
|
||||||
|
.setauthsize = chachapoly_setauthsize,
|
||||||
|
.encrypt = aead_encrypt,
|
||||||
|
.decrypt = aead_decrypt,
|
||||||
|
.ivsize = CHACHAPOLY_IV_SIZE,
|
||||||
|
.maxauthsize = POLY1305_DIGEST_SIZE,
|
||||||
|
},
|
||||||
|
.caam = {
|
||||||
|
.class1_alg_type = OP_ALG_ALGSEL_CHACHA20 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
.class2_alg_type = OP_ALG_ALGSEL_POLY1305 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.aead = {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "rfc7539esp(chacha20,poly1305)",
|
||||||
|
.cra_driver_name = "rfc7539esp-chacha20-"
|
||||||
|
"poly1305-caam-qi2",
|
||||||
|
.cra_blocksize = 1,
|
||||||
|
},
|
||||||
|
.setkey = chachapoly_setkey,
|
||||||
|
.setauthsize = chachapoly_setauthsize,
|
||||||
|
.encrypt = aead_encrypt,
|
||||||
|
.decrypt = aead_decrypt,
|
||||||
|
.ivsize = 8,
|
||||||
|
.maxauthsize = POLY1305_DIGEST_SIZE,
|
||||||
|
},
|
||||||
|
.caam = {
|
||||||
|
.class1_alg_type = OP_ALG_ALGSEL_CHACHA20 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
.class2_alg_type = OP_ALG_ALGSEL_POLY1305 |
|
||||||
|
OP_ALG_AAI_AEAD,
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
.aead = {
|
.aead = {
|
||||||
.base = {
|
.base = {
|
||||||
|
@ -4908,6 +5040,11 @@ static int dpaa2_caam_probe(struct fsl_mc_device *dpseci_dev)
|
||||||
alg_sel == OP_ALG_ALGSEL_AES)
|
alg_sel == OP_ALG_ALGSEL_AES)
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
|
/* Skip CHACHA20 algorithms if not supported by device */
|
||||||
|
if (alg_sel == OP_ALG_ALGSEL_CHACHA20 &&
|
||||||
|
!priv->sec_attr.ccha_acc_num)
|
||||||
|
continue;
|
||||||
|
|
||||||
t_alg->caam.dev = dev;
|
t_alg->caam.dev = dev;
|
||||||
caam_skcipher_alg_init(t_alg);
|
caam_skcipher_alg_init(t_alg);
|
||||||
|
|
||||||
|
@ -4940,11 +5077,22 @@ static int dpaa2_caam_probe(struct fsl_mc_device *dpseci_dev)
|
||||||
c1_alg_sel == OP_ALG_ALGSEL_AES)
|
c1_alg_sel == OP_ALG_ALGSEL_AES)
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
|
/* Skip CHACHA20 algorithms if not supported by device */
|
||||||
|
if (c1_alg_sel == OP_ALG_ALGSEL_CHACHA20 &&
|
||||||
|
!priv->sec_attr.ccha_acc_num)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
/* Skip POLY1305 algorithms if not supported by device */
|
||||||
|
if (c2_alg_sel == OP_ALG_ALGSEL_POLY1305 &&
|
||||||
|
!priv->sec_attr.ptha_acc_num)
|
||||||
|
continue;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Skip algorithms requiring message digests
|
* Skip algorithms requiring message digests
|
||||||
* if MD not supported by device.
|
* if MD not supported by device.
|
||||||
*/
|
*/
|
||||||
if (!priv->sec_attr.md_acc_num && c2_alg_sel)
|
if ((c2_alg_sel & ~OP_ALG_ALGSEL_SUBMASK) == 0x40 &&
|
||||||
|
!priv->sec_attr.md_acc_num)
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
t_alg->caam.dev = dev;
|
t_alg->caam.dev = dev;
|
||||||
|
|
|
@ -3,6 +3,7 @@
|
||||||
* caam - Freescale FSL CAAM support for ahash functions of crypto API
|
* caam - Freescale FSL CAAM support for ahash functions of crypto API
|
||||||
*
|
*
|
||||||
* Copyright 2011 Freescale Semiconductor, Inc.
|
* Copyright 2011 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*
|
*
|
||||||
* Based on caamalg.c crypto API driver.
|
* Based on caamalg.c crypto API driver.
|
||||||
*
|
*
|
||||||
|
@ -1801,7 +1802,7 @@ static int __init caam_algapi_hash_init(void)
|
||||||
int i = 0, err = 0;
|
int i = 0, err = 0;
|
||||||
struct caam_drv_private *priv;
|
struct caam_drv_private *priv;
|
||||||
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
unsigned int md_limit = SHA512_DIGEST_SIZE;
|
||||||
u32 cha_inst, cha_vid;
|
u32 md_inst, md_vid;
|
||||||
|
|
||||||
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
||||||
if (!dev_node) {
|
if (!dev_node) {
|
||||||
|
@ -1831,18 +1832,27 @@ static int __init caam_algapi_hash_init(void)
|
||||||
* Register crypto algorithms the device supports. First, identify
|
* Register crypto algorithms the device supports. First, identify
|
||||||
* presence and attributes of MD block.
|
* presence and attributes of MD block.
|
||||||
*/
|
*/
|
||||||
cha_vid = rd_reg32(&priv->ctrl->perfmon.cha_id_ls);
|
if (priv->era < 10) {
|
||||||
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
md_vid = (rd_reg32(&priv->ctrl->perfmon.cha_id_ls) &
|
||||||
|
CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
md_inst = (rd_reg32(&priv->ctrl->perfmon.cha_num_ls) &
|
||||||
|
CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
|
||||||
|
} else {
|
||||||
|
u32 mdha = rd_reg32(&priv->ctrl->vreg.mdha);
|
||||||
|
|
||||||
|
md_vid = (mdha & CHA_VER_VID_MASK) >> CHA_VER_VID_SHIFT;
|
||||||
|
md_inst = mdha & CHA_VER_NUM_MASK;
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Skip registration of any hashing algorithms if MD block
|
* Skip registration of any hashing algorithms if MD block
|
||||||
* is not present.
|
* is not present.
|
||||||
*/
|
*/
|
||||||
if (!((cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT))
|
if (!md_inst)
|
||||||
return -ENODEV;
|
return -ENODEV;
|
||||||
|
|
||||||
/* Limit digest size based on LP256 */
|
/* Limit digest size based on LP256 */
|
||||||
if ((cha_vid & CHA_ID_LS_MD_MASK) == CHA_ID_LS_MD_LP256)
|
if (md_vid == CHA_VER_VID_MD_LP256)
|
||||||
md_limit = SHA256_DIGEST_SIZE;
|
md_limit = SHA256_DIGEST_SIZE;
|
||||||
|
|
||||||
INIT_LIST_HEAD(&hash_list);
|
INIT_LIST_HEAD(&hash_list);
|
||||||
|
|
|
@ -3,6 +3,7 @@
|
||||||
* caam - Freescale FSL CAAM support for Public Key Cryptography
|
* caam - Freescale FSL CAAM support for Public Key Cryptography
|
||||||
*
|
*
|
||||||
* Copyright 2016 Freescale Semiconductor, Inc.
|
* Copyright 2016 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*
|
*
|
||||||
* There is no Shared Descriptor for PKC so that the Job Descriptor must carry
|
* There is no Shared Descriptor for PKC so that the Job Descriptor must carry
|
||||||
* all the desired key parameters, input and output pointers.
|
* all the desired key parameters, input and output pointers.
|
||||||
|
@ -1017,7 +1018,7 @@ static int __init caam_pkc_init(void)
|
||||||
struct platform_device *pdev;
|
struct platform_device *pdev;
|
||||||
struct device *ctrldev;
|
struct device *ctrldev;
|
||||||
struct caam_drv_private *priv;
|
struct caam_drv_private *priv;
|
||||||
u32 cha_inst, pk_inst;
|
u32 pk_inst;
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
||||||
|
@ -1045,8 +1046,11 @@ static int __init caam_pkc_init(void)
|
||||||
return -ENODEV;
|
return -ENODEV;
|
||||||
|
|
||||||
/* Determine public key hardware accelerator presence. */
|
/* Determine public key hardware accelerator presence. */
|
||||||
cha_inst = rd_reg32(&priv->ctrl->perfmon.cha_num_ls);
|
if (priv->era < 10)
|
||||||
pk_inst = (cha_inst & CHA_ID_LS_PK_MASK) >> CHA_ID_LS_PK_SHIFT;
|
pk_inst = (rd_reg32(&priv->ctrl->perfmon.cha_num_ls) &
|
||||||
|
CHA_ID_LS_PK_MASK) >> CHA_ID_LS_PK_SHIFT;
|
||||||
|
else
|
||||||
|
pk_inst = rd_reg32(&priv->ctrl->vreg.pkha) & CHA_VER_NUM_MASK;
|
||||||
|
|
||||||
/* Do not register algorithms if PKHA is not present. */
|
/* Do not register algorithms if PKHA is not present. */
|
||||||
if (!pk_inst)
|
if (!pk_inst)
|
||||||
|
|
|
@ -3,6 +3,7 @@
|
||||||
* caam - Freescale FSL CAAM support for hw_random
|
* caam - Freescale FSL CAAM support for hw_random
|
||||||
*
|
*
|
||||||
* Copyright 2011 Freescale Semiconductor, Inc.
|
* Copyright 2011 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*
|
*
|
||||||
* Based on caamalg.c crypto API driver.
|
* Based on caamalg.c crypto API driver.
|
||||||
*
|
*
|
||||||
|
@ -309,6 +310,7 @@ static int __init caam_rng_init(void)
|
||||||
struct platform_device *pdev;
|
struct platform_device *pdev;
|
||||||
struct device *ctrldev;
|
struct device *ctrldev;
|
||||||
struct caam_drv_private *priv;
|
struct caam_drv_private *priv;
|
||||||
|
u32 rng_inst;
|
||||||
int err;
|
int err;
|
||||||
|
|
||||||
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
dev_node = of_find_compatible_node(NULL, NULL, "fsl,sec-v4.0");
|
||||||
|
@ -336,7 +338,13 @@ static int __init caam_rng_init(void)
|
||||||
return -ENODEV;
|
return -ENODEV;
|
||||||
|
|
||||||
/* Check for an instantiated RNG before registration */
|
/* Check for an instantiated RNG before registration */
|
||||||
if (!(rd_reg32(&priv->ctrl->perfmon.cha_num_ls) & CHA_ID_LS_RNG_MASK))
|
if (priv->era < 10)
|
||||||
|
rng_inst = (rd_reg32(&priv->ctrl->perfmon.cha_num_ls) &
|
||||||
|
CHA_ID_LS_RNG_MASK) >> CHA_ID_LS_RNG_SHIFT;
|
||||||
|
else
|
||||||
|
rng_inst = rd_reg32(&priv->ctrl->vreg.rng) & CHA_VER_NUM_MASK;
|
||||||
|
|
||||||
|
if (!rng_inst)
|
||||||
return -ENODEV;
|
return -ENODEV;
|
||||||
|
|
||||||
dev = caam_jr_alloc();
|
dev = caam_jr_alloc();
|
||||||
|
|
|
@ -36,6 +36,8 @@
|
||||||
#include <crypto/gcm.h>
|
#include <crypto/gcm.h>
|
||||||
#include <crypto/sha.h>
|
#include <crypto/sha.h>
|
||||||
#include <crypto/md5.h>
|
#include <crypto/md5.h>
|
||||||
|
#include <crypto/chacha.h>
|
||||||
|
#include <crypto/poly1305.h>
|
||||||
#include <crypto/internal/aead.h>
|
#include <crypto/internal/aead.h>
|
||||||
#include <crypto/authenc.h>
|
#include <crypto/authenc.h>
|
||||||
#include <crypto/akcipher.h>
|
#include <crypto/akcipher.h>
|
||||||
|
|
|
@ -3,6 +3,7 @@
|
||||||
* Controller-level driver, kernel property detection, initialization
|
* Controller-level driver, kernel property detection, initialization
|
||||||
*
|
*
|
||||||
* Copyright 2008-2012 Freescale Semiconductor, Inc.
|
* Copyright 2008-2012 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/device.h>
|
#include <linux/device.h>
|
||||||
|
@ -106,7 +107,7 @@ static inline int run_descriptor_deco0(struct device *ctrldev, u32 *desc,
|
||||||
struct caam_ctrl __iomem *ctrl = ctrlpriv->ctrl;
|
struct caam_ctrl __iomem *ctrl = ctrlpriv->ctrl;
|
||||||
struct caam_deco __iomem *deco = ctrlpriv->deco;
|
struct caam_deco __iomem *deco = ctrlpriv->deco;
|
||||||
unsigned int timeout = 100000;
|
unsigned int timeout = 100000;
|
||||||
u32 deco_dbg_reg, flags;
|
u32 deco_dbg_reg, deco_state, flags;
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
|
|
||||||
|
@ -149,13 +150,22 @@ static inline int run_descriptor_deco0(struct device *ctrldev, u32 *desc,
|
||||||
timeout = 10000000;
|
timeout = 10000000;
|
||||||
do {
|
do {
|
||||||
deco_dbg_reg = rd_reg32(&deco->desc_dbg);
|
deco_dbg_reg = rd_reg32(&deco->desc_dbg);
|
||||||
|
|
||||||
|
if (ctrlpriv->era < 10)
|
||||||
|
deco_state = (deco_dbg_reg & DESC_DBG_DECO_STAT_MASK) >>
|
||||||
|
DESC_DBG_DECO_STAT_SHIFT;
|
||||||
|
else
|
||||||
|
deco_state = (rd_reg32(&deco->dbg_exec) &
|
||||||
|
DESC_DER_DECO_STAT_MASK) >>
|
||||||
|
DESC_DER_DECO_STAT_SHIFT;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If an error occured in the descriptor, then
|
* If an error occured in the descriptor, then
|
||||||
* the DECO status field will be set to 0x0D
|
* the DECO status field will be set to 0x0D
|
||||||
*/
|
*/
|
||||||
if ((deco_dbg_reg & DESC_DBG_DECO_STAT_MASK) ==
|
if (deco_state == DECO_STAT_HOST_ERR)
|
||||||
DESC_DBG_DECO_STAT_HOST_ERR)
|
|
||||||
break;
|
break;
|
||||||
|
|
||||||
cpu_relax();
|
cpu_relax();
|
||||||
} while ((deco_dbg_reg & DESC_DBG_DECO_STAT_VALID) && --timeout);
|
} while ((deco_dbg_reg & DESC_DBG_DECO_STAT_VALID) && --timeout);
|
||||||
|
|
||||||
|
@ -491,7 +501,7 @@ static int caam_probe(struct platform_device *pdev)
|
||||||
struct caam_perfmon *perfmon;
|
struct caam_perfmon *perfmon;
|
||||||
#endif
|
#endif
|
||||||
u32 scfgr, comp_params;
|
u32 scfgr, comp_params;
|
||||||
u32 cha_vid_ls;
|
u8 rng_vid;
|
||||||
int pg_size;
|
int pg_size;
|
||||||
int BLOCK_OFFSET = 0;
|
int BLOCK_OFFSET = 0;
|
||||||
|
|
||||||
|
@ -733,15 +743,19 @@ static int caam_probe(struct platform_device *pdev)
|
||||||
goto caam_remove;
|
goto caam_remove;
|
||||||
}
|
}
|
||||||
|
|
||||||
cha_vid_ls = rd_reg32(&ctrl->perfmon.cha_id_ls);
|
if (ctrlpriv->era < 10)
|
||||||
|
rng_vid = (rd_reg32(&ctrl->perfmon.cha_id_ls) &
|
||||||
|
CHA_ID_LS_RNG_MASK) >> CHA_ID_LS_RNG_SHIFT;
|
||||||
|
else
|
||||||
|
rng_vid = (rd_reg32(&ctrl->vreg.rng) & CHA_VER_VID_MASK) >>
|
||||||
|
CHA_VER_VID_SHIFT;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If SEC has RNG version >= 4 and RNG state handle has not been
|
* If SEC has RNG version >= 4 and RNG state handle has not been
|
||||||
* already instantiated, do RNG instantiation
|
* already instantiated, do RNG instantiation
|
||||||
* In case of SoCs with Management Complex, RNG is managed by MC f/w.
|
* In case of SoCs with Management Complex, RNG is managed by MC f/w.
|
||||||
*/
|
*/
|
||||||
if (!ctrlpriv->mc_en &&
|
if (!ctrlpriv->mc_en && rng_vid >= 4) {
|
||||||
(cha_vid_ls & CHA_ID_LS_RNG_MASK) >> CHA_ID_LS_RNG_SHIFT >= 4) {
|
|
||||||
ctrlpriv->rng4_sh_init =
|
ctrlpriv->rng4_sh_init =
|
||||||
rd_reg32(&ctrl->r4tst[0].rdsta);
|
rd_reg32(&ctrl->r4tst[0].rdsta);
|
||||||
/*
|
/*
|
||||||
|
|
|
@ -4,6 +4,7 @@
|
||||||
* Definitions to support CAAM descriptor instruction generation
|
* Definitions to support CAAM descriptor instruction generation
|
||||||
*
|
*
|
||||||
* Copyright 2008-2011 Freescale Semiconductor, Inc.
|
* Copyright 2008-2011 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#ifndef DESC_H
|
#ifndef DESC_H
|
||||||
|
@ -242,6 +243,7 @@
|
||||||
#define LDST_SRCDST_WORD_DESCBUF_SHARED (0x42 << LDST_SRCDST_SHIFT)
|
#define LDST_SRCDST_WORD_DESCBUF_SHARED (0x42 << LDST_SRCDST_SHIFT)
|
||||||
#define LDST_SRCDST_WORD_DESCBUF_JOB_WE (0x45 << LDST_SRCDST_SHIFT)
|
#define LDST_SRCDST_WORD_DESCBUF_JOB_WE (0x45 << LDST_SRCDST_SHIFT)
|
||||||
#define LDST_SRCDST_WORD_DESCBUF_SHARED_WE (0x46 << LDST_SRCDST_SHIFT)
|
#define LDST_SRCDST_WORD_DESCBUF_SHARED_WE (0x46 << LDST_SRCDST_SHIFT)
|
||||||
|
#define LDST_SRCDST_WORD_INFO_FIFO_SM (0x71 << LDST_SRCDST_SHIFT)
|
||||||
#define LDST_SRCDST_WORD_INFO_FIFO (0x7a << LDST_SRCDST_SHIFT)
|
#define LDST_SRCDST_WORD_INFO_FIFO (0x7a << LDST_SRCDST_SHIFT)
|
||||||
|
|
||||||
/* Offset in source/destination */
|
/* Offset in source/destination */
|
||||||
|
@ -284,6 +286,12 @@
|
||||||
#define LDLEN_SET_OFIFO_OFFSET_SHIFT 0
|
#define LDLEN_SET_OFIFO_OFFSET_SHIFT 0
|
||||||
#define LDLEN_SET_OFIFO_OFFSET_MASK (3 << LDLEN_SET_OFIFO_OFFSET_SHIFT)
|
#define LDLEN_SET_OFIFO_OFFSET_MASK (3 << LDLEN_SET_OFIFO_OFFSET_SHIFT)
|
||||||
|
|
||||||
|
/* Special Length definitions when dst=sm, nfifo-{sm,m} */
|
||||||
|
#define LDLEN_MATH0 0
|
||||||
|
#define LDLEN_MATH1 1
|
||||||
|
#define LDLEN_MATH2 2
|
||||||
|
#define LDLEN_MATH3 3
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* FIFO_LOAD/FIFO_STORE/SEQ_FIFO_LOAD/SEQ_FIFO_STORE
|
* FIFO_LOAD/FIFO_STORE/SEQ_FIFO_LOAD/SEQ_FIFO_STORE
|
||||||
* Command Constructs
|
* Command Constructs
|
||||||
|
@ -408,6 +416,7 @@
|
||||||
#define FIFOST_TYPE_MESSAGE_DATA (0x30 << FIFOST_TYPE_SHIFT)
|
#define FIFOST_TYPE_MESSAGE_DATA (0x30 << FIFOST_TYPE_SHIFT)
|
||||||
#define FIFOST_TYPE_RNGSTORE (0x34 << FIFOST_TYPE_SHIFT)
|
#define FIFOST_TYPE_RNGSTORE (0x34 << FIFOST_TYPE_SHIFT)
|
||||||
#define FIFOST_TYPE_RNGFIFO (0x35 << FIFOST_TYPE_SHIFT)
|
#define FIFOST_TYPE_RNGFIFO (0x35 << FIFOST_TYPE_SHIFT)
|
||||||
|
#define FIFOST_TYPE_METADATA (0x3e << FIFOST_TYPE_SHIFT)
|
||||||
#define FIFOST_TYPE_SKIP (0x3f << FIFOST_TYPE_SHIFT)
|
#define FIFOST_TYPE_SKIP (0x3f << FIFOST_TYPE_SHIFT)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -1133,6 +1142,12 @@
|
||||||
#define OP_ALG_TYPE_CLASS1 (2 << OP_ALG_TYPE_SHIFT)
|
#define OP_ALG_TYPE_CLASS1 (2 << OP_ALG_TYPE_SHIFT)
|
||||||
#define OP_ALG_TYPE_CLASS2 (4 << OP_ALG_TYPE_SHIFT)
|
#define OP_ALG_TYPE_CLASS2 (4 << OP_ALG_TYPE_SHIFT)
|
||||||
|
|
||||||
|
/* version register fields */
|
||||||
|
#define OP_VER_CCHA_NUM 0x000000ff /* Number CCHAs instantiated */
|
||||||
|
#define OP_VER_CCHA_MISC 0x0000ff00 /* CCHA Miscellaneous Information */
|
||||||
|
#define OP_VER_CCHA_REV 0x00ff0000 /* CCHA Revision Number */
|
||||||
|
#define OP_VER_CCHA_VID 0xff000000 /* CCHA Version ID */
|
||||||
|
|
||||||
#define OP_ALG_ALGSEL_SHIFT 16
|
#define OP_ALG_ALGSEL_SHIFT 16
|
||||||
#define OP_ALG_ALGSEL_MASK (0xff << OP_ALG_ALGSEL_SHIFT)
|
#define OP_ALG_ALGSEL_MASK (0xff << OP_ALG_ALGSEL_SHIFT)
|
||||||
#define OP_ALG_ALGSEL_SUBMASK (0x0f << OP_ALG_ALGSEL_SHIFT)
|
#define OP_ALG_ALGSEL_SUBMASK (0x0f << OP_ALG_ALGSEL_SHIFT)
|
||||||
|
@ -1152,6 +1167,8 @@
|
||||||
#define OP_ALG_ALGSEL_KASUMI (0x70 << OP_ALG_ALGSEL_SHIFT)
|
#define OP_ALG_ALGSEL_KASUMI (0x70 << OP_ALG_ALGSEL_SHIFT)
|
||||||
#define OP_ALG_ALGSEL_CRC (0x90 << OP_ALG_ALGSEL_SHIFT)
|
#define OP_ALG_ALGSEL_CRC (0x90 << OP_ALG_ALGSEL_SHIFT)
|
||||||
#define OP_ALG_ALGSEL_SNOW_F9 (0xA0 << OP_ALG_ALGSEL_SHIFT)
|
#define OP_ALG_ALGSEL_SNOW_F9 (0xA0 << OP_ALG_ALGSEL_SHIFT)
|
||||||
|
#define OP_ALG_ALGSEL_CHACHA20 (0xD0 << OP_ALG_ALGSEL_SHIFT)
|
||||||
|
#define OP_ALG_ALGSEL_POLY1305 (0xE0 << OP_ALG_ALGSEL_SHIFT)
|
||||||
|
|
||||||
#define OP_ALG_AAI_SHIFT 4
|
#define OP_ALG_AAI_SHIFT 4
|
||||||
#define OP_ALG_AAI_MASK (0x1ff << OP_ALG_AAI_SHIFT)
|
#define OP_ALG_AAI_MASK (0x1ff << OP_ALG_AAI_SHIFT)
|
||||||
|
@ -1199,6 +1216,11 @@
|
||||||
#define OP_ALG_AAI_RNG4_AI (0x80 << OP_ALG_AAI_SHIFT)
|
#define OP_ALG_AAI_RNG4_AI (0x80 << OP_ALG_AAI_SHIFT)
|
||||||
#define OP_ALG_AAI_RNG4_SK (0x100 << OP_ALG_AAI_SHIFT)
|
#define OP_ALG_AAI_RNG4_SK (0x100 << OP_ALG_AAI_SHIFT)
|
||||||
|
|
||||||
|
/* Chacha20 AAI set */
|
||||||
|
#define OP_ALG_AAI_AEAD (0x002 << OP_ALG_AAI_SHIFT)
|
||||||
|
#define OP_ALG_AAI_KEYSTREAM (0x001 << OP_ALG_AAI_SHIFT)
|
||||||
|
#define OP_ALG_AAI_BC8 (0x008 << OP_ALG_AAI_SHIFT)
|
||||||
|
|
||||||
/* hmac/smac AAI set */
|
/* hmac/smac AAI set */
|
||||||
#define OP_ALG_AAI_HASH (0x00 << OP_ALG_AAI_SHIFT)
|
#define OP_ALG_AAI_HASH (0x00 << OP_ALG_AAI_SHIFT)
|
||||||
#define OP_ALG_AAI_HMAC (0x01 << OP_ALG_AAI_SHIFT)
|
#define OP_ALG_AAI_HMAC (0x01 << OP_ALG_AAI_SHIFT)
|
||||||
|
@ -1387,6 +1409,7 @@
|
||||||
#define MOVE_SRC_MATH3 (0x07 << MOVE_SRC_SHIFT)
|
#define MOVE_SRC_MATH3 (0x07 << MOVE_SRC_SHIFT)
|
||||||
#define MOVE_SRC_INFIFO (0x08 << MOVE_SRC_SHIFT)
|
#define MOVE_SRC_INFIFO (0x08 << MOVE_SRC_SHIFT)
|
||||||
#define MOVE_SRC_INFIFO_CL (0x09 << MOVE_SRC_SHIFT)
|
#define MOVE_SRC_INFIFO_CL (0x09 << MOVE_SRC_SHIFT)
|
||||||
|
#define MOVE_SRC_AUX_ABLK (0x0a << MOVE_SRC_SHIFT)
|
||||||
|
|
||||||
#define MOVE_DEST_SHIFT 16
|
#define MOVE_DEST_SHIFT 16
|
||||||
#define MOVE_DEST_MASK (0x0f << MOVE_DEST_SHIFT)
|
#define MOVE_DEST_MASK (0x0f << MOVE_DEST_SHIFT)
|
||||||
|
@ -1413,6 +1436,10 @@
|
||||||
|
|
||||||
#define MOVELEN_MRSEL_SHIFT 0
|
#define MOVELEN_MRSEL_SHIFT 0
|
||||||
#define MOVELEN_MRSEL_MASK (0x3 << MOVE_LEN_SHIFT)
|
#define MOVELEN_MRSEL_MASK (0x3 << MOVE_LEN_SHIFT)
|
||||||
|
#define MOVELEN_MRSEL_MATH0 (0 << MOVELEN_MRSEL_SHIFT)
|
||||||
|
#define MOVELEN_MRSEL_MATH1 (1 << MOVELEN_MRSEL_SHIFT)
|
||||||
|
#define MOVELEN_MRSEL_MATH2 (2 << MOVELEN_MRSEL_SHIFT)
|
||||||
|
#define MOVELEN_MRSEL_MATH3 (3 << MOVELEN_MRSEL_SHIFT)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* MATH Command Constructs
|
* MATH Command Constructs
|
||||||
|
@ -1589,6 +1616,7 @@
|
||||||
#define NFIFOENTRY_DTYPE_IV (0x2 << NFIFOENTRY_DTYPE_SHIFT)
|
#define NFIFOENTRY_DTYPE_IV (0x2 << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
#define NFIFOENTRY_DTYPE_SAD (0x3 << NFIFOENTRY_DTYPE_SHIFT)
|
#define NFIFOENTRY_DTYPE_SAD (0x3 << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
#define NFIFOENTRY_DTYPE_ICV (0xA << NFIFOENTRY_DTYPE_SHIFT)
|
#define NFIFOENTRY_DTYPE_ICV (0xA << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
|
#define NFIFOENTRY_DTYPE_POLY (0xB << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
#define NFIFOENTRY_DTYPE_SKIP (0xE << NFIFOENTRY_DTYPE_SHIFT)
|
#define NFIFOENTRY_DTYPE_SKIP (0xE << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
#define NFIFOENTRY_DTYPE_MSG (0xF << NFIFOENTRY_DTYPE_SHIFT)
|
#define NFIFOENTRY_DTYPE_MSG (0xF << NFIFOENTRY_DTYPE_SHIFT)
|
||||||
|
|
||||||
|
|
|
@ -189,6 +189,7 @@ static inline u32 *append_##cmd(u32 * const desc, u32 options) \
|
||||||
}
|
}
|
||||||
APPEND_CMD_RET(jump, JUMP)
|
APPEND_CMD_RET(jump, JUMP)
|
||||||
APPEND_CMD_RET(move, MOVE)
|
APPEND_CMD_RET(move, MOVE)
|
||||||
|
APPEND_CMD_RET(move_len, MOVE_LEN)
|
||||||
|
|
||||||
static inline void set_jump_tgt_here(u32 * const desc, u32 *jump_cmd)
|
static inline void set_jump_tgt_here(u32 * const desc, u32 *jump_cmd)
|
||||||
{
|
{
|
||||||
|
@ -327,7 +328,11 @@ static inline void append_##cmd##_imm_##type(u32 * const desc, type immediate, \
|
||||||
u32 options) \
|
u32 options) \
|
||||||
{ \
|
{ \
|
||||||
PRINT_POS; \
|
PRINT_POS; \
|
||||||
append_cmd(desc, CMD_##op | IMMEDIATE | options | sizeof(type)); \
|
if (options & LDST_LEN_MASK) \
|
||||||
|
append_cmd(desc, CMD_##op | IMMEDIATE | options); \
|
||||||
|
else \
|
||||||
|
append_cmd(desc, CMD_##op | IMMEDIATE | options | \
|
||||||
|
sizeof(type)); \
|
||||||
append_cmd(desc, immediate); \
|
append_cmd(desc, immediate); \
|
||||||
}
|
}
|
||||||
APPEND_CMD_RAW_IMM(load, LOAD, u32);
|
APPEND_CMD_RAW_IMM(load, LOAD, u32);
|
||||||
|
|
|
@ -3,6 +3,7 @@
|
||||||
* CAAM hardware register-level view
|
* CAAM hardware register-level view
|
||||||
*
|
*
|
||||||
* Copyright 2008-2011 Freescale Semiconductor, Inc.
|
* Copyright 2008-2011 Freescale Semiconductor, Inc.
|
||||||
|
* Copyright 2018 NXP
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#ifndef REGS_H
|
#ifndef REGS_H
|
||||||
|
@ -211,6 +212,47 @@ struct jr_outentry {
|
||||||
u32 jrstatus; /* Status for completed descriptor */
|
u32 jrstatus; /* Status for completed descriptor */
|
||||||
} __packed;
|
} __packed;
|
||||||
|
|
||||||
|
/* Version registers (Era 10+) e80-eff */
|
||||||
|
struct version_regs {
|
||||||
|
u32 crca; /* CRCA_VERSION */
|
||||||
|
u32 afha; /* AFHA_VERSION */
|
||||||
|
u32 kfha; /* KFHA_VERSION */
|
||||||
|
u32 pkha; /* PKHA_VERSION */
|
||||||
|
u32 aesa; /* AESA_VERSION */
|
||||||
|
u32 mdha; /* MDHA_VERSION */
|
||||||
|
u32 desa; /* DESA_VERSION */
|
||||||
|
u32 snw8a; /* SNW8A_VERSION */
|
||||||
|
u32 snw9a; /* SNW9A_VERSION */
|
||||||
|
u32 zuce; /* ZUCE_VERSION */
|
||||||
|
u32 zuca; /* ZUCA_VERSION */
|
||||||
|
u32 ccha; /* CCHA_VERSION */
|
||||||
|
u32 ptha; /* PTHA_VERSION */
|
||||||
|
u32 rng; /* RNG_VERSION */
|
||||||
|
u32 trng; /* TRNG_VERSION */
|
||||||
|
u32 aaha; /* AAHA_VERSION */
|
||||||
|
u32 rsvd[10];
|
||||||
|
u32 sr; /* SR_VERSION */
|
||||||
|
u32 dma; /* DMA_VERSION */
|
||||||
|
u32 ai; /* AI_VERSION */
|
||||||
|
u32 qi; /* QI_VERSION */
|
||||||
|
u32 jr; /* JR_VERSION */
|
||||||
|
u32 deco; /* DECO_VERSION */
|
||||||
|
};
|
||||||
|
|
||||||
|
/* Version registers bitfields */
|
||||||
|
|
||||||
|
/* Number of CHAs instantiated */
|
||||||
|
#define CHA_VER_NUM_MASK 0xffull
|
||||||
|
/* CHA Miscellaneous Information */
|
||||||
|
#define CHA_VER_MISC_SHIFT 8
|
||||||
|
#define CHA_VER_MISC_MASK (0xffull << CHA_VER_MISC_SHIFT)
|
||||||
|
/* CHA Revision Number */
|
||||||
|
#define CHA_VER_REV_SHIFT 16
|
||||||
|
#define CHA_VER_REV_MASK (0xffull << CHA_VER_REV_SHIFT)
|
||||||
|
/* CHA Version ID */
|
||||||
|
#define CHA_VER_VID_SHIFT 24
|
||||||
|
#define CHA_VER_VID_MASK (0xffull << CHA_VER_VID_SHIFT)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* caam_perfmon - Performance Monitor/Secure Memory Status/
|
* caam_perfmon - Performance Monitor/Secure Memory Status/
|
||||||
* CAAM Global Status/Component Version IDs
|
* CAAM Global Status/Component Version IDs
|
||||||
|
@ -223,15 +265,13 @@ struct jr_outentry {
|
||||||
#define CHA_NUM_MS_DECONUM_MASK (0xfull << CHA_NUM_MS_DECONUM_SHIFT)
|
#define CHA_NUM_MS_DECONUM_MASK (0xfull << CHA_NUM_MS_DECONUM_SHIFT)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* CHA version IDs / instantiation bitfields
|
* CHA version IDs / instantiation bitfields (< Era 10)
|
||||||
* Defined for use with the cha_id fields in perfmon, but the same shift/mask
|
* Defined for use with the cha_id fields in perfmon, but the same shift/mask
|
||||||
* selectors can be used to pull out the number of instantiated blocks within
|
* selectors can be used to pull out the number of instantiated blocks within
|
||||||
* cha_num fields in perfmon because the locations are the same.
|
* cha_num fields in perfmon because the locations are the same.
|
||||||
*/
|
*/
|
||||||
#define CHA_ID_LS_AES_SHIFT 0
|
#define CHA_ID_LS_AES_SHIFT 0
|
||||||
#define CHA_ID_LS_AES_MASK (0xfull << CHA_ID_LS_AES_SHIFT)
|
#define CHA_ID_LS_AES_MASK (0xfull << CHA_ID_LS_AES_SHIFT)
|
||||||
#define CHA_ID_LS_AES_LP (0x3ull << CHA_ID_LS_AES_SHIFT)
|
|
||||||
#define CHA_ID_LS_AES_HP (0x4ull << CHA_ID_LS_AES_SHIFT)
|
|
||||||
|
|
||||||
#define CHA_ID_LS_DES_SHIFT 4
|
#define CHA_ID_LS_DES_SHIFT 4
|
||||||
#define CHA_ID_LS_DES_MASK (0xfull << CHA_ID_LS_DES_SHIFT)
|
#define CHA_ID_LS_DES_MASK (0xfull << CHA_ID_LS_DES_SHIFT)
|
||||||
|
@ -241,9 +281,6 @@ struct jr_outentry {
|
||||||
|
|
||||||
#define CHA_ID_LS_MD_SHIFT 12
|
#define CHA_ID_LS_MD_SHIFT 12
|
||||||
#define CHA_ID_LS_MD_MASK (0xfull << CHA_ID_LS_MD_SHIFT)
|
#define CHA_ID_LS_MD_MASK (0xfull << CHA_ID_LS_MD_SHIFT)
|
||||||
#define CHA_ID_LS_MD_LP256 (0x0ull << CHA_ID_LS_MD_SHIFT)
|
|
||||||
#define CHA_ID_LS_MD_LP512 (0x1ull << CHA_ID_LS_MD_SHIFT)
|
|
||||||
#define CHA_ID_LS_MD_HP (0x2ull << CHA_ID_LS_MD_SHIFT)
|
|
||||||
|
|
||||||
#define CHA_ID_LS_RNG_SHIFT 16
|
#define CHA_ID_LS_RNG_SHIFT 16
|
||||||
#define CHA_ID_LS_RNG_MASK (0xfull << CHA_ID_LS_RNG_SHIFT)
|
#define CHA_ID_LS_RNG_MASK (0xfull << CHA_ID_LS_RNG_SHIFT)
|
||||||
|
@ -269,6 +306,13 @@ struct jr_outentry {
|
||||||
#define CHA_ID_MS_JR_SHIFT 28
|
#define CHA_ID_MS_JR_SHIFT 28
|
||||||
#define CHA_ID_MS_JR_MASK (0xfull << CHA_ID_MS_JR_SHIFT)
|
#define CHA_ID_MS_JR_MASK (0xfull << CHA_ID_MS_JR_SHIFT)
|
||||||
|
|
||||||
|
/* Specific CHA version IDs */
|
||||||
|
#define CHA_VER_VID_AES_LP 0x3ull
|
||||||
|
#define CHA_VER_VID_AES_HP 0x4ull
|
||||||
|
#define CHA_VER_VID_MD_LP256 0x0ull
|
||||||
|
#define CHA_VER_VID_MD_LP512 0x1ull
|
||||||
|
#define CHA_VER_VID_MD_HP 0x2ull
|
||||||
|
|
||||||
struct sec_vid {
|
struct sec_vid {
|
||||||
u16 ip_id;
|
u16 ip_id;
|
||||||
u8 maj_rev;
|
u8 maj_rev;
|
||||||
|
@ -479,8 +523,10 @@ struct caam_ctrl {
|
||||||
struct rng4tst r4tst[2];
|
struct rng4tst r4tst[2];
|
||||||
};
|
};
|
||||||
|
|
||||||
u32 rsvd9[448];
|
u32 rsvd9[416];
|
||||||
|
|
||||||
|
/* Version registers - introduced with era 10 e80-eff */
|
||||||
|
struct version_regs vreg;
|
||||||
/* Performance Monitor f00-fff */
|
/* Performance Monitor f00-fff */
|
||||||
struct caam_perfmon perfmon;
|
struct caam_perfmon perfmon;
|
||||||
};
|
};
|
||||||
|
@ -570,8 +616,10 @@ struct caam_job_ring {
|
||||||
u32 rsvd11;
|
u32 rsvd11;
|
||||||
u32 jrcommand; /* JRCRx - JobR command */
|
u32 jrcommand; /* JRCRx - JobR command */
|
||||||
|
|
||||||
u32 rsvd12[932];
|
u32 rsvd12[900];
|
||||||
|
|
||||||
|
/* Version registers - introduced with era 10 e80-eff */
|
||||||
|
struct version_regs vreg;
|
||||||
/* Performance Monitor f00-fff */
|
/* Performance Monitor f00-fff */
|
||||||
struct caam_perfmon perfmon;
|
struct caam_perfmon perfmon;
|
||||||
};
|
};
|
||||||
|
@ -878,13 +926,19 @@ struct caam_deco {
|
||||||
u32 rsvd29[48];
|
u32 rsvd29[48];
|
||||||
u32 descbuf[64]; /* DxDESB - Descriptor buffer */
|
u32 descbuf[64]; /* DxDESB - Descriptor buffer */
|
||||||
u32 rscvd30[193];
|
u32 rscvd30[193];
|
||||||
#define DESC_DBG_DECO_STAT_HOST_ERR 0x00D00000
|
|
||||||
#define DESC_DBG_DECO_STAT_VALID 0x80000000
|
#define DESC_DBG_DECO_STAT_VALID 0x80000000
|
||||||
#define DESC_DBG_DECO_STAT_MASK 0x00F00000
|
#define DESC_DBG_DECO_STAT_MASK 0x00F00000
|
||||||
|
#define DESC_DBG_DECO_STAT_SHIFT 20
|
||||||
u32 desc_dbg; /* DxDDR - DECO Debug Register */
|
u32 desc_dbg; /* DxDDR - DECO Debug Register */
|
||||||
u32 rsvd31[126];
|
u32 rsvd31[13];
|
||||||
|
#define DESC_DER_DECO_STAT_MASK 0x000F0000
|
||||||
|
#define DESC_DER_DECO_STAT_SHIFT 16
|
||||||
|
u32 dbg_exec; /* DxDER - DECO Debug Exec Register */
|
||||||
|
u32 rsvd32[112];
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#define DECO_STAT_HOST_ERR 0xD
|
||||||
|
|
||||||
#define DECO_JQCR_WHL 0x20000000
|
#define DECO_JQCR_WHL 0x20000000
|
||||||
#define DECO_JQCR_FOUR 0x10000000
|
#define DECO_JQCR_FOUR 0x10000000
|
||||||
|
|
||||||
|
|
|
@ -6,7 +6,10 @@ n5pf-objs := nitrox_main.o \
|
||||||
nitrox_lib.o \
|
nitrox_lib.o \
|
||||||
nitrox_hal.o \
|
nitrox_hal.o \
|
||||||
nitrox_reqmgr.o \
|
nitrox_reqmgr.o \
|
||||||
nitrox_algs.o
|
nitrox_algs.o \
|
||||||
|
nitrox_mbx.o \
|
||||||
|
nitrox_skcipher.o \
|
||||||
|
nitrox_aead.o
|
||||||
|
|
||||||
n5pf-$(CONFIG_PCI_IOV) += nitrox_sriov.o
|
n5pf-$(CONFIG_PCI_IOV) += nitrox_sriov.o
|
||||||
n5pf-$(CONFIG_DEBUG_FS) += nitrox_debugfs.o
|
n5pf-$(CONFIG_DEBUG_FS) += nitrox_debugfs.o
|
||||||
|
|
|
@ -0,0 +1,364 @@
|
||||||
|
// SPDX-License-Identifier: GPL-2.0
|
||||||
|
#include <linux/kernel.h>
|
||||||
|
#include <linux/printk.h>
|
||||||
|
#include <linux/crypto.h>
|
||||||
|
#include <linux/rtnetlink.h>
|
||||||
|
|
||||||
|
#include <crypto/aead.h>
|
||||||
|
#include <crypto/authenc.h>
|
||||||
|
#include <crypto/des.h>
|
||||||
|
#include <crypto/sha.h>
|
||||||
|
#include <crypto/internal/aead.h>
|
||||||
|
#include <crypto/scatterwalk.h>
|
||||||
|
#include <crypto/gcm.h>
|
||||||
|
|
||||||
|
#include "nitrox_dev.h"
|
||||||
|
#include "nitrox_common.h"
|
||||||
|
#include "nitrox_req.h"
|
||||||
|
|
||||||
|
#define GCM_AES_SALT_SIZE 4
|
||||||
|
|
||||||
|
/**
|
||||||
|
* struct nitrox_crypt_params - Params to set nitrox crypto request.
|
||||||
|
* @cryptlen: Encryption/Decryption data length
|
||||||
|
* @authlen: Assoc data length + Cryptlen
|
||||||
|
* @srclen: Input buffer length
|
||||||
|
* @dstlen: Output buffer length
|
||||||
|
* @iv: IV data
|
||||||
|
* @ivsize: IV data length
|
||||||
|
* @ctrl_arg: Identifies the request type (ENCRYPT/DECRYPT)
|
||||||
|
*/
|
||||||
|
struct nitrox_crypt_params {
|
||||||
|
unsigned int cryptlen;
|
||||||
|
unsigned int authlen;
|
||||||
|
unsigned int srclen;
|
||||||
|
unsigned int dstlen;
|
||||||
|
u8 *iv;
|
||||||
|
int ivsize;
|
||||||
|
u8 ctrl_arg;
|
||||||
|
};
|
||||||
|
|
||||||
|
union gph_p3 {
|
||||||
|
struct {
|
||||||
|
#ifdef __BIG_ENDIAN_BITFIELD
|
||||||
|
u16 iv_offset : 8;
|
||||||
|
u16 auth_offset : 8;
|
||||||
|
#else
|
||||||
|
u16 auth_offset : 8;
|
||||||
|
u16 iv_offset : 8;
|
||||||
|
#endif
|
||||||
|
};
|
||||||
|
u16 param;
|
||||||
|
};
|
||||||
|
|
||||||
|
static int nitrox_aes_gcm_setkey(struct crypto_aead *aead, const u8 *key,
|
||||||
|
unsigned int keylen)
|
||||||
|
{
|
||||||
|
int aes_keylen;
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
struct flexi_crypto_context *fctx;
|
||||||
|
union fc_ctx_flags flags;
|
||||||
|
|
||||||
|
aes_keylen = flexi_aes_keylen(keylen);
|
||||||
|
if (aes_keylen < 0) {
|
||||||
|
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
||||||
|
return -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* fill crypto context */
|
||||||
|
fctx = nctx->u.fctx;
|
||||||
|
flags.f = be64_to_cpu(fctx->flags.f);
|
||||||
|
flags.w0.aes_keylen = aes_keylen;
|
||||||
|
fctx->flags.f = cpu_to_be64(flags.f);
|
||||||
|
|
||||||
|
/* copy enc key to context */
|
||||||
|
memset(&fctx->crypto, 0, sizeof(fctx->crypto));
|
||||||
|
memcpy(fctx->crypto.u.key, key, keylen);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_aead_setauthsize(struct crypto_aead *aead,
|
||||||
|
unsigned int authsize)
|
||||||
|
{
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
struct flexi_crypto_context *fctx = nctx->u.fctx;
|
||||||
|
union fc_ctx_flags flags;
|
||||||
|
|
||||||
|
flags.f = be64_to_cpu(fctx->flags.f);
|
||||||
|
flags.w0.mac_len = authsize;
|
||||||
|
fctx->flags.f = cpu_to_be64(flags.f);
|
||||||
|
|
||||||
|
aead->authsize = authsize;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int alloc_src_sglist(struct aead_request *areq, char *iv, int ivsize,
|
||||||
|
int buflen)
|
||||||
|
{
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
int nents = sg_nents_for_len(areq->src, buflen) + 1;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
if (nents < 0)
|
||||||
|
return nents;
|
||||||
|
|
||||||
|
/* Allocate buffer to hold IV and input scatterlist array */
|
||||||
|
ret = alloc_src_req_buf(nkreq, nents, ivsize);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
nitrox_creq_copy_iv(nkreq->src, iv, ivsize);
|
||||||
|
nitrox_creq_set_src_sg(nkreq, nents, ivsize, areq->src, buflen);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int alloc_dst_sglist(struct aead_request *areq, int ivsize, int buflen)
|
||||||
|
{
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
int nents = sg_nents_for_len(areq->dst, buflen) + 3;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
if (nents < 0)
|
||||||
|
return nents;
|
||||||
|
|
||||||
|
/* Allocate buffer to hold ORH, COMPLETION and output scatterlist
|
||||||
|
* array
|
||||||
|
*/
|
||||||
|
ret = alloc_dst_req_buf(nkreq, nents);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
nitrox_creq_set_orh(nkreq);
|
||||||
|
nitrox_creq_set_comp(nkreq);
|
||||||
|
nitrox_creq_set_dst_sg(nkreq, nents, ivsize, areq->dst, buflen);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void free_src_sglist(struct aead_request *areq)
|
||||||
|
{
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
|
||||||
|
kfree(nkreq->src);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void free_dst_sglist(struct aead_request *areq)
|
||||||
|
{
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
|
||||||
|
kfree(nkreq->dst);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_set_creq(struct aead_request *areq,
|
||||||
|
struct nitrox_crypt_params *params)
|
||||||
|
{
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
struct se_crypto_request *creq = &nkreq->creq;
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(areq);
|
||||||
|
union gph_p3 param3;
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
creq->flags = areq->base.flags;
|
||||||
|
creq->gfp = (areq->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
|
||||||
|
GFP_KERNEL : GFP_ATOMIC;
|
||||||
|
|
||||||
|
creq->ctrl.value = 0;
|
||||||
|
creq->opcode = FLEXI_CRYPTO_ENCRYPT_HMAC;
|
||||||
|
creq->ctrl.s.arg = params->ctrl_arg;
|
||||||
|
|
||||||
|
creq->gph.param0 = cpu_to_be16(params->cryptlen);
|
||||||
|
creq->gph.param1 = cpu_to_be16(params->authlen);
|
||||||
|
creq->gph.param2 = cpu_to_be16(params->ivsize + areq->assoclen);
|
||||||
|
param3.iv_offset = 0;
|
||||||
|
param3.auth_offset = params->ivsize;
|
||||||
|
creq->gph.param3 = cpu_to_be16(param3.param);
|
||||||
|
|
||||||
|
creq->ctx_handle = nctx->u.ctx_handle;
|
||||||
|
creq->ctrl.s.ctxl = sizeof(struct flexi_crypto_context);
|
||||||
|
|
||||||
|
ret = alloc_src_sglist(areq, params->iv, params->ivsize,
|
||||||
|
params->srclen);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
ret = alloc_dst_sglist(areq, params->ivsize, params->dstlen);
|
||||||
|
if (ret) {
|
||||||
|
free_src_sglist(areq);
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void nitrox_aead_callback(void *arg, int err)
|
||||||
|
{
|
||||||
|
struct aead_request *areq = arg;
|
||||||
|
|
||||||
|
free_src_sglist(areq);
|
||||||
|
free_dst_sglist(areq);
|
||||||
|
if (err) {
|
||||||
|
pr_err_ratelimited("request failed status 0x%0x\n", err);
|
||||||
|
err = -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
areq->base.complete(&areq->base, err);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_aes_gcm_enc(struct aead_request *areq)
|
||||||
|
{
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(areq);
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
struct se_crypto_request *creq = &nkreq->creq;
|
||||||
|
struct flexi_crypto_context *fctx = nctx->u.fctx;
|
||||||
|
struct nitrox_crypt_params params;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
memcpy(fctx->crypto.iv, areq->iv, GCM_AES_SALT_SIZE);
|
||||||
|
|
||||||
|
memset(¶ms, 0, sizeof(params));
|
||||||
|
params.cryptlen = areq->cryptlen;
|
||||||
|
params.authlen = areq->assoclen + params.cryptlen;
|
||||||
|
params.srclen = params.authlen;
|
||||||
|
params.dstlen = params.srclen + aead->authsize;
|
||||||
|
params.iv = &areq->iv[GCM_AES_SALT_SIZE];
|
||||||
|
params.ivsize = GCM_AES_IV_SIZE - GCM_AES_SALT_SIZE;
|
||||||
|
params.ctrl_arg = ENCRYPT;
|
||||||
|
ret = nitrox_set_creq(areq, ¶ms);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
/* send the crypto request */
|
||||||
|
return nitrox_process_se_request(nctx->ndev, creq, nitrox_aead_callback,
|
||||||
|
areq);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_aes_gcm_dec(struct aead_request *areq)
|
||||||
|
{
|
||||||
|
struct crypto_aead *aead = crypto_aead_reqtfm(areq);
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
struct nitrox_kcrypt_request *nkreq = aead_request_ctx(areq);
|
||||||
|
struct se_crypto_request *creq = &nkreq->creq;
|
||||||
|
struct flexi_crypto_context *fctx = nctx->u.fctx;
|
||||||
|
struct nitrox_crypt_params params;
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
memcpy(fctx->crypto.iv, areq->iv, GCM_AES_SALT_SIZE);
|
||||||
|
|
||||||
|
memset(¶ms, 0, sizeof(params));
|
||||||
|
params.cryptlen = areq->cryptlen - aead->authsize;
|
||||||
|
params.authlen = areq->assoclen + params.cryptlen;
|
||||||
|
params.srclen = areq->cryptlen + areq->assoclen;
|
||||||
|
params.dstlen = params.srclen - aead->authsize;
|
||||||
|
params.iv = &areq->iv[GCM_AES_SALT_SIZE];
|
||||||
|
params.ivsize = GCM_AES_IV_SIZE - GCM_AES_SALT_SIZE;
|
||||||
|
params.ctrl_arg = DECRYPT;
|
||||||
|
ret = nitrox_set_creq(areq, ¶ms);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
/* send the crypto request */
|
||||||
|
return nitrox_process_se_request(nctx->ndev, creq, nitrox_aead_callback,
|
||||||
|
areq);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_aead_init(struct crypto_aead *aead)
|
||||||
|
{
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
struct crypto_ctx_hdr *chdr;
|
||||||
|
|
||||||
|
/* get the first device */
|
||||||
|
nctx->ndev = nitrox_get_first_device();
|
||||||
|
if (!nctx->ndev)
|
||||||
|
return -ENODEV;
|
||||||
|
|
||||||
|
/* allocate nitrox crypto context */
|
||||||
|
chdr = crypto_alloc_context(nctx->ndev);
|
||||||
|
if (!chdr) {
|
||||||
|
nitrox_put_device(nctx->ndev);
|
||||||
|
return -ENOMEM;
|
||||||
|
}
|
||||||
|
nctx->chdr = chdr;
|
||||||
|
nctx->u.ctx_handle = (uintptr_t)((u8 *)chdr->vaddr +
|
||||||
|
sizeof(struct ctx_hdr));
|
||||||
|
nctx->u.fctx->flags.f = 0;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int nitrox_aes_gcm_init(struct crypto_aead *aead)
|
||||||
|
{
|
||||||
|
int ret;
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
union fc_ctx_flags *flags;
|
||||||
|
|
||||||
|
ret = nitrox_aead_init(aead);
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
|
||||||
|
flags = &nctx->u.fctx->flags;
|
||||||
|
flags->w0.cipher_type = CIPHER_AES_GCM;
|
||||||
|
flags->w0.hash_type = AUTH_NULL;
|
||||||
|
flags->w0.iv_source = IV_FROM_DPTR;
|
||||||
|
/* ask microcode to calculate ipad/opad */
|
||||||
|
flags->w0.auth_input_type = 1;
|
||||||
|
flags->f = be64_to_cpu(flags->f);
|
||||||
|
|
||||||
|
crypto_aead_set_reqsize(aead, sizeof(struct aead_request) +
|
||||||
|
sizeof(struct nitrox_kcrypt_request));
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void nitrox_aead_exit(struct crypto_aead *aead)
|
||||||
|
{
|
||||||
|
struct nitrox_crypto_ctx *nctx = crypto_aead_ctx(aead);
|
||||||
|
|
||||||
|
/* free the nitrox crypto context */
|
||||||
|
if (nctx->u.ctx_handle) {
|
||||||
|
struct flexi_crypto_context *fctx = nctx->u.fctx;
|
||||||
|
|
||||||
|
memzero_explicit(&fctx->crypto, sizeof(struct crypto_keys));
|
||||||
|
memzero_explicit(&fctx->auth, sizeof(struct auth_keys));
|
||||||
|
crypto_free_context((void *)nctx->chdr);
|
||||||
|
}
|
||||||
|
nitrox_put_device(nctx->ndev);
|
||||||
|
|
||||||
|
nctx->u.ctx_handle = 0;
|
||||||
|
nctx->ndev = NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct aead_alg nitrox_aeads[] = { {
|
||||||
|
.base = {
|
||||||
|
.cra_name = "gcm(aes)",
|
||||||
|
.cra_driver_name = "n5_aes_gcm",
|
||||||
|
.cra_priority = PRIO,
|
||||||
|
.cra_flags = CRYPTO_ALG_ASYNC,
|
||||||
|
.cra_blocksize = AES_BLOCK_SIZE,
|
||||||
|
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
||||||
|
.cra_alignmask = 0,
|
||||||
|
.cra_module = THIS_MODULE,
|
||||||
|
},
|
||||||
|
.setkey = nitrox_aes_gcm_setkey,
|
||||||
|
.setauthsize = nitrox_aead_setauthsize,
|
||||||
|
.encrypt = nitrox_aes_gcm_enc,
|
||||||
|
.decrypt = nitrox_aes_gcm_dec,
|
||||||
|
.init = nitrox_aes_gcm_init,
|
||||||
|
.exit = nitrox_aead_exit,
|
||||||
|
.ivsize = GCM_AES_IV_SIZE,
|
||||||
|
.maxauthsize = AES_BLOCK_SIZE,
|
||||||
|
} };
|
||||||
|
|
||||||
|
int nitrox_register_aeads(void)
|
||||||
|
{
|
||||||
|
return crypto_register_aeads(nitrox_aeads, ARRAY_SIZE(nitrox_aeads));
|
||||||
|
}
|
||||||
|
|
||||||
|
void nitrox_unregister_aeads(void)
|
||||||
|
{
|
||||||
|
crypto_unregister_aeads(nitrox_aeads, ARRAY_SIZE(nitrox_aeads));
|
||||||
|
}
|
|
@ -1,458 +1,24 @@
|
||||||
// SPDX-License-Identifier: GPL-2.0
|
|
||||||
#include <linux/crypto.h>
|
|
||||||
#include <linux/kernel.h>
|
|
||||||
#include <linux/module.h>
|
|
||||||
#include <linux/printk.h>
|
|
||||||
|
|
||||||
#include <crypto/aes.h>
|
|
||||||
#include <crypto/skcipher.h>
|
|
||||||
#include <crypto/ctr.h>
|
|
||||||
#include <crypto/des.h>
|
|
||||||
#include <crypto/xts.h>
|
|
||||||
|
|
||||||
#include "nitrox_dev.h"
|
|
||||||
#include "nitrox_common.h"
|
#include "nitrox_common.h"
|
||||||
#include "nitrox_req.h"
|
|
||||||
|
|
||||||
#define PRIO 4001
|
|
||||||
|
|
||||||
struct nitrox_cipher {
|
|
||||||
const char *name;
|
|
||||||
enum flexi_cipher value;
|
|
||||||
};
|
|
||||||
|
|
||||||
/**
|
|
||||||
* supported cipher list
|
|
||||||
*/
|
|
||||||
static const struct nitrox_cipher flexi_cipher_table[] = {
|
|
||||||
{ "null", CIPHER_NULL },
|
|
||||||
{ "cbc(des3_ede)", CIPHER_3DES_CBC },
|
|
||||||
{ "ecb(des3_ede)", CIPHER_3DES_ECB },
|
|
||||||
{ "cbc(aes)", CIPHER_AES_CBC },
|
|
||||||
{ "ecb(aes)", CIPHER_AES_ECB },
|
|
||||||
{ "cfb(aes)", CIPHER_AES_CFB },
|
|
||||||
{ "rfc3686(ctr(aes))", CIPHER_AES_CTR },
|
|
||||||
{ "xts(aes)", CIPHER_AES_XTS },
|
|
||||||
{ "cts(cbc(aes))", CIPHER_AES_CBC_CTS },
|
|
||||||
{ NULL, CIPHER_INVALID }
|
|
||||||
};
|
|
||||||
|
|
||||||
static enum flexi_cipher flexi_cipher_type(const char *name)
|
|
||||||
{
|
|
||||||
const struct nitrox_cipher *cipher = flexi_cipher_table;
|
|
||||||
|
|
||||||
while (cipher->name) {
|
|
||||||
if (!strcmp(cipher->name, name))
|
|
||||||
break;
|
|
||||||
cipher++;
|
|
||||||
}
|
|
||||||
return cipher->value;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int flexi_aes_keylen(int keylen)
|
|
||||||
{
|
|
||||||
int aes_keylen;
|
|
||||||
|
|
||||||
switch (keylen) {
|
|
||||||
case AES_KEYSIZE_128:
|
|
||||||
aes_keylen = 1;
|
|
||||||
break;
|
|
||||||
case AES_KEYSIZE_192:
|
|
||||||
aes_keylen = 2;
|
|
||||||
break;
|
|
||||||
case AES_KEYSIZE_256:
|
|
||||||
aes_keylen = 3;
|
|
||||||
break;
|
|
||||||
default:
|
|
||||||
aes_keylen = -EINVAL;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
return aes_keylen;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_skcipher_init(struct crypto_skcipher *tfm)
|
|
||||||
{
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_skcipher_ctx(tfm);
|
|
||||||
void *fctx;
|
|
||||||
|
|
||||||
/* get the first device */
|
|
||||||
nctx->ndev = nitrox_get_first_device();
|
|
||||||
if (!nctx->ndev)
|
|
||||||
return -ENODEV;
|
|
||||||
|
|
||||||
/* allocate nitrox crypto context */
|
|
||||||
fctx = crypto_alloc_context(nctx->ndev);
|
|
||||||
if (!fctx) {
|
|
||||||
nitrox_put_device(nctx->ndev);
|
|
||||||
return -ENOMEM;
|
|
||||||
}
|
|
||||||
nctx->u.ctx_handle = (uintptr_t)fctx;
|
|
||||||
crypto_skcipher_set_reqsize(tfm, crypto_skcipher_reqsize(tfm) +
|
|
||||||
sizeof(struct nitrox_kcrypt_request));
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void nitrox_skcipher_exit(struct crypto_skcipher *tfm)
|
|
||||||
{
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_skcipher_ctx(tfm);
|
|
||||||
|
|
||||||
/* free the nitrox crypto context */
|
|
||||||
if (nctx->u.ctx_handle) {
|
|
||||||
struct flexi_crypto_context *fctx = nctx->u.fctx;
|
|
||||||
|
|
||||||
memset(&fctx->crypto, 0, sizeof(struct crypto_keys));
|
|
||||||
memset(&fctx->auth, 0, sizeof(struct auth_keys));
|
|
||||||
crypto_free_context((void *)fctx);
|
|
||||||
}
|
|
||||||
nitrox_put_device(nctx->ndev);
|
|
||||||
|
|
||||||
nctx->u.ctx_handle = 0;
|
|
||||||
nctx->ndev = NULL;
|
|
||||||
}
|
|
||||||
|
|
||||||
static inline int nitrox_skcipher_setkey(struct crypto_skcipher *cipher,
|
|
||||||
int aes_keylen, const u8 *key,
|
|
||||||
unsigned int keylen)
|
|
||||||
{
|
|
||||||
struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher);
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_tfm_ctx(tfm);
|
|
||||||
struct flexi_crypto_context *fctx;
|
|
||||||
enum flexi_cipher cipher_type;
|
|
||||||
const char *name;
|
|
||||||
|
|
||||||
name = crypto_tfm_alg_name(tfm);
|
|
||||||
cipher_type = flexi_cipher_type(name);
|
|
||||||
if (unlikely(cipher_type == CIPHER_INVALID)) {
|
|
||||||
pr_err("unsupported cipher: %s\n", name);
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* fill crypto context */
|
|
||||||
fctx = nctx->u.fctx;
|
|
||||||
fctx->flags = 0;
|
|
||||||
fctx->w0.cipher_type = cipher_type;
|
|
||||||
fctx->w0.aes_keylen = aes_keylen;
|
|
||||||
fctx->w0.iv_source = IV_FROM_DPTR;
|
|
||||||
fctx->flags = cpu_to_be64(*(u64 *)&fctx->w0);
|
|
||||||
/* copy the key to context */
|
|
||||||
memcpy(fctx->crypto.u.key, key, keylen);
|
|
||||||
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_aes_setkey(struct crypto_skcipher *cipher, const u8 *key,
|
|
||||||
unsigned int keylen)
|
|
||||||
{
|
|
||||||
int aes_keylen;
|
|
||||||
|
|
||||||
aes_keylen = flexi_aes_keylen(keylen);
|
|
||||||
if (aes_keylen < 0) {
|
|
||||||
crypto_skcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
return nitrox_skcipher_setkey(cipher, aes_keylen, key, keylen);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void nitrox_skcipher_callback(struct skcipher_request *skreq,
|
|
||||||
int err)
|
|
||||||
{
|
|
||||||
if (err) {
|
|
||||||
pr_err_ratelimited("request failed status 0x%0x\n", err);
|
|
||||||
err = -EINVAL;
|
|
||||||
}
|
|
||||||
skcipher_request_complete(skreq, err);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_skcipher_crypt(struct skcipher_request *skreq, bool enc)
|
|
||||||
{
|
|
||||||
struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(skreq);
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_skcipher_ctx(cipher);
|
|
||||||
struct nitrox_kcrypt_request *nkreq = skcipher_request_ctx(skreq);
|
|
||||||
int ivsize = crypto_skcipher_ivsize(cipher);
|
|
||||||
struct se_crypto_request *creq;
|
|
||||||
|
|
||||||
creq = &nkreq->creq;
|
|
||||||
creq->flags = skreq->base.flags;
|
|
||||||
creq->gfp = (skreq->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
|
|
||||||
GFP_KERNEL : GFP_ATOMIC;
|
|
||||||
|
|
||||||
/* fill the request */
|
|
||||||
creq->ctrl.value = 0;
|
|
||||||
creq->opcode = FLEXI_CRYPTO_ENCRYPT_HMAC;
|
|
||||||
creq->ctrl.s.arg = (enc ? ENCRYPT : DECRYPT);
|
|
||||||
/* param0: length of the data to be encrypted */
|
|
||||||
creq->gph.param0 = cpu_to_be16(skreq->cryptlen);
|
|
||||||
creq->gph.param1 = 0;
|
|
||||||
/* param2: encryption data offset */
|
|
||||||
creq->gph.param2 = cpu_to_be16(ivsize);
|
|
||||||
creq->gph.param3 = 0;
|
|
||||||
|
|
||||||
creq->ctx_handle = nctx->u.ctx_handle;
|
|
||||||
creq->ctrl.s.ctxl = sizeof(struct flexi_crypto_context);
|
|
||||||
|
|
||||||
/* copy the iv */
|
|
||||||
memcpy(creq->iv, skreq->iv, ivsize);
|
|
||||||
creq->ivsize = ivsize;
|
|
||||||
creq->src = skreq->src;
|
|
||||||
creq->dst = skreq->dst;
|
|
||||||
|
|
||||||
nkreq->nctx = nctx;
|
|
||||||
nkreq->skreq = skreq;
|
|
||||||
|
|
||||||
/* send the crypto request */
|
|
||||||
return nitrox_process_se_request(nctx->ndev, creq,
|
|
||||||
nitrox_skcipher_callback, skreq);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_aes_encrypt(struct skcipher_request *skreq)
|
|
||||||
{
|
|
||||||
return nitrox_skcipher_crypt(skreq, true);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_aes_decrypt(struct skcipher_request *skreq)
|
|
||||||
{
|
|
||||||
return nitrox_skcipher_crypt(skreq, false);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_3des_setkey(struct crypto_skcipher *cipher,
|
|
||||||
const u8 *key, unsigned int keylen)
|
|
||||||
{
|
|
||||||
if (keylen != DES3_EDE_KEY_SIZE) {
|
|
||||||
crypto_skcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
|
|
||||||
return nitrox_skcipher_setkey(cipher, 0, key, keylen);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_3des_encrypt(struct skcipher_request *skreq)
|
|
||||||
{
|
|
||||||
return nitrox_skcipher_crypt(skreq, true);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_3des_decrypt(struct skcipher_request *skreq)
|
|
||||||
{
|
|
||||||
return nitrox_skcipher_crypt(skreq, false);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_aes_xts_setkey(struct crypto_skcipher *cipher,
|
|
||||||
const u8 *key, unsigned int keylen)
|
|
||||||
{
|
|
||||||
struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher);
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_tfm_ctx(tfm);
|
|
||||||
struct flexi_crypto_context *fctx;
|
|
||||||
int aes_keylen, ret;
|
|
||||||
|
|
||||||
ret = xts_check_key(tfm, key, keylen);
|
|
||||||
if (ret)
|
|
||||||
return ret;
|
|
||||||
|
|
||||||
keylen /= 2;
|
|
||||||
|
|
||||||
aes_keylen = flexi_aes_keylen(keylen);
|
|
||||||
if (aes_keylen < 0) {
|
|
||||||
crypto_skcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
|
|
||||||
fctx = nctx->u.fctx;
|
|
||||||
/* copy KEY2 */
|
|
||||||
memcpy(fctx->auth.u.key2, (key + keylen), keylen);
|
|
||||||
|
|
||||||
return nitrox_skcipher_setkey(cipher, aes_keylen, key, keylen);
|
|
||||||
}
|
|
||||||
|
|
||||||
static int nitrox_aes_ctr_rfc3686_setkey(struct crypto_skcipher *cipher,
|
|
||||||
const u8 *key, unsigned int keylen)
|
|
||||||
{
|
|
||||||
struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher);
|
|
||||||
struct nitrox_crypto_ctx *nctx = crypto_tfm_ctx(tfm);
|
|
||||||
struct flexi_crypto_context *fctx;
|
|
||||||
int aes_keylen;
|
|
||||||
|
|
||||||
if (keylen < CTR_RFC3686_NONCE_SIZE)
|
|
||||||
return -EINVAL;
|
|
||||||
|
|
||||||
fctx = nctx->u.fctx;
|
|
||||||
|
|
||||||
memcpy(fctx->crypto.iv, key + (keylen - CTR_RFC3686_NONCE_SIZE),
|
|
||||||
CTR_RFC3686_NONCE_SIZE);
|
|
||||||
|
|
||||||
keylen -= CTR_RFC3686_NONCE_SIZE;
|
|
||||||
|
|
||||||
aes_keylen = flexi_aes_keylen(keylen);
|
|
||||||
if (aes_keylen < 0) {
|
|
||||||
crypto_skcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
|
||||||
return -EINVAL;
|
|
||||||
}
|
|
||||||
return nitrox_skcipher_setkey(cipher, aes_keylen, key, keylen);
|
|
||||||
}
|
|
||||||
|
|
||||||
static struct skcipher_alg nitrox_skciphers[] = { {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "cbc(aes)",
|
|
||||||
.cra_driver_name = "n5_cbc(aes)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE,
|
|
||||||
.max_keysize = AES_MAX_KEY_SIZE,
|
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_aes_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "ecb(aes)",
|
|
||||||
.cra_driver_name = "n5_ecb(aes)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE,
|
|
||||||
.max_keysize = AES_MAX_KEY_SIZE,
|
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_aes_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "cfb(aes)",
|
|
||||||
.cra_driver_name = "n5_cfb(aes)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE,
|
|
||||||
.max_keysize = AES_MAX_KEY_SIZE,
|
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_aes_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "xts(aes)",
|
|
||||||
.cra_driver_name = "n5_xts(aes)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = 2 * AES_MIN_KEY_SIZE,
|
|
||||||
.max_keysize = 2 * AES_MAX_KEY_SIZE,
|
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_aes_xts_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "rfc3686(ctr(aes))",
|
|
||||||
.cra_driver_name = "n5_rfc3686(ctr(aes))",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = 1,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE + CTR_RFC3686_NONCE_SIZE,
|
|
||||||
.max_keysize = AES_MAX_KEY_SIZE + CTR_RFC3686_NONCE_SIZE,
|
|
||||||
.ivsize = CTR_RFC3686_IV_SIZE,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
.setkey = nitrox_aes_ctr_rfc3686_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "cts(cbc(aes))",
|
|
||||||
.cra_driver_name = "n5_cts(cbc(aes))",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = AES_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_type = &crypto_ablkcipher_type,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = AES_MIN_KEY_SIZE,
|
|
||||||
.max_keysize = AES_MAX_KEY_SIZE,
|
|
||||||
.ivsize = AES_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_aes_setkey,
|
|
||||||
.encrypt = nitrox_aes_encrypt,
|
|
||||||
.decrypt = nitrox_aes_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "cbc(des3_ede)",
|
|
||||||
.cra_driver_name = "n5_cbc(des3_ede)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = DES3_EDE_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = DES3_EDE_KEY_SIZE,
|
|
||||||
.max_keysize = DES3_EDE_KEY_SIZE,
|
|
||||||
.ivsize = DES3_EDE_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_3des_setkey,
|
|
||||||
.encrypt = nitrox_3des_encrypt,
|
|
||||||
.decrypt = nitrox_3des_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}, {
|
|
||||||
.base = {
|
|
||||||
.cra_name = "ecb(des3_ede)",
|
|
||||||
.cra_driver_name = "n5_ecb(des3_ede)",
|
|
||||||
.cra_priority = PRIO,
|
|
||||||
.cra_flags = CRYPTO_ALG_ASYNC,
|
|
||||||
.cra_blocksize = DES3_EDE_BLOCK_SIZE,
|
|
||||||
.cra_ctxsize = sizeof(struct nitrox_crypto_ctx),
|
|
||||||
.cra_alignmask = 0,
|
|
||||||
.cra_module = THIS_MODULE,
|
|
||||||
},
|
|
||||||
.min_keysize = DES3_EDE_KEY_SIZE,
|
|
||||||
.max_keysize = DES3_EDE_KEY_SIZE,
|
|
||||||
.ivsize = DES3_EDE_BLOCK_SIZE,
|
|
||||||
.setkey = nitrox_3des_setkey,
|
|
||||||
.encrypt = nitrox_3des_encrypt,
|
|
||||||
.decrypt = nitrox_3des_decrypt,
|
|
||||||
.init = nitrox_skcipher_init,
|
|
||||||
.exit = nitrox_skcipher_exit,
|
|
||||||
}
|
|
||||||
|
|
||||||
};
|
|
||||||
|
|
||||||
int nitrox_crypto_register(void)
|
int nitrox_crypto_register(void)
|
||||||
{
|
{
|
||||||
return crypto_register_skciphers(nitrox_skciphers,
|
int err;
|
||||||
ARRAY_SIZE(nitrox_skciphers));
|
|
||||||
|
err = nitrox_register_skciphers();
|
||||||
|
if (err)
|
||||||
|
return err;
|
||||||
|
|
||||||
|
err = nitrox_register_aeads();
|
||||||
|
if (err) {
|
||||||
|
nitrox_unregister_skciphers();
|
||||||
|
return err;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
void nitrox_crypto_unregister(void)
|
void nitrox_crypto_unregister(void)
|
||||||
{
|
{
|
||||||
crypto_unregister_skciphers(nitrox_skciphers,
|
nitrox_unregister_aeads();
|
||||||
ARRAY_SIZE(nitrox_skciphers));
|
nitrox_unregister_skciphers();
|
||||||
}
|
}
|
||||||
|
|
|
@ -7,6 +7,10 @@
|
||||||
|
|
||||||
int nitrox_crypto_register(void);
|
int nitrox_crypto_register(void);
|
||||||
void nitrox_crypto_unregister(void);
|
void nitrox_crypto_unregister(void);
|
||||||
|
int nitrox_register_aeads(void);
|
||||||
|
void nitrox_unregister_aeads(void);
|
||||||
|
int nitrox_register_skciphers(void);
|
||||||
|
void nitrox_unregister_skciphers(void);
|
||||||
void *crypto_alloc_context(struct nitrox_device *ndev);
|
void *crypto_alloc_context(struct nitrox_device *ndev);
|
||||||
void crypto_free_context(void *ctx);
|
void crypto_free_context(void *ctx);
|
||||||
struct nitrox_device *nitrox_get_first_device(void);
|
struct nitrox_device *nitrox_get_first_device(void);
|
||||||
|
@ -19,7 +23,7 @@ void pkt_slc_resp_tasklet(unsigned long data);
|
||||||
int nitrox_process_se_request(struct nitrox_device *ndev,
|
int nitrox_process_se_request(struct nitrox_device *ndev,
|
||||||
struct se_crypto_request *req,
|
struct se_crypto_request *req,
|
||||||
completion_t cb,
|
completion_t cb,
|
||||||
struct skcipher_request *skreq);
|
void *cb_arg);
|
||||||
void backlog_qflush_work(struct work_struct *work);
|
void backlog_qflush_work(struct work_struct *work);
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -54,7 +54,13 @@
|
||||||
#define NPS_STATS_PKT_DMA_WR_CNT 0x1000190
|
#define NPS_STATS_PKT_DMA_WR_CNT 0x1000190
|
||||||
|
|
||||||
/* NPS packet registers */
|
/* NPS packet registers */
|
||||||
#define NPS_PKT_INT 0x1040018
|
#define NPS_PKT_INT 0x1040018
|
||||||
|
#define NPS_PKT_MBOX_INT_LO 0x1040020
|
||||||
|
#define NPS_PKT_MBOX_INT_LO_ENA_W1C 0x1040030
|
||||||
|
#define NPS_PKT_MBOX_INT_LO_ENA_W1S 0x1040038
|
||||||
|
#define NPS_PKT_MBOX_INT_HI 0x1040040
|
||||||
|
#define NPS_PKT_MBOX_INT_HI_ENA_W1C 0x1040050
|
||||||
|
#define NPS_PKT_MBOX_INT_HI_ENA_W1S 0x1040058
|
||||||
#define NPS_PKT_IN_RERR_HI 0x1040108
|
#define NPS_PKT_IN_RERR_HI 0x1040108
|
||||||
#define NPS_PKT_IN_RERR_HI_ENA_W1S 0x1040120
|
#define NPS_PKT_IN_RERR_HI_ENA_W1S 0x1040120
|
||||||
#define NPS_PKT_IN_RERR_LO 0x1040128
|
#define NPS_PKT_IN_RERR_LO 0x1040128
|
||||||
|
@ -74,6 +80,10 @@
|
||||||
#define NPS_PKT_SLC_RERR_LO_ENA_W1S 0x1040240
|
#define NPS_PKT_SLC_RERR_LO_ENA_W1S 0x1040240
|
||||||
#define NPS_PKT_SLC_ERR_TYPE 0x1040248
|
#define NPS_PKT_SLC_ERR_TYPE 0x1040248
|
||||||
#define NPS_PKT_SLC_ERR_TYPE_ENA_W1S 0x1040260
|
#define NPS_PKT_SLC_ERR_TYPE_ENA_W1S 0x1040260
|
||||||
|
/* Mailbox PF->VF PF Accessible Data registers */
|
||||||
|
#define NPS_PKT_MBOX_PF_VF_PFDATAX(_i) (0x1040800 + ((_i) * 0x8))
|
||||||
|
#define NPS_PKT_MBOX_VF_PF_PFDATAX(_i) (0x1040C00 + ((_i) * 0x8))
|
||||||
|
|
||||||
#define NPS_PKT_SLC_CTLX(_i) (0x10000 + ((_i) * 0x40000))
|
#define NPS_PKT_SLC_CTLX(_i) (0x10000 + ((_i) * 0x40000))
|
||||||
#define NPS_PKT_SLC_CNTSX(_i) (0x10008 + ((_i) * 0x40000))
|
#define NPS_PKT_SLC_CNTSX(_i) (0x10008 + ((_i) * 0x40000))
|
||||||
#define NPS_PKT_SLC_INT_LEVELSX(_i) (0x10010 + ((_i) * 0x40000))
|
#define NPS_PKT_SLC_INT_LEVELSX(_i) (0x10010 + ((_i) * 0x40000))
|
||||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue