Commit Graph

27 Commits

Author SHA1 Message Date
Suresh Siddha e49140120c crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()
Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:

##################################################################

BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:

Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
       c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
       c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
 [<c03b5b43>] ? schedule+0x285/0x2ff
 [<c0131856>] ? pm_qos_requirement+0x3c/0x53
 [<c0239f54>] ? acpi_processor_idle+0x0/0x434
 [<c01025fe>] ? cpu_idle+0x73/0x7f
 [<c03a4dcd>] ? rest_init+0x61/0x63
 =======================

Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.

Suresh wrote:

These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.

This is the code sequence  that is probably causing this problem:

a) new app is getting exec'd and it is somewhere in between
   start_thread() and flush_old_exec() in the load_xyz_binary()

b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
   cleared.

c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
   routines in the network stack. This generates a math fault (as
   cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
   in the task's xstate.

d) Return to exec code path, which does start_thread() which does
   free_thread_xstate() and sets xstate pointer to NULL while
   the TS_USEDFPU is still set.

e) At the next context switch from the new exec'd task to another task,
   we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
   This can cause an oops during unlazy_fpu() in __switch_to()

Now:

1) This should happen with or with out pre-emption. Viro also encountered
   similar problem with out CONFIG_PREEMPT.

2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
   kernel_fpu_begin() will manually do a clts() and won't run in to the
   situation of setting TS_USEDFPU in step "c" above.

3) This was working before the fpu changes, because its a spurious
   math fault  which doesn't corrupt any fpu/sse registers and the task's
   math state was always in an allocated state.

With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).

This is the failing scenario that existed even before the lazy fpu allocation
changes:

0. CPU's TS flag is set

1. kernel using FPU in some optimized copy  routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()

2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.

3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts

4. We complete the padlock routine

5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.

6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.

This causes the fpu leakage.

Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the  context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.

Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13 22:02:26 +10:00
Jeremy Katz b43e726b32 crypto: padlock - Make module loading quieter when hardware isn't available
When loading aes or sha256 via the module aliases, the padlock modules
also try to get loaded.  Make the error message for them not being
present only be a NOTICE rather than an ERROR so that use of 'quiet'
will suppress the messages

Signed-off-by: Jeremy Katz <katzj@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-07-10 20:35:16 +08:00
Sebastian Siewior 7dc748e4e7 [CRYPTO] padlock-aes: Use generic setkey function
The Padlock AES setkey routine is the same as exported by the generic
implementation. So we could use it.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Cc: Michal Ludvig <michal@logix.cz>
Tested-by: Stefan Hellermann <stefan@the2masters.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-04-21 10:19:34 +08:00
Herbert Xu 866cd902e8 [CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
Currently we reset the key for each segment fed to the xcrypt instructions.
This patch optimises this for CBC and ECB so that we only do this once for
each encrypt/decrypt operation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:59 +11:00
Sebastian Siewior 89e1265431 [CRYPTO] aes: Move common defines into a header file
This three defines are used in all AES related hardware.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:04 +11:00
Herbert Xu 490fe3f05b [CRYPTO] padlock: Fix alignment fault in aes_crypt_copy
The previous patch fixed spurious read faults from occuring by copying
the data if we happen to have a single block at the end of a page.  It
appears that gcc cannot guarantee 16-byte alignment in the kernel with
__attribute__.  The following report from Torben Viets shows a buffer
that's only 8-byte aligned:

> eneral protection fault: 0000 [#1]
> Modules linked in: xt_TCPMSS xt_tcpmss iptable_mangle ipt_MASQUERADE
> xt_tcpudp xt_mark xt_state iptable_nat nf_nat nf_conntrack_ipv4
> iptable_filter ip_tables x_tables pppoe pppox af_packet ppp_generic slhc
> aes_i586
> CPU:    0
> EIP:    0060:[<c035b828>]    Not tainted VLI
> EFLAGS: 00010292   (2.6.23.12 #7)
> EIP is at aes_crypt_copy+0x28/0x40
> eax: f7639ff0   ebx: f6c24050   ecx: 00000001   edx: f6c24030
> esi: f7e89dc8   edi: f7639ff0   ebp: 00010000   esp: f7e89dc8

Since the hardware must have 16-byte alignment, the following patch fixes
this by open coding the alignment adjustment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:09:35 +11:00
Herbert Xu d4a7dd8e63 [CRYPTO] padlock: Fix spurious ECB page fault
The xcryptecb instruction always processes an even number of blocks so
we need to ensure th existence of an extra block if we have to process
an odd number of blocks.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-12-28 11:05:46 +11:00
Sebastian Siewior f8246af005 [CRYPTO] aes: Rename aes to aes-generic
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.

Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-10-10 16:55:49 -07:00
Herbert Xu efcf8023e2 [CRYPTO] drivers: Remove obsolete block cipher operations
This patch removes obsolete block operations of the simple cipher type
from drivers.  These were preserved so that existing users can make a
smooth transition.  Now that the transition is complete, they are no
longer needed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:16 +10:00
Herbert Xu 28ce728a90 [CRYPTO] padlock: Added block cipher versions of CBC/ECB
This patch adds block cipher algorithms for cbc(aes) and ecb(aes) for
the PadLock device.  Once all users to the old cipher type have been
converted the old cbc/ecb PadLock operations will be removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:44:35 +10:00
Herbert Xu 560c06ae1a [CRYPTO] api: Get rid of flags argument to setkey
Now that the tfm is passed directly to setkey instead of the ctx, we no
longer need to pass the &tfm->crt_flags pointer.

This patch also gets rid of a few unnecessary checks on the key length
for ciphers as the cipher layer guarantees that the key length is within
the bounds specified by the algorithm.

Rather than testing dia_setkey every time, this patch does it only once
during crypto_alloc_tfm.  The redundant check from crypto_digest_setkey
is also removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:41:02 +10:00
Michal Ludvig 5644bda5d6 [CRYPTO] padlock: Helper module padlock.ko
Compile a helper module padlock.ko that will try
to autoload all configured padlock algorithms.

This also provides backward compatibility with 
the ancient times before padlock.ko was renamed 
to padlock-aes.ko

Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:41:00 +10:00
Michal Ludvig ccc17c34d6 [CRYPTO] padlock: Update private header file
PADLOCK_CRA_PRIORITY is shared between padlock-aes and padlock-sha
so it should be in the header.

On the other hand "struct cword" is only used in padlock-aes.c
so it's unnecessary to have it in padlock.h

Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:40:22 +10:00
Herbert Xu db5e9a4237 [CRYPTO] padlock: Add compatibility alias after rename
Whenever we rename modules we should add an alias to ensure that existing
users can still locate the new module.

This patch also gets rid of the now unused module function prototypes from
padlock.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:40:22 +10:00
Michal Ludvig 1191f0a493 [CRYPTO] padlock: Get rid of padlock-generic.c
Merge padlock-generic.c into padlock-aes.c and compile
AES as a standalone module. We won't make a monolithic
padlock.ko with all supported algorithms, instead we'll
compile each driver into its own module.

Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:40:21 +10:00
Michal Ludvig cc08632f8f [CRYPTO] padlock: Fix alignment after aes_ctx rearrange
Herbert's patch 82062c72cd 
in cryptodev-2.6 tree breaks alignment rules for PadLock 
xcrypt instruction leading to General protection Oopses.

This patch fixes the problem.

Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-07-15 11:08:50 +10:00
Herbert Xu 82062c72cd [CRYPTO] padlock: Rearrange context structure to reduce code size
i386 assembly has more compact instructions for accessing 7-bit offsets.
So by moving the large members to the end of the structure we can save
quite a bit of code size.  This patch shaves about 10% or 300 bytes off
the padlock-aes file.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:39 +10:00
Herbert Xu 6c2bb98bc3 [CRYPTO] all: Pass tfm instead of ctx to algorithms
Up until now algorithms have been happy to get a context pointer since
they know everything that's in the tfm already (e.g., alignment, block
size).

However, once we have parameterised algorithms, such information will
be specific to each tfm.  So the algorithm API needs to be changed to
pass the tfm structure instead of the context pointer.

This patch is basically a text substitution.  The only tricky bit is
the assembly routines that need to get the context pointer offset
through asm-offsets.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:39 +10:00
Herbert Xu f10b7897ee [CRYPTO] api: Align tfm context as wide as possible
Since tfm contexts can contain arbitrary types we should provide at least
natural alignment (__attribute__ ((__aligned__))) for them.  In particular,
this is needed on the Xscale which is a 32-bit architecture with a u64 type
that requires 64-bit alignment.  This problem was reported by Ronen Shitrit.

The crypto_tfm structure's size was 44 bytes on 32-bit architectures and
80 bytes on 64-bit architectures.  So adding this requirement only means
that we have to add an extra 4 bytes on 32-bit architectures.

On i386 the natural alignment is 16 bytes which also benefits the VIA
Padlock as it no longer has to manually align its context structure to
128 bits.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-03-21 20:14:08 +11:00
Herbert Xu 102d60a2d8 [PATCH] padlock: Fix typo that broke 256-bit keys
A typo crept into the le32_to_cpu patch which broke 256-bit keys
in the padlock driver.  The following patch based on observations
by Michael Heyse fixes the problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-22 07:47:07 -08:00
Herbert Xu c8a19c91b5 [CRYPTO] Allow AES C/ASM implementations to coexist
As the Crypto API now allows multiple implementations to be registered
for the same algorithm, we no longer have to play tricks with Kconfig
to select the right AES implementation.

This patch sets the driver name and priority for all the AES
implementations and removes the Kconfig conditions on the C implementation
for AES.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-01-09 14:15:39 -08:00
Herbert Xu 06ace7a9ba [CRYPTO] Use standard byte order macros wherever possible
A lot of crypto code needs to read/write a 32-bit/64-bit words in a
specific gender.  Many of them open code them by reading/writing one
byte at a time.  This patch converts all the applicable usages over
to use the standard byte order macros.

This is based on a previous patch by Denis Vlasenko.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-01-09 14:15:34 -08:00
Herbert Xu 476df259cd [CRYPTO] Update IV correctly for Padlock CBC encryption
When the Padlock does CBC encryption, the memory pointed to by EAX is
not updated at all.  Instead, it updates the value of EAX by pointing
it to the last block in the output.  Therefore to maintain the correct
semantics we need to copy the IV.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 13:54:09 -07:00
Herbert Xu fbdae9f3e7 [CRYPTO] Ensure cit_iv is aligned correctly
This patch ensures that cit_iv is aligned according to cra_alignmask
by allocating it as part of the tfm structure.  As a side effect the
crypto layer will also guarantee that the tfm ctx area has enough space
to be aligned by cra_alignmask.  This allows us to remove the extra
space reservation from the Padlock driver.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 13:53:29 -07:00
Herbert Xu 28e8c3ad94 [PADLOCK] Implement multi-block operations
By operating on multiple blocks at once, we expect to extract more
performance out of the VIA Padlock.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 13:52:43 -07:00
Herbert Xu 6789b2dc45 [PADLOCK] Move fast path work into aes_set_key and upper layer
Most of the work done aes_padlock can be done in aes_set_key.  This
means that we only have to do it once when the key changes rather
than every time we perform an encryption or decryption.

This patch also sets cra_alignmask to let the upper layer ensure
that the buffers fed to us are aligned correctly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 13:52:27 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00