Commit Graph

16 Commits

Author SHA1 Message Date
Arnd Bergmann 23e0e8afaf powerpc/cell/axon-msi: Fix MSI after kexec
Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.

The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer.  We set up the ring buffer
during reboot, but not the offset into it.  On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.

With the new code, we time out on each interrupt, waiting for
it to become valid.  If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.

The solution in this commit is to read the current offset from
the MSIC when reinitializing it.  This now works correctly, as
expected.

Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-16 13:48:18 +11:00
Arnd Bergmann d015fe9951 powerpc/cell/axon-msi: Retry on missing interrupt
The MSI capture logic for the axon bridge can sometimes
lose interrupts in case of high DMA and interrupt load,
when it signals an MSI interrupt to the MPIC interrupt
controller while we are already handling another MSI.

Each MSI vector gets written into a FIFO buffer in main
memory using DMA, and that DMA access is normally flushed
by the actual interrupt packet on the IOIF.  An MMIO
register in the MSIC holds the position of the last
entry in the FIFO buffer that was written.  However,
reading that position does not flush the DMA, so that
we can observe stale data in the buffer.

In a stress test, we have observed the DMA to arrive
up to 14 microseconds after reading the register.

This patch works around this problem by retrying the
access to the FIFO buffer.

We can reliably detect the conditioning by writing
an invalid MSI vector into the FIFO buffer after
reading from it, assuming that all MSIs we get
are valid.  After detecting an invalid MSI vector,
we udelay(1) in the interrupt cascade for up to
100 times before giving up.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-01 09:40:18 +11:00
Michael Ellerman 19fc65b525 powerpc: Fix irq_alloc_host() reference counting and callers
When I changed irq_alloc_host() to take an of_node
(52964f87c64e6c6ea671b5bf3030fb1494090a48: "Add an optional
device_node pointer to the irq_host"), I botched the reference
counting semantics.

Stephen pointed out that it's irq_alloc_host()'s business if
it needs to take an additional reference to the device_node,
the caller shouldn't need to care.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-09 13:51:16 +10:00
Michael Ellerman 997526db9f powerpc: Rework Axon MSI setup so we can avoid freeing the irq_host
If we do the call to irq_of_parse_and_map() first, then we don't
need to worry about freeing the irq_host.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-09 13:51:13 +10:00
Michael Ellerman 72cac213fd [POWERPC] Add debugging trigger to Axon MSI code
This adds some debugging code to the Axon MSI driver.  It creates a
file per MSIC in /sys/kernel/debug/powerpc, which allows the user to
trigger a fake MSI interrupt by writing to the file.

This can be used to test some of the MSI generation path.  In
particular, that the MSIC recognises a write to the MSI address,
generates an interrupt and writes the MSI packet into the ring buffer.

All the code is inside #ifdef DEBUG so it causes no harm unless it's
enabled.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23 15:27:28 +10:00
Michael Ellerman 988479ebcc [POWERPC] Use of_get_next_parent() in platforms/cell/axon_msi.c
Replace two open-coded occurences of the of_get_next_parent() logic.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-24 20:58:03 +10:00
Stephen Rothwell c6d01179bf [POWERPC] Avoid possible extra of_node_put in axon_msi.c
I got this warning from gcc:
arch/powerpc/platforms/cell/axon_msi.c:118: warning: 'tmp' may be used uninitialized in this function

Which turns out to be a false positive, but pointed out that it was
possible for the error path in find_msi_translator() to do an extra
of_node_put on a node.  This fixes it by localising the ref counting
a bit.  As a side effect, the warning goes away.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-02-06 16:30:00 +11:00
Michael Ellerman de4c928b84 [POWERPC] Avoid DMA exception when using axon_msi with IOMMU
There's a brown-paper-bag bug in axon_msi, we pass the address of our
FIFO directly to the hardware, without DMA mapping it.  This leads to
DMA exceptions if you enable MSI & the IOMMU.

The fix is to correctly DMA map the fifo, dma_alloc_coherent() does
what we want - and we need to track the virt & phys addresses.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-02-06 16:30:00 +11:00
Michael Ellerman e4347dfb58 [POWERPC] Convert axon_msi to an of_platform driver
Now that we create of_platform devices earlier on cell, we can make the
axon_msi driver an of_platform driver.  This makes the code cleaner in
several ways, and most importantly means we have a struct device.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-02-06 16:30:00 +11:00
Michael Ellerman 2843e7f7d6 Remove msic_dcr_read() in axon_msi.c
msic_dcr_read() doesn't really do anything useful, just replace it with
direct calls to dcr_read().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:29:49 -04:00
Michael Ellerman 83f34df4e7 Add dcr_host_t.base in dcr_read()/dcr_write()
Now that all users of dcr_read()/dcr_write() add the dcr_host_t.base, we
can save them the trouble and do it in dcr_read()/dcr_write().

As some background to why we just went through all this jiggery-pokery,
benh sayeth:

 Initially the goal of the dcr_read/dcr_write routines was to operate like
 mfdcr/mtdcr which take absolute DCR numbers. The reason is that on 4xx
 hardware, indirect DCR access is a pain (goes through a table of
 instructions) and it's useful to have the compiler resolve an absolute DCR
 inline.

 We decided that wasn't worth the API bastardisation since most places
 where absolute DCR values are used are low level 4xx-only code which may
 as well continue using mfdcr/mtdcr, while the new API is designed for
 device "instances" that can exist on 4xx and Axon type platforms and may
 be located at variable DCR offsets.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:29:49 -04:00
Michael Ellerman 4acb889627 [POWERPC] Update axon_msi to use dcr_host_t.base
Now that dcr_host_t contains the base address, we can use that in the
axon_msi code, rather than storing it separately.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03 13:25:28 +10:00
Michael Ellerman db220b234d [POWERPC] Make sure to of_node_get() the result of pci_device_to_OF_node()
pci_device_to_OF_node() returns the device node attached to a PCI device,
but doesn't actually grab a reference - we need to do it ourselves.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-03 09:11:29 +10:00
Michael Ellerman 6815800601 [POWERPC] Provide a default irq_host match, which matches on an exact of_node
The most common match semantic is an exact match based on the device node.
So provide a default implementation that does this, and hook it up if no
match routine is specified.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-14 01:33:20 +10:00
Michael Ellerman 52964f87c6 [POWERPC] Add an optional device_node pointer to the irq_host
The majority of irq_host implementations (3 out of 4) are associated
with a device_node, and need to stash it somewhere. Rather than having
it somewhere different for each host, add an optional device_node pointer
to the irq_host structure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-14 01:33:20 +10:00
Michael Ellerman ce21b3c964 [CELL] add support for MSI on Axon-based Cell systems
This patch adds support for the setup and decoding of MSIs
on Axon-based Cell systems, using the MSIC mechanism.

This involves setting up an area of BE memory which the Axon
then uses as a FIFO for MSI messages. When one or more MSIs
are decoded by the MSIC we receive an interrupt on the MPIC,
and the MSI messages are written into the FIFO. At the moment
we use a 64KB FIFO, one per MSIC/BE.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:41:45 +02:00