Don't piggy back the SMP receive signal code to do the
context version change handling.
Instead allocate another fixed PIL number for this
asynchronous cross-call. We can't use smp_call_function()
because this thing is invoked with interrupts disabled
and a few spinlocks held.
Also, fix smp_call_function_mask() to count "cpus" correctly.
There is no guarentee that the local cpu is in the mask
yet that is exactly what this code was assuming.
Signed-off-by: David S. Miller <davem@davemloft.net>
Earlier I unifdefed PageCompound, so that snd_pcm_mmap_control_nopage and
others can give out a 0-order component of a higher-order page, which won't
be mistakenly freed when zap_pte_range unmaps it. But many Bad page states
reported a PG_reserved was freed after all: I had missed that we need to
say __GFP_COMP to get compound page behaviour.
Some of these higher-order pages are allocated by snd_malloc_pages, some by
snd_malloc_dev_pages; or if SBUS, by sbus_alloc_consistent - but that has
no gfp arg, so add __GFP_COMP into its sparc32/64 implementations.
I'm still rather puzzled that DRM seems not to need a similar change.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE which is never used anyways.
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears that a memory barrier soon after a mispredicted
branch, not just in the delay slot, can cause the hang
condition of this cpu errata.
So move them out-of-line, and explicitly put them into
a "branch always, predict taken" delay slot which should
fully kill this problem.
Signed-off-by: David S. Miller <davem@davemloft.net>
Firstly, if the direction is TODEVICE, then dirty data in the
streaming cache is impossible so we can elide the flush-flag
synchronization in that case.
Next, the context allocator is broken. It is highly likely
that contexts get used multiple times for different dma
mappings, which confuses the strbuf flushing code and makes
it run inefficiently.
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent change to add a timeout to strbuf flushing had
a negative performance impact. The udelay()'s are too long,
and they were done in the wrong order wrt. the register read
checks. Fix both, and things are happy again.
There are more possible improvements in this area. In fact,
PCI streaming buffer flushing seems to be part of the bottleneck
in network receive performance on my SunBlade1000 box.
Signed-off-by: David S. Miller <davem@davemloft.net>
If some hardware error occurs and the flush flag never updates,
we will hang forever in these routines. Add a timeout, and
print out a diagnostic if it is reached.
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!