The generated copy_page for R4k CPU with a 128 byte cache line size used
Create Dirty Exclusive cache line operations even if only part of the
cache line was filled. This change avoids generating cache operations,
if only part of the cache line size is copied in one loop. It also
increases the maxmimum loop size, because the generated code even fits
into the available space for r4k CPUs with 128 byte cache line size.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fold the SB-1 specific implementation of clear_page/copy_page in the
generic version, and rewrite that one in tlbex style. The immediate
benefits:
- It converts the compile-time workaround for SB-1 pass 1 prefetches
to a more efficient run-time check.
- It allows adjustment of loop unfolling, which helps to reduce the
number of redundant cdex cache ops.
- It fixes some esoteric cornercases (the cache line length calculations
can go wrong, and support for 64k pages without prefetch instructions
will overflow the addiu immediate).
- Somewhat better guesses of "good" prefetch values.
Signed-off-by: Thiemo Seufer <ths@networkno.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>