Pgtable bits are assigned dynamically depending on processor feature and
statically based on kernel configuration. To make sense out of the
disassembled TLB exception handlers a list of the actual assignments
used for a particular configuration and hardware setup can be very useful.
Output the actual TLB exception handlers in a format that simplifies their
post processsing from dmesg output.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
From a software perspective R5000 and R5000A are the same thing which is
why the symbol CPU_R5000A never got used, so finally delete it.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
R5000 and the Nevada CPUs (RM5230, RM5231, RM5260, RM5261, RM5270 and
RM5271) are basically the same CPU core and all are documented to require
two instructions separating a write to c0_pagemask, c0_entryhi, c0_entrylo0,
c0_entrylo1 or c0_index.
So far we were only providing on cycle before / after a TLBR/TLBWI
for R5000 but 3 cycles before and 1 cycles after for the Nevadas.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The microassembler used in tlbex.c does not notice if a label is redefined
resulting in relocations against such labels silently missrelocated.
The issues exists since commit add6eb04776db4189ea89f596cbcde31b899be9d
[Synthesize TLB exception handlers at runtime.] in 2.6.10 and went unnoticed
for so long because the relocations for the affected branches got computed
to do something *almost* sensible.
The issue affects R4000, R4400, QED/IDT RM5230, RM5231, RM5260, RM5261,
RM5270 and RM5271 processors.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We don't have to do a separate shift to eliminate the software bits,
just rotate them into the fill and they will be ignored.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4294/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Remove usage of the 'kernel_uses_smartmips_rixi' macro from all files
and use new 'cpu_has_rixi' instead.
Signed-off-by: Steven J. Hill <sjhill@mips.com>
Acked-by: David Daney <david.daney@cavium.com>
The EXT and INS instructions can be used to decrease code size and
thus speed up TLB handlers on MIPS32R2 and MIPS64R2 cores.
Signed-off-by: Steven J. Hill <sjhill@mips.com>
The architecture specification says that an EHB instruction is
needed to avoid a hazard when writing TLB entries. However, some
cores do not have this hazard, and thus the EHB instruction causes
a costly pipeline stall. Detect these cores and do not use the EHB
instruction.
Signed-off-by: Steven J. Hill <sjhill@mips.com>
For the case PM_DEFAULT_MASK == 0, we were placing a branch in the
delay slot of another branch. This leads to undefined behavior.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2775/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Only some GCC versions such as gcc 4.2 notice that the variable wr in
build_r3000_tlb_modify_handler is used uninitialized. When using one
of those GCCs the build will fail due to -Werror. GCC 4.6 does not
warn about the uninitialized use of wr.
This issue was introduced by 7211f4d7a3dcbe57c5d396c334dca525315dceb2
[MIPS: Close races in TLB modify handlers.]
Reported-by: Ganesan Ramalingam <ganesan18@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Page table entries are made invalid by writing a zero into the the PTE
slot in a page table. This creates a race condition with the TLB
modify handlers when they are updating the PTE.
CPU0 CPU1
Test for _PAGE_PRESENT
. set to not _PAGE_PRESENT (zero)
Set to _PAGE_VALID
So now the page not present value (zero) is suddenly valid and user
space programs have access to physical page zero.
We close the race by putting the test for _PAGE_PRESENT and setting of
_PAGE_VALID into an atomic LL/SC section. This requires more registers
than just K0 and K1 in the handlers, so we need to save some registers
to a save area and then restore them when we are done.
The save area is an array of cacheline aligned structures that should
not suffer cache line bouncing as they are CPU private.
[ralf@linux-mips.org: Fix !defined(CONFIG_MIPS_PGD_C0_CONTEXT) build error.]
Signed-off-by: David Daney <david.daney@cavium.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2577/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CPU_XLR case added to mm/tlbex.c
CPU_XLR case added to mm/c-r4k.c for PINDEX attribute
Feature overrides for XLR cpu.
Signed-off-by: Jayachandran C <jayachandranc@netlogicmicro.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2333/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CC arch/mips/mm/tlbex.o
arch/mips/mm/tlbex.c: In function 'build_r4000_tlb_refill_handler':
arch/mips/mm/tlbex.c:1155:22: error: variable 'vmalloc_mode' set but not used [-Werror=unused-but-set-variable]
arch/mips/mm/tlbex.c:1154:28: error: variable 'htlb_info' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
It was reported that GCC-4.3.3 (with CodeSourcery extensions) fails
without this.
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2010/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Octeon can use scratch registers in the TLB handlers. Octeon II can
use LDX instructions.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1904/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If the CPU supports BBIT0 and BBIT1, use them in TLB handlers as they
are more efficient than an AND followed by an branch and then
restoring the clobbered register.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1873/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
BMIPS processor cores are used in 50+ different chipsets spread across
5+ product lines. In many cases the chipsets do not share the same
peripheral register layouts, the same register blocks, the same
interrupt controllers, the same memory maps, or much of anything else.
But, across radically different SoCs that share nothing more than the
same BMIPS CPU, a few things are still mostly constant:
SMP operations
Access to performance counters
DMA cache coherency quirks
Cache and memory bus configuration
So, it makes sense to treat each BMIPS processor type as a generic
"building block," rather than tying it to a specific SoC. This makes it
easier to support a large number of BMIPS-based chipsets without
unnecessary duplication of code, and provides the infrastructure needed
to support BMIPS-proprietary features.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Cc: mbizon@freebox.fr
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Florian Fainelli <ffainelli@freebox.fr>
Patchwork: https://patchwork.linux-mips.org/patch/1706/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org
For some combinations of PAGE_SIZE and vmbits, it is possible to have
userspace access that are beyond what is covered by the PGD, but within
vmbits. Such an access would cause the TLB refill handler to load garbage
values for PMD and PTE potentially giving userspace access to parts of the
physical address space to which it is not entitled.
In the TLB refill hot path, we add a single dsrl instruction so we can
check if any bits outside of the range covered by the PGD are set. In
the vmalloc side we then separate the bad case from the normal vmalloc
case and call tlb_do_page_fault_0 if warranted. This slows us down a
bit, but has the benefit of yielding deterministic behavior.
[Ralf: Fixed build error for 32-bit kernels.]
[Ralf: Folded lmo commit c8c0e22b2aa3982852b44279638ef37f9aa31b7d into this
commit.]
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1152/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
This makes the code somewhat cleaner while reducing the risk of shift
amount overflows when various page table related options are changed.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1154/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SmartMIPS ASE specifies how Read Inhibit (RI) and eXecute Inhibit
(XI) bits in the page tables work. The upper two bits of EntryLo{0,1}
are RI and XI when the feature is enabled in the PageGrain register.
SmartMIPS only covers 32-bit systems. Cavium Octeon+ extends this to
64-bit systems by continuing to place the RI and XI bits in the top of
EntryLo even when EntryLo is 64-bits wide.
Because we need to carry the RI and XI bits in the PTE, the layout of
the PTE is changed. There is a two instruction overhead in the TLB
refill hot path to get the EntryLo bits into the proper position.
Also the TLB load exception has to probe the TLB to check if RI or XI
caused the exception.
Also of note is that the layout of the PTE bits is done at compile and
runtime rather than statically. In the 32-bit case this allows for
the same number of PFN bits as before the patch as the _PAGE_HUGE is
not supported in 32-bit kernels (we have _PAGE_NO_EXEC and
_PAGE_NO_READ instead of _PAGE_READ and _PAGE_HUGE).
The patch is tested on Cavium Octeon+, but should also work on 32-bit
systems with the Smart-MIPS ASE.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/952/
Patchwork: http://patchwork.linux-mips.org/patch/956/
Patchwork: http://patchwork.linux-mips.org/patch/962/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
64-bit CPUs have 64-bit c0_entrylo{0,1} registers. We should use the
64-bit dmtc0 instruction to set them. This becomes important if we
want to set the RI and XI bits present in some processors.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/954/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For 64-bit kernels with 64KB pages and two level page tables, there are
42 bits worth of virtual address space This is larger than the 40 bits of
virtual address space obtained with the default 4KB Page size and three
levels, so there are no draw backs for using two level tables with this
configuration.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/761/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Processors that support the mips64r2 ISA can in four instructions
convert a shifted PGD pointer stored in the upper bits of c0_context
into a usable pointer. By doing this we save a memory load and
associated potential cache miss in the TLB exception handlers.
Since the upper bits of c0_context were holding the CPU number, we
move this to the upper bits of c0_xcontext which doesn't have enough
bits to hold the PGD pointer, but has plenty for the CPU number.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Todo: Nothing ever detects CPU_BCM6338 but the code tests for it anyway.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
By combining swapper_pg_dir and module_pg_dir, several if conditions
can be eliminated from the tlb exception handler. The reason they
can be combined is that, the effective virtual address of vmalloc
returned is at the bottom, and of module_alloc returned is at the
top. It also fixes the bug in vmalloc(), which happens when its
return address is not covered by the first pgd.
Signed-off-by: Wu Fei <at.wufei@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Some of the were relying into smp.h being dragged in by another header
which of course is fragile. <asm/cpu-info.h> uses smp_processor_id()
only in macros and including smp.h there leads to an include loop, so
don't change cpu-info.h.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The TLB handlers need to check for huge pages and give them special
handling. Huge pages consist of two contiguous sub-pages of physical
memory.
* Loading entrylo0 and entrylo1 need to be handled specially.
* The page mask must be set for huge pages and then restored after
writing the TLB entries.
* The PTE for huge pages resides in the PMD, we halt traversal of the
tables there.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The l parameter to iPTE_LW() is unused. Remove it and from some of its
callers as well.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CPU_CAVIUM_OCTEON is mips_r2 which is handled before the switch. This
label in the switch statement is dead code, so we remove it.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Reviewed by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Some CPUs do not need ehb instructions after writing CP0 registers.
By allowing ehb generation to be overridden in
cpu-feature-overrides.h, we can save a few instructions in the TLB
handler hot paths.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Try to fold the 64-bit TLB refill handler opportunistically at the
beginning of the vmalloc path so as to avoid splitting execution flow in
half and wasting cycles for a branch required at that point then. Resort
to doing the split if either of the newly created parts would not fit into
its designated slot.
Original-patch-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The logic used to split the r4000 refill handler is liberally
sprinkled with magic numbers. We attempt to explain what they are and
normalize them against a new symbolic value (MIPS64_REFILL_INSNS).
CC: David VomLehn <dvomlehn@cisco.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Alchemy manuals state:
"All pipeline hazards and dependencies are enforced by hardware interlocks
so that any sequence of instructions is guaranteed to execute correctly.
Therefore, it is not necessary to pad legacy MIPS hazards (such as
load delay slots and coprocessor accesses) with NOPs."
Run-tested on Au12x0, without any ill effects.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch removes the various CPU_AU1??? model constants in favor of
a single CPU_ALCHEMY one.
All currently existing Alchemy models are identical in terms of cpu
core and cache size/organization. The parts of the mips kernel which
need to know the exact CPU revision extract it from the c0_prid register
already; and finally nothing else in-tree depends on those any more.
Should a new variant with slightly different "company options" and/or
"processor revision" bits in c0_prid appear, it will be supported
immediately (minus an exact model string in cpuinfo).
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Current VR5500 processor support lacks of some functions which are
expected to be configured/synthesized on arch initialization.
Here're some VR5500A spec notes:
* All execution hazards are handled in hardware.
* Once VR5500A stops the operation of the pipeline by WAIT instruction,
it could return from the standby mode only when either a reset, NMI
request, or all enabled interrupts is/are detected. In other words,
if interrupts are disabled by Status.IE=0, it keeps in standby mode
even when interrupts are internally asserted.
Notes on WAIT: The operation of the processor is undefined if WAIT
insn is in the branch delay slot. The operation is also undefined
if WAIT insn is executed when Status.EXL and Status.ERL are set to 1.
* VR5500A core only implements the Load prefetch.
With these changes, it boots fine.
Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi@necel.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Expand the case statement for build_tlb_write_entry so that it does
the right thing on Cavium CPU variants.
Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Signed-off-by: Paul Gortmaker <Paul.Gortmaker@windriver.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
trap_init issues flush_icache_range(), which uses ipi functions to
get icache flushing done on all cpus. But this is done before interrupts
are enabled and caused WARN_ON messages. This changeset introduces
a new local_flush_icache_range() and uses it before interrupts (and
additional CPUs) are enabled to avoid this problem.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Treat R4700 like R4600 in build_tlb_probe_entry. Without this fix kernel
will lock up.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Early 4KEc were MIPS32r1 and therefore need some love to get a TLB
refill handler.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch moves the micro-assembler in a separate implementation, as
it is useful for further run-time optimizations. The only change in
behaviour is cutting down printk noise at kernel startup time.
Checkpatch complains about macro parameters which aren't protected by
parentheses. I believe this is a flaw in checkpatch, the paste operator
used in those macros won't work with parenthesised parameters.
Signed-off-by: Thiemo Seufer <ths@networkno.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>