linux

Commit Graph

Author	SHA1	Message	Date
John David Anglin	6c63ef8001	parisc: Remove lock code to serialize TLB operations in pacache.S TLB operations only need to be serialized on machines with the Merced (Stretch) bus. The only machines in this category are L and N class, and they require a 64-bit PA 2.0 kernel. On these machines, we use local TLB purges in the tmpalias routines. We don't need to serialize TLB purges on all other machines. Thus, the lock/unlock code can be removed when CONFIG_PA20 is not defined. Further, when CONFIG_PA20 is not defined, alternative patching converts the TLB purges to local purges when PA 2.0 hardware has been detected. Signed-off-by: John David Anglin <dave.anglin@bell.net> Tested-By: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
John David Anglin	4c5fe5db1a	parisc: Optimze cache flush algorithms The attached patch implements three optimizations: 1) Loops in flush_user_dcache_range_asm, flush_kernel_dcache_range_asm, purge_kernel_dcache_range_asm, flush_user_icache_range_asm, and flush_kernel_icache_range_asm are unrolled to reduce branch overhead. 2) The static branch prediction for cmpb instructions in pacache.S have been reviewed and the operand order adjusted where necessary. 3) For flush routines in cache.c, we purge rather flush when we have no context. The pdc instruction at level 0 is not required to write back dirty lines to memory. This provides a performance improvement over the fdc instruction if the feature is implemented. Version 2 adds alternative patching. The patch provides an average improvement of about 2%. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2018-10-20 21:10:26 +02:00
Helge Deller	3847dab774	parisc: Add alternative coding infrastructure This patch adds the necessary code to patch a running kernel at runtime to improve performance. The current implementation offers a few optimizations variants: - When running a SMP kernel on a single UP processor, unwanted assembler statements like locking functions are overwritten with NOPs. When multiple instructions shall be skipped, one branch instruction is used instead of multiple nop instructions. - In the UP case, some pdtlb and pitlb instructions are patched to become pdtlb,l and pitlb,l which only flushes the CPU-local tlb entries instead of broadcasting the flush to other CPUs in the system and thus may improve performance. - fic and fdc instructions are skipped if no I- or D-caches are installed. This should speed up qemu emulation and cacheless systems. - If no cache coherence is needed for IO operations, the relevant fdc and sync instructions in the sba and ccio drivers are replaced by nops. - On systems which share I- and D-TLBs and thus don't have a seperate instruction TLB, the pitlb instruction is replaced by a nop. Live-patching is done early in the boot process, just after having run the system inventory. No drivers are running and thus no external interrupts should arrive. So the hope is that no TLB exceptions will occur during the patching. If this turns out to be wrong we will probably need to do the patching in real-mode. Signed-off-by: Helge Deller <deller@gmx.de>	2018-10-17 17:22:26 +02:00
Helge Deller	c8921d72e3	parisc: Fix and improve kernel stack unwinding This patchset fixes and improves stack unwinding a lot: 1. Show backward stack traces with up to 30 callsites 2. Add callinfo to ENTRY_CFI() such that every assembler function will get an entry in the unwind table 3. Use constants instead of numbers in call_on_stack() 4. Do not depend on CONFIG_KALLSYMS to generate backtraces. 5. Speed up backtrace generation Make sure you have this patch to GNU as installed: https://sourceware.org/ml/binutils/2018-07/msg00474.html Without this patch, unwind info in the kernel is often wrong for various functions. Signed-off-by: Helge Deller <deller@gmx.de>	2018-08-13 09:54:17 +02:00
John David Anglin	fedb8da963	parisc: Define mb() and add memory barriers to assembler unlock sequences For years I thought all parisc machines executed loads and stores in order. However, Jeff Law recently indicated on gcc-patches that this is not correct. There are various degrees of out-of-order execution all the way back to the PA7xxx processor series (hit-under-miss). The PA8xxx series has full out-of-order execution for both integer operations, and loads and stores. This is described in the following article: http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml For this reason, we need to define mb() and to insert a memory barrier before the store unlocking spinlocks. This ensures that all memory accesses are complete prior to unlocking. The ldcw instruction performs the same function on entry. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de>	2018-08-08 22:13:32 +02:00
Helge Deller	2a03bb9e7a	parisc: Move cache flush functions into .text.hot section and move the disable_sr_hashing() C and assembly functions into the .init section. Signed-off-by: Helge Deller <deller@gmx.de>	2018-04-11 11:40:35 +02:00
John David Anglin	0adb24e03a	parisc: Fix ordering of cache and TLB flushes The change to flush_kernel_vmap_range() wasn't sufficient to avoid the SMP stalls. The problem is some drivers call these routines with interrupts disabled. Interrupts need to be enabled for flush_tlb_all() and flush_cache_all() to work. This version adds checks to ensure interrupts are not disabled before calling routines that need IPI interrupts. When interrupts are disabled, we now drop into slower code. The attached change fixes the ordering of cache and TLB flushes in several cases. When we flush the cache using the existing PTE/TLB entries, we need to flush the TLB after doing the cache flush. We don't need to do this when we flush the entire instruction and data caches as these flushes don't use the existing TLB entries. The same is true for tmpalias region flushes. The flush_kernel_vmap_range() and invalidate_kernel_vmap_range() routines have been updated. Secondly, we added a new purge_kernel_dcache_range_asm() routine to pacache.S and use it in invalidate_kernel_vmap_range(). Nominally, purges are faster than flushes as the cache lines don't have to be written back to memory. Hopefully, this is sufficient to resolve the remaining problems due to cache speculation. So far, testing indicates that this is the case. I did work up a patch using tmpalias flushes, but there is a performance hit because we need the physical address for each page, and we also need to sequence access to the tmpalias flush code. This increases the probability of stalls. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Helge Deller <deller@gmx.de>	2018-03-02 10:03:28 +01:00
Helge Deller	88776c0e70	parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9." Those opcodes evaluate to the ldcw() assembly instruction which requires (on 32bit) an alignment of 16 bytes to ensure atomicity. As it turns out, qemu is correct and in our assembly code in entry.S and pacache.S we don't pay attention to the required alignment. This patch fixes the problem by aligning the lock offset in assembly code in the same manner as we do in our C-code. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.0+	2018-01-02 22:21:54 +01:00
John David Anglin	febe42964f	parisc: Remove unnecessary TLB purges from flush_dcache_page_asm and flush_icache_page_asm We have four routines in pacache.S that use temporary alias pages: copy_user_page_asm(), clear_user_page_asm(), flush_dcache_page_asm() and flush_icache_page_asm(). copy_user_page_asm() and clear_user_page_asm() don't purge the TLB entry used for the operation. flush_dcache_page_asm() and flush_icache_page_asm do purge the entry. Presumably, this was thought to optimize TLB use. However, the operation is quite heavy weight on PA 1.X processors as we need to take the TLB lock and a TLB broadcast is sent to all processors. This patch removes the purges from flush_dcache_page_asm() and flush_icache_page_asm. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Helge Deller <deller@gmx.de>	2016-12-07 09:01:21 +01:00
John David Anglin	5035b230e7	parisc: Also flush data TLB in flush_icache_page_asm This is the second issue I noticed in reviewing the parisc TLB code. The fic instruction may use either the instruction or data TLB in flushing the instruction cache. Thus, on machines with a split TLB, we should also flush the data TLB after setting up the temporary alias registers. Although this has no functional impact, I changed the pdtlb and pitlb instructions to consistently use the index register %r0. These instructions do not support integer displacements. Tested on rp3440 and c8000. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Helge Deller <deller@gmx.de>	2016-11-25 12:32:01 +01:00
Helge Deller	f39cce654f	parisc: Add cfi_startproc and cfi_endproc to assembly code Add ENTRY_CFI() and ENDPROC_CFI() macros for dwarf debug info and convert assembly users to new macros. Signed-off-by: Helge Deller <deller@gmx.de>	2016-10-05 22:54:40 +02:00
John David Anglin	910a86435d	parisc: Update comment regarding implementation of copy_user_page_asm The attached patch describes the current implementation of copy_user_page_asm(). It is possible to implement this routine using either the kernel page mappings or equivalent aliases. I tested both and decided the former was more efficient. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2016-09-20 20:00:12 +02:00
John David Anglin	d65ea48dc6	parisc: Use unshadowed index register for flush instructions in flush_dcache_page_asm and flush_icache_page_asm The comment at the start of pacache.S states that the base and index registers used for fdc,fic, and pdc instructions should not use shadowed registers. Although this is probably unnecessary for tmpalias flushes, there is also no reason not to comply. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2013-06-18 20:29:10 +02:00
Helge Deller	d845b5fb36	parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.S additionally clean up some whitespaces & tabs. Signed-off-by: Helge Deller <deller@gmx.de>	2013-05-24 22:30:28 +02:00
Helge Deller	6a45716abb	parisc: fix partly 16/64k PAGE_SIZE boot This patch fixes partly PAGE_SIZEs of 16K or 64K by adjusting the assembler PTE lookup code and the assembler TEMPALIAS code. Furthermore some data alignments for PAGE_SIZE have been limited to 4K (or less) to not waste too much memory with greater page sizes. As a side note, the palo loader can (currently) only handle up to 10 ELF segments which is fixed with tighter aligning as well. My testings indicated that the ldci command in the sba iommu coding needed adjustment by the PAGE_SHIFT value and that the I/O PDIR Page size was only set to 4K for my machine (C3000). All this fixes partly the boot, but there are still quite some caching problems left. Examples are e.g. the symbios logic driver which is failing: sym0: <896> rev 0x7 at pci 0000:00:0f.0 irq 69 sym0: PA-RISC Firmware, ID 7, Fast-40, SE, parity checking CACHE TEST FAILED: DMA error (dstat=0x81).sym0: CACHE INCORRECTLY CONFIGURED. and the tulip network driver which doesn't seem to work correctly either: Sending BOOTP requests .net eth0: Setting full-duplex based on MII#1 link partner capability of 05e1 ..... timed out! Beside those kernel fixes glibc will need fixes too to be able to handle >4K page sizes. Signed-off-by: Helge Deller <deller@gmx.de>	2013-05-06 23:08:32 +02:00
John David Anglin	6d2ddc2f94	parisc: fixes and cleanups in page cache flushing (2/4) Implement clear_page_asm and copy_page_asm. These are optimized routines to clear and copy a page. I tested prefetch optimizations in clear_page_asm and copy_page_asm but didn't see any significant performance improvement on rp3440. I'm not sure if these are routines are significantly faster than memset and/or memcpy, but they are there for further performance evaluation. TLB purge operations on PA 1.X SMP machines are now serialized with the help of the new tlb_lock() and tlb_unlock() macros, since on some PA-RISC machines, TLB purges need to be serialized in software. Obviously, lock isn't needed in UP kernels. On PA 2.0 machines, there is a local TLB instruction which is much less disruptive to the memory subsystem. No lock is needed for local purge. Loops are also unrolled in flush_instruction_cache_local and flush_data_cache_local. The implementation of what used to be copy_user_page (now copy_user_page_asm) is now fixed. Additionally 64-bit support is now added. Read the preceding comment which I didn't change. I left the comment but it is now inaccurate. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2013-02-20 22:49:29 +01:00
John David Anglin	207f583d71	[PARISC] fix crash in flush_icache_page_asm on PA1.1 As pointed out by serveral people, PA1.1 only has a type 26 instruction meaning that the space register must be explicitly encoded. Not giving an explicit space means that the compiler uses the type 24 version which is PA2.0 only resulting in an illegal instruction crash. This regression was caused by commit `f311847c2f` Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Wed Dec 22 10:22:11 2010 -0600 parisc: flush pages through tmpalias space Reported-by: Helge Deller <deller@gmx.de> Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org #2.6.39+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-05-16 13:15:06 +01:00
Meelis Roos	b54cd0d505	[PARISC] fix pacache .size with new binutils Fix style of flush_user_dcache_range_asm procedure declaration in arch/parisc/kernel/pacache.s to be consistent with other assembly procedures. Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-04-15 12:50:41 -05:00
James Bottomley	f311847c2f	parisc: flush pages through tmpalias space The kernel has an 8M tmpailas space (originally designed for copying and clearing pages but now only used for clearing). The idea is to place zeros into the cache above a physical page rather than into the physical page and flush the cache, because often the zeros end up being replaced quickly anyway. We can also use the tmpalias space for flushing a page. The difference here is that we have to do tmpalias processing in the non access data and instruction traps. The principle is the same: as long as we know the physical address and have a virtual address congruent to the real one, the flush will be effective. In order to use the tmpalias space, the icache miss path has to be enhanced to check for the alias region to make the fic instruction effective. Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-01-15 08:44:40 -06:00
Kyle McMartin	dfcf753bd3	Revert "parisc: fix trivial section name warnings" This reverts commit `bd3bb8c15b`. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2008-06-13 10:49:45 -04:00
Kyle McMartin	872f6debca	parisc: use conditional macro for 64-bit wide ops This work enables us to remove -traditional from $AFLAGS on parisc. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2008-05-15 11:03:43 -04:00
Helge Deller	bd3bb8c15b	parisc: fix trivial section name warnings This trivial patch fixes the following section warnings on PARISC: > WARNING: vmlinux.o (.text.1): unexpected section name. >The (.[number]+) following section name are ld generated and not expected. > Did you forget to use "ax"/"aw" in a .S file? > Note that for example <linux/init.h> contains > section definitions for use in .S files. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2008-05-15 10:38:54 -04:00
Kyle McMartin	6ebeafff64	[PARISC] Clean up pointless ASM_PAGE_SIZE_DIV use Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:59:23 -07:00
Helge Deller	0b3d643f9e	[PARISC] add ASM_EXCEPTIONTABLE_ENTRY() macro - this macro unifies the code to add exception table entries - additionally use ENTRY()/ENDPROC() at more places Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2007-02-17 01:16:26 -05:00
Helge Deller	8e9e9844b4	[PARISC] more ENTRY(), ENDPROC(), END() conversions Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2007-02-17 01:16:12 -05:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Helge Deller	2fd8303816	[PARISC] Further work for multiple page sizes More work towards supporing multiple page sizes on 64-bit. Convert some assumptions that 64bit uses 3 level page tables into testing PT_NLEVELS. Also some BUG() to BUG_ON() conversions and some cleanups to assembler. Signed-off-by: Helge Deller <deller@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2006-04-21 22:20:34 +00:00
James Bottomley	ba57583396	[PARISC] Add parisc implementation of flush_kernel_dcache_page() We need to do a little renaming of our original syntax because of the difference in arguments. Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2006-03-30 17:48:44 +00:00
Matthew Wilcox	e635c96ed6	[PARISC] Explicitly specify sr4 when flushing kernel space Specify sr4 when flushing kernel space (we could equally well use sr5-7, but must not use sr0). Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2005-10-21 22:56:14 -04:00
Grant Grundler	9b3b331d03	[PARISC] Properly specify index field to I/D cache flush ops replace use of "0" with "%r0" since PA 1.1 I/D flush ops only take a general register and not an immediate value for the index field. This just forces the code to always be PA 1.1 "clean". From: Joel Soete <soete.joel@tiscali.be> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2005-10-21 22:55:51 -04:00
Grant Grundler	37318a3cb1	[PARISC] Fix copy_user_page_asm to NOT access past end of page 2.6.12-rc2-pa3 fix copy_user_page_asm to NOT access past end of page. My bad. /o\ Lamont confirmed that instructions following a conditional branch are alway executed regardless if the branch is taken or not. Unless they are nullified (which was missing in this case). He also noted: Conditional branches nullify on forward taken branch, and on non-taken backward branch. Note that .+4 is a backwards branch. This makes alot more sense than the giberish in the PA20 arch book. Compiles and boots on both 64-bit (a500) and 32-bit (j6k). Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2005-10-21 22:55:34 -04:00
Grant Grundler	413059f28e	[PARISC] Replace uses of __LP64__ with CONFIG_64BIT 2.6.12-rc4-pa3 s/__LP64__/CONFIG_64BIT/ and fixup config.h usage Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2005-10-21 22:46:48 -04:00
Grant Grundler	896a375623	[PARISC] Make sure use of RFI conforms to PA 2.0 and 1.1 arch docs 2.6.12-rc4-pa3 : first pass at making sure use of RFI conforms to PA 2.0 arch pages F-4 and F-5, PA 1.1 Arch page 3-19 and 3-20. The discussion revolves around all the rules for clearing PSW Q-bit. The hard part is meeting all the rules for "relied upon translation". .align directive is used to guarantee the critical sequence ends more than 8 instructions (32 bytes) from the end of page. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>	2005-10-21 22:40:07 -04:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

34 Commits