Patch f598282f51 exposed a problem in
powerpc MSI-X functionality, making network interfaces such as ixgbe
and cxgb3 stop to work when MSI-X is enabled. RX interrupts were not
being generated.
The problem was caused because MSI irq was not being effectively
unmasked after device initialization.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Doing so causes xtime to be negative which crashes the timekeeping
code in funny ways when doing suspend/resume
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I inadvertently left that debug code enabled, causing the number of
contexts to be clamped to 31 which is going to slow things down on
4xx and just plain breaks 8xx
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Based on an original patch by Valentine Barshak <vbarshak@ru.mvista.com>
Use preempt_schedule_irq to prevent infinite irq-entry and
eventual stack overflow problems with fast-paced IRQ sources.
This kind of problems has been observed on the PASemi Electra IDE
controller. We have to make sure we are soft-disabled before calling
preempt_schedule_irq and hard disable interrupts after that
to avoid unrecoverable exceptions.
This patch also moves the "clrrdi r9,r1,THREAD_SHIFT" out of
the #ifdef CONFIG_PPC_BOOK3E scope, since r9 is clobbered
and has to be restored in both cases.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix the following 3 issues:
arch/powerpc/kernel/process.c: In function 'arch_randomize_brk':
arch/powerpc/kernel/process.c:1183: error: 'mmu_highuser_ssize' undeclared (first use in this function)
arch/powerpc/kernel/process.c:1183: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/process.c:1183: error: for each function it appears in.)
arch/powerpc/kernel/process.c:1183: error: 'MMU_SEGSIZE_1T' undeclared (first use in this function)
In file included from arch/powerpc/kernel/setup_64.c:60:
arch/powerpc/include/asm/mmu-hash64.h:132: error: redefinition of 'struct mmu_psize_def'
arch/powerpc/include/asm/mmu-hash64.h:159: error: expected identifier or '(' before numeric constant
arch/powerpc/include/asm/mmu-hash64.h:396: error: conflicting types for 'mm_context_t'
arch/powerpc/include/asm/mmu-book3e.h:184: error: previous declaration of 'mm_context_t' was here
cc1: warnings being treated as errors
arch/powerpc/kernel/pci_64.c: In function 'pcibios_unmap_io_space':
arch/powerpc/kernel/pci_64.c💯 error: unused variable 'res'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This defconfig's purpose at this time is to help catch compile errors
between Book-3S and Book-3E support in ppc64. It is based on the
ppc64_defconfig with some things disabled that we dont support on
Book-3E right now (hugetlbfs, slices, etc.)
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Prior to the arch/ppc -> arch/powerpc transition, xmon had support for single
stepping on 4xx boards. The functionality was lost when arch/ppc was removed.
This patch restores single step support for 44x boards, and Book-E in general.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The ABI specifies a 64K alignment, we need to map the vDSO accordingly
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Due to missing segment assignments the .text section was put in the NOTES
segment (and marked as NOTE section), and the .got was put in the DYNAMIC
segment.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The creation of the flattened device tree depended on the compiler
putting the constant strings for an object in a section with a
particular name. This was changed with recent compilers. Do this
explicitly instead.
Without this patch, iseries kernels may silently not boot when built with
some compilers.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The architecture defines that if MSR PR is set we are in problem state
irrespective of the HV bit. This fixes perf events to reflect this.
Also, on bare metal systems, samples taken in Linux will now be reported
as kernel rather than hypervisor.
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: paulus@samba.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The 'fsl5200-clocking'-property was dropped since
0d1cde2358. Remove all occurences
in dts-files.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
- serial Console on PSC1
- 64MB SDRAM
- MTD CFI Flash
- Ethernet FEC
- IDE support
Signed-off-by: Heiko Schocher <hs@denx.de>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
- serial Console on PSC1
- 64MB SDRAM
- MTD CFI Flash
- Ethernet FEC
- I2C with PCF8563 and Temp. Sensor ADM9240
- IDE support
Signed-off-by: Heiko Schocher <hs@denx.de>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
making a powerpc target with PCI support, shows the
following warning:
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x10430): Section mismatch in reference from the
function pcibios_allocate_bus_resources() to the function .init.text:reparent_resources()
The function pcibios_allocate_bus_resources() references
the function __init reparent_resources().
This is often because pcibios_allocate_bus_resources lacks a __init
annotation or the annotation of reparent_resources is wrong.
This patch fix this warning by removing the __init
annotation before reparent_resources.
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Here's a patch that adds the ppc750 CL cpu as supported by oprofile.
Signed-off-by: Dragos Tatulea <dtatulea@ixiacom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to align before the output section. Having the align inside
the output section causes the linker to put some filler in there,
which makes it a non-empty section, but this section isn't assigned to
a segment so you get a warning from the linker.
Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
'acc' isn't used anywhere and thus triggers gcc warning, which causes
build error with CONFIG_PPC_DISABLE_WERROR=n (default):
cc1: warnings being treated as errors
arch/powerpc/kernel/kgdb.c: In function 'gdb_regs_to_pt_regs':
arch/powerpc/kernel/kgdb.c:289: warning: unused variable 'acc'
make[1]: *** [arch/powerpc/kernel/kgdb.o] Error 1
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Profiling of a page fault scalability microbenchmark shows flush_hash_range
is not calling the batch hpte invalidate hcall (H_BULK_REMOVE).
It turns out we have a duplicate firmware feature for hcall-bulk and the
current setup code stops after finding the first match. This meant we never
batch and always do individual invalidates.
The patch below removes the duplicate and shifts FW_FEATURE_CMO to close
the gap. With the patch applied the single threaded page fault rate improves
from 217169 to 238755 per second on a POWER5 test box, a 10% improvement.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On pSeries, we always force the IO space to be mapped using 4K
pages even with a 64K base page size to cope with some limitations
in the HV interface to some devices.
However, the SLB miss handler code to discriminate between vmalloc
and ioremap space uses a CPU feature section such that the code
is nop'ed out when the processor support large pages non-cachable
mappings.
Thus, we end up always using the ioremap page size for vmalloc
segments on such processors, causing a discrepency between the
segment and the hash table, and thus a hang continously hashing
the page.
It works for the first segment of the vmalloc space since that
segment is "bolted" in by C code correctly, and thankfully we
almost never use the vmalloc space beyond the first segment,
but the new percpu code made the bug happen.
This fixes it by removing the feature section from the assembly,
we now always do the comparison between vmalloc and ioremap.
Signed-off-by; Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
cppcheck found a memory leak in axon_msi, if dcr_base or dcr_len are zero,
we have already allocated msic, so we should free it in the error path.
Signed-off-by: Eric Sesterhenn <eric.sesterhenn@lsexperts.de>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since the change of how interrupts are disabled during suspend,
certain PowerBook models started exhibiting various issues during
suspend or resume from sleep.
I finally tracked it down to the code that runs various "platform"
functions (kind of little scripts extracted from the device-tree),
which uses our i2c and PMU drivers expecting interrutps to work,
and at a time where with the new scheme, they have been disabled.
This causes timeouts internally which for some reason results in
the PMU being unable to see the trackpad, among other issues, really
it depends on the machine. Most of the time, we fail to properly adjust
some clocks for suspend/resume so the results are not always
predictable.
This patch fixes it by using IRQF_TIMER for both the PMU and the I2C
interrupts. I prefer doing it this way than moving the call sites since
I really want those platform functions to still be called after all
drivers (and before sysdevs).
We also do a slight cleanup to via-pmu.c driver to make sure the
ADB autopoll mask is handled correctly when doing bus resets
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The mod_return_to_handler needs to switch to the kernel TOC before
jumping to a the kernel code. It currently does this by looking
at the kernel function data and retrieves the TOC that way.
Not only is this inefficient, it also breaks with a relocatable kernel.
The PACA contains the kernel TOC and we can easily retrieve it that
way.
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When the function graph tracer is enabled, it replaces the return address
with a hook back to the tracer. This makes back traces see the hook instead
of the actual return address.
The current code also shows the real address by checking if the return
address jumps to the return_to_handler. If it is, is also prints out
the saved real return address.
On powerpc64, some modules may return to mod_return_to_handler, which
is not checked. This patch will also show the real address if a return
is to mod_return_to_handler as well.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code
But leave TTM code alone, something is fishy there with global vm_ops
being used.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
Fix build of cpm_uart due to core changes
powerpc/8xx: Fix regression introduced by cache coherency rewrite
powerpc/4xx: Fix erroneous xmon warning on PowerPC 4xx
powerpc/mm: Fix 40x and 8xx vs. _PAGE_SPECIAL
powerpc: Cleanup linker script using new linker script macros.
powerpc: Fix ibm,client-architecture-support printout
powerpc: Increase NODES_SHIFT on 64bit from 4 to 8
powerpc/perf_counter: Fix vdso detection
powerpc: Move 64bit heap above 1TB on machines with 1TB segments
powerpc: Change archdata dma_data to a union
powerpc: Rename get_dma_direct_offset get_dma_offset
powerpc/mm: Remove duplicated #include
powerpc/book3e-64: Remove duplicated #include
powerpc: Check for unsupported relocs when using CONFIG_RELOCATABLE
powerpc/pmc: Don't access lppaca on Book3E
powerpc: kmalloc failure ignored in vio_build_iommu_table()
hvc_console: Provide (un)locked version for hvc_resize()
* 'for-linus' of git://neil.brown.name/md: (97 commits)
md: raid-1/10: fix RW bits manipulation
md: remove unnecessary memset from multipath.
md: report device as congested when suspended
md: Improve name of threads created by md_register_thread
md: remove sparse warnings about lock context.
md: remove sparse waring "symbol xxx shadows an earlier one"
async_tx/raid6: add missing dma_unmap calls to the async fail case
ioat3: fix uninitialized var warnings
drivers/dma/ioat/dma_v2.c: fix warnings
raid6test: fix stack overflow
ioat2: clarify ring size limits
md/raid6: cleanup ops_run_compute6_2
md/raid6: eliminate BUG_ON with side effect
dca: module load should not be an error message
ioat: driver version 4.0
dca: registering requesters in multiple dca domains
async_tx: remove HIGHMEM64G restriction
dmaengine: sh: Add Support SuperH DMA Engine driver
dmaengine: Move all map_sg/unmap_sg for slave channel to its client
fsldma: Add DMA_SLAVE support
...
After upgrading to the latest kernel on my mpc875 userspace started
running incredibly slow (hours to get to a shell, even!).
I tracked it down to commit 8d30c14cab,
that patch removed a work-around for the 8xx. Adding it
back makes my problem go away.
Signed-off-by: Rex Feany <rfeany@mrv.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The xmon code relies on MSR_RI being non-zero to indicate that an exception
is recoverable. If it is not, it prints a warning message. However, the
PowerPC 4xx cores do not have an MSR_RI bit and this warning is produced for
every xmon event.
This introduces an unrecoverable_excp function to determine if an exception
is recoverable or not. This gets rid of the erroneous warnings on 4xx.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The test to check whether we have _PAGE_SPECIAL defined is broken,
since we always define it, just not always to a meaninful value :-)
That broke 8xx and 40x under some circumstances.
This fixes it by adding _PAGE_SPECIAL for both of these since they
had a free PTE bit, and removing the condition around advertising
it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On machines without the ibm,client-architecture-support call we were missing a
newline. We may as well print the full name in all its glory too - its
ibm,client-architecture-support, not ibm,client-architecture as I mistakenly
wrote (a name only an IBM architect could love).
For my penance I will write out ibm,client-architecture-support 100 times.
Before:
Calling ibm,client-architecture...command line: root=/dev/sda6 console=hvc0 quiet
After:
Calling ibm,client-architecture-support... not implemented
command line: root=/dev/sda6 console=hvc0
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some System p configurations can already have more than 16 nodes so we
need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the
future, although we can look at something smaller if the memory bloat is
considered too much.
Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in
early_node_map in mm/page_alloc.c:
< 6144 early_node_map
> 307200 early_node_map
due to:
#if MAX_NUMNODES >= 32
/* If there can be many nodes, allow up to 50 holes per node */
#define MAX_ACTIVE_REGIONS (MAX_NUMNODES*50)
#else
/* By default, allow up to 256 distinct regions */
#define MAX_ACTIVE_REGIONS 256
Since our memory is mostly contiguous it seems reasonable to keep this
at 256 for now. I also set 32bit to 32 to save space (is there any chance
a 32bit system will have more than 32 discontiguous memory ranges?).
Even with that fixed we have a few data structures that grow:
< 896 bootmem_node_data
> 14336 bootmem_node_data
< 1280 node_devices
> 20480 node_devices
< 25088 kmalloc_caches
> 59648 kmalloc_caches
< 1632 hstates
> 21792 hstates
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
perf_counter uses arch_vma_name() to detect a vdso region which in turn uses
current->mm->context.vdso_base. We need to initialise this before doing
the mmap or else we fail to detect the vdso.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If we are using 1TB segments and we are allowed to randomise the heap, we can
put it above 1TB so it is backed by a 1TB segment. Otherwise the heap will be
in the bottom 1TB which always uses 256MB segments and this may result in a
performance penalty.
This functionality is disabled when heap randomisation is turned off:
echo 1 > /proc/sys/kernel/randomize_va_space
which may be useful when trying to allocate the maximum amount of 16M or 16G
pages.
On a microbenchmark that repeatedly touches 32GB of memory with a stride of
256MB + 4kB (designed to stress 256MB segments while still mapping nicely into
the L1 cache), we see the improvement:
Force malloc to use heap all the time:
# export MALLOC_MMAP_MAX_=0 MALLOC_TRIM_THRESHOLD_=-1
Disable heap randomization:
# echo 1 > /proc/sys/kernel/randomize_va_space
# time ./test
12.51s
Enable heap randomization:
# echo 2 > /proc/sys/kernel/randomize_va_space
# time ./test
1.70s
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Sometimes this is used to hold a simple offset, and sometimes
it is used to hold a pointer. This patch changes it to a union containing
void * and dma_addr_t. get/set accessors are also provided, because it was
getting a bit ugly to get to the actual data.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The former is no longer really accurate with the swiotlb case now
a possibility. I also move it into dma-mapping.h - it no longer
needs to be in dma.c, and there are about to be some more accessors
that should all end up in the same place. A comment is added to
indicate that this function is not used in configs where there is no
simple dma offset, such as the iommu case.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When using CONFIG_RELOCATABLE, we build the kernel as a position
independent executable. The kernel then uses a little bit of relocation
code to relocate itself. That code only deals with R_PPC64_RELATIVE
relocations though. If for some reason you use assembly constructs
such as LOAD_REG_IMMEDIATE() to load the address of a symbol, you'll
generate different kinds of relocations that won't be processed properly
and bad things will happen. (We have 2 such bugs today).
The perl script tries to filter out "known" bad ones. It's possible
that we are missing some in the case of a weak function that nobody
implements, we'll see if we get false positive and fix it.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
cpumask: Move deprecated functions to end of header.
cpumask: remove unused deprecated functions, avoid accusations of insanity
cpumask: use new-style cpumask ops in mm/quicklist.
cpumask: use mm_cpumask() wrapper: x86
cpumask: use mm_cpumask() wrapper: um
cpumask: use mm_cpumask() wrapper: mips
cpumask: use mm_cpumask() wrapper: mn10300
cpumask: use mm_cpumask() wrapper: m32r
cpumask: use mm_cpumask() wrapper: arm
cpumask: Use accessors for cpu_*_mask: um
cpumask: Use accessors for cpu_*_mask: powerpc
cpumask: Use accessors for cpu_*_mask: mips
cpumask: Use accessors for cpu_*_mask: m32r
cpumask: remove arch_send_call_function_ipi
cpumask: arch_send_call_function_ipi_mask: s390
cpumask: arch_send_call_function_ipi_mask: powerpc
cpumask: arch_send_call_function_ipi_mask: mips
cpumask: arch_send_call_function_ipi_mask: m32r
cpumask: arch_send_call_function_ipi_mask: alpha
cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
...
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>