Move some foo_kern.c files to foo.c now that the old foo.c files are out
of the way.
Also cleaned up some whitespace and an emacs formatting comment.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fork on UML has always somewhat subtle. The underlying cause has been the
need to initialize a stack for the new process. The only portable way to
initialize a new stack is to set it as the alternate signal stack and take a
signal. The signal handler does whatever initialization is needed and jumps
back to the original stack, where the fork processing is finished. The basic
context switching mechanism is a jmp_buf for each process. You switch to a
new process by longjmping to its jmp_buf.
Now that UML has its own implementation of setjmp and longjmp, and I can poke
around inside a jmp_buf without fear that libc will change the structure, a
much simpler mechanism is possible. The jmpbuf can simply be initialized by
hand.
This eliminates -
the need to set up and remove the alternate signal stack
sending and handling a signal
the signal blocking needed around the stack switching, since
there is no stack switching
setting up the jmp_buf needed to jump back to the original
stack after the new one is set up
In addition, since jmp_buf is now defined by UML, and not by libc, it can be
embedded in the thread struct. This makes it unnecessary to have it exist on
the stack, where it used to be. It also simplifies interfaces, since the
switch jmp_buf used to be a void * inside the thread struct, and functions
which took it as an argument needed to define a jmp_buf variable and assign it
from the void *.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mark a symbol and file as being tt-mode only. This shrinks the binary
slightly when tt mode support is compiled out and makes it easier to identity
stuff when tt mode is removed.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
BB noticed that we had the wrong bus error handler.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make __bb_init_func weak in order to avoid a link failure with some libcs
and/or gccs.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The UML/x86_64 headers were missing ptrace support for some segment registers.
The underlying problem was that the x86_64 kernel uses user_regs_struct
rather than the ptrace register definitions in ptrace. This patch switches
UML/x86_64 to using user_regs_struct for its definitions of the host's
registers.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ZONE_DMA might become dependent on CONFIG_ZONE_DMA, which UML doesn't define
(we're still arguing about this) So, let's change ZONE_DMA to ZONE_NORMAL.
This is prompted by optional-zone_dma-in-the-vm.patch, but should be harmless
on its own.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make lots of structures const in order to make it obvious that they need no
locking.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This spinlock can be taken on interrupt too, so spin_lock_irq[save] must be
used.
However, Documentation/networking/netdevices.txt explains we are called with
rtnl_lock() held - so we don't need to care about other concurrent opens.
Verified also in LDD3 and by direct checking. Also verified that the network
layer (through a state machine) guarantees us that nobody will close the
interface while it's being used. Please correct me if I'm wrong.
Also, we must check we don't sleep with irqs disabled!!! But anyway, this is
not news - we already can't sleep while holding a spinlock. Who says this is
guaranted really by the present code?
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We have never used this flag and recently one user experienced a complaining
warning about this (there was a symbol in the positive half of the address space
IIRC). So fix it.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Arch-independent zone-sizing determines the size of a node
(pgdat->node_spanned_pages) based on the physical memory that was
registered by the architecture. However, when
CONFIG_MEMORY_HOTPLUG_RESERVE is set, the architecture expects that the
spanned_pages will be much larger and that mem_map will be allocated that
is used lated on memory hot-add.
This patch allows an architecture that sets CONFIG_MEMORY_HOTPLUG_RESERVE
to call push_node_boundaries() which will set the node beginning and end to
at *least* the requested boundary.
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
absent_pages_in_range() made the assumption that users of the API would not
care about holes beyound the end of physical memory. This was not the
case. This patch will account for ranges outside of physical memory as
holes correctly.
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The x86_64 code accounted for memmap and some portions of the the DMA zone as
holes. This was because those areas would never be reclaimed and accounting
for them as memory affects min watermarks. This patch will account for the
memmap as a memory hole. Architectures may optionally use set_dma_reserve()
if they wish to account for a portion of memory in ZONE_DMA as a hole.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix build error introduced by 3212fe1594
Non-NUMA case should be handled.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since do_bad_area() always takes the currently active task and
(supposed to) take the currently active MM, there's no point passing
them to this function. Instead, obtain references to them inside
do_bad_area().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mm-armv.c now only contains the pgd allocation/freeing code, so
rename it to have a more sensible filename.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If we're going to have mmu.c for code which is specific to the MMU
machines, we might as well move the other MMU initialisation
specific code from mm-armv.c into this new file. This also allows
us to make some functions static.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no point to rewrite some logic to parse command line
to pass initrd parameters or to declare a user memory area.
We could use instead parse_early_param() that does the same
thing.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
There's no point to inline any functions in setup.c. Let's GCC
doing its job, it's good enough for that now.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This function although doing simple thing is hard to follow. It's
mainly due to:
- a lot of #ifdef
- bad local names
- redundant tests
So this patch try to address these issues. It also do not use
max_pfn global which is marked as an unused exported symbol.
As a bonus side, it's now really easy to see what part of the
code is for no-numa system.
There's also no point to make this function inline.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This array was used to 'cache' some frame info about scheduler
functions to speed up get_wchan(). This array was 1Ko size and
was only used when CONFIG_KALLSYMS was set but declared for all
configs.
Rather than make the array statement conditional, this patches
removes this array and its uses. Indeed the common case doesn't
seem to use this array and get_wchan() is not a critical path
anyways.
It results in a smaller bss and a smaller/cleaner code:
text data bss dec hex filename
2543808 254148 139296 2937252 2cd1a4 vmlinux-new-get-wchan
2544080 254148 143392 2941620 2ce2b4 vmlinux~old
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch adds 2 sanity checks.
The first one test that the start address of the function to analyze has been
set by the caller. If not return an error since nothing usefull can be done
without.
The second one checks that the function's size has been set. A null size can
happen if CONFIG_KALLSYMS is not set and it means that we don't know the size
of the function to analyze. In this case, we make it equal to 128 instructions
by default.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
While working on a glibc patch to support the fstatat() functions[1],
I noticed that the o32 implementation behaves differently on 32-bit and
64-bit kernels; the former provides a stat64 while the latter provides
a plain (o32) stat. I think the former is what's intended, as there is
no separate fstatat64. It's also what x86 does.
I think this is just a case of a compat too far.
[1] I've seen Khem's patch, but I don't think it's right.
Signed-off-by: Richard Sandiford <richard@codesourcery.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch introduces a number of configuration variables. These allow to
specify presence/absence of integrated peripherals found on the MIPS
RM9xxx processor family, based on the particular processor model used.
Signed-off-by: Thomas Koeller <thomas.koeller@baslerweb.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
excite_fpga.h, like all platform headers, really belongs in the
platform header directory.
Signed-off-by: Thomas Koeller <thomas.koeller@baslerweb.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The excite platform exports hardware resources for device drivers to use.
Any driver wanting to use these resources will look up them by their names.
Since these resources are declared to have static linkage, but are not
used in the source file defining them, the compiler used to emit an
'unused' warning, which this patch suppresses.
Signed-off-by: Thomas Koeller <thomas.koeller@baslerweb.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
With more recent compilers inline doesn't necessarily means a function
will always be inlined. So leave that decission to the compiler and
make the function as __init.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The following change updates the Atlas interrupt handling to match that
of Malta. Tested with a 5Kc and a 34Kf successfully.
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In hooking up the perf counter overflow interrupt to the experimental
deprecated-real-soon-now /proc/perf interface last night, I had to
revisit arch/mips/mips-boards/generic/time.c, and discovered that
when the 2.6.9-based SMTC prototype was merged with the more
recent tree, it was missed that arch/mips/kernel/time.c had changed
so that even in SMP kernels, timer_interrupt() calls
local_timer_interrupt(), so there is no longer a need to invoke it
directly from mips_timer_interrupt() in those cases where
timer_interrupt() has been called. So I got rid of that, and added the
invocation of perf_irq() if Cause.PCI is set, more-or-less following the
same logic as in the non-SMTC case, with the modifications that (a) a
runtime check for Release 2 isn't done, because it's redundant in SMTC),
and (b) we check for a clock interrupt regardless of the value returned
by the perf counter service - I don't understand why we'd want to control
that with perf_irq(), but maybe one of you knows the story. I also got
rid of the stupid warning about the unused variable when compiled for
SMTC (another artifact of the merge). The result hasn't been beaten to
death, but boots, seems stable, and supports extended precision event
counting.
Signed-off-by: Kevin D. Kissell <kevink@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If a thread became runnable between need_resched() and the WAIT
instruction, switching to the thread will delay until a next interrupt.
Some CPUs can execute the WAIT instruction with interrupt disabled, so
we can get rid of this race on them (at least UP case).
Original Patch by Atsushi with fixing up for MIPS Technology's cores by
Ralf based on feedback from the RTL designers.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
I've encountered a serious problem with PCI config space access on Au1x000
platforms with recent 2.6.x-kernel. With 2.4.31 the same hardware works fine.
So I was looking for the differences:
Symptoms:
- no PCI-device is seen on bootup though two or three cards are present
- lspci output is empty
- OR: lspci shows 20 times the same device
(- OR: in some slot-configurations it worked anyhow)
System(s):
1. platform with Au1500 and three PCI-devices (actually a mycable XXS1500
with backplane for three PCI-devices)
2. platform with Au1550 and two PCI-devices (custom board)
Debugging:
I digged down to the config_access() of the au1xxx-processors in
arch/mips/pci/ops-au1000.c and switched on DEBUG.
The code of config_access() seems to be almost the same as of the
2.4.x-kernel. But the "pci_cfg_vm->addr" returned by get_vm_area(0x2000, 0)
once on booting is different. That's of course not forbidden. But the
alignment seems to be wrong. In my case, I received:
2.4.31: pci_cfg_vm->addr = c0000000
2.6.18-rc5: pci_cfg_vm->addr = c0101000
To make it short: With 2.6.x it fails on the first config-access with:
"PCI ERR detected: status 83a00356".
Fixup:
My fix is now, to use the VM_IOREMAP-flag in the get_vm_area call. This flag
seems to be introduced in mm/vmalloc.c a long time ago (in 2.6.7-bk13, I
found in gitweb).
Now, the returned address is pci_cfg_vm->addr = c0104000 and everything works
fine.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since lmo commit 323a380bf9e1a1679a774a2b053e3c1f2aa3f179 ("Simplify
dump_stack()") made prepare_frametrace() always inlined, using $2 (v0)
in __asm__ is not safe anymore. We can use $1 (at) instead. Also we
should use "dla" instead of "la" for 64-bit kernel.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
c-r4k.c and c-sb1.c use drop_mmu_context() to flush virtually tagged
I-caches, but this does not work for flushing other task's icache. This
is for example triggered by copy_to_user_page() called from ptrace(2).
Use indexed flush for such cases.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
XTC can only be set if VPA is clear, which it may not be. There is
also the possibility of a back to back c0 register access hazard to
take care of.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
dclz() expects its 64-bit argument being passed as a single register
but on 32-bit kernels it'll actually be in a register pair.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In entry.S resume_userspace ... jal do_notify_resume form a loop through
which the kernel will iterate as long as work is pending. If we
iterate through this loop more than once with no signal pending for at
least one but the last iteration we will take do the syscall restarting
multiple times resulting in a syscall return prior to the the syscall
instruction in userspace. This may happen when debugging a multithreaded
program.
Debugging and original fix by Maciej; extended to other ABIs by me.
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Recent 34Ks come out of reset with WP enabled on VPE 1 so we take an
immediate exception when starting the second VPE.
Signed-off-by: Chris Dearman <chris@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch allows unwind_stack() to return ra for leaf function.
But it tries to detects cases where get_frame_info() wrongly
consider nested function as a leaf one.
It also pass 'unsinged long *sp' instead of 'unsigned long **sp'
as second parameter. The code looks cleaner.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Now get_frame_info() wants to detect move sp instruction first. It
assumes that the save ra in the stack instruction can't happen
before allocating frame size space into the stack.
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Make dump_stack() code not depend on CONFIG_KALLSYMS.
It also make prepare_frametrace() always inlined to get
less false entries reported by show_raw_backtrace().
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We usually use backtrace term for dumping a call tree during
debug. Therefore this patch renames show_frametrace() into
show_backtrace() and show_trace() into show_raw_backtrace().
It also uses the new function print_ip_sym().
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Instead of dump all possible address in the stack, unwind the stack frame
based on prologue code analysis, as like as get_wchan() does. While the
code analysis might fail for some reason, there is a new kernel option
"raw_show_trace" to disable this feature.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Print call-trace in show_stack() (like on other archs). Also make
show_trace() static and simplify its argument list.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
IRQs disabling in flush_cache_4096 for cache purge. Under certain
workloads we would get an IRQ in the middle of a purge operation,
and the cachelines would remain in an inconsistent state, leading
to occasional stack corruption.
Signed-off-by: Takeo Takahashi <takahashi.takeo@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Set the SHM alignment at runtime, based off of probed cache desc.
Optimize get_unmapped_area() to only colour align shared mappings.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements initial support for the vsyscall page on SH.
At the moment we leave it configurable due to having nommu
to support from the same code base. We hook it up for the
signal trampoline return at present, with more to be added
later, once uClibc catches up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
flush_cache_mm() wraps in to flush_cache_all(), which is rather
excessive given that the number of PTEs within the specified context
are generally quite low. Optimize for walking the mm's VMA list and
selectively flushing the VMA ranges from the dcache. Invalidate the
icache only if a VMA sets VM_EXEC.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Figure out the cache desc entry_mask at runtime, and remove
hard-coded assumption about the cacheline size.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This enables support for 4K stacks on SH.
Currently this depends on DEBUG_KERNEL, but likely all boards
will switch to this as the default in the future.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some more machvec overhauling and setup code cleanup. Kill off
get_system_type() and platform_setup(), we can do these both
through the machvec. While we're add it, kill off more useless
mach.c's and drop some legacy cruft from setup.c.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
nommu does not require the page table manipulation code in the
bootmem initialisation paths. Move this into separate inline
functions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The generic hardirq layer already takes care of a lot of the
appropriate locking and disabling for us, no need to duplicate
it in the handlers..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
None of these have been maintained in years, and no one seems to
be interested in doing so, so just get rid of them.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for the aforementioned CPU subtypes, and cleans
up some build issues encountered as a result.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The ARM Versatile board PCI config space read routines are broken for byte
accesses. The access uses a byte read, so masking the bottom two bits of the
address is wrong.
I guess this is a cut/paste error from the the halfword code which uses
aligned word access+shift+mask.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ARM XIP_KERNEL map created in devicemaps_init() is wrong.
The map.pfn is rounded down to an even 1MiB section boundary
which results in va/pa translations errors when XIP_PHYS_ADDR
starts on an odd 1MiB boundary and this causes the kernel to
hang. This patch fixes ARM XIP_KERNEL translation errors for
the odd 1MiB XIP_PHYS_ADDR boundary case.
Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add S3C2412 power management code, and move the
core register saving in from s3c2412.c
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add the AML M5900 series to the list of supported machines in the
arch/arm/mach-s3c2410 directory. This ensures the core peripherals
are registered, and the timer source is configured. if selected in
the kernel config the framebuffer registers and mtd partition
information are set. This version of the patch has corrected
formatting and removed the legacy procfs directory entry.
Signed-off-by: David Anders <danders@amltd.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds some simple setup code for most of the CPU subtypes,
primarily simple platform device registration.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes up some of the various outstanding nommu bugs on
SH.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
nommu needs to be able to shift PAGE_OFFSET, so we switch it to a
non-user-visible CONFIG_PAGE_OFFSET and use that in the few places
where it matters.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Drop TIF_USERSPACE and add addr_limit to the thread_info struct.
Subsequently, use that for address checking in strnlen_user() to
ward off bogus -EFAULTs.
Make __strnlen_user() return 0 on exception, rather than -EFAULT.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes a long-standing FIXME for G2 DMA, where we finally
wire up the IRQ handler and allow for sampling remaining bytes
while in-flight.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds the VoyagerGX UART to the RTS7751R2D setup
code, and cleans up a few build issues.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Inhibit mapping through page tables in __ioremap() for PCI memory
apertures on SH7751 and SH7780-style PCI controllers, translation is
not possible for these areas. For other users that map a small window
in P1/P2 space, ioremap() traps that already, and should never make
it to __ioremap().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This cleans up quite a lot of the PCI mess that we
currently have, and attempts to consolidate the
duplication in the SH7780 and SH7751 PCI controllers.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some kgdb cleanup. Move hexchars/highhex/lowhex to the header, so it can
be reused by sh-sci. Also drop silly ctrl_inl/outl() overloading being
done by the kgdb stub.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds some simple PM stubs and the basic APM interfaces,
primarily for use by hp6xx, where the existing userland
expects it.
Signed-off-by: Andriy Skulysh <askulysh@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Rewrite the store queue API for a per-cpu interface in the driver
model. The old miscdevice is dropped, due to TASK_SIZE limitations,
and no one was using it anyways.
Carve up and allocate store queue space with a bitmap, back sq
mapping objects with a slab cache, and let userspace worry about
its own prefetching.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There was a bug that got introduced when the split ptlock changes
went in where mm could be unintialized for user mappings, this
fixes it up..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
ioremap() overhaul. Add support for transparent PMB mapping, get rid of
p3_ioremap(), etc. Also drop ioremap() and iounmap() routines from the
machvec, as everyone can use the generic ioremap() API instead. For PCI
memory apertures and other special cases, use the pci_iomap() API, as
boards are already required to get the mapping right there.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There is an ancient and totally incorrect sanity check being
done on the ramdisk location. The check assumes that the
kernel is always loaded to physical address zero, which is
wrong. It was trying to validate the ramdisk value by saying that
if it fell within the kernel image address range it must be wrong.
Anyways, kill this because it actually creates problems. The
'ramdisk_image' should always be adjusted down by KERNBASE.
SILO can easily put the ramdisk in a location which causes
this test to trigger, breaking things.
[ Based almost entirely upon a patch from Ben Collins. ]
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup of page table allocators, using generic folded PMD and PUD
helpers. TLB flushing operations are moved to a more sensible spot.
The page fault handler is also optimized slightly, we no longer waste
cycles on IRQ disabling for flushing of the page from the ITLB, since
we're already under CLI protection by the initial exception handler.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
A synco is needed before we jump to start_kernel().
While we're at it, also move the sh_cpu_init() jump until after
we've zeroed BSS, as this has caused some undesirable results
in sh_cpu_init().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Currently when making changes to control registers, we
typically need some time for changes to take effect (8
nops, generally). However, for sh4a we simply need to
do an icbi..
This is a simple patch for implementing a general purpose
ctrl_barrier() which functions as a control register write
barrier. There's some additional documentation in the patch
itself, but it's pretty self explanatory.
There were also some places where we were not doing the
barrier, which didn't seem to have any adverse effects on
legacy parts, but certainly did on sh4a. It's safer to have
the barrier in place for legacy parts as well in these cases,
though this does make flush_tlb_all() more expensive (by an
order of 8 nops). We can ifdef around the flush_tlb_all()
case for now if it's clear that all legacy parts won't have
a problem with this.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There's a bug in the Hitachi SuperH csum_partial_copy_generic()
implementation. If the supplied length is 1 (and several alignment
conditions are met), the function immediately branches to label 4.
However, the assembly at label 4 expects the length to be stored in
register r2. Since this has not occurred, subsequent behavior is
undefined.
This can cause bad payload checksums in TCP connections.
I've fixed the problem by initializing register r2 prior to the branch
instruction.
Signed-off-by: Ollie Wild <aaw@rincewind.tv>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We had a pretty interesting oops happening, where copy_user_page()
was down()'ing p3map_sem[] with a bogus offset (particularly, an
offset that hadn't been initialized with sema_init(), due to the
mismatch between cpu_data->dcache.n_aliases and what was assumed
based off of the old CACHE_ALIAS value).
Luckily, spinlock debugging caught this for us, and so we drop
the old hardcoded CACHE_ALIAS for sh4 completely and rely on the
run-time probed cpu_data->dcache.alias_mask. This in turn gets
the p3map_sem[] index right, and everything works again.
While we're at it, also convert to 4-level page tables..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allow multiple early printk consoles via earlyprintk=.
With this change earlyprintk is no longer enabled by default,
it must be specified on the kernel command line. Optionally
with ,keep to prevent unreg by tty_io.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Minor sign-extension bug in SH-specific memset()..
Signed-off-by: Toshinobu Sugioka <sugioka@itonet.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This reworks some of the SH-4 cache handling code to more easily
accomodate newer-style caches (particularly for the > direct-mapped
case), as well as optimizing some of the old code.
Signed-off-by: Richard Curnow <richard.curnow@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Various cleanups for HS7751RVoIP. Mostly just getting
rid of the old mach.c and splitting codec configuration
in to its own Kconfig.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the I/O rework for hd64461 we're down to a single header,
so move it by itself and get rid of the directory.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some of these have suffered some bitrot, and so there is
some degree of dead code that has been left sitting around,
clean it up..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
For some of the larger sizes we permitted spanning pages
across several PTEs, but this turned out to not be generally
useful. This reverts the sh hugetlbpage interface to something
more sensible using huge pages at single PTE granularity.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some minor cleanups for the updated consolidated hp6xx
mach-type.
Signed-off-by: Andriy Skulysh <askulysh@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We had quite a bit of whitespace damage, clean most of it up..
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Please ignore previous message.
This patch is adding support for CPU connected to CLE266
chipset. For older CPU this is only way. For "Powersaver"
processor this way will be used if ACPI C3 isn't supported.
I have tested it. It seems to work exacly like ACPI.
But it is less safe. On CLE266 chipset port 0x22 is
blocking processor access to PCI bus too.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Make the sections proper and get rid of section mismatch warnings.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
These were previously sprinkled in machine_power_off(),
though missed being updated when the rest of the boards
switched over.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We had a special .stack section in the ld script that
was being used to position r15 initially. This is
nonsensical, as we can just use a THREAD_SIZE offset
from the init_thread_union instead (as every other arch
does).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
in_nmi shifted down a few labels, so we were inadvertently
clearing the lower byte of do_syscall_trace, badness ensues.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Move the syscall table in to its own file, as per sh64. The entry.S
bits will end up being considerably different in the sh2/sh2a cases,
so this lets us keep things in sync somewhat..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
flush_cache_range() wasn't page aligning the end of the range,
we can't assume that it will always be page aligned, and we
ended up getting unaligned faults in some rare call paths.
Additionally, we add a small optimization to just purge the
dcache entirely if the range is large enough that the page
table walking will take longer. We use an arbitrary value of
64 pages for the large range size, as per sh64.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch prevents pcibios_disable_device() from disabling interrupts
of devices which is not enabled.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
0x08 is the HT capability, while PCI_CAP_ID_HT_IRQCONF would be
the subtype 0x80 that mpic_scan_ht_pic() uses.
Rename PCI_CAP_ID_HT_IRQCONF into PCI_CAP_ID_HT.
And by the way, use it in the ipath driver instead of defining its
own HT_CAPABILITY_ID.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Minor reformatting to vmlinux.lds.S to make it 80-column usable,
in accordance with Linux coding style.
Signed-off-by: Al Stone <ahs3@fc.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch reverses the order of fetching log from SAL and
checking poll threshold. This will fix following trivial issues:
- If SAL_GET_SATE_INFO is unbelievably slow (due to huge system
or just its silly implementation) and if it takes more than
1/5 sec, CMCI/CPEI will never switch to CMCP/CPEP.
- Assuming terrible flood of interrupt (continuous corrected
errors let all CPUs enter to handler at once and bind them
in it), CPUs will be serialized by IA64_LOG_LOCK(*).
Now we check the poll threshold after the lock and log fetch,
so we need to call SAL_GET_STATE_INFO (num_online_cpus() + 4)
times in the worst case.
if we can check the threshold before the lock, we can shut up
interrupts quickly without waiting preceding log fetches, and
the number of times will be reduced to (num_online_cpus()) in
the same situation.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When entering the kernel due to an MCA or INIT, ar.fpsr (ar40)
was not getting set to the kernel default value (remaining
at the user value). The effect depends on the user setting
of ar.fpsr. In the test case, the effect was addresses
printing with strange hex values.
Setting ar.fpsr in ia64_set_kernel_registers sets it for both
the MCA and INIT paths. The user value of ar.fpsr is correctly
saved (in ia64_state_save) and restored (in ia64_state_restore).
Below is an example of output with very strange hex values.
Anyone know the value of hex 'g'? :-)
Processes interrupted by INIT - 0 (cpu 14 task 0xdfffg55g7a4c6gA)
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Printing message to console from MCA/INIT handler is useful,
however doing oops_in_progress = 1 in them exactly makes
something in kernel wrong. Especially it sounds ugly if
system goes wrong after returning from recoverable MCA.
This patch adds ia64_mca_printk() function that collects
messages into temporary-not-so-large message buffer during
in MCA/INIT environment and print them out later, after
returning to normal context or when handlers determine to
down the system.
Also this print function is exported for use in extensional
MCA handler. It would be useful to describe detail about
recovery.
NOTE:
I don't think it is sane thing if temporary message buffer
is enlarged enough to hold whole stack dumps from INIT, so
buffering is disabled during stack dump from INIT-monarch
(= default_monarch_init_process). please fix it in future.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cut the number of lines of memory info output per node from five
to one line.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use the default sysrq printk level for printing show_mem() output both
for disconfig and contig versions. This is consistent with the printk
level used on other architectures (well ia32 at least).
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
MCA dispatch code take physical address of GP passed from SAL, then call
DATA_PA_TO_VA twice on GP before call into C code. The first time is
in ia64_set_kernel_register, the second time is in VIRTUAL_MODE_ENTER.
The gp is changed to a virtual address in region 7 because DATA_PA_TO_VA
is implemented by dep instruction.
However when notify blocks were called from MCA handler code, because
notify blocks are supported by callback function pointers, gp value
value was switched to region 5 again.
The patch set gp register to kernel gp of region 5 at entry of MCA
dispatch.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This reverts commit 2636255488.
Jakub Jelinek provided the missing futex_atomic_cmpxchg_inatomic()
function, so now it should be safe to re-enable these syscalls.
Signed-off-by: Tony Luck <tony.luck@intel.com>
The GFP_DMA option usually does nothing on SN2 since all of memory is in thei
DMA zone and the BTE has always been capable of addressing all of memory.
So there is no need to get memory from a restricted range of memory (which
is what GFP_DMA is for).
Remove useless __GFP_DMA option.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
[PATCH] Don't set calgary iommu as default y
[PATCH] i386/x86-64: New Intel feature flags
[PATCH] x86: Add a cumulative thermal throttle event counter.
[PATCH] i386: Make the jiffies compares use the 64bit safe macros.
[PATCH] x86: Refactor thermal throttle processing
[PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
[PATCH] Fix unwinder warning in traps.c
[PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
[PATCH] x86: Move direct PCI scanning functions out of line
[PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
[PATCH] Don't leak NT bit into next task
[PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
[PATCH] Fix some broken white space in ia32_signal.c
[PATCH] Initialize argument registers for 32bit signal handlers.
[PATCH] Remove all traces of signal number conversion
[PATCH] Don't synchronize time reading on single core AMD systems
[PATCH] Remove outdated comment in x86-64 mmconfig code
[PATCH] Use string instructions for Core2 copy/clear
[PATCH] x86: - restore i8259A eoi status on resume
[PATCH] i386: Split multi-line printk in oops output.
...
This patch renders thread_struct->pmcs[] and thread_struct->pmds[]
OBSOLETE. The actual table is moved to pfm_context structure which
saves space in thread_struct (in turn saving space in task_struct
which frees up more space for kernel stacks).
Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add is_multithreading_enabled() to check whether multi-threading
is enabled independently of which cpu is currently online
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If the user-specified kprobe handler causes the page fault when accessing
user space address, fixup this fault since do_page_fault() should not
continue as the kprobe handler are run with preemption disabled.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On IA64 instruction opcode must be 16 bytes alignment, in kprobe structure
there is one element to save original instruction, currently saved opcode
is not statically allocated in kprobe structure, that can not assure
16 bytes alignment. This patch dynamically allocated kprobe instruction
opcode to assure 16 bytes alignment.
Signed-off-by: bibo mao <bibo.mao@intel.com>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Convert cmm's usage of kernel_thread to kthread_run. Also create the
cmmthread at module load time, so it is possible to check if creation of
the thread fails.
In addition the cmmthread now gets terminated when the module gets unloaded
instead of leaving a stale kernel thread. Also check the return values of
other registration functions at module load and handle their return values
appropriately.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ensure current->signal->tty doesn't get freed during log_exec().
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The KSTK_* macros used an inordinate amount of stack. In order to overcome
an impedance mismatch between their interface, which just returns a single
register value, and the interface of get_thread_regs, which took a full
pt_regs, the implementation created an on-stack pt_regs, filled it in, and
returned one field. do_task_stat calls KSTK_* twice, resulting in two
local pt_regs, blowing out the stack.
This patch changes the interface (and name) of get_thread_regs to just
return a single register from a jmp_buf.
The include of archsetjmp.h" in registers.h to get the definition of
jmp_buf exposed a bogus include of <setjmp.h> in start_up.c. <setjmp.h>
shouldn't be used anywhere any more since UML uses the klibc
setjmp/longjmp.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean set_ether_mac usage. Maybe could also be removed, but surely it can't
be a global function taking a void* argument.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
timer_irq_inited was useless, so it is removed.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
set_interval returns an error instead of panicing if setitimer fails. Some of
its callers now check the return.
enable_timer is largely tt-mode-specific, so it is marked as such, and the
only skas-mode caller is made to call set-interval instead.
user_time_init was a no-value-added wrapper around set_interval, so it is
gone.
Since set_interval is now called from kernel code, callers no longer pass
ITIMER_* to it. Instead, they pass a flag which is converted into ITIMER_REAL
or ITIMER_VIRTUAL.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Have most signals go through an arch-provided handler which recovers the
sigcontext and then calls a generic handler. This replaces the
ARCH_GET_SIGCONTEXT macro, which was somewhat fragile. On x86_64, recovering
%rdx (which holds the sigcontext pointer) must be the first thing that
happens. sig_handler duly invokes that first, but there is no guarantee that
I can see that instructions won't be reordered such that %rdx is used before
that. Having the arch provide the handler seems much more robust.
Some signals in some parts of UML require their own handlers - these places
don't call set_handler any more. They call sigaction or signal themselves.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Various cleanups in the sigio code.
- Removed explicit zero-initializations of a few structures.
- Improved some error messages.
- An API change - there was an asymmetry between reactivate_fd calling
maybe_sigio_broken, which goes through all the machinery of figuring out if
a file descriptor supports SIGIO and applying the workaround to it if not,
and deactivate_fd, which just turns off the descriptor.
This is changed so that only activate_fd calls maybe_sigio_broken, when
the descriptor is first seen. reactivate_fd now calls add_sigio_fd, which
is symmetric with ignore_sigio_fd.
This removes a recursion which makes a critical section look more critical
than it really was, obsoleting a big comment to that effect. This requires
keeping track of all descriptors which are getting the SIGIO treatment, not
just the ones being polled at any given moment, so that reactivate_fd,
through add_sigio_fd, doesn't try to tell the SIGIO thread about descriptors
it doesn't care about.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
UML can get a SIGBUS anywhere if the tmpfs mount being used for its memory
runs out of space. This patch adds a printk before the panic to provide a
clue as to what likely went wrong.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were some bugs in handling failures to exec helper programs. errno was
passed back from the child with the wrong sign. It was also ignored. In the
case where it mattered, the errno from the (successful) read in the parent was
used instead.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch/um/kernel/tlb.c had some pretty serious whitespace problems. I also
fixed some returns.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stack randomization needs to be conditional on the personality allowing it.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were a bunch of missed ARRAY_SIZE opportunities.
Also, some formatting fixes in the affected areas of code.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds an implementation of setjmp and longjmp to UML, allowing
access to the inside of a jmpbuf without needing the access macros formerly
provided by libc.
The implementation is stolen from klibc. I copy the relevant files into
arch/um. I have another patch which avoids the copying, but requires klibc be
in the tree.
setjmp and longjmp users required some tweaking. Includes of <setjmp.h> were
removed and includes of the UML longjmp.h were added where necessary. There
are also replacements of siglongjmp with UML_LONGJMP which I somehow missed
earlier.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Detect the situations in which the time after a resume from disk would be
earlier than the time before the suspend and prevent them from happening on
i386.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The name of the pagedir_nosave variable does not make sense any more, so it
seems reasonable to change it to something more meaningful.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On x86_64 machines with more than 2 GB of RAM there are large memory gaps
(with no corresponding kernel virtual addresses) and reserved memory
regions between areas of usable physical RAM. Moreover, if CONFIG_FLATMEM
is set, they appear within the normal zone. swsusp should not try to save
them, so the corresponding page structs have to be marked as 'nosave'.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There appears to be a typo in the EV56 config option. NORITAKE and PRIMO are
be able to set a variation of either.
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The functions prepare_set and post_set in kernel/cpu/mtrr/generic.c wrap
the spinlock set_atomicity_lock: prepare_set returns with the lock held,
and post_set releases the lock without acquiring it. Add lock annotations
to these two functions so that sparse can check callers for lock pairing,
and so that sparse will not complain about these functions since they
intentionally use locks in this manner.
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove all references to xtime in i386 and replace them w/
get/set_timeofday(). Requires some ugly and uncertain changes to APM, but
has been lightly tested to work.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Voyager fiddles with current->signal.tty without locking. It turns out
that the code in question has already cleared current->signal.tty correctly
because daemonize() does the right thing already.
The signal handling also appears to be incorrect as it does an unprotected
sigfillset that also appears unneccessary. As I don't have a bowtie and am
therefore not a qualified voyager maintainer I leave that to James.
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If we're going to implement smp_call_function_single() on three architecture
with the same prototype then it should have a declaration in a
non-arch-specific header file.
Move it into <linux/smp.h>.
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Continiung the series of small patches necessary for the perfmon subsystem,
here is a patch that adds support for the smp_call_function_single()
function for i386. It exists for almost all other architectures but i386.
The perfmon subsystem needs it in one case to free some state on a
designated remote CPU.
Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current VMSPLIT Kconfig option is disabled whenever highmem is on.
This is a bit screwy because the people who need to change VMSPLIT the most
tend to be the ones with highmem and constrained lowmem.
So, remove the highmem dependency. But, re-include the dependency for the
"full 1GB of lowmem" option. You can't have the full 1GB of lowmem and
highmem because of the need for the vmalloc(), kmap(), etc... areas.
I thought there would be at least a bit of tweaking to do to
get it to work, but everything seems OK.
Boot tested on a 4GB x86 machine, and a 12GB 3-node NUMA-Q:
elm3b82:~# cat /proc/meminfo
MemTotal: 3695412 kB
MemFree: 3659540 kB
...
LowTotal: 2909008 kB
LowFree: 2892324 kB
...
elm3b82:~# zgrep PAE /proc/config.gz
CONFIG_X86_PAE=y
larry:~# cat /proc/meminfo
MemTotal: 11845900 kB
MemFree: 11786748 kB
...
LowTotal: 2855180 kB
LowFree: 2830092 kB
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch will pack any .note.* section into a PT_NOTE segment in the output
file.
To do this, we tell ld that we need a PT_NOTE segment. This requires us to
start explicitly mapping sections to segments, so we also need to explicitly
create PT_LOAD segments for text and data, and map the sections to them
appropriately. Fortunately, each section will default to its previous
section's segment, so it doesn't take many changes to vmlinux.lds.S.
This only changes i386 for now, but I presume the corresponding changes for
other architectures will be as simple.
This change also adds <linux/elfnote.h>, which defines C and Assembler macros
for actually creating ELF notes.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a boot parameter to reserve high linear address space for hypervisors.
This is necessary to allow dynamically loaded hypervisor modules, which might
not happen until userspace is already running, and also provides a useful tool
to benchmark the performance impact of reduced lowmem address space.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make __FIXADDR_TOP a variable, so that it can be set to not get in the way of
address space a hypervisor may want to reserve.
Original patch by Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch/i386/kernel/reboot.c defines its own struct to describe an ldt entry: it
should use struct Xgt_desc_struct (currently load_ldt is a macro, so doesn't
complain: paravirt patches make it warn).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up module initalization for apm.c. I had started by auditing for
proper return code checks in misc_register, but I found that in the event
of an initalization failure, a proc file and a kernel thread were left
hanging out. this patch properly cleans up those loose ends on any
initalization failure.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
show_registers() tries to dump failing code starting 43 bytes before the
offending instruction, but this address can be bad, for example in a device
driver where the failing instruction is less than 43 bytes from the start
of the driver's code. When that happens, try to dump code starting at the
failing instruction instead of printing no code at all.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To prevent the emulated RTC timer from stopping when interrupts are delayed
for too long, disable interrupts around all of the register initialization,
and check that the interrupt handler did not schedule the next interrupt in
the past.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Vojtech Pavlik <vojtech@suse.cz>
Cc: Robert Picco <Robert.Picco@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
FRegister a platform device for the AT49BV6416 NOR flash chip on the ATSTK1000
development board for use by the physmap MTD driver.
The SMC timings are set up before the platform device is registered so that no
board-specific mapping driver is necessary.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patchset adds the necessary drivers and infrastructure to access the
external flash on the ATSTK1000 board through the MTD subsystem. With this
stuff in place, it will be possible to use a jffs2 filesystem stored in the
external flash as a root filesystem. It might also be possible to update the
boot loader if you drop the write protection of partition 0.
As suggested by David Woodhouse, I reworked the patches to use the physmap
driver instead of introducing a separate mapping driver for the ATSTK1000.
I've also cleaned up the hsmc header by removing useless comments and
converting spaces to tabs (my headerfile generator needs some work.)
Unfortunately, I couldn't unlock the flash in fixup_use_atmel_lock because the
erase regions hadn't been set up yet, so I had to do it from cfi_amdstd_setup
instead.
This patch:
This adds a simple API for configuring the static memory controller along with
an implementation for the Atmel HSMC.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds support for the Atmel AVR32 architecture as well as the AT32AP7000
CPU and the AT32STK1000 development board.
AVR32 is a new high-performance 32-bit RISC microprocessor core, designed for
cost-sensitive embedded applications, with particular emphasis on low power
consumption and high code density. The AVR32 architecture is not binary
compatible with earlier 8-bit AVR architectures.
The AVR32 architecture, including the instruction set, is described by the
AVR32 Architecture Manual, available from
http://www.atmel.com/dyn/resources/prod_documents/doc32000.pdf
The Atmel AT32AP7000 is the first CPU implementing the AVR32 architecture. It
features a 7-stage pipeline, 16KB instruction and data caches and a full
Memory Management Unit. It also comes with a large set of integrated
peripherals, many of which are shared with the AT91 ARM-based controllers from
Atmel.
Full data sheet is available from
http://www.atmel.com/dyn/resources/prod_documents/doc32003.pdf
while the CPU core implementation including caches and MMU is documented by
the AVR32 AP Technical Reference, available from
http://www.atmel.com/dyn/resources/prod_documents/doc32001.pdf
Information about the AT32STK1000 development board can be found at
http://www.atmel.com/dyn/products/tools_card.asp?tool_id=3918
including a BSP CD image with an earlier version of this patch, development
tools (binaries and source/patches) and a root filesystem image suitable for
booting from SD card.
Alternatively, there's a preliminary "getting started" guide available at
http://avr32linux.org/twiki/bin/view/Main/GettingStarted which provides links
to the sources and patches you will need in order to set up a cross-compiling
environment for avr32-linux.
This patch, as well as the other patches included with the BSP and the
toolchain patches, is actively supported by Atmel Corporation.
[dmccr@us.ibm.com: Fix more pxx_page macro locations]
[bunk@stusta.de: fix `make defconfig']
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The third argument of au1xxx_dbdma_chan_alloc's callback function is not
used anywhere.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Permit __do_IRQ() to be dispensed with based on a configuration option.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Improve FRV's use of generic IRQ handling:
(*) Use generic_handle_irq() rather than __do_IRQ() as the latter is obsolete.
(*) Don't implement enable() and disable() ops as these will fall back to
using unmask() and mask().
(*) Provide mask_ack() functions to avoid a call each to mask() and ack().
(*) Make the cascade handlers always return IRQ_HANDLED.
(*) Implement the mask() and unmask() functions in the same order as they're
listed in the ops table.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the FRV arch use the generic IRQ code rather than having its own
routines for doing so.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are many places where we need to determine the node of a zone.
Currently we use a difficult to read sequence of pointer dereferencing.
Put that into an inline function and use throughout VM. Maybe we can find
a way to optimize the lookup in the future.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the atomic counter for slab_reclaim_pages and replace the counter
and NR_SLAB with two ZVC counter that account for unreclaimable and
reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.
Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE. The
intend seems to be to check for slab pages that could be freed.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of the changes necessary for shared page tables is to standardize the
pxx_page macros. pte_page and pmd_page have always returned the struct
page associated with their entry, while pte_page_kernel and pmd_page_kernel
have returned the kernel virtual address. pud_page and pgd_page, on the
other hand, return the kernel virtual address.
Shared page tables needs pud_page and pgd_page to return the actual page
structures. There are very few actual users of these functions, so it is
simple to standardize their usage.
Since this is basic cleanup, I am submitting these changes as a standalone
patch. Per Hugh Dickins' comments about it, I am also changing the
pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In many places we will need to use the same combination of flags. Specify
a single GFP_THISNODE definition for ease of use in gfp.h.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The uncached allocator manages per node pools. Specify __GFP_THISNODE in
order to force allocation on the indicated node or fail. The uncached
allocator has already logic to deal with failing allocations.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a notifer chain to the out of memory killer. If one of the registered
callbacks could release some memory, do not kill the process but return and
retry the allocation that forced the oom killer to run.
The purpose of the notifier is to add a safety net in the presence of
memory ballooners. If the resource manager inflated the balloon to a size
where memory allocations can not be satisfied anymore, it is better to
deflate the balloon a bit instead of killing processes.
The implementation for the s390 ballooner is included.
[akpm@osdl.org: cleanups]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We cannot check MAX_NR_ZONES since it not defined in the preprocessor
anymore.
So remove the check.
The maximum number of zones per node for i386 is 3 since i386 does not
support ZONE_DMA32.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
eventcounters: Do not display counters for zones that are not available on an
arch
Do not define or display counters for the DMA32 and the HIGHMEM zone if such
zones were not configured.
[akpm@osdl.org: s390 fix]
[heiko.carstens@de.ibm.com: s390 fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make ZONE_HIGHMEM optional
- ifdef out code and definitions related to CONFIG_HIGHMEM
- __GFP_HIGHMEM falls back to normal allocations if there is no
ZONE_HIGHMEM
- GFP_ZONEMASK becomes 0x01 if there is no DMA32 and no HIGHMEM
zone.
[jdike@addtoit.com: build fix]
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make ZONE_DMA32 optional
- Add #ifdefs around ZONE_DMA32 specific code and definitions.
- Add CONFIG_ZONE_DMA32 config option and use that for x86_64
that alone needs this zone.
- Remove the use of CONFIG_DMA_IS_DMA32 and CONFIG_DMA_IS_NORMAL
for ia64 and fix up the way per node ZVCs are calculated.
- Fall back to prior GFP_ZONEMASK of 0x03 if there is no
DMA32 zone.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move totalhigh_pages and nr_free_highpages() into highmem.c/.h
Move the totalhigh_pages definition into highmem.c/.h. Move the
nr_free_highpages function into highmem.c
[yoichi_yuasa@tripeaks.co.jp: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix array initialization in lots of arches
The number of zones may now be reduced from 4 to 2 for many arches. Fix the
array initialization for the zones array for all architectures so that it is
not initializing a fixed number of elements.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I keep seeing zones on various platforms that are never used and wonder why we
compile support for them into the kernel. Counters show up for HIGHMEM and
DMA32 that are alway zero.
This patch allows the removal of ZONE_DMA32 for non x86_64 architectures and
it will get rid of ZONE_HIGHMEM for arches not using highmem (like 64 bit
architectures). If an arch does not define CONFIG_HIGHMEM then ZONE_HIGHMEM
will not be defined. Similarly if an arch does not define CONFIG_ZONE_DMA32
then ZONE_DMA32 will not be defined.
No current architecture uses all the 4 zones (DMA,DMA32,NORMAL,HIGH) that we
have now. The patchset will reduce the number of zones for all platforms.
On many platforms that do not have DMA32 or HIGHMEM this will reduce the
number of zones by 50%. F.e. ia64 only uses DMA and NORMAL.
Large amounts of memory can be saved for larger systemss that may have a few
hundred NUMA nodes.
With ZONE_DMA32 and ZONE_HIGHMEM support optional MAX_NR_ZONES will be 2 for
many non i386 platforms and even for i386 without CONFIG_HIGHMEM set.
Tested on ia64, x86_64 and on i386 with and without highmem.
The patchset consists of 11 patches that are following this message.
One could go even further than this patchset and also make ZONE_DMA optional
because some platforms do not need a separate DMA zone and can do DMA to all
of memory. This could reduce MAX_NR_ZONES to 1. Such a patchset will
hopefully follow soon.
This patch:
Fix strange uses of MAX_NR_ZONES
Sometimes we use MAX_NR_ZONES - x to refer to a zone. Make that explicit.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Address a long standing issue of booting with an initrd on an i386 numa
system. Currently (and always) the numa kva area is mapped into low memory
by finding the end of low memory and moving that mark down (thus creating
space for the kva). The issue with this is that Grub loads initrds into
this similar space so when the kernel check the initrd it finds it outside
max_low_pfn and disables it (it thinks the initrd is not mapped into usable
memory) thus initrd enabled kernels can't boot i386 numa :(
My solution to the problem just converts the numa kva area to use the
bootmem allocator to save it's area (instead of moving the end of low
memory). Using bootmem allows the kva area to be mapped into more diverse
addresses (not just the end of low memory) and enables the kva area to be
mapped below the initrd if present.
I have tested this patch on numaq(no initrd) and summit(initrd) i386 numa
based systems.
[akpm@osdl.org: cleanups]
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>