a recent fix:
commit ce28b9864b
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed Feb 20 23:57:30 2008 +0100
x86: fix vsyscall wreckage
removed the broken /kernel/vsyscall64 handler completely.
This triggers the following debug check:
sysctl table check failed: /kernel/vsyscall64 No proc_handler
Restore the sane part of the proc handler.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix a kernel bug (vmware boot problem) reported by Tomasz Grobelny,
which occurs with certain .config variants and gccs.
The x86 TLS cleanup in commit efd1ca52d0
made the sys_set_thread_area and sys_get_thread_area functions ripe for
tail call optimization. If the compiler chooses to use it for them, it
can clobber the user trap frame because these are asmlinkage functions.
Reported-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Diffing dmesg between git7 and git8 doesn't sched any light since
> git8 also removed the printouts of the x86 caps as they were being
> initialised and updated. I'm currently adding those printouts back
> in the hope of seeing where and when the caps get broken.
That turned out to be very illuminating:
--- dmesg-2.6.24-git7 2008-02-24 18:01:25.295851000 +0100
+++ dmesg-2.6.24-git8 2008-02-24 18:01:25.530358000 +0100
...
CPU: After generic identify, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
CPU: After all inits, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
+CPU: After applying cleared_cpu_caps, caps: 00000013 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Notice how the TSC cap bit goes from Off to On.
(The first two lines are printout loops from -git7 forward-ported
to -git8, the third line is the same printout loop added just after
the xor-with-cleared_cpu_caps[] loop.)
Here's how the breakage occurs:
1. arch/x86/kernel/tsc_32.c:tsc_init() sees !cpu_has_tsc,
so bails and calls setup_clear_cpu_cap(X86_FEATURE_TSC).
2. include/asm-x86/cpufeature.h:setup_clear_cpu_cap(bit) clears
the bit in boot_cpu_data and sets it in cleared_cpu_caps
3. arch/x86/kernel/cpu/common.c:identify_cpu() XORs all caps
in with cleared_cpu_caps
HOWEVER, at this point c->x86_capability correctly has TSC
Off, cleared_cpu_caps has TSC On, so the XOR incorrectly
sets TSC to On in c->x86_capability, with disastrous results.
The real bug is that clearing bits with XOR only works if the
bits are known to be 1 prior to the XOR, and that's not true here.
A simple fix is to convert the XOR to AND-NOT instead. The following
patch does that, and allows my 486 to boot 2.6.25-rc kernels again.
[ mingo@elte.hu: fixed a similar bug in setup_64.c as well. ]
The breakage was introduced via commit 7d851c8d3d.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, c_idle is declared in the stack, and thus, have no static address.
Peter Zijlstra points out this simple solution, in which c_idle.work
is initializated separatedly. Note that the INIT_WORK macro has a static
declaration of a key inside.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Acked-by: Peter Zijlstra <pzijlstr@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, there is no way for print_stack_trace() to determine whether
a given stack trace entry was deemed reliable or not, simply because
save_stack_trace() does not record this information. (Perhaps needless
to say, this makes the saved stack traces A LOT harder to read, and
probably with no other benefits, since debugging features that use
save_stack_trace() most likely also require frame pointers, etc.)
This patch reverts to the old behaviour of only recording the reliable trace
entries for saved stack traces.
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There doesn't seem to be any reason for swapper_pg_pmd being global.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Inside a KVM virtual machine the MTRRs are usually blank. This confuses Linux
and causes a warning message at boot. This patch removes that warning message
when running Linux as a KVM guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pointed out by pageexec@freemail.hu:
> what happens here is that gcc treats the argument area as owned by the
> callee, not the caller and is allowed to do certain tricks. for ssp it
> will make a copy of the struct passed by value into the local variable
> area and pass *its* address down, and it won't copy it back into the
> original instance stored in the argument area.
>
> so once sys_execve returns, the pt_regs passed by value hasn't at all
> changed and its default content will cause a nice double fault (FWIW,
> this part took me the longest to debug, being down with cold didn't
> help it either ;).
To fix this we pass in pt_regs by pointer.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
based on a report from Arne Georg Gleditsch about user-space apps
misbehaving after toggling /proc/sys/kernel/vsyscall64, a review
of the code revealed that the "NOP patching" done there is
fundamentally unsafe for a number of reasons:
1) the patching code runs without synchronizing other CPUs
2) it inserts NOPs even if there is no clock source which provides vread
3) when the clock source changes to one without vread we run in
exactly the same problem as in #2
4) if nobody toggles the proc entry from 1 to 0 and to 1 again, then
the syscall is not patched out
as a result it is possible to break user-space via this patching.
The only safe thing for now is to remove the patching.
This code was broken since v2.6.21.
Reported-by: Arne Georg Gleditsch <arne.gleditsch@dolphinics.no>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel
text but data, bss and init sections as well.
That name led me on the wrong path with the KERNEL_TEXT_SIZE regression,
because i knew how big of _text_ my images have and i knew about the 40 MB
"text" limit so i wrongly thought to be on the safe side of the 40 MB limit
with my 29 MB of text, while the total image size was slightly above 40 MB.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
recently the 64-bit allyesconfig bzImage kernel started spontaneously
rebooting during early bootup.
after a few fun hours spent with early init debugging, it turns out
that we've got this rather annoying limit on the size of the kernel
image:
#define KERNEL_TEXT_SIZE (40*1024*1024)
which limit my vmlinux just happened to pass:
text data bss dec hex filename
29703744 4222751 8646224 42572719 2899baf vmlinux
40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/
So it happily crashed right in head_64.S, which - as we all know - is
the most debuggable code in the whole architecture ;-)
So increase the limit to allow an up to 128MB kernel image to be mapped.
(should anyone be that crazy or lazy)
We have a full 4K of pagetable (level2_kernel_pgt) allocated for these
mappings already, so there's no RAM overhead and the limit was rather
pointless and arbitrary.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
notsc is ignored in 32-bit kernels if CONFIG_X86_TSC is on.. which is
bad, fix it.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix mtrr kernel-doc warning:
Warning(linux-2.6.24-git12//arch/x86/kernel/cpu/mtrr/main.c:677): No description found for parameter 'end_pfn'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We have been promoting Transmeta TM3x00/TM5x00 chips to i686-class
based on the notion that they contain all the user-space visible
features of an i686-class chip. However, this is not actually true:
they lack the EA-taking long NOPs (0F 1F /0). Since this is a
userspace-visible incompatibility, downgrade these CPUs to the
manufacturer-defined i586 level.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Simple typo fix for regression introduced by the user_regset changes.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove redundant irq_desc[NR_IRQS] element initialization in
init_ISA_irqs(). irq_desc[NR_IRQS] is already statically
initialized with the same values in kernel/irq/handle.c .
besides the clean-up value this also saves some space:
text data bss dec hex filename
1389 356 14 1759 6df i8259_32.o.before
1325 356 14 1695 69f i8259_32.o.after
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Though we use PDA for regular task stack but that
is not acceptable for init_task wich is special
one. We still have to allocate init_task's stack
in that manner.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's much better to use PAGE_SIZE then magic 4096
(though it's almost synonym in most cases on x86 but
not for *all* cases ;)
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
initial_code are initially used to hold a function pointer
from __init and later from __cpuinit. This confuses modpost
and changing initial_code to REFDATA silence the warning.
(But now we do not discard the variable anymore).
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch_register_cpu() is only defined for HOTPLUG_CPU code
so simple fix is to ignore references by annotating the
function __ref.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
srat_detect_node() is only used by __cpuinit init_intel().
So the trivial fix is to annotate srat_detect_node() with __cpuinit.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
nearby_node() were only used by __cpuinit amd_detect_cmp()
So annotating nearby_node() __cpuinit was the trivial fix.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/nmi_64.c:50: warning: 'unknown_nmi_panic_callback' declared 'static' but never defined
This patch also fixes nmi_32.c
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Yes, it should.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/efi_32.c:42:6: warning: symbol 'efi_call_phys_prelog' was not declared. Should it be static?
arch/x86/kernel/efi_32.c:84:6: warning: symbol 'efi_call_phys_epilog' was not declared. Should it be static?
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/kprobes.c:584:16: warning: symbol 'kretprobe_trampoline_holder' was not declared. Should it be static?
arch/x86/kernel/kprobes.c:676:6: warning: symbol 'trampoline_handler' was not declared. Should it be static?
Make them static and add the __used attribute, approach taken from the
arm kprobes implementation.
kretprobe_trampoline_holder uses inline assemly to define the global
symbol kretprobe_trampoline, but nothing ever calls the holder explicitly.
trampoline handler is only called from inline assembly in the same file,
mark it used and static.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch removes the mca-pentium boot option that was a noop.
besides the source code cleanup factor, this saves some text as well:
arch/x86/kernel/cpu/bugs.o:
text data bss dec hex filename
651 77 4 732 2dc bugs.o.before
631 53 4 688 2b0 bugs.o.after
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
drivers/lguest/x86/switcher_32.S:(.text+0x3815f8):
undefined reference to `LGUEST_PAGES_regs_trapnum'
This problem was caused by asm-offsets.c only having the offsets when
lguest *guest* support was set, not lguest host (host support used to
imply guest support, so now they're separate these bugs come out).
Lguest guest support and host support are separate config options:
they used to be tied together. Sort out which parts of asm-offsets are
needed for Guest and Host.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting
from __START_KERNEL_map. The kernel itself only needs _text to _end
mapped in the high alias. On relocatible kernels the ASM setup code
adjusts the compile time created high mappings to the relocation. This
creates invalid pmd entries for negative offsets:
0xffffffff80000000 -> pmd entry: ffffffffff2001e3
It points outside of the physical address space and is marked present.
This starts at the virtual address __START_KERNEL_map and goes up to
the point where the first valid physical address (0x0) is mapped.
Zap the mappings before _text and after _end right away in early
boot. This removes also the invalid entries.
Furthermore it simplifies the range check for high aliases.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: DMI: quirk for FSC ESPRIMO Mobile V5505
ACPI: DMI blacklist updates
pnpacpi: __initdata is not an identifier
ACPI: static acpi_chain_head
ACPI: static acpi_find_dsdt_initrd()
ACPI: static acpi_no_initrd_override_setup()
thinkpad_acpi: static
ACPI suspend: Execute _WAK with the right argument
cpuidle: Add Documentation
ACPI, cpuidle: Clarify C-state description in sysfs
ACPI: fix suspend regression due to idle update
extern should not appear in C files. Also, the definitions
do not match the prototype currently, not sure what way you
want to go with this, I've switched the prototype to return
int, but I can see going to the void return as well.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When the GART table is unmapped from the kernel direct mappings
during early bootup, make sure we have no leftover cachelines in it.
Note: the clflush done by set_memory_np() was not enough, because
clflush does not work on unmapped pages.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The EFI-runtime mapping code changed a larger memory area than it
should have, due to a pages/bytes parameter mixup.
noticed by Andi Kleen.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jiri Kosina reported the following deadlock scenario with
show_unhandled_signals enabled:
[ 68.379022] gnome-settings-[2941] trap int3 ip:3d2c840f34
sp:7fff36f5d100 error:0<3>BUG: sleeping function called from invalid
context at kernel/rwsem.c:21
[ 68.379039] in_atomic():1, irqs_disabled():0
[ 68.379044] no locks held by gnome-settings-/2941.
[ 68.379050] Pid: 2941, comm: gnome-settings- Not tainted 2.6.25-rc1 #30
[ 68.379054]
[ 68.379056] Call Trace:
[ 68.379061] <#DB> [<ffffffff81064883>] ? __debug_show_held_locks+0x13/0x30
[ 68.379109] [<ffffffff81036765>] __might_sleep+0xe5/0x110
[ 68.379123] [<ffffffff812f2240>] down_read+0x20/0x70
[ 68.379137] [<ffffffff8109cdca>] print_vma_addr+0x3a/0x110
[ 68.379152] [<ffffffff8100f435>] do_trap+0xf5/0x170
[ 68.379168] [<ffffffff8100f52b>] do_int3+0x7b/0xe0
[ 68.379180] [<ffffffff812f4a6f>] int3+0x9f/0xd0
[ 68.379203] <<EOE>>
[ 68.379229] in libglib-2.0.so.0.1505.0[3d2c800000+dc000]
and tracked it down to:
commit 03252919b7
Author: Andi Kleen <ak@suse.de>
Date: Wed Jan 30 13:33:18 2008 +0100
x86: print which shared library/executable faulted in segfault etc. messages
the problem is that we call down_read() from an atomic context.
Solve this by returning from print_vma_addr() if the preempt count is
elevated. Update preempt_conditional_sti / preempt_conditional_cli to
unconditionally lift the preempt count even on !CONFIG_PREEMPT.
Reported-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a new sysfs entry under cpuidle states. desc - can be used by driver to
communicate to userspace any specific information about the state.
This helps in identifying the exact hardware C-states behind the ACPI C-state
definition.
Idea is to export this through powertop, which will help to map the C-state
reported by powertop to actual hardware C-state.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
arch/x86/kernel/i8253.c:98:27: warning: symbol 'pit_clockevent' was not declared. Should it be static?
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch enhances EFI runtime code memory mapping as following:
- Move __supported_pte_mask & _PAGE_NX checking before invoking
runtime_code_page_mkexec(). This makes it possible for compiler to
eliminate runtime_code_page_mkexec() on machine without NX support.
- Use set_memory_x/nx in early_mapping_set_exec(). This eliminates the
duplicated implementation.
This patch has been tested on Intel x86_64 platform with EFI64/32
firmware.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andi Kleen pointed out that the cache attribute logic is reverse in
efi_enter_virtual_mode(). This problem alone is harmless as we do not
(yet) do cache attribute conflict resolution. (This bug was not present
in the original EFI submission - I introduced it while fixing up rejects.)
While reviewing this code I noticed a second, worse problem: the use of
uninitialized md->virt_addr.
Fix both problems.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When reboot_32.c and reboot_64.c were unified (commit 4d022e35fd...),
the machine_ops code was broken, leading to xen pvops kernels failing
to properly halt/poweroff/reboot etc. This fixes that up.
Signed-off-by: Jody Belka <knew-linux@pimb.org>
Cc: Miguel Boton <mboton@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since we may not have a pci_dev for the device we need to access, we can't
use pci_read_config_word. But raw_pci_read is an internal implementation
detail; it's better to use the architected pci_bus_read_config_word
interface. Using PCI_DEVFN instead of a mysterious constant helps
reassure everyone that we really do intend to access device 8.
[ Thanks to Grant Grundler for pointing out to me that this is exactly
what the write immediately above this is doing -- enabling device 8 to
respond to config space cycles.
- Matthew
Grant also says:
"Can you also add a comment which points at the Intel
documentation?
The 'Intel E7320 Memory Controller Hub (MCH) Datasheet' at
http://download.intel.com/design/chipsets/datashts/30300702.pdf
Page 69 documents register F4h (DEVPRES1).
And I just doubled checked that the 0xf4 register value is
restored later in the quirk (obvious when you look at the code
but not from the patch"
so here it is.
- Linus ]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want to allow different implementations of pci_raw_ops for standard
and extended config space on x86. Rather than clutter generic code with
knowledge of this, we make pci_raw_ops private to x86 and use it to
implement the new raw interface -- raw_pci_read() and raw_pci_write().
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move arch/x86/kernel/suspend_64.c to arch/x86/power .
Move arch/x86/kernel/suspend_asm_64.S to arch/x86/power
as hibernate_asm_64.S .
Update purpose and copyright information in
arch/x86/power/suspend_64.c and
arch/x86/power/hibernate_asm_64.S .
Update the Makefiles in arch/x86, arch/x86/kernel and
arch/x86/power to reflect the above changes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled
kernel are in PAE format.
early_ioremap is updated to use the standard page table accessors.
Clear any mappings beyond max_low_pfn from the boot page tables in
native_pagetable_setup_start because the initial mappings can extend
beyond the range of physical memory and into the vmalloc area.
Derived from patches by Eric Biederman and H. Peter Anvin.
[ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ]
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Mika Penttilä <mika.penttila@kolumbus.fi>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use a common irq_return entry point for all the iret places, which
need the paravirt INTERRUPT return wrapper.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Each AMD Geode MFGPT timer interrupt output is paired with another
timer; esentially the interrupt goes if either timer fires. This
is okay, but the handlers need to be aware of this. Make sure in
the timer tick handler that our timer really did expire.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We *really* don't want to be reading MFGPTx_SETUP and writing back those
values. What we want to be doing is clearing CMP1 and CMP2 unconditionally;
otherwise, we have races where CMP1 and/or CMP2 fire after we've read
MFGPTx_SETUP. They can also fire between when we've written ~CNTEN to
the register, and when the new register values get copied to the timer's
version of the register. By clearing both fields, we're okay.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There isn't much value to always detecting the MFGPT timers on
Geode platforms; detection is only needed when something wants
to use the timers. Move the detection code so that it gets
called the first time a timer is allocated.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We need to be called from elsewhere, and this gets some #ifdefs out
of the .c file.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Drop F_AVAIL and the 'flags' field, replacing with an 'avail' bit. This
looks more understandable to me.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We had planned to use the 'owner' field for allowing re-allocation of
MFGPTs; however, doing it by module owner name isn't flexible enough. So,
drop this for now. If it turns out that we need timers in modules, we'll
need to come up with a scheme that matches the write-once fields of the
MFGPTx_SETUP register, and drops ponies from the sky.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The GEODE MFGPT code assumed that 32kHz was 32000 Hz while the boards
run on a 32.768 kHz digital watch crystal. In practise, it will not
change the timer's frequency as the skew was only 2.4%, but it
should provide more accurate intervals.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- uninline timer functions; the compiler knows better than we do
whether or not to inline these.
- mfgpt_start_timer() had an unused 'clock' argument, drop it.
From both Jordan and myself.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set.
Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
be permitted to go looking for A.OUT libraries to load in such a case. Not
only that, but under such conditions A.OUT core dumps are not produced either.
To make this work, this patch also does the following:
(1) Makes the existence of the contents of linux/a.out.h contingent on
CONFIG_ARCH_SUPPORTS_AOUT.
(2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT
core dumping code.
(3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This
is then included only where needed. This means that this bit of arch
code will be stored in the appropriate A.OUT binfmt module rather than
the core kernel.
(4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not
needed) and FRV.
This patch depends on the previous patch to move STACK_TOP[_MAX] out of
asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT
format is available.
[jdike@addtoit.com: uml: re-remove accidentally restored code]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typo in comments.
BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise
checkpatch.pl will be complaining.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Add missing printk levels to e_powersaver
[CPUFREQ] Fix sparse warning in powernow-k8
[CPUFREQ] Support Model D parts and newer in e_powersaver
[CPUFREQ] Powernow-k8: Update to support the latest Turion processors
[CPUFREQ] fix configuration help message
[CPUFREQ] powernow-k8 print pstate instead of fid/did for family 10h
[CPUFREQ] Eliminate cpufreq_userspace scaling_setspeed deadlock
[CPUFREQ] gx-suspmod.c: use boot_cpu_data instead of current_cpu_data
[CPUFREQ] fix incorrect comment on show_available_freqs() in freq_table.c
[CPUFREQ] drivers/cpufreq: Add missing "space"
[CPUFREQ] arch/x86: Add missing "space"
[CPUFREQ] Remove pointless Kconfig dependancy
This patch fixes the configuration dependencies in the vmcoreinfo data.
i386's "node_data" is defined in arch/x86/mm/discontig_32.c,
and x86_64's one is defined in arch/x86/mm/numa_64.c.
They depend on CONFIG_NUMA:
arch/x86/mm/Makefile_32:7
obj-$(CONFIG_NUMA) += discontig_32.o
arch/x86/mm/Makefile_64:7
obj-$(CONFIG_NUMA) += numa_64.o
ia64's "pgdat_list" is defined in arch/ia64/mm/discontig.c,
and it depends on CONFIG_DISCONTIGMEM and CONFIG_SPARSEMEM:
arch/ia64/mm/Makefile:9-10
obj-$(CONFIG_DISCONTIGMEM) += discontig.o
obj-$(CONFIG_SPARSEMEM) += discontig.o
ia64's "node_memblk" is defined in arch/ia64/mm/numa.c,
and it depends on CONFIG_NUMA:
arch/ia64/mm/Makefile:8
obj-$(CONFIG_NUMA) += numa.o
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Acked-by: Simon Horman <horms@verge.net.au>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid
conflicts while reserving the memory for the kdump capture kernel
(crashkernel=).
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.
This patch:
Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past. This is to avoid conflicts.
Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:1238:9: warning: symbol '__ptr' shadows an earlier one
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:1238:9: originally declared here
Signed-off-by: Dave Jones <davej@redhat.com>
Patch by VIA that updates e_powersaver.c to work with our model D parts
and newer.
From: Jesse Ahrens <jahrens@centtech.com>
Signed-off-by: Dave Jones <davej@redhat.com>
The latest series of Turion X2 processors have a new XFAM
model. Add support for them to powernow-k8.h.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
In preemptible kernel will report BUG: using smp_processor_id() in
preemptible, so use boot_cpu_data instead of current_cpu_data.
discussion in :
http://lkml.org/lkml/2007/7/25/32
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
CC: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Pavel Emelyanov reported that his networking card did not work
and bisected it down to:
"
The commit
093af8d7f0
x86_32: trim memory by updating e820
broke my e1000 card: on loading driver says that
e1000: probe of 0000:04:03.0 failed with error -5
and the interface doesn't appear.
"
on a 32-bit kernel, base will overflow when try to do PAGE_SHIFT,
and highest_addr will always less 4G.
So use pfn instead of address to avoid the overflow when more than
4g RAM is installed on a 32-bit kernel.
Many thanks to Pavel Emelyanov for reporting and testing it.
Bisected-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Tested-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The .rodata section shouldn't just be read-only,
but also non-executable. This is free since we've broken
up the 2MB page already anyway.
also update test_nx to check for this.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This change broke recovery of exceptions in iret:
commit 72fe485854
Author: Glauber de Oliveira Costa <gcosta@redhat.com>
x86: replace privileged instructions with paravirt macros
The ENTRY(native_iret) macro adds alignment padding before the iretq
instruction, so "iret_label" no longer points exactly at the instruction.
It was sloppy to leave the old "iret_label" label behind when replacing
its nearby use. Removing it would have revealed the other use of the
label later in the file, and upon noticing that use, anyone exercising
the minimum of attention to detail expected of anyone touching this
subtle code would realize it needed to change as well.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:830:7: warning: symbol 'hi' shadows an earlier one
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:824:6: originally declared here
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:830:15: warning: symbol 'lo' shadows an earlier one
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:824:14: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This was being used to ensure the proper alignment of the FXSAVE/FXRSTOR data.
This would create a sparse error in the _correct_ cases, hiding further
warnings. Use BUILD_BUG_ON instead.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In my revamp of the x86 ptrace code for setting register values,
I accidentally omitted a check that was there in the old code.
Allowing %cs to be 0 causes a bad crash in recovery from iret failure.
This patch fixes that regression against 2.6.24, and adds a comment
that should help prevent this subtlety from being overlooked again.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unify the x86-64 behavior for 32-bit processes that set
bogus %cs/%ss values (the only ones that can fault in iret)
match what the native i386 behavior is. (do not kill the task
via do_exit but generate a SIGSEGV signal)
[ tglx@linutronix.de: build fix ]
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
calibrate_delay() must be __cpuinit, not __{dev,}init.
I've verified that this is correct for all users.
While doing the latter, I also did the following cleanups:
- remove pointless additional prototypes in C files
- ensure all users #include <linux/delay.h>
This fixes the following section mismatches with CONFIG_HOTPLUG=n,
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0x1128d): Section mismatch: reference to .init.text.1:calibrate_delay (between 'check_cx686_slop' and 'set_cx86_reorder')
WARNING: vmlinux.o(.text+0x25102): Section mismatch: reference to .init.text.1:calibrate_delay (between 'smp_callin' and 'cpu_coregroup_map')
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Zankel <chris@zankel.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>