Impact: do not expose a control that has no effect
Fix to prevent sched_mc_power_saving from being exported through sysfs
on single-socket systems. (Say multicore single socket (Laptop))
CPU core map of the boot cpu should be equal to possible number
of cpus for single socket system.
This fix has been developed at FOSS.in kernel workout.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
attr_smt_snooze_delay is only defined for CONFIG_PPC64, so protect the
attribute removal with the same condition. This fixes this build error
on 32-bit SMP configurations:
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c: In function ‘unregister_cpu_online’:
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: ‘attr_smt_snooze_delay’ undeclared (first use in this function)
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: (Each undeclared identifier is reported only once
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: for each function it appears in.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix system calls on Cell entered with XER.SO=1
powerpc/cell: Fix GDB watchpoints, again
powerpc/mpic: Don't reset affinity for secondary MPIC on boot
powerpc/cell/axon-msi: Retry on missing interrupt
powerpc: Fix boot freeze on machine with empty memory node
powerpc: Fix IRQ assignment for some PCIe devices
powerpc/spufs: Fix spinning in spufs_ps_fault on signal
powerpc/mpc832x_rdb: fix swapped ethernet ids
powerpc: Use generic PHY driver for Marvell 88E1111 PHY on GE Fanuc SBC610
powerpc/85xx: L2 cache size wrong in 8572DS dts
powerpc/virtex: Update defconfigs
powerpc/52xx: update defconfigs
xsysace: Fix driver to use resource_size_t instead of unsigned long
powerpc/virtex: fix various format/casting printk mismatches
powerpc/mpc5200: fix bestcomm Kconfig dependencies
powerpc/44x: Fix 460EX/460GT machine check handling
powerpc/40x: Limit allocable DRAM during early mapping
* master.kernel.org:/home/rmk/linux-2.6-arm:
Allow architectures to override copy_user_highpage()
[ARM] pxa/palmtx: misc fixes to use generic GPIO API
ARM: OMAP: Fixes for suspend / resume GPIO wake-up handling
[ARM] pxa/corgi: update default config to exclude tosa from being built
[ARM] pxa/pcm990: use negative number for an invalid GPIO in camera data
ARM: OMAP: Typo fix for clock_allow_idle
ARM: OMAP: Remove broken LCD driver for SX1
[ARM] 5335/1: pxa25x_udc: Fix is_vbus_present to return 1 or 0
[ARM] pxa/MioA701: bluetooth resume fix
[ARM] pxa/MioA701: fix memory corruption.
It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
= y, if a program sets the SO (summary overflow) bit in the XER and
then does a system call, the SO bit in CR0 will be set on return
regardless of whether the system call detected an error. Since CR0.SO
is used as the error indication from the system call, this means that
all system calls appear to fail.
The reason is that the workaround for the timebase bug on Cell uses a
compare instruction. With CONFIG_VIRT_CPU_ACCOUNTING = y, the
ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
compare instruction, which copies XER.SO to CR0.SO. Since we were
doing this in the system call entry patch after clearing CR0.SO but
before saving the CR, this meant that the saved CR image had CR0.SO
set if XER.SO was set on entry.
This fixes it by moving the clearing of CR0.SO to after the
ACCOUNT_CPU_USER_ENTRY call in the system call entry path.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
An earlier patch from Jens Osterkamp attempted to fix GDB
watchpoints by enabling the DABRX register at boot time.
Unfortunately, this did not work on SMP setups, where
secondary CPUs were still using the power-on DABRX value.
This introduces the same change for secondary CPUs on cell
as well.
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Tested-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Kexec/kdump currently fails on the IBM QS2x blades when the kexec happens
on a CPU other than the initial boot CPU. It turns out that this is the
result of mpic_init trying to set affinity of each interrupt vector to the
current boot CPU.
As far as I can tell, the same problem is likely to exist on any
secondary MPIC, because they have to deliver interrupts to the first
output all the time. There are two potential solutions for this: either
not set up affinity at all for secondary MPICs, or assume that a single
CPU output is connected to the upstream interrupt controller and hardcode
affinity to that per architecture.
This patch implements the second approach, defaulting to the first output.
Currently, all known secondary MPICs are routed to their upstream port
using the first destination, so we hardcode that.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The MSI capture logic for the axon bridge can sometimes
lose interrupts in case of high DMA and interrupt load,
when it signals an MSI interrupt to the MPIC interrupt
controller while we are already handling another MSI.
Each MSI vector gets written into a FIFO buffer in main
memory using DMA, and that DMA access is normally flushed
by the actual interrupt packet on the IOIF. An MMIO
register in the MSIC holds the position of the last
entry in the FIFO buffer that was written. However,
reading that position does not flush the DMA, so that
we can observe stale data in the buffer.
In a stress test, we have observed the DMA to arrive
up to 14 microseconds after reading the register.
This patch works around this problem by retrying the
access to the FIFO buffer.
We can reliably detect the conditioning by writing
an invalid MSI vector into the FIFO buffer after
reading from it, assuming that all MSIs we get
are valid. After detecting an invalid MSI vector,
we udelay(1) in the interrupt cascade for up to
100 times before giving up.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
I got a bug report about a distro kernel not booting on a particular
machine. It would freeze during boot:
> ...
> Could not find start_pfn for node 1
> [boot]0015 Setup Done
> Built 2 zonelists in Node order, mobility grouping on. Total pages: 123783
> Policy zone: DMA
> Kernel command line:
> [boot]0020 XICS Init
> [boot]0021 XICS Done
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> clocksource: timebase mult[7d0000] shift[22] registered
> Console: colour dummy device 80x25
> console handover: boot [udbg0] -> real [hvc0]
> Dentry cache hash table entries: 1048576 (order: 7, 8388608 bytes)
> Inode-cache hash table entries: 524288 (order: 6, 4194304 bytes)
> freeing bootmem node 0
I've reproduced this on 2.6.27.7. It is caused by commit
8f64e1f2d1 ("powerpc: Reserve in bootmem
lmb reserved regions that cross NUMA nodes").
The problem is that Jon took a loop which was (in pseudocode):
for_each_node(nid)
NODE_DATA(nid) = careful_alloc(nid);
setup_bootmem(nid);
reserve_node_bootmem(nid);
and broke it up into:
for_each_node(nid)
NODE_DATA(nid) = careful_alloc(nid);
setup_bootmem(nid);
for_each_node(nid)
reserve_node_bootmem(nid);
The issue comes in when the 'careful_alloc()' is called on a node with
no memory. It falls back to using bootmem from a previously-initialized
node. But, bootmem has not yet been reserved when Jon's patch is
applied. It gives back bogus memory (0xc000000000000000) and pukes
later in boot.
The following patch collapses the loop back together. It also breaks
the mark_reserved_regions_for_nid() code out into a function and adds
some comments. I think a huge part of introducing this bug is because
for loop was too long and hard to read.
The actual bug fix here is the:
+ if (end_pfn <= node->node_start_pfn ||
+ start_pfn >= node_end_pfn)
+ continue;
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, some PCIe devices on POWER6 machines do not get interrupts
assigned correctly. The problem is that OF doesn't create an
"interrupt" property for them. The fix is for of_irq_map_pci to fall
back to using the value in the PCI interrupt-pin register in config
space, as we do when there is no OF device-tree node for the device.
I have verified that this works fine with a pair of Squib-E SAS
adapter on a P6-570.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
parisc: struct device - replace bus_id with dev_name(), dev_set_name()
parisc: fix kernel crash when unwinding a userspace process
parisc: __kernel_time_t is always long
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq.h: fix missing/extra kernel-doc
genirq: __irq_set_trigger: change pr_warning to pr_debug
irq: fix typo
x86: apic honour irq affinity which was set in early boot
genirq: fix the affinity setting in setup_irq
genirq: keep affinities set from userspace across free/request_irq()
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: always define DECLARE_PCI_UNMAP* macros
x86: fixup config space size of CPU functions for AMD family 11h
x86, bts: fix wrmsr and spinlock over kmalloc
x86, pebs: fix PEBS record size configuration
x86, bts: turn macro into static inline function
x86, bts: exclude ds.c from build when disabled
arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
x86: use limited register constraint for setnz
xen: pin correct PGD on suspend
x86: revert irq number limitation
x86: fixing __cpuinit/__init tangle, xsave_cntxt_init()
x86: fix __cpuinit/__init tangle in init_thread_xstate()
oprofile: fix an overflow in ppro code
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] powernow-k8: ignore out-of-range PstateStatus value
[CPUFREQ] Documentation: Add Blackfin to list of supported processors
Compress a set of consecutive switch cases into a case-range.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).
Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... so get xen-ops.h in agreement with xen/smp.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
usual "introduce .text.head, put it in front of TEXT_TEXT in vmlinux.lds.S,
make the stuff up to jump to start_kernel live in it", same as on other
targets.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
switch to __init for those; unlike powerpc sparc has no hotplug support
for that stuff and their ->probe() tends to call __init functions while
being declared __devinit.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All noise since we don't have CPU hotplug there. However, they
did expose something very odd-looking in there - poke_viking()
does a bunch of identical btfixup each time it's called (i.e.
for each CPU). That one is left alone for now; just the trivial
misannotation fixes.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
called only from __init, calls __init. Incidentally, it ought to be static
in file.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pure noise - alpha doesn't have CPU hotplug
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Devices like b44 ethernet can't dma from addresses above 1GB. The driver
handles this cases by falling back to GFP_DMA allocation. But for detecting
the problem it needs to get an indication from dma_mapping_error.
The bug is triggered by using a VMSPLIT option of 2G/2G.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: remove dead code
If we take a closer look at the rff_trace/rff_action ret_from_fork code,
we have to realize that it does all the wrong things: for example it
checks the TIF flag - while later on jumping back to the ret-from-syscall
path - duplicating the check needlessly.
But checking for _TIF_SYSCALL_TRACE is completely unnecessary here because
we clear that flag for every freshly forked task. So the whole "tracing"
code here, for which there is a out of line jump optimization that makes
it even harder to read, is in reality completely dead code ...
Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Cyrill Gorcunov <gorcunov@gmail.com>
We merge this branch because x86/debug touches code that we started
cleaning up in x86/irq. The two branches started out independent,
but as unexpected amount of activity went into x86/irq, they became
dependent. Resolve that by this cross-merge.
child_rip is called not by its name but indirectly
rather so make it global and aligned.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off
Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.
This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
entry_32.S is now the only user of KPROBE_ENTRY / KPROBE_END,
treewide. This patch reorders entry_64.S and explicitly generates
a separate section for functions that need the protection. The
generated code before and after the patch is equal.
The KPROBE_ENTRY and KPROBE_END macro's are removed too.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up assembly macros and annotations - with some object impact
entry_64.S is the only user of KPROBE_ENTRY / KPROBE_END on
x86_64. This patch reorders entry_64.S and explicitly generates
a separate section for functions that need the protection. The
generated code before and after the patch is equal.
Implicitly changing sections in assembly files makes it more
difficult to follow why the assembler is doing certain things.
For example,
.p2align 5
KPROBE_ENTRY(...)
was not doing what you would expect. Other section changes
(__ex_table, .fixup, .init.rodata) are done explicitly already.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need an alignment of 16384 bytes for the initial kernel stack if
the kernel is configured for 16384 bytes stacks but the linker script
currently guarantees only an alignment of 8192 bytes.
So fix this and simply use THREAD_SIZE as alignment value which will
always do the right thing.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When running several kvm processes with lots of memory overcommitment,
we have seen an oops during process shutdown:
------------[ cut here ]------------
Kernel BUG at 0000000000193434 [verbose debug info unavailable]
addressing exception: 0005 [#1] PREEMPT SMP
Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8
Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650)
Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0
00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400
00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0
0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0
Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12
000000000019342c: a7180000 lhi %r1,0
0000000000193430: b2290012 iske %r1,%r2
>0000000000193434: a7110002 tmll %r1,2
0000000000193438: a7840006 brc 8,193444
000000000019343c: 9602c000 oi 0(%r12),2
0000000000193440: 96806000 oi 0(%r6),128
0000000000193444: a7110004 tmll %r1,4
Call Trace:
([<00000000001933f6>] unmap_vmas+0x846/0xf10)
[<0000000000199680>] exit_mmap+0x210/0x458
[<000000000012a8f8>] mmput+0x54/0xfc
[<000000000012f714>] exit_mm+0x134/0x144
[<000000000013120c>] do_exit+0x240/0x878
[<00000000001318dc>] do_group_exit+0x98/0xc8
[<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358
[<000000000010bee0>] do_signal+0xec/0x860
[<0000000000112e30>] sysc_sigpending+0xe/0x22
[<000002000013198a>] 0x2000013198a
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4
<4>---[ end trace bc19f1d51ac9db7c ]---
The faulting instruction is the storage key operation (iske) in
ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske
reads dirty and reference bit information for a physical page and
requires a valid physical address. Since we are in pte_clear, we
cannot rely on the pte containing a valid address. Fortunately we
dont need these information in pte_clear - after all there is no
mapping. The best fix is to remove the needless call to ptep_rcp_copy
that contains the iske.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot:
..
[ 0.000000] Movable zone: 0 pages used for memmap
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 775679
[ 0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[6920575.975232] console [ttyS0] enabled
[6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
..
The s390 implementation of sched_clock uses the store clock instruction and
subtracts jiffies_timer_cc.
jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used
for sched_clock and monotonic clock. For historical reasons there is an offset
on that value. With todays code this offset is unnecessary. By removing that
offset we can get a sched_clock which returns the nanoseconds after time_init.
This improves CONFIG_PRINTK_TIME.
Since sched_clock is the only user, I have also renamed jiffies_timer_cc to
sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant
and can be romved as well.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
syscall_get_nr() currently returns a valid result only if the call
chain of the traced process includes do_syscall_trace_enter(). But
collect_syscall() can be called for any sleeping task, the result of
syscall_get_nr() in general is completely bogus.
To make syscall_get_nr() work for any sleeping task the traps field
in pt_regs is replace with svcnr - the system call number the process
is executing. If svcnr == 0 the process is not on a system call path.
The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
for the first system call parameter. This is incorrect since gprs[2]
may have been overwritten with the system call number if the call
chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the correct wake-up enable register, and make it
work with 34xx also.
Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
(I did not compile or test it, please let me know, or help fixing
it, if something is wrong with the conversion)
This patch is part of a larger patch series which will remove
the "char bus_id[20]" name string from struct device. The device
name is managed in the kobject anyway, and without any size
limitation, and just needlessly copied into "struct device".
To set and read the device name dev_name(dev) and dev_set_name(dev)
must be used. If your code uses static kobjects, which it shouldn't
do, "const char *init_name" can be used to statically provide the
name the registered device should have. At registration time, the
init_name field is cleared, to enforce the use of dev_name(dev) to
access the device name at a later time.
We need to get rid of all occurrences of bus_id in the entire tree
to be able to enable the new interface. Please apply this patch,
and possibly convert any remaining remaining occurrences of bus_id.
We want to submit a patch to -next, which will remove bus_id from
"struct device", to find the remaining pieces to convert, and finally
switch over to the new api, which will remove the 20 bytes array
and does no longer have a size limitation.
Thanks,
Kay
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: linux-parisc@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Any user on existing parisc 32- and 64bit-kernels can easily crash
the kernel and as such enforce a DSO.
A simple testcase is available here:
http://gsyprf10.external.hp.com/~deller/crash.tgz
The problem is introduced by the fact, that the handle_interruption()
crash handler calls the show_regs() function, which in turn tries to
unwind the stack by calling parisc_show_stack(). Since the stack contains
userspace addresses, a try to unwind the stack is dangerous and useless
and leads to the crash.
The fix is trivial: For userspace processes
a) avoid to unwind the stack, and
b) avoid to resolve userspace addresses to kernel symbol names.
While touching this code, I converted print_symbol() to %pS
printk formats and made parisc_show_stack() static.
An initial patch for this was written by Kyle McMartin back in August:
http://marc.info/?l=linux-parisc&m=121805168830283&w=2
Compile and run-tested with a 64bit parisc kernel.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, earlier...]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
__kernel_time_t is always long on PA-RISC, irrespective of CONFIG_64BIT,
hence move it out of the #ifdef CONFIG_64BIT / #else / #endif block.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
It is possible for a shadow page to have a parent link
pointing to a freed page. When zapping a high level table,
kvm_mmu_page_unlink_children fails to remove the parent_pte link.
For that to happen, the child must be unreachable via the shadow
tree, which can happen in shadow_walk_entry if the guest pte was
modified in between walk() and fetch(). Remove the parent pte
reference in such case.
Possible cause for oops in bug #2217430.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
0 is a valid GPIO number, use a negative number to specify, that this camera
doesn't have a GPIO for bus-width switching.
Signed-off-by: Guennadi Liakhovetski <lg@denx.de>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Impact: extend allowed configuration space access on 11h CPUs from 256 to 4K
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The second clk_deny_idle instance should be clk_allow_idle instead.
Signed-off-by: Amit Kucheria <amit.kucheria@verdurent.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
A workaround for AMD CPU family 11h erratum 311 might cause that the
P-state Status Register shows a "current P-state" which is larger than
the "current P-state limit" in P-state Current Limit Register. For the
wrong P-state value there is no ACPI _PSS object defined and
powernow-k8/cpufreq can't determine the proper CPU frequency for that
state.
As a consequence this can cause a panic during boot (potentially with
all recent kernel versions -- at least I have reproduced it with
various 2.6.27 kernels and with the current .28 series), as an
example:
powernow-k8: Found 1 AMD Turion(tm)X2 Ultra DualCore Mobile ZM-82 processors (2 \
)
powernow-k8: 0 : pstate 0 (2200 MHz)
powernow-k8: 1 : pstate 1 (1100 MHz)
powernow-k8: 2 : pstate 2 (600 MHz)
BUG: unable to handle kernel paging request at ffff88086e7528b8
IP: [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
PGD 202063 PUD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 1
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.28-rc3-dirty #16
RIP: 0010:[<ffffffff80486361>] [<ffffffff80486361>] cpufreq_stats_update+0x4a/0\
f
Synaptics claims to have extended capabilities, but I'm not able to read them.<6\
6
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88006e7528c0
RDX: 00000000ffffffff RSI: ffff88006e54af00 RDI: ffffffff808f056c
RBP: 00000000fffee697 R08: 0000000000000003 R09: ffff88006e73f080
R10: 0000000000000001 R11: 00000000002191c0 R12: ffff88006fb83c10
R13: 00000000ffffffff R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88006fb50740(0000) knlGS:0000000000000000
Unable to initialize Synaptics hardware.
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88086e7528b8 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88006fb82000, task ffff88006fb816d0)
Stack:
ffff88006e74da50 0000000000000000 ffff88006e54af00 ffffffff804863c7
ffff88006e74da50 0000000000000000 00000000ffffffff 0000000000000000
ffff88006fb83c10 ffffffff8024b46c ffffffff808f0560 ffff88006fb83c10
Call Trace:
[<ffffffff804863c7>] ? cpufreq_stat_notifier_trans+0x51/0x83
[<ffffffff8024b46c>] ? notifier_call_chain+0x29/0x4c
[<ffffffff8024b561>] ? __srcu_notifier_call_chain+0x46/0x61
[<ffffffff8048496d>] ? cpufreq_notify_transition+0x93/0xa9
[<ffffffff8021ab8d>] ? powernowk8_target+0x1e8/0x5f3
[<ffffffff80486687>] ? cpufreq_governor_performance+0x1b/0x20
[<ffffffff80484886>] ? __cpufreq_governor+0x71/0xa8
[<ffffffff80484b21>] ? __cpufreq_set_policy+0x101/0x13e
[<ffffffff80485bcd>] ? cpufreq_add_dev+0x3f0/0x4cd
[<ffffffff8048577a>] ? handle_update+0x0/0x8
[<ffffffff803c2062>] ? sysdev_driver_register+0xb6/0x10d
[<ffffffff8056592c>] ? powernowk8_init+0x0/0x7e
[<ffffffff8048604c>] ? cpufreq_register_driver+0x8f/0x140
[<ffffffff80209056>] ? _stext+0x56/0x14f
[<ffffffff802c2234>] ? proc_register+0x122/0x17d
[<ffffffff802c23a0>] ? create_proc_entry+0x73/0x8a
[<ffffffff8025c259>] ? register_irq_proc+0x92/0xaa
[<ffffffff8025c2c8>] ? init_irq_proc+0x57/0x69
[<ffffffff807fc85f>] ? kernel_init+0x116/0x169
[<ffffffff8020cc79>] ? child_rip+0xa/0x11
[<ffffffff807fc749>] ? kernel_init+0x0/0x169
[<ffffffff8020cc6f>] ? child_rip+0x0/0x11
Code: 05 c5 83 36 00 48 c7 c2 48 5d 86 80 48 8b 04 d8 48 8b 40 08 48 8b 34 02 48\
RIP [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
RSP <ffff88006fb83b20>
CR2: ffff88086e7528b8
---[ end trace 0678bac75e67a2f7 ]---
Kernel panic - not syncing: Attempted to kill init!
In short, aftereffect of the wrong P-state is that
cpufreq_stats_update() uses "-1" as index for some array in
cpufreq_stats_update (unsigned int cpu)
{
...
if (stat->time_in_state)
stat->time_in_state[stat->last_index] =
cputime64_add(stat->time_in_state[stat->last_index],
cputime_sub(cur_time, stat->last_time));
...
}
Fortunately, the wrong P-state value is returned only if the core is
in P-state 0. This fix solves the problem by detecting the
out-of-range P-state, ignoring it, and using "0" instead.
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Impact: fix sleeping-with-spinlock-held bugs/crashes
- Turn a wrmsr to write the DS_AREA MSR into a wrmsrl.
- Use irqsave variants of spinlocks.
- Do not allocate memory while holding spinlocks.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix DS hw enablement on 64-bit x86
Fix the PEBS record size in the DS configuration.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Replace a macro with a static inline function.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Move the CONFIG guard from the .c file into the makefile.
Reported-by: Andi Kleen <andi-suse@firstfloor.org>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix theoretical option string parsing overflow
Since bridge is unsigned, it would seem better to use simple_strtoul that
simple_strtol.
A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r2@
long e;
position p;
@@
e = simple_strtol@p(...)
@@
position p != r2.p;
type T;
T e;
@@
e =
- simple_strtol@p
+ simple_strtoul
(...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: muli@il.ibm.com
Cc: jdmason@kudzu.us
Cc: discuss@x86-64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix with certain compilers
GCC can decide to use %dil when "r" is used, which is not valid for
setnz.
This bug was brought out by Stephen Rothwell's merging of the
branch tracer into linux-next.
[ Thanks to Uros Bizjak for recommending 'q' over 'Q' ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When the VM exits, we must call put_page() for every page referenced in the
shadow TLB.
Without this patch, we usually leak 30-50 host pages (120 - 200 KiB with 4 KiB
pages). The maximum number of pages leaked is the size of our shadow TLB, 64
pages.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Often we do things like put BUG() in the default clause of a case
statement. Since it was not declared __noreturn, this could sometimes
lead to bogus compiler warnings that variables were used
uninitialized.
There is a small problem in that we have to put a magic while(1); loop to
fool GCC into really thinking it is noreturn. This makes the new
BUG() function 3 instructions long instead of just 1, but I think it
is worth it as it is now unnecessary to do extra work to silence the
'used uninitialized' warnings.
I also re-wrote BUG_ON so that if it is given a constant condition, it
just does BUG() instead of loading a constant value in to a register
and testing it.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Impact: improve backtrace quality
avoid the confusion in call trace because of the lack of padding at the
tail of function.
When do_exit gets called, the return address behind call instruction is
pushed into stack. If something get wrong in do_exit, for x86_64, the
entry "kernel_execve +0x00/0xXX" rather than "child_rip +0xYY/0xZZ" is
in the call trace.
That looks confusing, so add a u2d to make the return address still part
of the original call site. (This also catches any instances of us returning
from that function somehow.)
Signed-off-by: jia zhang <jia.zhang2008@gmail.com>
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: make ENTRY()/END() macros more capable
It's usefull to catch unbalanced or messed or mixed declarations of ENTRY and
KPROBES. These macros would help a bit.
For example the following code would compile without problems
ENTRY_X86(mcount)
retq
END_X86(mcount)
But if you forget and mess the following form
ENTRY_X86(mcount)
retq
END(mcount)
ENTRY_X86(ftrace_caller)
The assembler will issue the following message:
Error: ENTRY_X86/KPROBE_X86 unbalanced,missed,mixed
Actually the checking is performed at every _X86 macro
so maybe it's good idea to put ENTRY_KPROBE_FINAL_X86
at the end of .S file to be sure you didn't miss anything.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Alexander van Heukelum <heukelum@mailshack.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
During page sync, if a pagetable contains a self referencing pte (that
points to the pagetable), the corresponding spte may be marked as
writable even though all mappings are supposed to be write protected.
Fix by clearing page unsync before syncing individual sptes.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Impact: move some code out of .kprobes.text
KPROBE_ENTRY switches code generation to .kprobes.text, and KPROBE_END
uses .popsection to get back to the previous section (.text, normally).
Also replace ENDPROC by END, for consistency.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup of entry_64.S
Except for the order and the place of the functions, this
patch should not change the generated code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
PAL_VPS_RESUME_HANDLER should use r26 to hold vac fields according to SDM.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Use CFLAGS_vcpu.o, not EXTRA_CFLAGS, to provide fixed register information
to the compiler.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
If an interrupt cannot be injected for some reason (say, page fault
when fetching the IDT descriptor), the interrupt is marked for
reinjection. However, if an NMI is queued at this time, the NMI
will be injected instead and the NMI will be lost.
Fix by deferring the NMI injection until the interrupt has been
injected successfully.
Analyzed by Jan Kiszka.
Signed-off-by: Avi Kivity <avi@redhat.com>
We can get an exit for instructions starting with 0xae, even if the guest is
in userspace. Lets make sure, that the signal processor handler is only called
in guest supervisor mode. Otherwise, send a program check.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Impact: cleanup
Move recently introduced dwarf2 macros to dwarf2.h file.
It allow us to not duplicate them in assembly files.
Active usage of _cfi macros don't make assembly files
more obvious to understand but we already have a lot of
macros there which requires to search the definitions
of them *anyway*. But at least it make every cfi usage
one line shorter.
Also some code alignment is done.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix MSIx not enough irq numbers available regression
The manual revert of the sparse_irq patches missed to bring the number
of possible irqs back to the .27 status. This resulted in a regression
when two multichannel network cards were placed in a system with only
one IO_APIC - causing the networking driver to not have the right
IRQ and the device not coming up.
Remove the dynamic allocation logic leftovers and simply return
NR_IRQS in probe_nr_irqs() for now.
Fixes: http://lkml.org/lkml/2008/11/19/354
Reported-by: Jesper Dangaard Brouer <hawk@diku.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jesper Dangaard Brouer <hawk@diku.dk>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Make the headers portion of signal_32.c and signal_64.c the same.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Simplify the irq-sampled stack overflow debug check:
- eliminate an #idef
- use WARN_ONCE() to emit a single warning (all bets are off
after the first such warning anyway)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: make stack overflow debug check and printout narrower
stack_overflow_check() should consider the stack usage of pt_regs, and
thus it could warn us in advance. Additionally, it looks better for
the warning time to start at INITIAL_JIFFIES.
Assuming that rsp gets close to the check point before interrupt
arrives: when interrupt really happens, thread_info will be partly
overrode.
Signed-off-by: jia zhang <jia.zhang2008@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The G3IPL expects the value at RAM address 0xa020b020 to be
exactly 1 to setup the bluetooth GPIOs properly. The actual
code got a value from gpio_get_value() which was not 1, but
a "not equal to 0" integer.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
In the resume bootstrap, the early disable address is wrong.
Fix it to RAM address 0xa020b000 instead of 0xa0200000, and
make it consistent with RESUME_ENABLE_ADDR in mioa701.c.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Impact: fix bootup crash
Even though it tested fine for me, there was still a bug in the
first patch: I have overlooked a call to ptregscall_common. This
patch fixes that, I think, but the code is never executed for
me while running a debian install... (I tested this by putting
an "1:jmp 1b" in there.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
DISABLE_INTERRUPTS(CLBR_NONE)/TRACE_IRQS_OFF is now always
executed just before paranoid_exit. Move it there.
Split out paranoidzeroentry, paranoiderrorentry, and
paranoidzeroentry_ist to get more readable macro's.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>