linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	26ff6801f7	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional x86: DT: Fix return condition in irq_create_of_mapping() x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context	2011-03-25 17:51:51 -07:00
Jason Wessel	21431c2900	kgdb,x86_64: fix compile warning found with sparse Fix sparse warning: arch/x86/kernel/kgdb.c:123:9: warning: switch with no cases Reported-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2011-03-25 16:37:31 -05:00
Ingo Molnar	45daae575e	perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue Eric Dumazet reported that hardware PMU events do not work on his system, due to the BIOS corrupting PMU state: Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only. [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c) Linus suggested that we continue in the face of such BIOS-induced CPU state corruption: http://lkml.org/lkml/2011/3/24/608 Such BIOSes will have to be fixed - Linux developers rely on a working and fully capable PMU and the BIOS interfering with the CPU's PMU state is simply not acceptable. So this patch changes perf to continue when it detects such BIOS interaction, some hardware events may be unreliable due to the BIOS writing and re-writing them - there's not much the kernel can do about that but to detect the corruption and report it. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-25 11:23:41 +01:00
Thomas Gleixner	07611dbda5	x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional That call escaped the name space cleanup. Fix it up. We really want to call there. The chip might have changed since the irq was setup initially. So let the core code and the chip decide what to do. The status is just an unreliable snapshot. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>	2011-03-24 23:17:56 +01:00
Thomas Gleixner	00a30b254b	x86: DT: Fix return condition in irq_create_of_mapping() The xlate() function returns 0 or a negative error code. Returning the error code blindly will be seen as an huge irq number by the calling function because irq_create_of_mapping() returns an unsigned value. Return 0 (NO_IRQ) as required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>	2011-03-24 23:17:56 +01:00
Don Zickus	242214f9c1	perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows The read of a proper MSR register was missed and instead of counter the configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI hitting the system. As result the user may obtain "Dazed and confused, but trying to continue" message. Fix it by reading a proper MSR register. When an NMI happens on a P4, the perf nmi handler checks the configuration register to see if the overflow bit is set or not before taking appropriate action. Unfortunately, various P4 machines had a broken overflow bit, so a backup mechanism was implemented. This mechanism checked to see if the counter rolled over or not. A previous commit that implemented this backup mechanism was broken. Instead of reading the counter register, it used the configuration register to determine if the counter rolled over or not. Reading that bit would give incorrect results. This would lead to 'Dazed and confused' messages for the end user when using the perf tool (or if the nmi watchdog is running). The fix is to read the counter register before determining if the counter rolled over or not. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <4D8BAB49.3080701@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-24 21:40:01 +01:00
Andi Kleen	914a76ca5e	oprofile, x86: Allow setting EDGE/INV/CMASK for counter events For some performance events it's useful to set the EDGE and INV bits and the CMASK mask in the counter control register. The list of predefined events Intel releases for each CPU has some events which require these settings to get more "natural" to use higher level events. oprofile currently doesn't allow this. This patch adds new extra configuration fields for them, so that they can be specified in oprofilefs. An updated oprofile daemon can then make use of this to set them. v2: Write back masked extra value to variable. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Robert Richter <robert.richter@amd.com>	2011-03-24 18:45:44 +01:00
Linus Torvalds	047f61c5d1	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits) ACPI: minor printk format change in acpi_pad ACPI: make acpi_pad /sys output more readable ACPICA: Update version to 20110316 ACPICA: Header support for SLIC table ACPI: Make sure the FADT is at least rev 2 before using the reset register ACPI: Bug compatibility for Windows on the ACPI reboot vector ACPICA: Fix access width for reset vector ACPI battery: fribble sysfs files from a resume notifier ACPI button: remove unused procfs I/F ACPI, APEI, Add PCIe AER error information printing support PCIe, AER, use pre-generated prefix in error information printing ACPI, APEI, Add ERST record ID cache ACPI: Use syscore_ops instead of sysdev class and sysdev ACPI: Remove the unused EC sysdev class ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree ACPI: use __init where possible in processor driver Thermal_Framework-Fix_crash_during_hwmon_unregister ACPICA: Update version to 20110211. ACPICA: Add mechanism to defer _REG methods for some installed handlers ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name ...	2011-03-24 08:25:15 -07:00
Linus Torvalds	a6a1d6485e	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (90 commits) mfd: Push byte swaps out of wm8994 bulk read path mfd: Rename ab8500 gpadc header mfd: Constify WM8994 write path mfd: Push byte swap out of WM8994 bulk I/O mfd: Avoid copying data in WM8994 I2C write mfd: Remove copy from WM831x I2C write function mfd: Staticise WM8994 PM ops regulator: Add a subdriver for TI TPS6105x regulator portions v2 mfd: Add a core driver for TI TPS61050/TPS61052 chips v2 gpio: Add Tunnel Creek support to sch_gpio mfd: Add Tunnel Creek support to lpc_sch pci_ids: Add Intel Tunnel Creek LPC Bridge device ID. regulator: MAX8997/8966 support mfd: Add WM8994 bulk register write operation mfd: Append additional read write on 88pm860x mfd: Adopt mfd_data in 88pm860x input driver mfd: Adopt mfd_data in 88pm860x regulator mfd: Adopt mfd_data in 88pm860x led mfd: Adopt mfd_data in 88pm860x backlight mfd: Fix MAX8997 Kconfig entry typos ...	2011-03-24 07:59:01 -07:00
Daniel De Graaf	b254244d26	xen/p2m: Allocate p2m tracking pages on override It is possible to add a p2m override on pages that are currently mapped to INVALID_P2M_ENTRY; in particular, this will happen when using ballooned pages in gntdev. This means that set_phys_to_machine must be used instead of __set_phys_to_machine. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-03-24 10:36:01 -04:00
Namhyung Kim	71f9e59800	x86, dumpstack: Use %pB format specifier for stack trace Improve noreturn function entries in call traces: Before: Call Trace: [<ffffffff812a8502>] panic+0x8c/0x18d [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic] <--- bad [<ffffffff81104666>] proc_file_write+0x73/0x8d [<ffffffff811000b3>] proc_reg_write+0x8d/0xac [<ffffffff810c7d32>] vfs_write+0xa1/0xc5 [<ffffffff810c7e0f>] sys_write+0x45/0x6c [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b After: Call Trace: [<ffffffff812bce69>] panic+0x8c/0x18d [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good [<ffffffff81115fab>] proc_file_write+0x73/0x8d [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac [<ffffffff810d90ee>] vfs_write+0xa1/0xc5 [<ffffffff810d91cb>] sys_write+0x45/0x6c [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-24 08:36:10 +01:00
Linus Torvalds	b81a618dcd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: deal with races in /proc//{syscall,stack,personality} proc: enable writing to /proc/pid/mem proc: make check_mem_permission() return an mm_struct on success proc: hold cred_guard_mutex in check_mem_permission() proc: disable mem_write after exec mm: implement access_remote_vm mm: factor out main logic of access_process_vm mm: use mm_struct to resolve gate vma's in __get_user_pages mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm mm: arch: make in_gate_area take an mm_struct instead of a task_struct mm: arch: make get_gate_vma take an mm_struct instead of a task_struct x86: mark associated mm when running a task in 32 bit compatibility mode x86: add context tag to mark mm when running a task in 32-bit compatibility mode auxv: require the target to be tracable (or yourself) close race in /proc//environ report errors in /proc//map* sanely pagemap: close races with suid execve make sessionid permissions in /proc//task/ match those in /proc/* fix leaks in path_lookupat() Fix up trivial conflicts in fs/proc/base.c	2011-03-23 20:51:42 -07:00
Olaf Hering	93a72052be	crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn The Xen PV drivers in a crashed HVM guest can not connect to the dom0 backend drivers because both frontend and backend drivers are still in connected state. To run the connection reset function only in case of a crashdump, the is_kdump_kernel() function needs to be available for the PV driver modules. Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel() usable for modules. Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup() to early_param(). It adds an address range check from x86 also on ia64 and powerpc. [akpm@linux-foundation.org: additional #includes] [akpm@linux-foundation.org: remove elfcorehdr_addr export] [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes] Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:47:19 -07:00
FUJITA Tomonori	8547727756	remove dma64_addr_t There is no user now. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: David Miller <davem@davemloft.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:47:18 -07:00
Alexandre Bounine	388b78adc9	rapidio: modify configuration to support PCI-SRIO controller 1. Add an option to include RapidIO support if the PCI is available. 2. Add FSL_RIO configuration option to enable controller selection. 3. Add RapidIO support option into x86 and MIPS architectures. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Cc: Micha Nelissen <micha@neli.hopto.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:42 -07:00
Akinobu Mita	61f2e7b0f4	bitops: remove minix bitops from asm/bitops.h minix bit operations are only used by minix filesystem and useless by other modules. Because byte order of inode and block bitmaps is different on each architecture like below: m68k: big-endian 16bit indexed bitmaps h8300, microblaze, s390, sparc, m68knommu: big-endian 32 or 64bit indexed bitmaps m32r, mips, sh, xtensa: big-endian 32 or 64bit indexed bitmaps for big-endian mode little-endian bitmaps for little-endian mode Others: little-endian bitmaps In order to move minix bit operations from asm/bitops.h to architecture independent code in minix filesystem, this provides two config options. CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k. CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu, m32r, mips, sh, xtensa). The architectures which always use little-endian bitmaps do not select these options. Finally, we can remove minix bit operations from asm/bitops.h for all architectures. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hirokazu Takata <takata@linux-m32r.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:22 -07:00
Akinobu Mita	f312eff816	bitops: remove ext2 non-atomic bitops from asm/bitops.h As the result of conversions, there are no users of ext2 non-atomic bit operations except for ext2 filesystem itself. Now we can put them into architecture independent code in ext2 filesystem, and remove from asm/bitops.h for all architectures. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:21 -07:00
Akinobu Mita	861b5ae7cd	bitops: introduce little-endian bitops for most architectures Introduce little-endian bit operations to the big-endian architectures which do not have native little-endian bit operations and the little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300, ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa) These architectures can just include generic implementation (asm-generic/bitops/le.h). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:15 -07:00
Rafael J. Wysocki	d47d81c0e9	Introduce ARCH_NO_SYSDEV_OPS config option (v2) Introduce Kconfig option allowing architectures where sysdev operations used during system suspend, resume and shutdown have been completely replaced with struct sycore_ops operations to avoid building sysdev code that will never be used. Make callbacks in struct sys_device and struct sysdev_driver depend on ARCH_NO_SYSDEV_OPS to allows us to verify if all of the references have been actually removed from the code the given architecture depends on. Make x86 select ARCH_NO_SYSDEV_OPS. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-03-23 22:16:41 +01:00
Rafael J. Wysocki	f3c6ea1b06	x86: Use syscore_ops instead of sysdev classes and sysdevs Some subsystems in the x86 tree need to carry out suspend/resume and shutdown operations with one CPU on-line and interrupts disabled and they define sysdev classes and sysdevs or sysdev drivers for this purpose. This leads to unnecessarily complicated code and excessive memory usage, so switch them to using struct syscore_ops objects for this purpose instead. Generally, there are three categories of subsystems that use sysdevs for implementing PM operations: (1) subsystems whose suspend/resume callbacks ignore their arguments entirely (the majority), (2) subsystems whose suspend/resume callbacks use their struct sys_device argument, but don't really need to do that, because they can be implemented differently in an arguably simpler way (io_apic.c), and (3) subsystems whose suspend/resume callbacks use their struct sys_device argument, but the value of that argument is always the same and could be ignored (microcode_core.c). In all of these cases the subsystems in question may be readily converted to using struct syscore_ops objects for power management and shutdown. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>	2011-03-23 22:15:54 +01:00
Stephen Wilson	cae5d39032	mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm Now that gate vma's are referenced with respect to a particular mm and not a particular task it only makes sense to propagate the change to this predicate as well. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-23 16:36:55 -04:00
Stephen Wilson	83b964bbf8	mm: arch: make in_gate_area take an mm_struct instead of a task_struct Morally, the question of whether an address lies in a gate vma should be asked with respect to an mm, not a particular task. Moreover, dropping the dependency on task_struct will help make existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-23 16:36:54 -04:00
Stephen Wilson	31db58b3ab	mm: arch: make get_gate_vma take an mm_struct instead of a task_struct Morally, the presence of a gate vma is more an attribute of a particular mm than a particular task. Moreover, dropping the dependency on task_struct will help make both existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-23 16:36:54 -04:00
Stephen Wilson	375906f876	x86: mark associated mm when running a task in 32 bit compatibility mode This patch simply follows the same practice as for setting the TIF_IA32 flag. In particular, an mm is marked as holding 32-bit tasks when a 32-bit binary is exec'ed. Both ELF and a.out formats are updated. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-23 16:36:53 -04:00
Stephen Wilson	c2ef45df3b	x86: add context tag to mark mm when running a task in 32-bit compatibility mode This tag is intended to mirror the thread info TIF_IA32 flag. Will be used to identify mm's which support 32 bit tasks running in compatibility mode without requiring a reference to the task itself. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-23 16:36:52 -04:00
Andres Salomon	f77289ac25	mfd: Rename mfd_shared_cell_{en,dis}able to drop the "shared" part As requested by Samuel, there's not really any reason to have "shared" in the name. This also modifies the only user of the function, as well. Signed-off-by: Andres Salomon <dilinger@queued.net> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2011-03-23 10:42:03 +01:00
Andres Salomon	1310e6d638	mfd: Add sharing for cs5535 acpi/pms cells This enables sharing of cs5535-mfd cells via the new mfd_shared_* API. Hooks for enable/disble of resources are added, with refcounting of resources being automatically handled so that cs5535_mfd_res_enable/disable are only called when necessary. Clients of cs5535-mfd (in this case, olpc-xo1.c) are also modified to use the mfd_shared API. The platform drivers are also renamed to olpc-xo1-{pms,acpi}, and resource enabling/disabling is replaced with mfd_shared API calls. Signed-off-by: Andres Salomon <dilinger@queued.net> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2011-03-23 10:41:58 +01:00
Len Brown	02e2407858	Merge branch 'linus' into release Conflicts: arch/x86/kernel/acpi/sleep.c Signed-off-by: Len Brown <len.brown@intel.com>	2011-03-23 02:34:54 -04:00
Len Brown	5c129a8600	Merge commit 'v2.6.38' into release	2011-03-23 02:33:54 -04:00
David Rientjes	1c00f0161f	x86: allow CONFIG_ISA_DMA_API to be disabled Not all 64-bit systems require ISA-style DMA, so allow it to be configurable. x86 utilizes the generic ISA DMA allocator from kernel/dma.c, so require it only when CONFIG_ISA_DMA_API is enabled. Disabling CONFIG_ISA_DMA_API is dependent on x86_64 since those machines do not have ISA slots and benefit the most from disabling the option (and on CONFIG_EXPERT as required by H. Peter Anvin). When disabled, this also avoids declaring claim_dma_lock(), release_dma_lock(), request_dma(), and free_dma() since those interfaces will no longer be provided. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:16 -07:00
David Rientjes	8df3bd9e18	x86: only compile floppy driver if CONFIG_ISA_DMA_API is enabled The generic floppy disk driver utilizies the interface provided by CONFIG_ISA_DMA_API, specifically claim_dma_lock(), release_dma_lock(), request_dma(), and free_dma(). Thus, there's a strict dependency on the config option and the driver should only be loaded if the kernel supports ISA-style DMA. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:16 -07:00
David Rientjes	4061d68e1a	x86: only compile 8237A if CONFIG_ISA_DMA_API is enabled 8237A utilizes the interface provided by CONFIG_ISA_DMA_API, specifically claim_dma_lock() and release_dma_lock(). Thus, there's a strict dependency on the config option and the module should only be loaded if the kernel supports ISA-style DMA. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:16 -07:00
Olaf Hering	d404ab0a11	move x86 specific oops=panic to generic code The oops=panic cmdline option is not x86 specific, move it to generic code. Update documentation. Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:11 -07:00
FUJITA Tomonori	3e50594e8e	add the common dma_addr_t typedef to include/linux/types.h All architectures can use the common dma_addr_t typedef now. We can remove the arch specific dma_addr_t. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Matt Turner <mattst88@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:09 -07:00
Eric Dumazet	b6a84016bd	mm: NUMA aware alloc_thread_info_node() Add a node parameter to alloc_thread_info(), and change its name to alloc_thread_info_node() This change is needed to allow NUMA aware kthread_create_on_cpu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:01 -07:00
Linus Torvalds	73d5a8675f	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: update mask_rw_pte after kernel page tables init changes xen: set max_pfn_mapped to the last pfn mapped x86: Cleanup highmap after brk is concluded Fix up trivial onflict (added header file includes) in arch/x86/mm/init_64.c	2011-03-22 10:41:36 -07:00
Rakib Mullick	cbb84c4cc1	x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context When CONFIG_X86_MPPARSE=y and CONFIG_X86_IO_APIC=n, then we get the following warning: arch/x86/kernel/mpparse.c:723: warning: 'check_slot' defined but not used So, put check_slot into CONFIG_X86_IO_APIC context. Its only called from CONFIG_X86_IO_APIC=y context. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <AANLkTinsUfGc=NG_GeH_B+zFVu+DXJzZbJKdQLscqfuH@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-22 13:00:22 +01:00
Len Brown	25076246e8	Merge branch 'apei-release' into release	2011-03-22 01:41:47 -04:00
Huang Ying	885b976fad	ACPI, APEI, Add ERST record ID cache APEI ERST firmware interface and implementation has no multiple users in mind. For example, if there is four records in storage with ID: 1, 2, 3 and 4, if two ERST readers enumerate the records via GET_NEXT_RECORD_ID as follow, reader 1 reader 2 1 2 3 4 -1 -1 where -1 signals there is no more record ID. Reader 1 has no chance to check record 2 and 4, while reader 2 has no chance to check record 1 and 3. And any other GET_NEXT_RECORD_ID will return -1, that is, other readers will has no chance to check any record even they are not cleared by anyone. This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple users. To solve the issue, an in-memory ERST record ID cache is designed and implemented. When enumerating record ID, the ID returned by GET_NEXT_RECORD_ID is added into cache in addition to be returned to caller. So other readers can check the cache to get all record ID available. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-03-21 22:59:06 -04:00
Sage Weil	b7ed78f565	introduce sys_syncfs to sync a single file system It is frequently useful to sync a single file system, instead of all mounted file systems via sync(2): - On machines with many mounts, it is not at all uncommon for some of them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on those and may never get to the one you do care about (e.g., /). - Some applications write lots of data to the file system and then want to make sure it is flushed to disk. Calling fsync(2) on each file introduces unnecessary ordering constraints that result in a large amount of sub-optimal writeback/flush/commit behavior by the file system. There are currently two ways (that I know of) to sync a single super_block: - BLKFLSBUF ioctl on the block device: That also invalidates the bdev mapping, which isn't usually desirable, and doesn't work for non-block file systems. - 'mount -o remount,rw' will call sync_filesystem as an artifact of the current implemention. Relying on this little-known side effect for something like data safety sounds foolish. Both of these approaches require root privileges, which some applications do not have (nor should they need?) given that sync(2) is an unprivileged operation. This patch introduces a new system call syncfs(2) that takes an fd and syncs only the file system it references. Maybe someday we can $ sync /some/path and not get sync: ignoring all arguments The syscall is motivated by comments by Al and Christoph at the last LSF. syncfs(2) seems like an appropriate name given statfs(2). A similar ioctl was also proposed a while back, see http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2 Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-21 00:40:29 -04:00
Stefano Stabellini	d8aa5ec338	xen: update mask_rw_pte after kernel page tables init changes After "x86-64, mm: Put early page table high" already existing kernel page table pages can be mapped using early_ioremap too so we need to update mask_rw_pte to make sure these pages are still mapped RO. The reason why we have to do that is explain by the commit message of fef5ba797991f9335bcfc295942b684f9bf613a1: "Xen requires that all pages containing pagetable entries to be mapped read-only. If pages used for the initial pagetable are already mapped then we can change the mapping to RO. However, if they are initially unmapped, we need to make sure that when they are later mapped, they are also mapped RO. ..SNIP.. the pagetable setup code early_ioremaps the pages to write their entries, so we must make sure that mappings created in the early_ioremap fixmap area are mapped RW. (Those mappings are removed before the pages are presented to Xen as pagetable pages.)" We accomplish all this in mask_rw_pte by mapping RO all the pages mapped using early_ioremap apart from the last one that has been allocated because it is not a page table page yet (it has not been hooked into the page tables yet). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-03-19 11:58:28 -07:00
Stefano Stabellini	14988a4d35	xen: set max_pfn_mapped to the last pfn mapped Do not set max_pfn_mapped to the end of the initial memory mappings, that also contain pages that don't belong in pfn space (like the mfn list). Set max_pfn_mapped to the last real pfn mapped in the initial memory mappings that is the pfn backing _end. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-03-19 11:58:25 -07:00
Yinghai Lu	e5f15b45dd	x86: Cleanup highmap after brk is concluded Now cleanup_highmap actually is in two steps: one is early in head64.c and only clears above _end; a second one is in init_memory_mapping() and tries to clean from _brk_end to _end. It should check if those boundaries are PMD_SIZE aligned but currently does not. Also init_memory_mapping() is called several times for numa or memory hotplug, so we really should not handle initial kernel mappings there. This patch moves cleanup_highmap() down after _brk_end is settled so we can do everything in one step. Also we honor max_pfn_mapped in the implementation of cleanup_highmap. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-03-19 11:58:19 -07:00
Stephane Eranian	fc66c5210e	perf, x86: Fix Intel fixed counters base initialization The following patch solves the problems introduced by Robert's commit `41bf498` and reported by Arun Sharma. This commit gets rid of the base + index notation for reading and writing PMU msrs. The problem is that for fixed counters, the new calculation for the base did not take into account the fixed counter indexes, thus all fixed counters were read/written from fixed counter 0. Although all fixed counters share the same config MSR, they each have their own counter register. Without: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892) 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892) 49473 baclears (0.00% scaling, ena=1000790892, run=1000790892) With: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809) 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809) 47863 baclears (0.00% scaling, ena=1000840809, run=1000840809) Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: asharma@fb.com Cc: perfmon2-devel@lists.sf.net LKML-Reference: <20110319172005.GB4978@quad> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-19 19:00:49 +01:00
Len Brown	05534c9ffc	Merge branch 'acpica' into release	2011-03-18 18:06:08 -04:00
Linus Torvalds	99759619b2	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: label: remove #include of ACPI header to avoid warnings PCI: label: Fix compilation error when CONFIG_ACPI is unset PCI: pre-allocate additional resources to devices only after successful allocation of essential resources. PCI: introduce reset_resource() PCI: data structure agnostic free list function PCI: refactor io size calculation code PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH PCI hotplug: acpiphp: set current_state to D0 in register_slot PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs PCI: add more checking to ICH region quirks PCI: aer-inject: Override PCIe AER Mask Registers PCI: fix tlan build when CONFIG_PCI is not enabled PCI: remove quirk for pre-production systems PCI: Avoid potential NULL pointer dereference in pci_scan_bridge PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs PCI: sysfs: Fix failure path for addition of "vpd" attribute	2011-03-18 10:56:44 -07:00
Linus Torvalds	f2e1fbb5f2	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Flush TLB if PGD entry is changed in i386 PAE mode x86, dumpstack: Correct stack dump info when frame pointer is available x86: Clean up csum-copy_64.S a bit x86: Fix common misspellings x86: Fix misspelling and align params x86: Use PentiumPro-optimized partial_csum() on VIA C7	2011-03-18 10:45:21 -07:00
Linus Torvalds	619297855a	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) trace, filters: Initialize the match variable in process_ops() properly trace, documentation: Fix branch profiling location in debugfs oprofile, s390: Cleanups oprofile, s390: Remove hwsampler_files.c and merge it into init.c perf: Fix tear-down of inherited group events perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds perf: Handle stopped state with tracepoints perf: Fix the software events state check perf, powerpc: Handle events that raise an exception without overflowing perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints perf, x86: Clean up SandyBridge PEBS events perf lock: Fix sorting by wait_min perf tools: Version incorrect with some versions of grep perf evlist: New command to list the names of events present in a perf.data file perf script: Add support for H/W and S/W events perf script: Add support for dumping symbols perf script: Support custom field selection for output perf script: Move printing of 'common' data from print_event and rename perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse perf script: Change process_event prototype ...	2011-03-18 10:38:34 -07:00
Linus Torvalds	e16b396ce3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)	2011-03-18 10:37:40 -07:00
Shaohua Li	4981d01ead	x86: Flush TLB if PGD entry is changed in i386 PAE mode According to intel CPU manual, every time PGD entry is changed in i386 PAE mode, we need do a full TLB flush. Current code follows this and there is comment for this too in the code. But current code misses the multi-threaded case. A changed page table might be used by several CPUs, every such CPU should flush TLB. Usually this isn't a problem, because we prepopulate all PGD entries at process fork. But when the process does munmap and follows new mmap, this issue will be triggered. When it happens, some CPUs keep doing page faults: http://marc.info/?l=linux-kernel&m=129915020508238&w=2 Reported-by: Yasunori Goto<y-goto@jp.fujitsu.com> Tested-by: Yasunori Goto<y-goto@jp.fujitsu.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Shaohua Li<shaohua.li@intel.com> Cc: Mallick Asit K <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm <linux-mm@kvack.org> Cc: stable <stable@kernel.org> LKML-Reference: <1300246649.2337.95.camel@sli10-conroe> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 11:44:01 +01:00
Namhyung Kim	e8e999cf3c	x86, dumpstack: Correct stack dump info when frame pointer is available Current stack dump code scans entire stack and check each entry contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it could mark whether the pointer is valid or not based on value of the frame pointer. Invalid entries could be preceded by '?' sign. However this was not going to happen because scan start point was always higher than the frame pointer so that they could not meet. Commit `9c0729dc80` ("x86: Eliminate bp argument from the stack tracing routines") delayed bp acquisition point, so the bp was read in lower frame, thus all of the entries were marked invalid. This patch fixes this by reverting above commit while retaining stack_frame() helper as suggested by Frederic Weisbecker. End result looks like below: before: [ 3.508329] Call Trace: [ 3.508551] [<ffffffff814f35c9>] ? panic+0x91/0x199 [ 3.508662] [<ffffffff814f3739>] ? printk+0x68/0x6a [ 3.508770] [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e [ 3.508876] [<ffffffff81a9821f>] ? mount_root+0x56/0x5a [ 3.508975] [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9 [ 3.509216] [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2 [ 3.509335] [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10 [ 3.509442] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.509542] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.509641] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 after: [ 3.522991] Call Trace: [ 3.523351] [<ffffffff814f35b9>] panic+0x91/0x199 [ 3.523468] [<ffffffff814f3729>] ? printk+0x68/0x6a [ 3.523576] [<ffffffff81a981b2>] mount_block_root+0x257/0x26e [ 3.523681] [<ffffffff81a9821f>] mount_root+0x56/0x5a [ 3.523780] [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9 [ 3.523885] [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2 [ 3.523987] [<ffffffff81003894>] kernel_thread_helper+0x4/0x10 [ 3.524228] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.524345] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.524445] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 -v5: * fix build breakage with oprofile -v4: * use 0 instead of regs->bp * separate out printk changes -v3: * apply comment from Frederic * add a couple of printk fixes Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Soren Sandmann <ssp@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 10:51:42 +01:00
Ingo Molnar	2c76397bdd	x86: Clean up csum-copy_64.S a bit The many stray whitespaces and other uncleanlinesses made this code almost unreadable to me - so fix those. No changes to the code. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 10:44:26 +01:00
Lucas De Marchi	0d2eb44f63	x86: Fix common misspellings They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 10:39:30 +01:00
Ingo Molnar	8dd8997d2c	Merge branch 'linus' into x86/urgent Merge reason: Merge upstream commits to avoid conflicts in upcoming patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-18 10:39:00 +01:00
Linus Torvalds	ec0afc9311	Merge branch 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits) KVM: unbreak userspace that does not sets tss address KVM: MMU: cleanup pte write path KVM: MMU: introduce a common function to get no-dirty-logged slot KVM: fix rcu usage in init_rmode_* functions KVM: fix kvmclock regression due to missing clock update KVM: emulator: Fix permission checking in io permission bitmap KVM: emulator: Fix io permission checking for 64bit guest KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n KVM: x86: Remove useless regs_page pointer from kvm_lapic KVM: improve comment on rcu use in irqfd_deassign KVM: MMU: remove unused macros KVM: MMU: cleanup page alloc and free KVM: MMU: do not record gfn in kvm_mmu_pte_write KVM: MMU: move mmu pages calculated out of mmu lock KVM: MMU: set spte accessed bit properly KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits KVM: Start lock documentation KVM: better readability of efer_reserved_bits KVM: Clear async page fault hash after switching to real mode KVM: VMX: Initialize vm86 TSS only once. ...	2011-03-17 18:40:35 -07:00
Linus Torvalds	5a39837f76	Merge branches 'stable/irq.fairness' and 'stable/irq.ween_of_nr_irqs' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/irq.fairness' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: events: Remove redundant clear of l2i at end of round-robin loop xen: events: Make round-robin scan fairer by snapshotting each l2 word once only xen: events: Clean up round-robin evtchn scan. xen: events: Make last processed event channel a per-cpu variable. xen: events: Process event channels notifications in round-robin order. * 'stable/irq.ween_of_nr_irqs' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: events: Fix compile error if CONFIG_SMP is not defined. xen: events: correct locking in xen_irq_from_pirq xen: events: propagate irq allocation failure instead of panicking xen: events: do not workaround too-small nr_irqs xen: events: remove use of nr_irqs as upper bound on number of pirqs xen: events: dynamically allocate irq info structures xen: events: maintain a list of Xen interrupts xen: events: push setup of irq<->{evtchn,ipi,virq,pirq} maps into irq_info init functions xen: events: turn irq_info constructors into initialiser functions xen: events: use per-cpu variable for cpu_evtchn_mask xen: events: refactor GSI pirq bindings functions xen: events: rename restore_cpu_pirqs -> restore_pirqs xen: events: remove unused public functions xen: events: fix xen_map_pirq_gsi error return xen: events: simplify comment xen: events: separate two unrelated halves of if condition Fix up trivial conflicts in drivers/xen/events.c	2011-03-17 18:27:49 -07:00
Linus Torvalds	514af9f790	Merge branches 'stable/hvc-console', 'stable/gntalloc.v6' and 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/hvc-console' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/hvc: Disable probe_irq_on/off from poking the hvc-console IRQ line. * 'stable/gntalloc.v6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: gntdev: fix build warning xen/p2m/m2p/gnttab: do not add failed grant maps to m2p override xen-gntdev: Add cast to pointer xen-gntdev: Fix incorrect use of zero handle xen: change xen/[gntdev/gntalloc] to default m xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only mappings xen-gntdev: Avoid double-mapping memory xen-gntdev: Avoid unmapping ranges twice xen-gntdev: Use map->vma for checking map validity xen-gntdev: Fix unmap notify on PV domains xen-gntdev: Fix memory leak when mmap fails xen/gntalloc,gntdev: Add unmap notify ioctl xen-gntalloc: Userspace grant allocation driver xen-gntdev: Support mapping in HVM domains xen-gntdev: Add reference counting to maps xen-gntdev: Use find_vma rather than iterating our vma list manually xen-gntdev: Change page limit to be global instead of per-open * 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (24 commits) xen-gntdev: Use ballooned pages for grant mappings xen-balloon: Add interface to retrieve ballooned pages xen-balloon: Move core balloon functionality out of module xen/balloon: Remove pr_info's and don't alter retry_count xen/balloon: Protect against CPU exhaust by event/x process xen/balloon: Migration from mod_timer() to schedule_delayed_work() xen/balloon: Removal of driver_pages	2011-03-17 18:16:36 -07:00
Linus Torvalds	61ef46fd45	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] pcc-cpufreq: remove duplicate statements [CPUFREQ] Remove the pm_message_t argument from driver suspend [CPUFREQ] Remove unneeded locks [CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files [CPUFREQ] Remove deprecated sysfs file sampling_rate_max [CPUFREQ] powernow-k8: The table index is not worth displaying [CPUFREQ] calculate delay after dbs_check_cpu [CPUFREQ] Add documentation for sampling_down_factor [CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons	2011-03-17 17:42:14 -07:00
Linus Torvalds	978ca164bd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (38 commits) amd64_edac: Fix decode_syndrome types amd64_edac: Fix DCT argument type amd64_edac: Fix ranges signedness amd64_edac: Drop local variable amd64_edac: Fix PCI config addressing types amd64_edac: Fix DRAM base macros amd64_edac: Fix node id signedness amd64_edac: Drop redundant declarations amd64_edac: Enable driver on F15h amd64_edac: Adjust ECC symbol size to F15h amd64_edac: Simplify scrubrate setting PCI: Rename CPU PCI id define amd64_edac: Improve DRAM address mapping amd64_edac: Sanitize ->read_dram_ctl_register amd64_edac: Adjust sys_addr to chip select conversion routine to F15h amd64_edac: Beef up early exit reporting amd64_edac: Revamp online spare handling amd64_edac: Fix channel interleave removal amd64_edac: Correct node interleaving removal amd64_edac: Add support for interleaved region swapping ... Fix up trivial conflict in include/linux/pci_ids.h due to AMD_15H_NB_MISC being renamed as AMD_15H_NB_F3 next to the new AMD_15H_NB_LINK entry.	2011-03-17 17:21:32 -07:00
Gleb Natapov	776e58ea3d	KVM: unbreak userspace that does not sets tss address Commit 6440e5967bc broke old userspaces that do not set tss address before entering vcpu. Unbreak it by setting tss address to a safe value on the first vcpu entry. New userspaces should set tss address, so print warning in case it doesn't. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:35 -03:00
Xiao Guangrong	0f53b5b1c0	KVM: MMU: cleanup pte write path This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:35 -03:00
Xiao Guangrong	5d163b1c9d	KVM: MMU: introduce a common function to get no-dirty-logged slot Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:34 -03:00
Xiao Guangrong	40dcaa9f69	KVM: fix rcu usage in init_rmode_* functions fix: [ 3494.671786] stack backtrace: [ 3494.671789] Pid: 10527, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 3494.671790] Call Trace: [ 3494.671796] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 3494.671826] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 3494.671834] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 3494.671843] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 3494.671851] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 3494.671861] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 3494.671867] [] ? vmx_set_tss_addr+0x83/0x122 [kvm_intel] and: [ 8328.789599] stack backtrace: [ 8328.789601] Pid: 18736, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 8328.789603] Call Trace: [ 8328.789609] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 8328.789621] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 8328.789628] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 8328.789635] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 8328.789643] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 8328.789699] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 8328.789713] [] ? vmx_create_vcpu+0x316/0x3c8 [kvm_intel] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:34 -03:00
Nikola Ciprich	1aa8ceef03	KVM: fix kvmclock regression due to missing clock update commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu), breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function to proper place. Signed-off-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:33 -03:00
Gleb Natapov	399a40c92d	KVM: emulator: Fix permission checking in io permission bitmap Currently if io port + len crosses 8bit boundary in io permission bitmap the check may allow IO that otherwise should not be allowed. The patch fixes that. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:33 -03:00
Gleb Natapov	5601d05b8c	KVM: emulator: Fix io permission checking for 64bit guest Current implementation truncates upper 32bit of TR base address during IO permission bitmap check. The patch fixes this. Reported-and-tested-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:33 -03:00
Avi Kivity	831ca6093c	KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n With CONFIG_CC_STACKPROTECTOR, we need a valid %gs at all times, so disable lazy reload and do an eager reload immediately after the vmexit. Reported-by: IVAN ANGELOV <ivangotoy@gmail.com> Acked-By: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:33 -03:00
Takuya Yoshikawa	afc20184b7	KVM: x86: Remove useless regs_page pointer from kvm_lapic Access to this page is mostly done through the regs member which holds the address to this page. The exceptions are in vmx_vcpu_reset() and kvm_free_lapic() and these both can easily be converted to using regs. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:33 -03:00
Xiao Guangrong	676646ee4b	KVM: MMU: remove unused macros These macros are not used, so removed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	842f22ed9b	KVM: MMU: cleanup page alloc and free Using __get_free_page instead of alloc_page and page_address, using free_page instead of __free_page and virt_to_page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	49b26e26e4	KVM: MMU: do not record gfn in kvm_mmu_pte_write No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	48c0e4e906	KVM: MMU: move mmu pages calculated out of mmu lock kvm_mmu_calculate_mmu_pages need to walk all memslots and it's protected by kvm->slots_lock, so move it out of mmu spinlock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	1b7fd45c32	KVM: MMU: set spte accessed bit properly Set spte accessed bit only if guest_initiated == 1 that means the really accessed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	da8dc75f0c	KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits Only remove write access in the last sptes. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Lai Jiangshan	1260edbe7d	KVM: better readability of efer_reserved_bits use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:31 -03:00
Lai Jiangshan	d170c41906	KVM: Clear async page fault hash after switching to real mode The hash array of async gfns may still contain some left gfns after kvm_clear_async_pf_completion_queue() called, need to clear them. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:31 -03:00
Gleb Natapov	93ea5388ea	KVM: VMX: Initialize vm86 TSS only once. Currently vm86 task is initialized on each real mode entry and vcpu reset. Initialization is done by zeroing TSS and updating relevant fields. But since all vcpus are using the same TSS there is a race where one vcpu may use TSS while other vcpu is initializing it, so the vcpu that uses TSS will see wrong TSS content and will behave incorrectly. Fix that by initializing TSS only once. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:31 -03:00
Gleb Natapov	a8ba6c2622	KVM: VMX: update live TR selector if it changes in real mode When rmode.vm86 is active TR descriptor is updated with vm86 task values, but selector is left intact. vmx_set_segment() makes sure that if TR register is written into while vm86 is active the new values are saved for use after vm86 is deactivated, but since selector is not updated on vm86 activation/deactivation new value is lost. Fix this by writing new selector into vmcs immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:31 -03:00
Lai Jiangshan	a3b5ba49a8	KVM: VMX: add the __noclone attribute to vmx_vcpu_run The changelog of `104f226` said "adds the __noclone attribute", but it was missing in its patch. I think it is still needed. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:31 -03:00
Jan Kiszka	038f8c110e	KVM: x86: Convert tsc_write_lock to raw_spinlock Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Gleb Natapov	7049467b53	KVM: remove isr_ack logic from PIC isr_ack logic was added by `e48258009d` to avoid unnecessary IPIs. Back then it made sense, but now the code checks that vcpu is ready to accept interrupt before sending IPI, so this logic is no longer needed. The patch removes it. Fixes a regression with Debian/Hurd. Signed-off-by: Gleb Natapov <gleb@redhat.com> Reported-and-tested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Joseph Cihula	23f3e99132	KVM: VMX: fix detection of BIOS disabling VMX This patch fixes the logic used to detect whether BIOS has disabled VMX, for the case where VMX is enabled only under SMX, but tboot is not active. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Jan Kiszka	e935b8372c	KVM: Convert kvm_lock to raw_spinlock Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Avi Kivity	bd3d1ec3d2	KVM: SVM: check for progress after IRET interception When we enable an NMI window, we ask for an IRET intercept, since the IRET re-enables NMIs. However, the IRET intercept happens before the instruction executes, while the NMI window architecturally opens afterwards. To compensate for this mismatch, we only open the NMI window in the following exit, assuming that the IRET has by then executed; however, this assumption is not always correct; we may exit due to a host interrupt or page fault, without having executed the instruction. Fix by checking for forward progress by recording and comparing the IRET's rip. This is somewhat of a hack, since an unchaging rip does not mean that no forward progress has been made, but is the simplest fix for now. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Avi Kivity	f86368493e	KVM: Fix race between nmi injection and enabling nmi window The interrupt injection logic looks something like if an nmi is pending, and nmi injection allowed inject nmi if an nmi is pending request exit on nmi window the problem is that "nmi is pending" can be set asynchronously by the PIT; if it happens to fire between the two if statements, we will request an nmi window even though nmi injection is allowed. On SVM, this has disasterous results, since it causes eflags.TF to be set in random guest code. The fix is simple; make nmi_pending synchronous using the standard vcpu->requests mechanism; this ensures the code above is completely synchronous wrt nmi_pending. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Avi Kivity	4005996e42	KVM: Drop ad-hoc vendor specific instruction restriction Use the new support in the emulator, and drop the ad-hoc code in x86.c. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
Avi Kivity	d867162c6d	KVM: x86 emulator: vendor specific instructions Mark some instructions as vendor specific, and allow the caller to request emulation only of vendor specific instructions. This is useful in some circumstances (responding to a #UD fault). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
Avi Kivity	3e90943907	KVM: Drop bogus x86_decode_insn() error check x86_decode_insn() doesn't return X86EMUL_* values, so the check for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper check later on, so there is no need for a replacement for this code. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
Jan Kiszka	0bb8865979	KVM: x86: Drop obsolete warning about INIT on runnable VCPU This warning was once used for debugging QEMU user space. Though uncommon, it is actually possible to send an INIT request to a running VCPU. So better drop this warning before someone misuses it to flood kernel logs this way. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
Glauber Costa	12f9a48f7b	KVM: x86: release kvmclock page on reset When a vcpu is reset, kvmclock page keeps being written to this days. This is wrong and inconsistent: a cpu reset should take it to its initial state. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
john cooper	91c9c3eda4	KVM: x86: handle guest access to BBL_CR_CTL3 MSR A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:27 -03:00
Xiao Guangrong	6b7e2d0991	KVM: Add "exiting guest mode" state Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:26 -03:00
Jan Kiszka	9ca5231831	KVM: x86: Remove user space triggerable MCE error message This case is a pure user space error we do not need to record. Moreover, it can be misused to flood the kernel log. Remove it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Xiao Guangrong	63f42e023e	KVM: fix rcu usage warning in kvm_arch_vcpu_ioctl_set_sregs() Fix: [ 1001.499596] =================================================== [ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ] [ 1001.499601] --------------------------------------------------- [ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection! ...... [ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62 [ 1001.499638] Call Trace: [ 1001.499644] [] lockdep_rcu_dereference+0x9d/0xa5 [ 1001.499653] [] gfn_to_memslot+0x8d/0xc8 [kvm] [ 1001.499661] [] gfn_to_hva+0x16/0x3f [kvm] [ 1001.499669] [] kvm_read_guest_page+0x1e/0x5e [kvm] [ 1001.499681] [] kvm_read_guest_page_mmu+0x53/0x5e [kvm] [ 1001.499699] [] load_pdptrs+0x3f/0x9c [kvm] [ 1001.499705] [] ? vmx_set_cr0+0x507/0x517 [kvm_intel] [ 1001.499717] [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm] [ 1001.499727] [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Avi Kivity	40712faeb8	KVM: VMX: Avoid atomic operation in vmx_vcpu_run Instead of exchanging the guest and host rcx, have separate storage for each. This allows us to avoid using the xchg instruction, which is is a little slower than normal operations. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Avi Kivity	1c696d0e1b	KVM: VMX: Simplify saving guest rcx in vmx_vcpu_run Change push top-of-stack pop guest-rcx pop dummy to pop guest-rcx which is the same thing, only simpler. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Rik van Riel	00c25bce02	KVM: VMX: increase ple_gap default to 128 On some CPUs, a ple_gap of 41 is simply insufficient to ever trigger PLE exits, even with the minimalistic PLE test from kvm-unit-tests. http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=commitdiff;h=eda71b28fa122203e316483b35f37aaacd42f545 For example, the Xeon X5670 CPU needs a ple_gap of at least 48 in order to get pause loop exits: # modprobe kvm_intel ple_gap=47 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 58298446 # rmmod kvm_intel # modprobe kvm_intel ple_gap=48 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 36616 Increase the ple_gap to 128 to be on the safe side. Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Zhai, Edwin <edwin.zhai@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:25 -03:00
Joerg Roedel	3781c01c15	KVM: SVM: Add support for perf-kvm This patch adds the necessary code to run perf-kvm on AMD machines. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:25 -03:00
Avi Kivity	a917949935	KVM: VMX: Avoid leaking fake realmode state to userspace When emulating real mode, we fake some state: - tr.base points to a fake vm86 tss - segment registers are made to conform to vm86 restrictions change vmx_get_segment() not to expose this fake state to userspace; instead, return the original state. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Avi Kivity	d0ba64f9b4	KVM: VMX: Save and restore tr selector across mode switches When emulating real mode we play with tr hidden state, but leave tr.selector alone. That works well, except for save/restore, since loading TR writes it to the hidden state in vmx->rmode. Fix by also saving and restoring the tr selector; this makes things more consistent and allows migration to work during the early boot stages of Windows XP. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00

1 2 3 4 5 ...

12905 Commits