linux_old1

Commit Graph

Author	SHA1	Message	Date
Ingo Molnar	006b20fe4c	Merge branch 'perf/urgent' into perf/core Merge reason: We want to apply a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-16 11:22:27 +01:00
Rusty Russell	da32dac101	lguest: populate initial_page_table Two x86 patches broke lguest: 1) v2.6.35-492-g72d7c3b, which changed x86 to use the memblock allocator. In lguest, the host places linear page tables at the top of mem, which used to be enough to get us up to the swapper_pg_dir page tables. With the first patch, the direct mapping tables used that memory: Before: kernel direct mapping tables up to 4000000 @ 7000-1a000 After: kernel direct mapping tables up to 4000000 @ 3fed000-4000000 I initially fixed this by lying about the amount of memory we had, so the kernel wouldn't blatt the lguest boot pagetables (yuk!), but then... 2) v2.6.36-rc8-54-gb40827f, which made x86 boot use initial_page_table. This was initialized in a part of head_32.S which isn't executed by lguest; it is then copied into swapper_pg_dir. So we have to initialize it; and anyway we switch to it before we blatt the old tables, so that fixes the previous damage as well. For the moment, I cut & pasted the code into lguest's boot code, but next merge window I will merge them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> To: x86@kernel.org	2010-12-16 17:03:15 +10:30
Rusty Russell	bb4093deb2	lguest: restore boot speed lguest is dumb and drops all the pagetables for set_pte (which is only used for kernel mapping manipulation, so it's OK without highmem). But it's used a lot in boot, too. As a guest optimization, we suppressed this flushing until the first page switch. Now we have initial_page_table, that happens much earlier, so extend the heuristic to wait until we switch to something other than the swapper_pg_dir or initial_page_table. As measured on my laptop under kvm, this dropped the time-to-mount-root from 48 seconds to 4.3 seconds. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-12-16 17:03:15 +10:30
Rusty Russell	bb6f1d9a99	lguest: fix crash lguest_time_init `fe25c7fc2e` "x86: lguest: Convert to new irq chip functions" converted enable_lguest_irq() to take a struct irq_data *, but didn't fix the one internal caller. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: x86@kernel.org	2010-12-16 17:03:14 +10:30
Randy Dunlap	52f6c5ad43	crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h Add missing header file: arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR' arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-12-15 19:44:08 +08:00
Kenji Kaneshige	7f7fbf45c6	x86: Enable the intr-remap fault handling after local APIC setup Interrupt-remapping gets enabled very early in the boot, as it determines the apic mode that the processor can use. And the current code enables the vt-d fault handling before the setup_local_APIC(). And hence the APIC LDR registers and data structure in the memory may not be initialized. So the vt-d fault handling in logical xapic/x2apic modes were broken. Fix this by enabling the vt-d fault handling in the end_local_APIC_setup() A cleaner fix of enabling fault handling while enabling intr-remapping will be addressed for v2.6.38. [ Enabling intr-remapping determines the usage of x2apic mode and the apic mode determines the fault-handling configuration. ] Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <20101201062244.541996375@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: stable@kernel.org [v2.6.32+] Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-13 16:53:32 -08:00
Kenji Kaneshige	086e8ced65	x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode In x2apic mode, we need to set the upper address register of the fault handling interrupt register of the vt-d hardware. Without this irq migration of the vt-d fault handling interrupt is broken. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <1291225233.2648.39.camel@sbsiddha-MOBL3> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: stable@kernel.org [v2.6.32+] Acked-by: Chris Wright <chrisw@sous-sol.org> Tested-by: Takao Indoh <indou.takao@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-13 16:52:52 -08:00
Suresh Siddha	3fb82d56ad	x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume During suspend, we disable all the non boot cpus. And during resume we bring them all back again. So no need to do alternatives_smp_switch() in between. On my core 2 based laptop, this speeds up the suspend path by 15msec and the resume path by 5 msec (suspend/resume speed up differences can be attributed to the different P-states that the cpu is in during suspend/resume). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1290557500.4946.8.camel@sbsiddha-MOBL3.sc.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-13 16:23:56 -08:00
Suresh Siddha	10340ae130	x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem() Alignment of alloc_bootmem() depends on the value of L1_CACHE_SHIFT. What we need here, however, is 64 byte alignment. Use alloc_bootmem_align() and explicitly specify the alignment instead. This fixes a kernel boot crash reported by Jody when the cpu in .config is set to MPENTIUMII but the kernel is booted on a xsave-capable CPU. Reported-by: Jody Bruchon <jody@nctritech.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20101116212442.059967454@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>	2010-12-13 16:13:11 -08:00
H. Peter Anvin	de2a8cf98e	x86, gcc-4.6: Use gcc -m options when building vdso The vdso Makefile passes linker-style -m options not to the linker but to gcc. This happens to work with earlier gcc, but fails with gcc 4.6. Pass gcc-style -m options, instead. Note: all currently supported versions of gcc supports -m32, so there is no reason to conditionalize it any more. Reported-by: H. J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-*@git.kernel.org> Cc: <stable@kernel.org>	2010-12-13 16:08:37 -08:00
Don Zickus	5f29805a4f	x86, watchdog: Compile fix when CONFIG_LOCAL_APIC not enabled When adjusting the code to handle removing the old nmi watchdog, I forgot to consider the compile case when the local apic is not enabled. This change fixes the following build error: arch/x86/kernel/apic/hw_nmi.c:28:6: error: redefinition of ‘touch_nmi_watchdog’ Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Rakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <20101213153719.GD18577@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-13 18:23:23 +01:00
Thomas Gleixner	f1c18071ad	x86: HPET: Chose a paranoid safe value for the ETIME check commit `995bd3bb5` (x86: Hpet: Avoid the comparator readback penalty) chose 8 HPET cycles as a safe value for the ETIME check, as we had the confirmation that the posted write to the comparator register is delayed by two HPET clock cycles on Intel chipsets which showed readback problems. After that patch hit mainline we got reports from machines with newer AMD chipsets which seem to have an even longer delay. See http://thread.gmane.org/gmane.linux.kernel/1054283 and http://thread.gmane.org/gmane.linux.kernel/1069458 for further information. Boris tried to come up with an ACPI based selection of the minimum HPET cycles, but this failed on a couple of test machines. And of course we did not get any useful information from the hardware folks. For now our only option is to chose a paranoid high and safe value for the minimum HPET cycles used by the ETIME check. Adjust the minimum ns value for the HPET clockevent accordingly. Reported-Bistected-and-Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.1012131222420.2653@localhost6.localdomain6> Cc: Simon Kirby <sim@hostway.ca> Cc: Borislav Petkov <bp@alien8.de> Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com> Cc: John Stultz <johnstul@us.ibm.com>	2010-12-13 13:42:44 +01:00
Thomas Gleixner	a8760eca6c	x86: Check tsc available/disabled in the delayed init function The delayed TSC init function does not check whether the system has no TSC or TSC is disabled at the kernel command line, which results in a crash in the work queue based extended calibration due to division by zero because the basic calibration never happened. Add the missing checks and do not touch TSC when not available or disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com>	2010-12-13 11:35:05 +01:00
Tejun Heo	0aa002fe60	x86: apic: Cleanup and simplify setup_local_APIC() setup_local_APIC() is used to setup local APIC early during CPU initialization and already assumes that preemption is disabled on entry. However, The function unnecessarily disables and enables preemption and uses smp_processor_id() multiple times in and out of the nested preemption disabled section. This gives the wrong impression that the function might be able to handle being called with preemption enabled and/or migrated to another processor in the middle. Make it clear that the function is always called with preemption disabled, drop the confusing preemption disable block and call smp_processor_id() once at the beginning of the function. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: brgerst@gmail.com LKML-Reference: <4D00B3B9.7060702@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-10 13:46:26 +01:00
Don Zickus	5dc3055879	x86, NMI: Add back unknown_nmi_panic and nmi_watchdog sysctls Originally adapted from Huang Ying's patch which moved the unknown_nmi_panic to the traps.c file. Because the old nmi watchdog was deleted before this change happened, the unknown_nmi_panic sysctl was lost. This re-adds it. Also, the nmi_watchdog sysctl was re-implemented and its documentation updated accordingly. Patch-inspired-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: fweisbec@gmail.com LKML-Reference: <1291068437-5331-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-10 00:01:06 +01:00
Don Zickus	96a84c20d6	lockup detector: Compile fixes from removing the old x86 nmi watchdog My patch that removed the old x86 nmi watchdog broke other arches. This change reverts a piece of that patch and puts the change in the correct spot. Signed-off-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: fweisbec@gmail.com Cc: yinghai@kernel.org LKML-Reference: <1291068437-5331-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-10 00:01:06 +01:00
Feng Tang	0e3fa13f4e	x86: Further simplify mp_irq info handling assign_to_mp_irq() is copying the struct mpc_intsrc members one by one. That's silly. Use memcpy() and let the compiler figure it out. Same for the identical function assign_to_mpc_intsrc() mp_irq_mpc_intsrc_cmp() is comparing the struct members one by one, but no caller ever checks the different return codes. Use memcmp() instead. Remove the extra printk in MP_ioapic_info() Signed-off-by: Feng Tang <feng.tang@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "Alan Cox <alan@linux.intel.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <20101208151857.212f0018@feng-i7> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:06 +01:00
Feng Tang	2d8009ba67	x86: Unify 3 similar ways of saving mp_irqs info There are 3 places defining similar functions of saving IRQ vector info into mp_irqs[] array: mmparse/acpi/mrst. Replace the redundant code by a common function in io_apic.c as it's only called when CONFIG_X86_IO_APIC=y Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20101207133204.4d913c5a@feng-i7> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:06 +01:00
Yinghai Lu	60d79fd99f	x86, ioapic: Avoid writing io_apic id if already correct For 32bit mptable path, setup_ids_from_mpc() always writes the io_apic id register, even there is no change needed. Skip the write, when readout and mptable match. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> LKML-Reference: <4CFDF785.7010401@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:05 +01:00
Yinghai Lu	0450193bff	x86, x2apic: Don't map lapic addr for preenabled x2apic systems If x2apic is preenabled and used by the kernel, we don't need to map the lapic address. That mapping will never be used. So just skip that in register_lapic_address() Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> LKML-Reference: <4CFDF69C.9070501@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:05 +01:00
Yinghai Lu	53301f36f3	x86, sfi: Use register_lapic_address() register_lapic_address() and mp_sfi_register_lapic_address() are almost identical. Use the common function. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4CFDF693.6000908@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:05 +01:00
Yinghai Lu	326a2e6bae	x86, apic: Use register_lapic_address() in init_apic_mapping() Remove the printk as well, we don't want to print when nothing changed. We print in register_lapic_address() already. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> LKML-Reference: <4CFDF68A.7020902@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:05 +01:00
Yinghai Lu	f115714163	x86, apic: Remove early_init_lapic_mapping() It is almost the same as smp_register_lapic_addr(). We just need to let smp_read_mpc() call smp_register_lapic_addr() when early==1. Add the apic_printk to smp_register_lapic_address() Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> LKML-Reference: <4CFDF681.3030509@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:04 +01:00
Yinghai Lu	c0104d38a7	x86, apic: Unify identical register_lapic_address() functions They are the same, move the common function to apic.c to allow further cleanups. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <4CFDF675.4060305@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 21:52:04 +01:00
Thomas Gleixner	51ddafcbc7	Merge branch 'x86/platform' into x86/apic-cleanups Reason: apic cleanup series depends on x86/apic, x86/amd-nb and x86/platform Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 18:19:21 +01:00
Thomas Gleixner	d834a9dcec	Merge branch 'x86/amd-nb' into x86/apic-cleanups Reason: apic cleanup series depends on x86/apic, x86/amd-nb x86/platform Conflicts: arch/x86/include/asm/io_apic.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 18:17:25 +01:00
Thomas Gleixner	4720dd1b38	x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n arch/x86/kernel/apic/io_apic.c: In function 'ack_apic_level': arch/x86/kernel/apic/io_apic.c:2433: warning: unused variable 'desc' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <201010272107.o9RL7rse018212@imap1.linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 17:43:21 +01:00
Rakib Mullick	2c6cb1053a	x86: Address 'unused' warning in hw_nmi.c again arch/x86/kernel/apic/hw_nmi.c:29: warning: backtrace_mask defined but not used commit 0e2af2a9(x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG) addressed this warning, but it was reintroduced by commit 5f2b0ba4(x86, nmi_watchdog: Remove the old nmi_watchdog). Move backtrace_mask into the #ifdef arch_trigger_all_cpu_backtrace section again. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <AANLkTi=rcc38QzoKa6LFy4m++-p_9=Zt4_kDQE=GeKxf@mail.gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-09 16:06:52 +01:00
Peter Zijlstra	c079c791c5	perf, amd: Remove the nb lock Since all the hotplug stuff is serialized by the hotplug mutex, do away with the amd_nb_lock. Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-08 20:16:30 +01:00
Andre Przywara	73c1160ce3	KVM: enlarge number of possible CPUID leaves Currently the number of CPUID leaves KVM handles is limited to 40. My desktop machine (AthlonII) already has 35 and future CPUs will expand this well beyond the limit. Extend the limit to 80 to make room for future processors. KVM-Stable-Tag. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-12-08 17:28:38 +02:00
Joerg Roedel	24d1b15f72	KVM: SVM: Do not report xsave in supported cpuid To support xsave properly for the guest the SVM module need software support for it. As long as this is not present do not report the xsave as supported feature in cpuid. As a side-effect this patch moves the bit() helper function into the x86.h file so that it can be used in svm.c too. KVM-Stable-Tag. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-12-08 17:28:37 +02:00
Sheng Yang	3ea3aa8cf6	KVM: Fix OSXSAVE after migration CPUID's OSXSAVE is a mirror of CR4.OSXSAVE bit. We need to update the CPUID after migration. KVM-Stable-Tag. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-12-08 17:28:31 +02:00
Linus Torvalds	6313e3c217	Merge branches 'x86-fixes-for-linus', 'perf-fixes-for-linus' and 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/pvclock: Zero last_value on resume * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf record: Fix eternal wait for stillborn child perf header: Don't assume there's no attr info if no sample ids is provided perf symbols: Figure out start address of kernel map from kallsyms perf symbols: Fix kallsyms kernel/module map splitting * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: nohz: Fix printk_needs_cpu() return value on offline cpus printk: Fix wake_up_klogd() vs cpu hotplug	2010-12-08 06:40:59 -08:00
Ingo Molnar	10a18d7dc0	Merge commit 'v2.6.37-rc5' into perf/core Merge reason: Pick up the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-07 07:49:51 +01:00
Feng Tang	991cfffa7c	x86, earlyprintk: Move mrst early console to platform/ and fix a typo Move the code to arch/x86/platform/mrst/. Also fix a typo to use the correct config option: ONFIG_EARLY_PRINTK_MRST Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: alan@linux.intel.com LKML-Reference: <1291348298-21263-1-git-send-email-feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-06 20:52:04 +01:00
Masami Hiramatsu	f984ba4eb5	kprobes: Use text_poke_smp_batch for unoptimizing Use text_poke_smp_batch() on unoptimization path for reducing the number of stop_machine() issues. If the number of unoptimizing probes is more than MAX_OPTIMIZE_PROBES(=256), kprobes unoptimizes first MAX_OPTIMIZE_PROBES probes and kicks optimizer for remaining probes. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20101203095434.2961.22657.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-06 17:59:32 +01:00
Masami Hiramatsu	cd7ebe2298	kprobes: Use text_poke_smp_batch for optimizing Use text_poke_smp_batch() in optimization path for reducing the number of stop_machine() issues. If the number of optimizing probes is more than MAX_OPTIMIZE_PROBES(=256), kprobes optimizes first MAX_OPTIMIZE_PROBES probes and kicks optimizer for remaining probes. Changes in v5: - Use kick_kprobe_optimizer() instead of directly calling schedule_delayed_work(). - Rescheduling optimizer outside of kprobe mutex lock. Changes in v2: - Allocate code buffer and parameters in arch_init_kprobes() instead of using static arraies. - Merge previous max optimization limit patch into this patch. So, this patch introduces upper limit of optimization at once. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20101203095428.2961.8994.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-06 17:59:31 +01:00
Masami Hiramatsu	7deb18dcf0	x86: Introduce text_poke_smp_batch() for batch-code modifying Introduce text_poke_smp_batch(). This function modifies several text areas with one stop_machine() on SMP. Because calling stop_machine() is heavy task, it is better to aggregate text_poke requests. ( Note: I've talked with Rusty about this interface, and he would not like to expand stop_machine() interface, since it is not for generic use. ) Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: 2nddept-manager@sdl.hitachi.co.jp LKML-Reference: <20101203095422.2961.51217.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-06 17:59:31 +01:00
Masami Hiramatsu	6274de4984	kprobes: Support delayed unoptimizing Unoptimization occurs when a probe is unregistered or disabled, and is heavy because it recovers instructions by using stop_machine(). This patch delays unoptimization operations and unoptimize several probes at once by using text_poke_smp_batch(). This can avoid unexpected system slowdown coming from stop_machine(). Changes in v5: - Split this patch into several cleanup patches and this patch. - Fix some text_mutex lock miss. - Use bool instead of int for behavior flags. - Add additional comment for (un)optimizing path. Changes in v2: - Use dynamic allocated buffers and params. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: 2nddept-manager@sdl.hitachi.co.jp LKML-Reference: <20101203095409.2961.82733.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-06 17:59:30 +01:00
Feng Tang	e4d2ebcab1	x86, apbt: Setup affinity for apb timers acting as per-cpu timer Commit `a5ef2e70` "x86: Sanitize apb timer interrupt handling" forgot the affinity setup when cleaning up the code, this patch just adds the forgotten part Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Alan Cox <alan@linux.intel.com> LKML-Reference: <1291348298-21263-2-git-send-email-feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-06 15:58:26 +01:00
Dirk Brandewie	5ec6960f6f	ce4100: Add errata fixes for UART on CE4100 This patch enables the UART on the CE4100. The UART has a couple of issues that need to be worked around. First the UART is mostly PC compatible except that it is clocked eight times faster than a standard PC so the default configuration provided in arch/x86/include/asm/serial.h needs to be overridden. Second the TX interrupt may not be set correctly all the time. Lastly accessing the UART via I/O space for early_prink() hangs the chip when the IOAPIC is enabled. A custom mem_serial_in() is provided to work around the TX interrupt issue. The configuration issues are dealt with in the call back registered with the 8250 driver via serial8250_set_isa_configurator() Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> LKML-Reference: <1290436128-17958-1-git-send-email-dirk.brandewie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-06 15:58:26 +01:00
Sebastian Andrzej Siewior	a38c5380ef	x86: io_apic: Split setup_ioapic_ids_from_mpc() Sodaville needs to setup the IO_APIC ids as the boot loader leaves them uninitialized. Split out the setter function so it can be called unconditionally from the sodaville board code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20101126165020.GA26361@www.tglx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-12-06 14:30:28 +01:00
Linus Torvalds	11e8896474	Merge branch '2.6.37-rc4-pvhvm-fixes' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * '2.6.37-rc4-pvhvm-fixes' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: unplug the emulated devices at resume time xen: fix save/restore for PV on HVM guests with pirq remapping xen: resume the pv console for hvm guests too xen: fix MSI setup and teardown for PV on HVM guests xen: use PHYSDEVOP_get_free_pirq to implement find_unbound_pirq	2010-12-03 11:30:57 -08:00
Linus Torvalds	8338fded13	Merge branches 'upstream/core' and 'upstream/bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: allocate irq descs on any NUMA node xen: prevent crashes with non-HIGHMEM 32-bit kernels with largeish memory xen: use default_idle xen: clean up "extra" memory handling some more * 'upstream/bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: x86/32: perform initial startup on initial_page_table xen: don't bother to stop other cpus on shutdown/reboot	2010-12-03 10:08:52 -08:00
John Stultz	08ec0c58fb	x86: Improve TSC calibration using a delayed workqueue Boot to boot the TSC calibration may vary by quite a large amount. While normal variance of 50-100ppm can easily be seen, the quick calibration code only requires 500ppm accuracy, which is the limit of what NTP can correct for. This can cause problems for systems being used as NTP servers, as every time they reboot it can take hours for them to calculate the new drift error caused by the calibration. The classic trade-off here is calibration accuracy vs slow boot times, as during the calibration nothing else can run. This patch uses a delayed workqueue to calibrate the TSC over the period of a second. This allows very accurate calibration (in my tests only varying by 1khz or 0.4ppm boot to boot). Additionally this refined calibration step does not block the boot process, and only delays the TSC clocksoure registration by a few seconds in early boot. If the refined calibration strays 1% from the early boot calibration value, the system will fall back to already calculated early boot calibration. Credit to Andi Kleen who suggested using a timer quite awhile back, but I dismissed it thinking the timer calibration would be done after the clocksource was registered (which would break things). Forgive me for my short-sightedness. This patch has worked very well in my testing, but TSC hardware is quite varied so it would probably be good to get some extended testing, possibly pushing inclusion out to 2.6.39. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1289003985-29060-1-git-send-email-johnstul@us.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@elte.hu> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Clark Williams <williams@redhat.com> CC: Andi Kleen <andi@firstfloor.org>	2010-12-02 16:48:37 -08:00
John Stultz	b0f969009f	Merge remote branch 'tip/x86/tsc' into fortglx/2.6.38/tip/x86/tsc Conflicts: Documentation/kernel-parameters.txt	2010-12-02 16:47:52 -08:00
Jeremy Fitzhardinge	64141da587	vmalloc: eagerly clear ptes on vunmap On stock 2.6.37-rc4, running: # mount lilith:/export /mnt/lilith # find /mnt/lilith/ -type f -print0 \| xargs -0 file crashes the machine fairly quickly under Xen. Often it results in oops messages, but the couple of times I tried just now, it just hung quietly and made Xen print some rude messages: (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp 3000000000000000) for mfn 1d7058 (pfn 18fa7) (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp 1000000000000000) for mfn 1d2e04 (pfn 1d1fb) (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04 Which means the domain tried to map a pagetable page RW, which would allow it to map arbitrary memory, so Xen stopped it. This is because vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had finished with them, and those pages got recycled as pagetable pages while still having these RW aliases. Removing those mappings immediately removes the Xen-visible aliases, and so it has no problem with those pages being reused as pagetable pages. Deferring the TLB flush doesn't upset Xen because it can flush the TLB itself as needed to maintain its invariants. When unmapping a region in the vmalloc space, clear the ptes immediately. There's no point in deferring this because there's no amortization benefit. The TLBs are left dirty, and they are flushed lazily to amortize the cost of the IPIs. This specific motivation for this patch is an oops-causing regression since 2.6.36 when using NFS under Xen, triggered by the NFS client's use of vm_map_ram() introduced in `56e4ebf877` ("NFS: readdir with vmapped pages") . XFS also uses vm_map_ram() and could cause similar problems. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-12-02 14:51:15 -08:00
Stefano Stabellini	512b109ec9	xen: unplug the emulated devices at resume time Early after being resumed we need to unplug again the emulated devices. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-12-02 14:40:53 +00:00
Stefano Stabellini	af42b8d12f	xen: fix MSI setup and teardown for PV on HVM guests When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible for doing the actual mapping and unmapping. We only give qemu the desired pirq number when we ask to do the mapping the first time, after that we should be reading back the pirq number from qemu every time we want to re-enable the MSI. This fixes a bug in xen_hvm_setup_msi_irqs that manifests itself when trying to enable the same MSI for the second time: the old MSI to pirq mapping is still valid at this point but xen_hvm_setup_msi_irqs would try to assign a new pirq anyway. A simple way to reproduce this bug is to assign an MSI capable network card to a PV on HVM guest, if the user brings down the corresponding ethernet interface and up again, Linux would fail to enable MSIs on the device. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-12-02 14:34:25 +00:00
Ian Campbell	805e3f4950	xen: x86/32: perform initial startup on initial_page_table Only make swapper_pg_dir readonly and pinned when generic x86 architecture code (which also starts on initial_page_table) switches to it. This helps ensure that the generic setup paths work on Xen unmodified. In particular clone_pgd_range writes directly to the destination pgd entries and is used to initialise swapper_pg_dir so we need to ensure that it remains writeable until the last possible moment during bring up. This is complicated slightly by the need to avoid sharing kernel PMD entries when running under Xen, therefore the Xen implementation must make a copy of the kernel PMD (which is otherwise referred to by both intial_page_table and swapper_pg_dir) before switching to swapper_pg_dir. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-29 17:07:53 -08:00
Jeremy Fitzhardinge	31e323cca9	xen: don't bother to stop other cpus on shutdown/reboot Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's no need to do it manually. In any case it will fail because all the IPI irqs have been pulled down by this point, so the cross-CPU calls will simply hang forever. Until change `76fac077db` the function calls were not synchronously waited for, so this wasn't apparent. However after that change the calls became synchronous leading to a hang on shutdown on multi-VCPU guests. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: Alok Kataria <akataria@vmware.com>	2010-11-29 14:16:53 -08:00
Linus Torvalds	a9e40a2493	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix the software context switch counter perf, x86: Fixup Kconfig deps x86, perf, nmi: Disable perf if counters are not accessible perf: Fix inherit vs. context rotation bug	2010-11-28 12:25:02 -08:00
Jeremy Fitzhardinge	e7a3481c02	x86/pvclock: Zero last_value on resume If the guest domain has been suspend/resumed or migrated, then the system clock backing the pvclock clocksource may revert to a smaller value (ie, can be non-monotonic across the migration/save-restore). Make sure we zero last_value in that case so that the domain continues to see clock updates. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-28 09:33:20 +01:00
Linus Torvalds	fbe6c4047f	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: dmar, x86: Use function stubs when CONFIG_INTR_REMAP is disabled x86-64: Fix and clean up AMD Fam10 MMCONF enabling x86: UV: Address interrupt/IO port operation conflict x86: Use online node real index in calulate_tbl_offset() x86, asm: Fix binutils 2.15 build failure	2010-11-27 07:28:47 +09:00
Linus Torvalds	d2f30c73ab	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf symbols: Remove incorrect open-coded container_of() perf record: Handle restrictive permissions in /proc/{kallsyms,modules} x86/kprobes: Prevent kprobes to probe on save_args() irq_work: Drop cmpxchg() result perf: Fix owner-list vs exit x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG tracing: Fix recursive user stack trace perf,hw_breakpoint: Initialize hardware api earlier x86: Ignore trap bits on single step exceptions tracing: Force arch_local_irq_* notrace for paravirt tracing: Fix module use of trace_bprintk()	2010-11-27 07:28:17 +09:00
Cyrill Gorcunov	af86da5318	perf, x86: P4 PMU - describe config format Add description of .config in a sake of RAW events. At least this should bring some light to those who will be reading this code. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Stephane Eranian <eranian@google.com> Cc: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:14:57 +01:00
Peter Zijlstra	004417a6d4	perf, arch: Cleanup perf-pmu init vs lockup-detector The perf hardware pmu got initialized at various points in the boot, some before early_initcall() some after (notably arch_initcall). The problem is that the NMI lockup detector is ran from early_initcall() and expects the hardware pmu to be present. Sanitize this by moving all architecture hardware pmu implementations to initialize at early_initcall() and move the lockup detector to an explicit initcall right after that. Cc: paulus <paulus@samba.org> Cc: davem <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290707759.2145.119.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:14:56 +01:00
Andi Kleen	5ef428c4b5	x86: Set cpu masks before calling CPU_STARTING notifiers When booting up a CPU set the various topology masks before calling the CPU_STARTING notifier. This way the notifier can actually use the masks. This is needed for a perf change. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290077254-12165-2-git-send-email-andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:14:56 +01:00
Franck Bui-Huu	6c7e550f13	perf: Introduce is_sampling_event() and use it when appropriate. Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290525705-6265-1-git-send-email-fbuihuu@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:14:54 +01:00
Ingo Molnar	6c869e772c	Merge branch 'perf/urgent' into perf/core Conflicts: arch/x86/kernel/apic/hw_nmi.c Merge reason: Resolve conflict, queue up dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:07:02 +01:00
Ingo Molnar	e4e91ac410	Merge commit 'v2.6.37-rc3' into perf/core Merge reason: Pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:04:47 +01:00
Peter Zijlstra	cc2067a514	perf, x86: Fixup Kconfig deps This leads to a Kconfig dep inversion, x86 selects PERF_EVENT (due to a hw_breakpoint dep) but doesn't unconditionally provide HAVE_PERF_EVENT. (This can cause build failures on M386/M486 kernel .config's.) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101117222055.982965150@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:00:58 +01:00
Don Zickus	33c6d6a7ad	x86, perf, nmi: Disable perf if counters are not accessible In a kvm virt guests, the perf counters are not emulated. Instead they return zero on a rdmsrl. The perf nmi handler uses the fact that crossing a zero means the counter overflowed (for those counters that do not have specific interrupt bits). Therefore on kvm guests, perf will swallow all NMIs thinking the counters overflowed. This causes problems for subsystems like kgdb which needs NMIs to do its magic. This problem was discovered by running kgdb tests. The solution is to write garbage into a perf counter during the initialization and hopefully reading back the same number. On kvm guests, the value will be read back as zero and we disable perf as a result. Reported-by: Jason Wessel <jason.wessel@windriver.com> Patch-inspired-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:00:57 +01:00
Linus Torvalds	8a3fbc9fdb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: remove duplicated #include xen: x86/32: perform initial startup on initial_page_table	2010-11-25 08:35:53 +09:00
Andrew Morton	91d95fda85	arch/x86/include/asm/fixmap.h: mark __set_fixmap_offset as __always_inline When compiling arch/x86/kernel/early_printk_mrst.c with i386 allmodconfig, gcc-4.1.0 generates an out-of-line copy of __set_fixmap_offset() which contains a reference to __this_fixmap_does_not_exist which the compiler cannot elide. Marking __set_fixmap_offset() as __always_inline prevents this. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Feng Tang <feng.tang@intel.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-25 06:50:49 +09:00
Huang Weiyi	e6d4a76dbf	xen: remove duplicated #include Remove duplicated #include('s) in arch/x86/xen/setup.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-11-24 12:07:45 -05:00
Ian Campbell	5b5c1af104	xen: x86/32: perform initial startup on initial_page_table Only make swapper_pg_dir readonly and pinned when generic x86 architecture code (which also starts on initial_page_table) switches to it. This helps ensure that the generic setup paths work on Xen unmodified. In particular clone_pgd_range writes directly to the destination pgd entries and is used to initialise swapper_pg_dir so we need to ensure that it remains writeable until the last possible moment during bring up. This is complicated slightly by the need to avoid sharing kernel PMD entries when running under Xen, therefore the Xen implementation must make a copy of the kernel PMD (which is otherwise referred to by both intial_page_table and swapper_pg_dir) before switching to swapper_pg_dir. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-11-24 12:07:44 -05:00
Linus Torvalds	a4ec046c98	Merge branch 'upstream/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (23 commits) xen/events: Use PIRQ instead of GSI value when unmapping MSI/MSI-X irqs. xen: set IO permission early (before early_cpu_init()) xen: re-enable boot-time ballooning xen/balloon: make sure we only include remaining extra ram xen/balloon: the balloon_lock is useless xen: add extra pages to balloon xen: make evtchn's name less generic xen/evtchn: the evtchn device is non-seekable Revert "xen/privcmd: create address space to allow writable mmaps" xen/events: use locked set\|clear_bit() for cpu_evtchn_mask xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore xen/xenfs: update xenfs_mount for new prototype xen: fix header export to userspace xen: implement XENMEM_machphys_mapping xen: set vma flag VM_PFNMAP in the privcmd mmap file_op xen: xenfs: privcmd: check put_user() return code xen/evtchn: add missing static xen/evtchn: Fix name of Xen event-channel device xen/evtchn: don't do unbind_from_irqhandler under spinlock xen/evtchn: remove spurious barrier ...	2010-11-24 08:23:18 +09:00
Jeremy Fitzhardinge	bc15fde77f	xen: use default_idle We just need the idle loop to drop into safe_halt, which default_idle() is perfectly capable of doing. There's no need to duplicate it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-22 17:19:34 -08:00
Jeremy Fitzhardinge	c2d0879112	xen: clean up "extra" memory handling some more Make sure that extra_pages is added for all E820_RAM regions beyond mem_end - completely excluded regions as well as the remains of partially included regions. Also makes sure the extra region is not unnecessarily high, and simplifies the logic to decide which regions should be added. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-22 16:34:28 -08:00
Jeremy Fitzhardinge	9b8321531a	Merge branches 'upstream/core', 'upstream/xenfs' and 'upstream/evtchn' into upstream/for-linus * upstream/core: xen/events: Use PIRQ instead of GSI value when unmapping MSI/MSI-X irqs. xen: set IO permission early (before early_cpu_init()) xen: re-enable boot-time ballooning xen/balloon: make sure we only include remaining extra ram xen/balloon: the balloon_lock is useless xen: add extra pages to balloon xen/events: use locked set\|clear_bit() for cpu_evtchn_mask xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore xen: implement XENMEM_machphys_mapping * upstream/xenfs: Revert "xen/privcmd: create address space to allow writable mmaps" xen/xenfs: update xenfs_mount for new prototype xen: fix header export to userspace xen: set vma flag VM_PFNMAP in the privcmd mmap file_op xen: xenfs: privcmd: check put_user() return code * upstream/evtchn: xen: make evtchn's name less generic xen/evtchn: the evtchn device is non-seekable xen/evtchn: add missing static xen/evtchn: Fix name of Xen event-channel device xen/evtchn: don't do unbind_from_irqhandler under spinlock xen/evtchn: remove spurious barrier xen/evtchn: ports start enabled xen/evtchn: dynamically allocate port_user array xen/evtchn: track enabled state for each port	2010-11-22 12:22:42 -08:00
Konrad Rzeszutek Wilk	ec35a69c46	xen: set IO permission early (before early_cpu_init()) This patch is based off "xen dom0: Set up basic IO permissions for dom0." by Juan Quintela <quintela@redhat.com>. On AMD machines when we boot the kernel as Domain 0 we get this nasty: mapping kernel into physical memory Xen: setup ISA identity maps about to get started... (XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.1-101116 x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff8130271b>] (XEN) RFLAGS: 0000000000000282 EM: 1 CONTEXT: pv guest (XEN) rax: 000000008000c068 rbx: ffffffff8186c680 rcx: 0000000000000068 (XEN) rdx: 0000000000000cf8 rsi: 000000000000c000 rdi: 0000000000000000 (XEN) rbp: ffffffff81801e98 rsp: ffffffff81801e50 r8: ffffffff81801eac (XEN) r9: ffffffff81801ea8 r10: ffffffff81801eb4 r11: 00000000ffffffff (XEN) r12: ffffffff8186c694 r13: ffffffff81801f90 r14: ffffffffffffffff (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000006f0 (XEN) cr3: 0000000221803000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff81801e50: RIP points to read_pci_config() function. The issue is that we don't set IO permissions for the Linux kernel early enough. The call sequence used to be: xen_start_kernel() x86_init.oem.arch_setup = xen_setup_arch; setup_arch: - early_cpu_init - early_init_amd - read_pci_config - x86_init.oem.arch_setup [ xen_arch_setup ] - set IO permissions. We need to set the IO permissions earlier on, which this patch does. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-11-22 12:10:31 -08:00
Lin Ming	691513f70d	x86: Resume trampoline must be executable commit 5bd5a452(x86: Add NX protection for kernel data) marked the trampoline area NX - which unsurprisingly breaks resume and cpu hotplug. Revert the portion of that commit, which touches the trampoline. Originally-from: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <1290410581.2405.24.camel@minggr.sh.intel.com> Cc: Matthieu Castet <castet.matthieu@free.fr> Cc: Siarhei Liakh <sliakh.lkml@gmail.com> Cc: Xuxian Jiang <jiang@cs.ncsu.edu> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Tested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-22 14:38:52 +01:00
Thomas Gleixner	9cdca86972	x86: platform: Move iris to x86/platform where it belongs Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-20 10:37:05 +01:00
Jeremy Fitzhardinge	d2a817130c	xen: re-enable boot-time ballooning Now that the balloon driver doesn't stumble over non-RAM pages, we can enable the extra space for ballooning. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-19 23:28:08 -08:00
Vasiliy Kulikov	5ca9afdb9f	x86, mrst: Check platform_device_register() return code platform_device_register() may fail, if so propagate the return code from mrst_device_create(). Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> LKML-Reference: <1290104207-31279-1-git-send-email-segoon@openwall.com> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-11-18 13:45:46 -08:00
Ingo Molnar	ae51ce9061	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core	2010-11-18 20:07:12 +01:00
Linus Torvalds	fb3ff69d13	Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Fix host userspace gsbase corruption KVM: Correct ordering of ldt reload wrt fs/gs reload	2010-11-18 09:45:47 -08:00
Linus Torvalds	2d42dc3feb	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb,ppc: Fix regression in evr register handling kgdb,x86: fix regression in detach handling kdb: fix crash when KDB_BASE_CMD_MAX is exceeded kdb: fix memory leak in kdb_main.c	2010-11-18 08:24:58 -08:00
Hans Rosenfeld	f658bcfb26	x86, cacheinfo: Cleanup L3 cache index disable support Adaptions to the changes of the AMD northbridge caching code: instead of a bool in each l3 struct, use a flag in amd_northbridges.flags to indicate L3 cache index disable support; use a pointer to the whole northbridge instead of the misc device in the l3 struct; simplify the initialisation; dynamically generate sysfs attribute array. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:06 +01:00
Hans Rosenfeld	9653a5c76c	x86, amd-nb: Cleanup AMD northbridge caching code Support more than just the "Misc Control" part of the northbridges. Support more flags by turning "gart_supported" into a single bit flag that is stored in a flags member. Clean up related code by using a set of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb()) instead of accessing the NB data structures directly. Reorder the initialization code and put the GART flush words caching in a separate function. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:05 +01:00
Hans Rosenfeld	eec1d4fa00	x86, amd-nb: Complete the rename of AMD NB and related code Not only the naming of the files was confusing, it was even more so for the function and variable names. Renamed the K8 NB and NUMA stuff that is also used on other AMD platforms. This also renames the CONFIG_K8_NUMA option to CONFIG_AMD_NUMA and the related file k8topology_64.c to amdtopology_64.c. No functional changes intended. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:04 +01:00
Soeren Sandmann Pedersen	9c0729dc80	x86: Eliminate bp argument from the stack tracing routines The various stack tracing routines take a 'bp' argument in which the caller is supposed to provide the base pointer to use, or 0 if doesn't have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not defined, this means all callers in principle should either always pass 0, or be conditional on CONFIG_FRAME_POINTER. However, there are only really three use cases for stack tracing: (a) Trace the current task, including IRQ stack if any (b) Trace the current task, but skip IRQ stack (c) Trace some other task In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just be 0. If it _is_ defined, then - in case (a) bp should be gotten directly from the CPU's register, so the caller should pass NULL for regs, - in case (b) the caller should should pass the IRQ registers to dump_trace(), - in case (c) bp should be gotten from the top of the task's stack, so the caller should pass NULL for regs. Hence, the bp argument is not necessary because the combination of task and regs is sufficient to determine an appropriate value for bp. This patch introduces a new inline function stack_frame(task, regs) that computes the desired bp. This function is then called from the two versions of dump_stack(). Signed-off-by: Soren Sandmann <ssp@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org>, Cc: Frederic Weisbecker <fweisbec@gmail.com>, Cc: Arnaldo Carvalho de Melo <acme@redhat.com>, LKML-Reference: <m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com>> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-11-18 14:37:34 +01:00
Jan Beulich	37db6c8f1d	x86-64: Fix and clean up AMD Fam10 MMCONF enabling Candidate memory ranges were not calculated properly (start addresses got needlessly rounded down, and end addresses didn't get rounded up at all), address comparison for secondary CPUs was done on only part of the address, and disabled status wasn't tracked properly. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <4CE24DF40200007800022737@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 13:41:35 +01:00
Masami Hiramatsu	de31ec8a31	x86/kprobes: Prevent kprobes to probe on save_args() Prevent kprobes to probe on save_args() since this function will be called from breakpoint exception handler. That will cause infinit loop on breakpoint handling. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20101118101655.2779.2816.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 13:40:19 +01:00
matthieu castet	84e1c6bb38	x86: Add RO/NX protection for loadable kernel modules This patch is a logical extension of the protection provided by CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting module_core and module_init into three logical parts each and setting appropriate page access permissions for each individual section: 1. Code: RO+X 2. RO data: RO+NX 3. RW data: RW+NX In order to achieve proper protection, layout_sections() have been modified to align each of the three parts mentioned above onto page boundary. Next, the corresponding page access permissions are set right before successful exit from load_module(). Further, free_module() and sys_init_module have been modified to set module_core and module_init as RW+NX right before calling module_free(). By default, the original section layout and access flags are preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y, the patch will page-align each group of sections to ensure that each page contains only one type of content and will enforce RO/NX for each group of pages. -v1: Initial proof-of-concept patch. -v2: The patch have been re-written to reduce the number of #ifdefs and to make it architecture-agnostic. Code formatting has also been corrected. -v3: Opportunistic RO/NX protection is now unconditional. Section page-alignment is enabled when CONFIG_DEBUG_RODATA=y. -v4: Removed most macros and improved coding style. -v5: Changed page-alignment and RO/NX section size calculation -v6: Fixed comments. Restricted RO/NX enforcement to x86 only -v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added calls to set_all_modules_text_rw() and set_all_modules_text_ro() in ftrace -v8: updated for compatibility with linux 2.6.33-rc5 -v9: coding style fixes -v10: more coding style fixes -v11: minor adjustments for -tip -v12: minor adjustments for v2.6.35-rc2-tip -v13: minor adjustments for v2.6.37-rc1-tip Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F914.9070106@free.fr> [ minor cleanliness edits, -v14: build failure fix ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 13:32:56 +01:00
Matthieu Castet	5bd5a45266	x86: Add NX protection for kernel data This patch expands functionality of CONFIG_DEBUG_RODATA to set main (static) kernel data area as NX. The following steps are taken to achieve this: 1. Linker script is adjusted so .text always starts and ends on a page bound 2. Linker script is adjusted so .rodata always start and end on a page boundary 3. NX is set for all pages from _etext through _end in mark_rodata_ro. 4. free_init_pages() sets released memory NX in arch/x86/mm/init.c 5. bios rom is set to x when pcibios is used. The results of patch application may be observed in the diff of kernel page table dumps: pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc00a0000 640K RW GLB NX pte +0xc00a0000-0xc0100000 384K RW GLB x pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte No pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc0100000 1M RW GLB NX pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte The patch has been originally developed for Linux 2.6.34-rc2 x86 by Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>. -v1: initial patch for 2.6.30 -v2: patch for 2.6.31-rc7 -v3: moved all code into arch/x86, adjusted credits -v4: fixed ifdef, removed credits from CREDITS -v5: fixed an address calculation bug in mark_nxdata_nx() -v6: added acked-by and PT dump diff to commit log -v7: minor adjustments for -tip -v8: rework with the merge of "Set first MB as RW+NX" Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu> Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Cc: Arjan van de Ven <arjan@infradead.org> Cc: James Morris <jmorris@namei.org> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F82E.60601@free.fr> [ minor cleanliness edits ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 12:52:04 +01:00
matthieu castet	64edc8ed5f	x86: Fix improper large page preservation This patch fixes a bug in try_preserve_large_page() which may result in improper large page preservation and improper application of page attributes to the memory area outside of the original change request. More specifically, the problem manifests itself when set_memory_*() is called for several pages at the beginning of the large page and try_preserve_large_page() erroneously concludes that the change can be applied to whole large page. The fix consists of 3 parts: 1. Addition of "required" protection attributes in static_protections(), so .data and .bss can be guaranteed to stay "RW" 2. static_protections() is now called for every small page within large page to determine compatibility of new protection attributes (instead of just small pages within the requested range). 3. Large page can be preserved only if attribute change is large-page-aligned and covers whole large page. -v1: Try_preserve_large_page() patch for Linux 2.6.34-rc2 -v2: Replaced pfn check with address check for kernel rw-data Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: James Morris <jmorris@namei.org> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F7F3.8030809@free.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 12:52:04 +01:00
Dimitri Sivanich	8191c9f692	x86: UV: Address interrupt/IO port operation conflict This patch for SGI UV systems addresses a problem whereby interrupt transactions being looped back from a local IOH, through the hub to a local CPU can (erroneously) conflict with IO port operations and other transactions. To workaound this we set a high bit in the APIC IDs used for interrupts. This bit appears to be ignored by the sockets, but it avoids the conflict in the hub. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20101116222352.GA8155@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> ___ arch/x86/include/asm/uv/uv_hub.h \| 4 ++++ arch/x86/include/asm/uv/uv_mmrs.h \| 19 ++++++++++++++++++- arch/x86/kernel/apic/x2apic_uv_x.c \| 25 +++++++++++++++++++++++-- arch/x86/platform/uv/tlb_uv.c \| 2 +- arch/x86/platform/uv/uv_time.c \| 4 +++- 5 files changed, 49 insertions(+), 5 deletions(-)	2010-11-18 10:41:25 +01:00
Ingo Molnar	fcf48a725a	Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent	2010-11-18 10:37:51 +01:00
Yinghai Lu	9223081f54	x86: Use online node real index in calulate_tbl_offset() Found a NUMA system that doesn't have RAM installed at the first socket which hangs while executing init scripts. bisected it to: \| commit `9329672021` \| Author: Shaohua Li <shaohua.li@intel.com> \| Date: Wed Oct 20 11:07:03 2010 +0800 \| \| x86: Spread tlb flush vector between nodes It turns out when first socket is not online it could have cpus on node1 tlb_offset set to bigger than NUM_INVALIDATE_TLB_VECTORS. That could affect systems like 4 sockets, but socket 2 doesn't have installed, sockets 3 will get too big tlb_offset. Need to use real online node idx. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Shaohua Li <shaohua.li@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CDEDE59.40603@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 10:10:50 +01:00
Shérab	82148d1d0b	x86/platform: Add Eurobraille/Iris power off support The Iris machines from Eurobraille do not have APM or ACPI support to shut themselves down properly. A special I/O sequence is needed to do so. This modle runs this I/O sequence at kernel shutdown when its force parameter is set to 1. Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> [ did minor coding style edits ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 10:03:24 +01:00
Kees Cook	79250af2d5	x86: Fix included-by file reference comments Adjust the paths for files that are including verify_cpu.S. Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> LKML-Reference: <1289931004-16066-1-git-send-email-kees.cook@canonical.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 09:58:54 +01:00
Tetsuo Handa	96e612ffc3	x86, asm: Fix binutils 2.15 build failure Add parentheses around one pushl_cfi argument. Commit `df5d1874` "x86: Use {push,pop}{l,q}_cfi in more places" caused GNU assembler 2.15 (Debian Sarge) to fail. It is still failing as of commit `07bd8516` "x86, asm: Restore parentheses around one pushl_cfi argument". This patch solves build failure with GNU assembler 2.15. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Jan Beulich <jbeulich@novell.com> Cc: heukelum@fastmail.fm Cc: hpa@linux.intel.com LKML-Reference: <201011160445.oAG4jGif079860@www262.sakura.ne.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 09:25:11 +01:00
Rakib Mullick	0e2af2a9ab	x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG backtrace_mask has been used under the code context of ARCH_HAS_NMI_WATCHDOG. So put it into that context. We were warned by the following warning: arch/x86/kernel/apic/hw_nmi.c:21: warning: ‘backtrace_mask’ defined but not used Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Don Zickus <dzickus@redhat.com> LKML-Reference: <1289573455-3410-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 09:15:12 +01:00
Don Zickus	072b198a4a	x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog Now that the bulk of the old nmi_watchdog is gone, remove all the stub variables and hooks associated with it. This touches lots of files mainly because of how the io_apic nmi_watchdog was implemented. Now that the io_apic nmi_watchdog is forever gone, remove all its fingers. Most of this code was not being exercised by virtue of nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to risky here. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: fweisbec@gmail.com Cc: gorcunov@openvz.org LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 09:08:23 +01:00
Don Zickus	5f2b0ba4d9	x86, nmi_watchdog: Remove the old nmi_watchdog Now that we have a new nmi_watchdog that is more generic and sits on top of the perf subsystem, we really do not need the old nmi_watchdog any more. In addition, the old nmi_watchdog doesn't really work if you are using the default clocksource, hpet. The old nmi_watchdog code relied on local apic interrupts to determine if the cpu is still alive. With hpet as the clocksource, these interrupts don't increment any more and the old nmi_watchdog triggers false postives. This piece removes the old nmi_watchdog code and stubs out any variables and functions calls. The stubs are the same ones used by the new nmi_watchdog code, so it should be well tested. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: fweisbec@gmail.com Cc: gorcunov@openvz.org LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-18 09:08:23 +01:00
Ingo Molnar	a89d4bd055	Merge branch 'tip/perf/urgent-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent	2010-11-18 08:07:36 +01:00
Avi Kivity	c8770e7ba6	KVM: VMX: Fix host userspace gsbase corruption We now use load_gs_index() to load gs safely; unfortunately this also changes MSR_KERNEL_GS_BASE, which we managed separately. This resulted in confusion and breakage running 32-bit host userspace on a 64-bit kernel. Fix by - saving guest MSR_KERNEL_GS_BASE before we we reload the host's gs - doing the host save/load unconditionally, instead of only when in guest long mode Things can be cleaned up further, but this is the minmal fix for now. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-11-17 19:48:05 -02:00
Avi Kivity	0a77fe4c18	KVM: Correct ordering of ldt reload wrt fs/gs reload If fs or gs refer to the ldt, they must be reloaded after the ldt. Reorder the code to that effect. Userspace code that uses the ldt with kvm is nonexistent, so this doesn't fix a user-visible bug. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-11-17 19:47:59 -02:00
Jason Wessel	10a6e67648	kgdb,x86: fix regression in detach handling The fix from `ba773f7c51` (x86,kgdb: Fix hw breakpoint regression) was not entirely complete. The kgdb_remove_all_hw_break() function also needs to call the hw_break_release_slot() or else a breakpoint can get activated again after the debugger has detached. The kgdb test suite exposes the behavior in the form of either a hang or repetitive failure. The kernel config that exposes the problem contains all of the following: CONFIG_DEBUG_RODATA=y CONFIG_KGDB_TESTS=y CONFIG_KGDB_TESTS_ON_BOOT=y CONFIG_KGDB_TESTS_BOOT_STRING="V1F100" Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Tested-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-11-17 13:54:57 -06:00
Arnd Bergmann	451a3c24b0	BKL: remove extraneous #include <smp_lock.h> The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Randy Dunlap	ad02519a0d	x86, mrst: Fix dependencies of "select INTEL_SCU_IPC" commit `b9fc71f47` (x86, mrst: The shutdown for MRST requires the SCU IPC mechanism) introduced the following warning: warning: (X86_MRST && PCI && PCI_GOANY && X86_32 && X86_EXTENDED_PLATFORM && X86_IO_APIC) selects INTEL_SCU_IPC which has unmet direct dependencies (X86 && X86_PLATFORM_DEVICES && X86_MRST) which is due to the hierarchical menu structure. Select X86_PLATFORM_DEVICES as well. Originally-from: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20101115101406.77e072ef.randy.dunlap@oracle.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>	2010-11-17 13:41:44 +01:00
Alan Cox	b9fc71f47d	x86, mrst: The shutdown for MRST requires the SCU IPC mechanism Fix the build failure reported by Randy. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101115173110.6877.83958.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-17 13:27:44 +01:00
Jeremy Fitzhardinge	20b4755e4f	Merge commit 'v2.6.37-rc2' into upstream/xenfs * commit 'v2.6.37-rc2': (10093 commits) Linux 2.6.37-rc2 capabilities/syslog: open code cap_syslog logic to fix build failure i2c: Sanity checks on adapter registration i2c: Mark i2c_adapter.id as deprecated i2c: Drivers shouldn't include <linux/i2c-id.h> i2c: Delete unused adapter IDs i2c: Remove obsolete cleanup for clientdata include/linux/kernel.h: Move logging bits to include/linux/printk.h Fix gcc 4.5.1 miscompiling drivers/char/i8k.c (again) hwmon: (w83795) Check for BEEP pin availability hwmon: (w83795) Clear intrusion alarm immediately hwmon: (w83795) Read the intrusion state properly hwmon: (w83795) Print the actual temperature channels as sources hwmon: (w83795) List all usable temperature sources hwmon: (w83795) Expose fan control method hwmon: (w83795) Fix fan control mode attributes hwmon: (lm95241) Check validity of input values hwmon: Change mail address of Hans J. Koch PCI: sysfs: fix printk warnings GFS2: Fix inode deallocation race ...	2010-11-16 11:06:22 -08:00
Linus Torvalds	e5c13537b0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: sysfs: fix printk warnings PCI: fix pci_bus_alloc_resource() hang, prefer positive decode PCI: read current power state at enable time PCI: fix size checks for mmap() on /proc/bus/pci files x86/PCI: coalesce overlapping host bridge windows PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area	2010-11-15 14:01:33 -08:00
Linus Torvalds	891cbd30ef	Merge branch 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: do not release any memory under 1M in domain 0 xen: events: do not unmask event channels on resume xen: correct size of level2_kernel_pgt	2010-11-12 16:01:55 -08:00
Linus Torvalds	b5c5510436	Merge branch 'stable/xen-pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/xen-pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: MAINTAINERS: Mark XEN lists as moderated xen-pcifront: fix PCI reference leak xen-pcifront: Remove duplicate inclusion of headers. xen: fix memory leak in Xen PCI MSI/MSI-X allocator. MAINTAINERS: Update mailing list name for Xen pieces.	2010-11-12 15:54:39 -08:00
Ian Campbell	7e77506a59	xen: implement XENMEM_machphys_mapping This hypercall allows Xen to specify a non-default location for the machine to physical mapping. This capability is used when running a 32 bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to exactly the size required. [ Impact: add Xen hypercall definitions ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-11-12 15:00:06 -08:00
Linus Torvalds	25a34554d6	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pvclock: Remove leftover scale_delta() function x86, apic: Remove double #include x86: Adjust section annotations in AMD Fam10 MMCONF enabling code x86, UV: Update node controller MMRs x86: Remove unnecessary casts of void ptr returning alloc function return values x86: Address gcc4.6 "set but not used" warnings in apic.h x86, mm: Fix section mismatch in tlb.c	2010-11-12 08:40:23 -08:00
Frederic Weisbecker	6c0aca288e	x86: Ignore trap bits on single step exceptions When a single step exception fires, the trap bits, used to signal hardware breakpoints, are in a random state. These trap bits might be set if another exception will follow, like a breakpoint in the next instruction, or a watchpoint in the previous one. Or there can be any junk there. So if we handle these trap bits during the single step exception, we are going to handle an exception twice, or we are going to handle junk. Just ignore them in this case. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=21332 Reported-by: Michael Stefaniuc <mstefani@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: Alexandre Julliard <julliard@winehq.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: All since 2.6.33.x <stable@kernel.org>	2010-11-12 14:51:01 +01:00
Dirk Brandewie	37bc9f5078	x86: Ce4100: Add reboot_fixup() for CE4100 This patch adds the CE4100 reboot fixup to reboot_fixups_32.c [ tglx: Moved PCI id to reboot_fixups_32.c ] Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> LKML-Reference: <5bdcfb4f0206fa721570504e95659a03b815bc5e.1289331834.git.dirk.brandewie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-12 00:45:41 +01:00
Dirk Brandewie	91d8037f56	ce4100: Add PCI register emulation for CE4100 This patch provides access methods for PCI registers that mis-behave on the CE4100. Each register can be assigned a private init, read and write routine. The exception to this is the bridge device. The bridge device is the only device on bus zero (0) that requires any fixup so it is a special case. [ tglx: minor coding style cleanups, __init annotation and simplification of ce4100_conf_read/write ] Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> LKML-Reference: <40b6751381c2275dc359db5a17989cce22ad8db7.1289331834.git.dirk.brandewie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-12 00:45:41 +01:00
Thomas Gleixner	c751e17b53	x86: Add CE4100 platform support Add CE4100 platform support. CE4100 needs early setup like moorestown. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> LKML-Reference: <94720fd7f5564a12ebf202cf2c4f4c0d619aab35.1289331834.git.dirk.brandewie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-12 00:45:41 +01:00
Stefano Stabellini	e060e7af98	xen: set vma flag VM_PFNMAP in the privcmd mmap file_op Set VM_PFNMAP in the privcmd mmap file_op, rather than later in xen_remap_domain_mfn_range when it is too late because vma_wants_writenotify has already been called and vm_page_prot has already been modified. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-11 12:37:43 -08:00
Bjorn Helgaas	4723d0f2f9	x86/PCI: coalesce overlapping host bridge windows Some BIOSes provide PCI host bridge windows that overlap, e.g., pci_root PNP0A03:00: host bridge window [mem 0xb0000000-0xffffffff] pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xdfffffff] pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xffffffff] If we simply insert these as children of iomem_resource, the second window fails because it conflicts with the first, and the third is inserted as a child of the first, i.e., b0000000-ffffffff PCI Bus 0000:00 f0000000-ffffffff PCI Bus 0000:00 When we claim PCI device resources, this can cause collisions like this if we put them in the first window: pci 0000:00:01.0: address space collision: [mem 0xff300000-0xff4fffff] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xffffffff] Host bridge windows are top-level resources by definition, so it doesn't make sense to make the third window a child of the first. This patch coalesces any host bridge windows that overlap. For the example above, the result is this single window: pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xffffffff] This fixes a 2.6.34 regression. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=17011 Reported-and-tested-by: Anisse Astier <anisse@astier.eu> Reported-and-tested-by: Pramod Dematagoda <pmd.lotr.gandalf@gmail.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-11-11 09:34:31 -08:00
Feng Tang	6f207e9bb4	x86: mrst: Set vRTC's IRQ to level trigger type When setting up the mpc_intsrc structure for vRTC's IRQ, we need to set its irqflag to level trigger, otherwise it will be taken as edge triggered and the vRTC IRQ will fire only once, as there is never a EOI issued from the IA core for it. The original code worked in previous kernel. This is because it was configured to level trigger type by luck. It fell into the default PCI trigger category which is level triggered. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101111155019.12924.569.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-11 17:43:18 +01:00
Vinod Koul	86071535f8	x86: mrst: Add audio driver bindings This patch adds the sound card bindings for Moorestown (pmic_audio) and the Medfield platform (msic_audio) as IPC devices. This ensures they will be created at the right time. Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101110174044.11340.78008.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-11 11:34:28 +01:00
Feng Tang	0146f26145	rtc: Add drivers/rtc/rtc-mrst.c Provide the standard kernel rtc driver interface on top of the vrtc layer added in the previous patch. Signed-off-by: Feng Tang <feng.tang@intel.com> LKML-Reference: <20101110172911.3311.20593.stgit@localhost.localdomain> [Fixed swapped arguments on IPC] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> [Cleaned up and the device creation moved to arch/x86/platform] Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-11 11:34:27 +01:00
Feng Tang	7309282c90	x86: mrst: Add vrtc driver which serves as a wall clock device Moorestown platform doesn't have a m146818 RTC device like traditional x86 PC, but a firmware emulated virtual RTC device(vrtc), which provides some basic RTC functions like get/set time. vrtc serves as the only wall clock device on Moorestown platform. [ tglx: Changed the exports to _GPL ] Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101110172837.3311.40483.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-11 11:34:27 +01:00
Alek Du	cfb505a7eb	x86: mrst: Add Moorestown specific reboot/shutdown support Moorestowns needs to use a special IPC command to reboot or shutdown the platform. Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101110164928.6365.94243.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-11 11:34:27 +01:00
Steven Rostedt	b590854853	tracing: Force arch_local_irq_* notrace for paravirt When running ktest.pl randconfig tests, I would sometimes trigger a lockdep annotation bug (possible reason: unannotated irqs-on). This triggering happened right after function tracer self test was executed. After doing a config bisect I found that this was caused with having function tracer, paravirt guest, prove locking, and rcu torture all enabled. The rcu torture just enhanced the likelyhood of triggering the bug. Prove locking was needed, since it was the thing that was bugging. Function tracer would trace and disable interrupts in all sorts of funny places. paravirt guest would turn arch_local_irq_* into functions that would be traced. Besides the fact that tracing arch_local_irq_* is just a bad idea, this is what is happening. The bug happened simply in the local_irq_restore() code: if (raw_irqs_disabled_flags(flags)) { \ raw_local_irq_restore(flags); \ trace_hardirqs_off(); \ } else { \ trace_hardirqs_on(); \ raw_local_irq_restore(flags); \ } \ The raw_local_irq_restore() was defined as arch_local_irq_restore(). Now imagine, we are about to enable interrupts. We go into the else case and call trace_hardirqs_on() which tells lockdep that we are enabling interrupts, so it sets the current->hardirqs_enabled = 1. Then we call raw_local_irq_restore() which calls arch_local_irq_restore() which gets traced! Now in the function tracer we disable interrupts with local_irq_save(). This is fine, but flags is stored that we have interrupts disabled. When the function tracer calls local_irq_restore() it does it, but this time with flags set as disabled, so we go into the if () path. This keeps interrupts disabled and calls trace_hardirqs_off() which sets current->hardirqs_enabled = 0. When the tracer is finished and proceeds with the original code, we enable interrupts but leave current->hardirqs_enabled as 0. Which now breaks lockdeps internal processing. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-11-10 22:29:49 -05:00
Ian Campbell	9ec23a7f6d	xen: do not release any memory under 1M in domain 0 We already deliberately setup a 1-1 P2M for the region up to 1M in order to allow code which assumes this region is already mapped to work without having to convert everything to ioremap. Domain 0 should not return any apparently unused memory regions (reserved or otherwise) in this region to Xen since the e820 may not accurately reflect what the BIOS has stashed in this region. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-10 17:19:25 -08:00
Kees Cook	6036f373ea	x86, cpu: Only CPU features determine NX capabilities Fix the NX feature boot warning when NX is missing to correctly reflect that BIOSes cannot disable NX now. Signed-off-by: Kees Cook <kees.cook@canonical.com> LKML-Reference: <1289414154-7829-5-git-send-email-kees.cook@canonical.com> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-11-10 15:43:15 -08:00
Kees Cook	ebba638ae7	x86, cpu: Call verify_cpu during 32bit CPU startup The XD_DISABLE-clearing side-effect needs to happen for both 32bit and 64bit, but the 32bit init routines were not calling verify_cpu() yet. This adds that call to gain the side-effect. The longmode/SSE tests being performed in verify_cpu() need to happen very early for 64bit but not for 32bit. Instead of including it in two places for 32bit, we can just include it once in arch/x86/kernel/head_32.S. Signed-off-by: Kees Cook <kees.cook@canonical.com> LKML-Reference: <1289414154-7829-4-git-send-email-kees.cook@canonical.com> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-11-10 15:43:09 -08:00
Kees Cook	ae84739c27	x86, cpu: Clear XD_DISABLED flag on Intel to regain NX Intel CPUs have an additional MSR bit to indicate if the BIOS was configured to disable the NX cpu feature. This bit was traditionally used for operating systems that did not understand how to handle the NX bit. Since Linux understands this, this BIOS flag should be ignored by default. In a review[1] of reported hardware being used by Ubuntu bug reporters, almost 10% of systems had an incorrectly configured BIOS, leaving their systems unable to use the NX features of their CPU. This change will clear the MSR_IA32_MISC_ENABLE_XD_DISABLE bit so that NX cannot be inappropriately controlled by the BIOS on Intel CPUs. If, under very strange hardware configurations, NX actually needs to be disabled, "noexec=off" can be used to restore the prior behavior. [1] http://www.outflux.net/blog/archives/2010/02/18/data-mining-for-nx-bit/ Signed-off-by: Kees Cook <kees.cook@canonical.com> LKML-Reference: <1289414154-7829-3-git-send-email-kees.cook@canonical.com> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-11-10 15:42:54 -08:00
Kees Cook	c5cbac6942	x86, cpu: Rename verify_cpu_64.S to verify_cpu.S The code is 32bit already, and can be used in 32bit routines. Signed-off-by: Kees Cook <kees.cook@canonical.com> LKML-Reference: <1289414154-7829-2-git-send-email-kees.cook@canonical.com> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-11-10 15:42:42 -08:00
Peter Zijlstra	034c6efa46	perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocation Jasper suggested we use the zeroing capability of the allocators instead of calling memset ourselves. Add node affinity while we're at it. Reported-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 22:58:40 +01:00
Borislav Petkov	c7657ac0c3	x86, microcode, AMD: Cleanup code a bit get_ucode_data is a memcpy() wrapper which always returns 0. Move it into the header and make it an inline. Remove all code checking its return value and turn it into a void. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-10 14:54:54 +01:00
Jesper Juhl	1ea6be212e	x86, microcode, AMD: Replace vmalloc+memset with vzalloc We don't have to do memset() ourselves after vmalloc() when we have vzalloc(), so change that in arch/x86/kernel/microcode_amd.c::get_next_ucode(). Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-10 14:48:57 +01:00
Kusanagi Kouichi	1f523bf367	x86, pvclock: Remove leftover scale_delta() function Commit 92580d64e16402762e2acc3022f065397c780425 ("x86: pvclock: Move scale_delta into common header") forgot to remove scale_delta. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Zachary Amsden <zamsden@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Glauber Costa <glommer@redhat.com> LKML-Reference: <20101105110444.BAF6D6FC03B@msa105.auone-net.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 10:32:15 +01:00
Jesper Juhl	2a8dcbd6cd	x86, apic: Remove double #include Remove the second <asm/atomic.h> inclusion. Signed-off-by: Jesper Juhl <jj@chaosbits.net> LKML-Reference: <alpine.LNX.2.00.1011072253360.26247@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 10:21:16 +01:00
Jan Beulich	2f62bf7d23	x86: Adjust section annotations in AMD Fam10 MMCONF enabling code check_enable_amd_mmconf_dmi() gets called only for the BSP, hence everything hanging off of it can be __init*. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CD2DE1E0200007800020990@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 10:08:26 +01:00
Jack Steiner	62b0cfc240	x86, UV: Update node controller MMRs A new version of the SGI UV hub node controller is being developed. A few of the MMRs (control registers) that exist on the current hub no longer exist on the new hub. Fortunately, there are alternate MMRs that are are functionally equivalent and that exist on both hubs. This patch changes the UV code to use MMRs that exist in BOTH versions of the hub node controller. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20101106204056.GA27584@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 10:06:38 +01:00
Jesper Juhl	8e5e9521c1	x86: Remove unnecessary casts of void ptr returning alloc function return values The [vk][cmz]alloc(_node) family of functions return void pointers which it's completely unnecessary/pointless to cast to other pointer types since that happens implicitly. This patch removes such casts from arch/x86. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: trivial@kernel.org Cc: amd64-microcode@amd64.org Cc: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <alpine.LNX.2.00.1011082310220.23697@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-10 09:13:00 +01:00
Andi Kleen	0059b2436a	x86: Address gcc4.6 "set but not used" warnings in apic.h native_apic_msr_read() and x2apic_enabled() use rdmsr(msr, low, high), but only use the low part. gcc4.6 complains about this: .../apic.h:144:11: warning: variable 'high' set but not used [-Wunused-but-set-variable] rdmsr() is just a wrapper around rdmsrl() which splits the 64bit value into low and high, so using rdmsrl() directly solves this. [tglx: Changed the variables to u64 as suggested by Cyrill. It's less confusing and has no code impact as this is 64bit only anyway. Massaged changelog as well. ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: x86@kernel.org Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <1289251229-19589-1-git-send-email-andi@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-09 18:40:30 +01:00
Jacob Pan	7f05dec3dd	x86: mrst: Parse SFI timer table for all timer configs Penwell has APB timer based watchdog timers, it requires platform code to parse SFI MTMR tables in order to claim its timer. This patch will always parse SFI MTMR regardless of system timer configuration choices. Otherwise, SFI MTMR table may not get parsed if running on Medfield with always-on local APIC timers and constant TSC. Watchdog timer driver will then not get a timer to use. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20101109112800.20591.10802.stgit@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-09 14:45:52 +01:00
Feng Tang	1da4b1c6a4	x86/mrst: Add SFI platform device parsing code SFI provides a series of tables. These describe the platform devices present including SPI and I²C devices, as well as various sensors, keypads and other glue as well as interfaces provided via the SCU IPC mechanism (intel_scu_ipc.c) This patch is a merge of the core elements and relevant fixes from the Intel development code by Feng, Alek, myself into a single coherent patch for upstream submission. It provides the needed infrastructure to register I2C, SPI and platform devices described by the tables, as well as handlers for some of the hardware already supported in kernel. The 0.8 firmware also provides GPIO tables. Devices are created at boot time or if they are SCU dependant at the point an SCU is discovered. The existing Linux device mechanisms will then handle the device binding. At an abstract level this is an SFI to Linux device translator. Device/platform specific setup/glue is in this file. This is done so that the drivers for the generic I²C and SPI bus devices remain cross platform as they should. (Updated from RFC version to correct the emc1403 name used by the firmware and a wrongly used #define) Signed-off-by: Alek Du <alek.du@linux.intel.com> LKML-Reference: <20101109112158.20013.6158.stgit@localhost.localdomain> [Clean ups, removal of 0.7 support] Signed-off-by: Feng Tang <feng.tang@linux.intel.com> [Clean ups] Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-11-09 14:45:52 +01:00
Jiri Slaby	07cf2a64c2	xen: fix memory leak in Xen PCI MSI/MSI-X allocator. Stanse found that xen_setup_msi_irqs leaks memory when xen_allocate_pirq fails. Free the memory in that fail path. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org	2010-11-08 11:30:00 -05:00
Jan Kiszka	453d9c57e2	KVM: x86: Issue smp_call_function_many with preemption disabled smp_call_function_many is specified to be called only with preemption disabled. Fulfill this requirement. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-11-05 14:42:27 -02:00
Vasiliy Kulikov	97e69aa62f	KVM: x86: fix information leak to userland Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and kvm_clock_data are copied to userland with some padding and reserved fields unitialized. It leads to leaking of contents of kernel stack memory. We have to initialize them to zero. In patch v1 Jan Kiszka suggested to fill reserved fields with zeros instead of memset'ting the whole struct. It makes sense as these fields are explicitly marked as padding. No more fields need zeroing. KVM-Stable-Tag. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-11-05 14:42:27 -02:00
Marcelo Tosatti	eb45fda45f	KVM: MMU: fix rmap_remove on non present sptes drop_spte should not attempt to rmap_remove a non present shadow pte. This fixes a BUG_ON seen on kvm-autotest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reported-by: Lucas Meneghel Rodrigues <lmr@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-11-05 14:42:26 -02:00
Michael S. Tsirkin	edde99ce05	KVM: Write protect memory after slot swap I have observed the following bug trigger: 1. userspace calls GET_DIRTY_LOG 2. kvm_mmu_slot_remove_write_access is called and makes a page ro 3. page fault happens and makes the page writeable fault is logged in the bitmap appropriately 4. kvm_vm_ioctl_get_dirty_log swaps slot pointers a lot of time passes 5. guest writes into the page 6. userspace calls GET_DIRTY_LOG At point (5), bitmap is clean and page is writeable, thus, guest modification of memory is not logged and GET_DIRTY_LOG returns an empty bitmap. The rule is that all pages are either dirty in the current bitmap, or write-protected, which is violated here. It seems that just moving kvm_mmu_slot_remove_write_access down to after the slot pointer swap should fix this bug. KVM-Stable-Tag. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-11-05 14:42:25 -02:00
Rakib Mullick	cf38d0ba7e	x86, mm: Fix section mismatch in tlb.c Mark tlb_cpuhp_notify as __cpuinit. It's basically a callback function, which is called from __cpuinit init_smp_flash(). So - it's safe. We were warned by the following warning: WARNING: arch/x86/mm/built-in.o(.text+0x356d): Section mismatch in reference from the function tlb_cpuhp_notify() to the function .cpuinit.text:calculate_tlb_offset() The function tlb_cpuhp_notify() references the function __cpuinit calculate_tlb_offset(). This is often because tlb_cpuhp_notify lacks a __cpuinit annotation or the annotation of calculate_tlb_offset is wrong. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <AANLkTinWQRG=HA9uB3ad0KAqRRTinL6L_4iKgF84coph@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-01 10:09:07 +01:00
Linus Torvalds	f02a38d86a	Merge branches 'perf-fixes-for-linus' and 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: jump label: Add work around to i386 gcc asm goto bug x86, ftrace: Use safe noops, drop trap test jump_label: Fix unaligned traps on sparc. jump label: Make arch_jump_label_text_poke_early() optional jump label: Fix error with preempt disable holding mutex oprofile: Remove deprecated use of flush_scheduled_work() oprofile: Fix the hang while taking the cpu offline jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex jump label: Fix module __init section race * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Check irq_remapped instead of remapping_enabled in destroy_irq()	2010-10-30 11:43:26 -07:00
Yinghai Lu	7b79462a20	x86: Check irq_remapped instead of remapping_enabled in destroy_irq() Russ Anderson reported: \| There is a regression that is causing a NULL pointer dereference \| in free_irte when shutting down xpc. git bisect narrowed it down \| to git commit d585d06(intr_remap: Simplify the code further), which \| changed free_irte(). Reverse applying the patch fixes the problem. We need to use irq_remapped() for each irq instead of checking only intr_remapping_enabled as there might be non remapped irqs even when remapping is enabled. [ tglx: use cfg instead of retrieving it again. Massaged changelog ] Reported-bisected-and-tested-by: Russ Anderson <rja@sgi.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4CCBD511.40607@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-10-30 10:28:31 +02:00
Linus Torvalds	2d10d8737c	Merge branches 'x86-fixes-for-linus' and 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, alternative: Call stop_machine_text_poke() on all cpus x86-32: Restore irq stacks NUMA-aware allocations x86, memblock: Fix early_node_mem with big reserved region. * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, uv: More Westmere support on SGI UV x86, uv: Enable Westmere support on SGI UV	2010-10-29 18:58:00 -07:00
Jason Baron	404ba5d7bb	x86, alternative: Call stop_machine_text_poke() on all cpus Currently, text_poke_smp() passes a NULL as the third argument to __stop_machine(), which will only run stop_machine_text_poke() on 1 cpu. Change NULL -> cpu_online_mask, as stop_machine_text_poke() is intended to be run on all cpus. I actually didn't notice any problems with stop_machine_text_poke() only being called on 1 cpu, but found this via code inspection. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <20101028152026.GB2875@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-10-29 16:42:58 -07:00
Ian Campbell	a2d771c036	xen: correct size of level2_kernel_pgt sizeof(pmd_t *) is 4 bytes on 32-bit PAE leading to an allocation of only 2048 bytes. The correct size is sizeof(pmd_t) giving us a full page allocation. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-29 12:23:57 -07:00
Linus Torvalds	1e431a9d64	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb,ppc: Individual register get/set for ppc kgdbts: prevent re-entry to kgdbts before it unregisters debug_core,x86,blackfin: Clean up hw debug disable API kdb: Fix early debugging crash regression kgdb,arm: fix register dump kdb: fix per_cpu command to remove supress mask kdb: Add kdb kernel module sample	2010-10-29 11:49:38 -07:00

1 2 3 4 5 ...

12190 Commits