linux

Commit Graph

Author	SHA1	Message	Date
Stefani Seibold	89240ba059	x86, fs: Fix x86 procfs stack information for threads on 64-bit This patch fixes two issues in the procfs stack information on x86-64 linux. The 32 bit loader compat_do_execve did not store stack start. (this was figured out by Alexey Dobriyan). The stack information on a x64_64 kernel always shows 0 kbyte stack usage, because of a missing implementation of the KSTK_ESP macro which always returned -1. The new implementation now returns the right value. Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1257240160.4889.24.camel@wall-e> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:25:03 +01:00
Ingo Molnar	a2e7127153	Merge commit 'v2.6.32-rc6' into perf/core Conflicts: tools/perf/Makefile Merge reason: Resolve the conflict, merge to upstream and merge in perf fixes so we can add a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 11:59:45 +01:00
Ingo Molnar	1d87cff407	Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent	2009-11-03 16:54:14 +01:00
Herbert Xu	3b0d65969b	crypto: ghash-intel - Add PSHUFB macros Add PSHUFB macros instead of repeating byte sequences, suggested by Ingo. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 09:11:15 -05:00
Joerg Roedel	342688f9db	Merge branches 'amd-iommu/fixes' and 'dma-debug/fixes' into iommu/fixes	2009-11-03 12:05:40 +01:00
Mike Galbraith	6b9de613ae	sched: Disable SD_PREFER_LOCAL at node level Yanmin Zhang reported that SD_PREFER_LOCAL induces an order of magnitude increase in select_task_rq_fair() overhead while running heavy wakeup benchmarks (tbench and vmark). Since SD_BALANCE_WAKE is off at node level, turn SD_PREFER_LOCAL off as well pending further investigation. Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 07:24:07 +01:00
Suresh Siddha	502f660466	x86, cpa: Fix kernel text RO checks in static_protection() Steven Rostedt reported that we are unconditionally making the kernel text mapping as read-only. i.e., if someone does cpa() to the kernel text area for setting/clearing any page table attribute, we unconditionally clear the read-write attribute for the kernel text mapping that is set at compile time. We should delay (to forbid the write attribute) and enforce only after the kernel has mapped the text as read-only. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024820.996634347@sbs-t61.sc.intel.com> [ marked kernel_set_to_readonly as __read_mostly ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 17:16:35 +01:00
Suresh Siddha	a5e74b8419	x86: Force irq complete move during cpu offline When a cpu goes offline, fixup_irqs() try to move irq's currently destined to the offline cpu to a new cpu. But this attempt will fail if the irq is recently moved to this cpu and the irq still hasn't arrived at this cpu (for non intr-remapping platforms this is when we free the vector allocation at the previous destination) that is about to go offline. This will endup with the interrupt subsystem still pointing the irq to the offline cpu, causing that irq to not work any more. Fix this by forcing the irq to complete its move (its been a long time we moved the irq to this cpu which we are offlining now) and then move this irq to a new cpu before this cpu goes offline. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:36 +01:00
Suresh Siddha	23359a88e7	x86: Remove move_cleanup_count from irq_cfg move_cleanup_count for each irq in irq_cfg is keeping track of the total number of cpus that need to free the corresponding vectors associated with the irq which has now been migrated to new destination. As long as this move_cleanup_count is non-zero (i.e., as long as we have n't freed the vector allocations on the old destinations) we were preventing the irq's further migration. This cleanup count is unnecessary and it is enough to not allow the irq migration till we send the cleanup vector to the previous irq destination, for which we already have irq_cfg's move_in_progress. All we need to make sure is that we free the vector at the old desintation but we don't need to wait till that gets freed. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:35 +01:00
Linus Torvalds	6e958d73c2	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Do less agressive buddy clearing sched: Disable SD_PREFER_LOCAL for MC/CPU domains	2009-10-29 08:10:38 -07:00
Tejun Heo	0f5e4816db	percpu: remove some sparse warnings Make the following changes to remove some sparse warnings. * Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before defining it. * Annotate pcpu_extend_area_map() that it is entered with pcpu_lock held, releases it and then reacquires it. * Make percpu related macros use unique nested variable names. * While at it, add pcpu prefix to __size_call[_return]() macros as to-be-implemented sparse annotations will add percpu specific stuff to these macros. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 22:34:12 +09:00
Masami Hiramatsu	e0e492e99b	x86: AVX instruction set decoder support Add Intel AVX(Advanced Vector Extensions) instruction set support to x86 instruction decoder. This adds insn.vex_prefix field for storing VEX prefixes, and introduces some original tags for expressing opcodes attributes. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204226.30545.23451.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:46 +01:00
Masami Hiramatsu	04d46c1b13	x86: Merge INAT_REXPFX into INAT_PFX_* Merge INAT_REXPFX into INAT_PFX_* macro and rename it to INAT_PFX_REX. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204211.30545.58090.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:45 +01:00
Ingo Molnar	4331595650	Merge branch 'perf/core' into perf/probes Conflicts: tools/perf/Makefile Merge reason: - fix the conflict - pick up the pr_*() infrastructure to queue up dependent patch Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-23 08:23:20 +02:00
Robin Holt	02dd0a0613	x86, UV: Set DELIVERY_MODE=4 for vector=NMI_VECTOR in uv_hub_send_ipi() When sending a NMI_VECTOR IPI using the UV_HUB_IPI_INT register, we need to ensure the delivery mode field of that register has NMI delivery selected. This makes those IPIs true NMIs, instead of flat IPIs. It matters to reboot sequences and KGDB, both of which use NMI IPIs. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Cc: Martin Hicks <mort@sgi.com> Cc: <stable@kernel.org> LKML-Reference: <20091020193620.877322000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-21 13:31:13 +02:00
Suresh Siddha	74e081797b	x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel text/rodata/data to small 4KB pages as they are mapped with different attributes (text as RO, RODATA as RO and NX etc). On x86_64, preserve the large page mappings for kernel text/rodata/data boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the RODATA section to be hugepage aligned and having same RWX attributes for the 2MB page boundaries Extra Memory pages padding the sections will be freed during the end of the boot and the kernel identity mappings will have different RWX permissions compared to the kernel text mappings. Kernel identity mappings to these physical pages will be mapped with smaller pages but large page mappings are still retained for kernel text,rodata,data mappings. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-20 14:46:00 +09:00
Huang Ying	0e1227d356	crypto: ghash - Add PCLMULQDQ accelerated implementation PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at: http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/ Because PCLMULQDQ changes XMM state, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be defered to the cryptd kernel thread. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-10-19 11:53:06 +09:00
Frederic Weisbecker	0f8f86c7bd	Merge commit 'perf/core' into perf/hw-breakpoint Conflicts: kernel/Makefile kernel/trace/Makefile kernel/trace/trace.h samples/Makefile Merge reason: We need to be uptodate with the perf events development branch because we plan to rewrite the breakpoints API on top of perf events.	2009-10-18 01:12:33 +02:00
Ingo Molnar	bb3c3e8071	Merge commit 'v2.6.32-rc5' into perf/probes Conflicts: kernel/trace/trace_event_profile.c Merge reason: update to -rc5 and resolve conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-17 09:58:25 +02:00
Robin Holt	1d21e6e3ff	x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode() Create an inline function to extract the pnode from a global physical address and then convert the broadcast assist unit to use the newly created uv_gpa_to_pnode function. The open-coded code was wrong as well - it might explain a few of our unexplained bau hangs. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Cliff Whickman <cpw@sgi.com> Cc: linux-mm@kvack.org Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091016112920.GZ8903@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:51:53 +02:00
Borislav Petkov	5e09954a9a	x86, mce: Fix up MCE naming nomenclature Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:46:49 +02:00
Ingo Molnar	6b50f5c7c7	Merge branches 'x86/mce' and 'x86/urgent' into perf/mce Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:42:25 +02:00
Ingo Molnar	b226f744d4	Merge branch 'linus' into perf/core Merge reason: pick up tools/perf/ changes from upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 08:44:44 +02:00
Dimitri Sivanich	4a4de9c7d7	x86: UV RTC: Rename generic_interrupt to x86_platform_ipi Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014142257.GE11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 18:27:11 +02:00
Ingo Molnar	7ec13187ef	x86, apic: Fix prototype in hw_irq.h This warning: In file included from arch/x86/include/asm/ipi.h:23, from arch/x86/kernel/apic/apic_noop.c:27: arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want triggers because irq_desc is defined after hw_irq.h is included in irq.h. Since it's pointer reference only, a forward declaration of the type will solve the problem. LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 15:06:42 +02:00
Peter Zijlstra	799e2205ec	sched: Disable SD_PREFER_LOCAL for MC/CPU domains Yanmin reported that both tbench and hackbench were significantly hurt by trying to keep tasks local on these domains, esp on small cache machines. So disable it in order to promote spreading outside of the cache domains. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Mike Galbraith <efault@gmx.de> LKML-Reference: <1255083400.8802.15.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 15:02:34 +02:00
Dimitri Sivanich	9338ad6ffb	x86, apic: Move SGI UV functionality out of generic IO-APIC code Move UV specific functionality out of the generic IO-APIC code. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091013203236.GD20543@sgi.com> [ Cleaned up the code some more in their new places. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:09 +02:00
Dimitri Sivanich	6c2c502910	x86: SGI UV: Fix irq affinity for hub based interrupts This patch fixes handling of uv hub irq affinity. IRQs with ALL or NODE affinity can be routed to cpus other than their originally assigned cpu. Those with CPU affinity cannot be rerouted. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20090930160259.GA7822@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:01 +02:00
Cyrill Gorcunov	9844ab11c7	x86, apic: Introduce the NOOP apic driver Introduce NOOP APIC driver. We should use it in case if apic was disabled due to hardware of software/firmware problems (including user requested to disable it case). The driver is attempting to catch any inappropriate apic operation call with warning issue. Also it is possible to use some apic operation like IPI calls, read/write without checking for apic presence which should make callers code easier. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.534682104@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:00 +02:00
Ingo Molnar	9dbdd6c41c	Merge commit 'v2.6.32-rc4' into perf/core Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:31:34 +02:00
Jeremy Fitzhardinge	71999d9862	x86/paravirt: Use normal calling sequences for irq enable/disable Bastian Blank reported a boot crash with stackprotector enabled, and debugged it back to edx register corruption. For historical reasons irq enable/disable/save/restore had special calling sequences to make them more efficient. With the more recent introduction of higher-level and more general optimisations this is no longer necessary so we can just use the normal PVOP_ macros. This fixes some residual bugs in the old implementations which left edx liable to inadvertent clobbering. Also, fix some bugs in __PVOP_VCALLEESAVE which were revealed by actual use. Reported-by: Bastian Blank <bastian@waldi.eu.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: Xen-devel <xen-devel@lists.xensource.com> LKML-Reference: <4AD3BC9B.7040501@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:22:01 +02:00
Arnaldo Carvalho de Melo	a2e2725541	net: Introduce recvmmsg socket syscall Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-12 23:40:10 -07:00
David Rientjes	8716273cae	x86: Export srat physical topology This is the counterpart to "x86: export k8 physical topology" for SRAT. It is not as invasive because the acpi code already seperates node setup into detection and registration steps, with the exception of registering e820 active regions in acpi_numa_memory_affinity_init(). This is now moved to acpi_scan_nodes() if NUMA emulation is disabled or deferred. acpi_numa_init() now returns a value which specifies whether an underlying SRAT was located. If so, that topology can be used by the emulation code to interleave emulated nodes over physical nodes or to register the nodes for ACPI. acpi_get_nodes() may now be used to export the srat physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-12 22:56:46 +02:00
David Rientjes	8ee2debce3	x86: Export k8 physical topology To eventually interleave emulated nodes over physical nodes, we need to know the physical topology of the machine without actually registering it. This does the k8 node setup in two parts: detection and registration. NUMA emulation can then used the physical topology detected to setup the address ranges of emulated nodes accordingly. If emulation isn't used, the k8 nodes are registered as normal. Two formals are added to the x86 NUMA setup functions: `acpi' and `k8'. These represent whether ACPI or K8 NUMA has been detected; both cannot be true at the same time. This specifies to the NUMA emulation code whether an underlying physical NUMA topology exists and which interface to use. This patch deals solely with separating the k8 setup path into Northbridge detection and registration steps and leaves the ACPI changes for a subsequent patch. The `acpi' formal is added here, however, to avoid touching all the header files again in the next patch. This approach also ensures emulated nodes will not span physical nodes so the true memory latency is not misrepresented. k8_get_nodes() may now be used to export the k8 physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-12 22:56:45 +02:00
Brian Gerst	ae24ffe5ec	x86, 64-bit: Move K8 B step iret fixup to fault entry asm Move the handling of truncated %rip from an iret fault to the fault entry path. This allows x86-64 to use the standard search_extable() function. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Beulich <jbeulich@novell.com> LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-12 18:29:46 +02:00
Borislav Petkov	fb2531953f	mce, edac: Use an atomic notifier for MCEs decoding Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-12 12:24:45 +02:00
H. Peter Anvin	a6f05a6a0a	x86-64: make compat_start_thread() match start_thread() For no real good reason, compat_start_thread() was embedded inline in <asm/elf.h> whereas the native start_thread() lives in process_.c. Move compat_start_thread() to process_64.c, remove gratuitious differences, and fix a few items which mostly look like bit rot. In particular, compat_start_thread() didn't do free_thread_xstate(), which means it was hanging on to the xstate store area even when it was not needed. It was also not setting old_rsp, but it looks like that generally shouldn't matter for a 32-bit process. Note: compat_start_thread has* to be a macro, since it is tested with start_thread_ia32() as the out of line function name. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>	2009-10-09 16:26:38 -07:00
Joerg Roedel	c5cca146aa	x86/amd-iommu: Workaround for erratum 63 There is an erratum for IOMMU hardware which documents undefined behavior when forwarding SMI requests from peripherals and the DTE of that peripheral has a sysmgt value of 01b. This problem caused weird IO_PAGE_FAULTS in my case. This patch implements the suggested workaround for that erratum into the AMD IOMMU driver. The erratum is documented with number 63. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-10-09 18:37:46 +02:00
Peter Zijlstra	f3834b9ef6	x86: Generate cmpxchg build failures Rework the x86 cmpxchg() implementation to generate build failures when used on improper types. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1254771187.21044.22.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-09 15:57:00 +02:00
Stephane Eranian	04a705df47	perf_events: Check for filters on fixed counter events Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-09 15:56:10 +02:00
Linus Torvalds	624235c5b3	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pci: Correct spelling in a comment x86: Simplify bound checks in the MTRR code x86: EDAC: carve out AMD MCE decoding logic initcalls: Add early_initcall() for modules x86: EDAC: MCE: Fix MCE decoding callback logic	2009-10-08 12:06:36 -07:00
Izik Eidus	3da0dd433d	KVM: add support for change_pte mmu notifiers this is needed for kvm if it want ksm to directly map pages into its shadow page tables. [marcelo: cast pfn assignment to u64] Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-10-04 17:04:53 +02:00
Christoph Lameter	30ed1a79f5	this_cpu: Implement X86 optimized this_cpu operations Basically the existing percpu ops can be used for this_cpu variants that allow operations also on dynamically allocated percpu data. However, we do not pass a reference to a percpu variable in. Instead a dynamically or statically allocated percpu variable is provided. Preempt, the non preempt and the irqsafe operations generate the same code. It will always be possible to have the requires per cpu atomicness in a single RMW instruction with segment override on x86. 64 bit this_cpu operations are not supported on 32 bit. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-10-03 19:48:22 +09:00
Arjan van de Ven	63312b6a6f	x86: Add a Kconfig option to turn the copy_from_user warnings into errors For automated testing it is useful to have the option to turn the warnings on copy_from_user() etc checks into errors: In function ‘copy_from_user’, inlined from ‘fd_copyin’ at drivers/block/floppy.c:3080, inlined from ‘fd_ioctl’ at drivers/block/floppy.c:3503: linux/arch/x86/include/asm/uaccess_32.h:213: error: call to ‘copy_from_user_overflow’ declared with attribute error: copy_from_user buffer size is not provably correct Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20091002075050.4e9f7641@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-02 19:01:42 +02:00
Ingo Molnar	f436f8bb73	x86: EDAC: MCE: Fix MCE decoding callback logic Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-02 15:42:18 +02:00
Samuel Thibault	392d814daf	x86: fix csum_ipv6_magic asm memory clobber Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a memory clobber, as it is only passed the address of the buffer, not a memory reference to the buffer itself. This caused failures in Hurd's pfinetv4 when we tried to compile it with gcc-4.3 (bogus checksums). Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-01 16:11:12 -07:00
Avi Kivity	7c68af6e32	core, x86: Add user return notifiers Add a general per-cpu notifier that is called whenever the kernel is about to return to userspace. The notifier uses a thread_info flag and existing checks, so there is no impact on user return or context switch fast paths. This will be used initially to speed up KVM task switching by lazily updating MSRs. Signed-off-by: Avi Kivity <avi@redhat.com> LKML-Reference: <1253342422-13811-1-git-send-email-avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-01 12:12:18 -07:00
Arjan van de Ven	4a31276930	x86: Turn the copy_from_user check into an (optional) compile time warning A previous patch added the buffer size check to copy_from_user(). One of the things learned from analyzing the result of the previous patch is that in general, gcc is really good at proving that the code contains sufficient security checks to not need to do a runtime check. But that for those cases where gcc could not prove this, there was a relatively high percentage of real security issues. This patch turns the case of "gcc cannot prove" into a compile time warning, as long as a sufficiently new gcc is in use that supports this. The objective is that these warnings will trigger developers checking new cases out before a security hole enters a linux kernel release. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Cc: Jan Beulich <jbeulich@novell.com> LKML-Reference: <20090930130523.348ae6c4@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-01 11:31:04 +02:00
Samuel Thibault	d1716a60a8	x86: Fix csum_ipv6_magic asm memory clobber Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a memory clobber, as it is only passed the address of the buffer, not a memory reference to the buffer itself. This caused failures in Hurd's pfinetv4 when we tried to compile it with gcc-4.3 (bogus checksums). Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-01 08:11:26 +02:00
Arjan van de Ven	79e1dd05d1	x86: Provide an alternative() based cmpxchg64() cmpxchg64() today generates, to quote Linus, "barf bag" code. cmpxchg64() is about to get used in the scheduler to fix a bug there, but it's a prerequisite that cmpxchg64() first be made non-sucking. This patch turns cmpxchg64() into an efficient implementation that uses the alternative() mechanism to just use the raw instruction on all modern systems. Note: the fallback is NOT smp safe, just like the current fallback is not SMP safe. (Interested parties with i486 based SMP systems are welcome to submit fix patches for that.) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> [ fixed asm constraint bug ] Fixed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090930170754.0886ff2e@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-30 22:55:59 +02:00
Arjan van de Ven	ff60fab71b	x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy GCC provides reasonable memset/memcpy functions itself, with __builtin_memset and __builtin_memcpy. For the "unknown" cases, it'll fall back to our current existing functions, but for fixed size versions it'll inline something smart. Quite often that will be the same as we have now, but sometimes it can do something smarter (for example, if the code then sets the first member of a struct, it can do a shorter memset). In addition, and this is more important, gcc knows which registers and such are not clobbered (while for our asm version it pretty much acts like a compiler barrier), so for various cases it can avoid reloading values. The effect on codesize is shown below on my typical laptop .config: text data bss dec hex filename 5605675 2041100 6525148 14171923 d83f13 vmlinux.before 5595849 2041668 6525148 14162665 d81ae9 vmlinux.after Due to some not-so-good behavior in the gcc 3.x series, this change is only done for GCC 4.x and above. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20090928142122.6fc57e9c@infradead.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-28 16:43:15 -07:00
Linus Torvalds	5bb241b325	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove redundant non-NUMA topology functions x86: early_printk: Protect against using the same device twice x86: Reduce verbosity of "PAT enabled" kernel message x86: Reduce verbosity of "TSC is reliable" message x86: mce: Use safer ways to access MCE registers x86: mce, inject: Use real inject-msg in raise_local x86: mce: Fix thermal throttling message storm x86: mce: Clean up thermal throttling state tracking code x86: split NX setup into separate file to limit unstack-protected code xen: check EFER for NX before setting up GDT mapping x86: Cleanup linker script using new linker script macros. x86: Use section .data.page_aligned for the idt_table. x86: convert to use __HEAD and HEAD_TEXT macros. x86: convert compressed loader to use __HEAD and HEAD_TEXT macros. x86: fix fragile computation of vsyscall address	2009-09-26 10:13:35 -07:00
Arjan van de Ven	9f0cf4adb6	x86: Use __builtin_object_size() to validate the buffer size for copy_from_user() gcc (4.x) supports the __builtin_object_size() builtin, which reports the size of an object that a pointer point to, when known at compile time. If the buffer size is not known at compile time, a constant -1 is returned. This patch uses this feature to add a sanity check to copy_from_user(); if the target buffer is known to be smaller than the copy size, the copy is aborted and a WARNing is emitted in memory debug mode. These extra checks compile away when the object size is not known, or if both the buffer size and the copy length are constants. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20090926143301.2c396b94@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-26 16:25:41 +02:00
Linus Torvalds	b7f21bb2e2	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits) x86/PCI: make 32 bit NUMA node array int, not unsigned char x86/PCI: default pcibus cpumask to all cpus if it lacks affinity MAINTAINTERS: remove hotplug driver entries PCI: pciehp: remove slot capabilities definitions PCI: pciehp: remove error message definitions PCI: pciehp: remove number field PCI: pciehp: remove hpc_ops PCI: pciehp: remove pci_dev field PCI: pciehp: remove crit_sect mutex PCI: pciehp: remove slot_bus field PCI: pciehp: remove first_slot field PCI: pciehp: remove slot_device_offset field PCI: pciehp: remove hp_slot field PCI: pciehp: remove device field PCI: pciehp: remove bus field PCI: pciehp: remove slot_num_inc field PCI: pciehp: remove num_slots field PCI: pciehp: remove slot_list field PCI: fix VGA arbiter header file PCI: Disable AER with pci=nomsi ... Fixed up trivial conflicts in MAINTAINERS	2009-09-24 09:57:08 -07:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Rusty Russell	b0c6fbe458	x86: Remove redundant non-NUMA topology functions arch/x86/include/asm/topology.h declares inline fns cpu_to_node and cpumask_of_node for !NUMA, even though they are then declared as macros by asm-generic/topology.h, which is #included just below. The macros (which are the same) end up being used; these functions are just confusing. Noticed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "Greg Kroah-Hartman" <gregkh@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <200909241748.45629.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 14:16:15 +02:00
Ingo Molnar	d2ff6de537	Merge branch 'linus' into x86/urgent Merge reason: Queueing up dependent early-printk fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 12:59:18 +02:00
Linus Torvalds	94a8d5caba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits) cpumask: Move deprecated functions to end of header. cpumask: remove unused deprecated functions, avoid accusations of insanity cpumask: use new-style cpumask ops in mm/quicklist. cpumask: use mm_cpumask() wrapper: x86 cpumask: use mm_cpumask() wrapper: um cpumask: use mm_cpumask() wrapper: mips cpumask: use mm_cpumask() wrapper: mn10300 cpumask: use mm_cpumask() wrapper: m32r cpumask: use mm_cpumask() wrapper: arm cpumask: Use accessors for cpu__mask: um cpumask: Use accessors for cpu__mask: powerpc cpumask: Use accessors for cpu__mask: mips cpumask: Use accessors for cpu__mask: m32r cpumask: remove arch_send_call_function_ipi cpumask: arch_send_call_function_ipi_mask: s390 cpumask: arch_send_call_function_ipi_mask: powerpc cpumask: arch_send_call_function_ipi_mask: mips cpumask: arch_send_call_function_ipi_mask: m32r cpumask: arch_send_call_function_ipi_mask: alpha cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64 ...	2009-09-23 18:14:11 -07:00
Rusty Russell	78f1c4d6b0	cpumask: use mm_cpumask() wrapper: x86 Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer). It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:52 +09:30
Rusty Russell	0748bd0177	cpumask: remove arch_send_call_function_ipi Now everyone is converted to arch_send_call_function_ipi_mask, remove the shim and the #defines. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:47 +09:30
Linus Torvalds	c37efa9325	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits) Use macros for .data.page_aligned section. Use macros for .bss.page_aligned section. Use new __init_task_data macro in arch init_task.c files. kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts. arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0 kbuild: add static to prototypes kbuild: fail build if recordmcount.pl fails kbuild: set -fconserve-stack option for gcc 4.5 kbuild: echo the record_mcount command gconfig: disable "typeahead find" search in treeviews kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling checkincludes.pl: add option to remove duplicates in place markup_oops: use modinfo to avoid confusion with underscored module names checkincludes.pl: provide usage helper checkincludes.pl: close file as soon as we're done with it ctags: usability fix kernel hacking: move STRIP_ASM_SYMS from General gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma kbuild: Check if linker supports the -X option kbuild: introduce ld-option ... Fix trivial conflict in scripts/basic/fixdep.c	2009-09-23 15:37:02 -07:00
Frederic Weisbecker	d7a4b414ee	Merge commit 'linus/master' into tracing/kprobes Conflicts: kernel/trace/Makefile kernel/trace/trace.h kernel/trace/trace_event_types.h kernel/trace/trace_export.c Merge reason: Sync with latest significant tracing core changes.	2009-09-23 23:08:43 +02:00
Linus Torvalds	c11f6c8258	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (119 commits) ACPI: don't pass handle for fixed hardware notifications ACPI: remove null pointer checks in deferred execution path ACPI: simplify deferred execution path acerhdf: additional BIOS versions acerhdf: convert to dev_pm_ops acerhdf: fix fan control for AOA150 model thermal: add missing Kconfig dependency acpi: switch /proc/acpi/{debug_layer,debug_level} to seq_file hp-wmi: fix rfkill memory leak on unload ACPI: remove unnecessary #ifdef CONFIG_DMI ACPI: linux/acpi.h should not include linux/dmi.h hwmon driver for ACPI 4.0 power meters topstar-laptop: add new driver for hotkeys support on Topstar N01 thinkpad_acpi: fix rfkill memory leak on unload thinkpad-acpi: report brightness events when required thinkpad-acpi: don't poll by default any of the reserved hotkeys thinkpad-acpi: Fix procfs hotkey reset command thinkpad-acpi: deprecate hotkey_bios_mask thinkpad-acpi: hotkey poll fixes thinkpad-acpi: be more strict when detecting a ThinkPad ...	2009-09-23 09:32:11 -07:00
Ingo Molnar	14c93e8eba	Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/urgent	2009-09-23 14:35:10 +02:00
Roland McGrath	18c1e2c80d	x86: syscall_get_nr returns int Make syscall_get_nr() return int, so we always sign-extend the low 32 bits of orig_ax in checks. Signed-off-by: Roland McGrath <roland@redhat.com>	2009-09-22 19:57:51 -07:00
Jeremy Fitzhardinge	c44c9ec0f3	x86: split NX setup into separate file to limit unstack-protected code Move the NX setup into a separate file so that it can be compiled without stack-protection while leaving the rest of the mm/init code protected. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-09-21 13:56:58 -07:00
Linus Torvalds	43c1266ce4	Merge branch 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Tidy up after the big rename perf: Do the big rename: Performance Counters -> Performance Events perf_counter: Rename 'event' to event_id/hw_event perf_counter: Rename list_entry -> group_entry, counter_list -> group_list Manually resolved some fairly trivial conflicts with the tracing tree in include/trace/ftrace.h and kernel/trace/trace_syscalls.c.	2009-09-21 09:15:07 -07:00
Ingo Molnar	cdd6c482c9	perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N \| sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-21 14:28:04 +02:00
Tim Abbott	abe1ee3a22	Use macros for .data.page_aligned section. This patch changes the remaining direct references to .data.page_aligned in C and assembly code to use the macros in include/linux/linkage.h. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-09-21 06:27:08 +02:00
Ingo Molnar	bfefb7a0c6	Merge branch 'linus' into x86/urgent Merge reason: Bring in changes that the next patch will depend on. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 20:25:03 +02:00
Sergey Senozhatsky	4fe487828b	x86: Fix uaccess_32.h typo Trivial: correct "that the we don't" typo. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> LKML-Reference: <20090917125401.GU3717@localdomain.by> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 20:19:34 +02:00
Felipe Contreras	878f4f533e	x86: Trivial whitespace cleanups Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alok N Kataria <akataria@vmware.com> Cc: "Tan Wei Chong" <wei.chong.tan@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Bob Moore <robert.moore@intel.com> LKML-Reference: <1253137123-18047-2-git-send-email-felipe.contreras@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 20:18:57 +02:00
Cyrill Gorcunov	8312136fa8	x86, apic: Fix missed handling of discrete apics In case of discrete (pretty old) apics we may have cpu_has_apic bit not set but have to check if smp_found_config (MP spec) is there and apic was not disabled. Also don't forget to print apic/io-apic for such case as well. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20090915071230.GA10604@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 20:18:07 +02:00
Len Brown	003d6a38ce	Merge branch 'sfi-base' into release Conflicts: drivers/acpi/power.c Signed-off-by: Len Brown <len.brown@intel.com>	2009-09-19 00:37:13 -04:00
Linus Torvalds	78f28b7c55	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits) x86: Move get/set_wallclock to x86_platform_ops x86: platform: Fix section annotations x86: apic namespace cleanup x86: Distangle ioapic and i8259 x86: Add Moorestown early detection x86: Add hardware_subarch ID for Moorestown x86: Add early platform detection x86: Move tsc_init to late_time_init x86: Move tsc_calibration to x86_init_ops x86: Replace the now identical time_32/64.c by time.c x86: time_32/64.c unify profile_pc x86: Move calibrate_cpu to tsc.c x86: Make timer setup and global variables the same in time_32/64.c x86: Remove mca bus ifdef from timer interrupt x86: Simplify timer_ack magic in time_32.c x86: Prepare unification of time_32/64.c x86: Remove do_timer hook x86: Add timer_init to x86_init_ops x86: Move percpu clockevents setup to x86_init_ops x86: Move xen_post_allocator_init into xen_pagetable_setup_done ... Fix up conflicts in arch/x86/include/asm/io_apic.h	2009-09-18 14:05:47 -07:00
Linus Torvalds	a03fdb7612	Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (34 commits) time: Prevent 32 bit overflow with set_normalized_timespec() clocksource: Delay clocksource down rating to late boot clocksource: clocksource_select must be called with mutex locked clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash timers: Drop a function prototype clocksource: Resolve cpu hotplug dead lock with TSC unstable timer.c: Fix S/390 comments timekeeping: Fix invalid getboottime() value timekeeping: Fix up read_persistent_clock() breakage on sh timekeeping: Increase granularity of read_persistent_clock(), build fix time: Introduce CLOCK_REALTIME_COARSE x86: Do not unregister PIT clocksource on PIT oneshot setup/shutdown clocksource: Avoid clocksource watchdog circular locking dependency clocksource: Protect the watchdog rating changes with clocksource_mutex clocksource: Call clocksource_change_rating() outside of watchdog_lock timekeeping: Introduce read_boot_clock timekeeping: Increase granularity of read_persistent_clock() timekeeping: Update clocksource with stop_machine timekeeping: Add timekeeper read_clock helper functions timekeeping: Move NTP adjusted clock multiplier to struct timekeeper ... Fix trivial conflict due to MIPS lemote -> loongson renaming.	2009-09-18 09:15:24 -07:00
David Rientjes	7715a1e887	x86/PCI: default pcibus cpumask to all cpus if it lacks affinity The early initialization of the pci bus to node mapping leaves all busses with a node id of -1 if it lacks memory affinity. Thus, cpumask_of_pcibus must return all online cpus for such busses. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-18 08:51:10 -07:00
Jack Steiner	8dc579e868	x86: SGI UV: Add volatile semantics to macros that access chipset registers Add volatile-semantics to the SGI UV read/write macros that are used to access chipset memory mapped registers. No direct references to volatile are made. Instead the readq/writeq macros are used. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Cc: dwalker@fifo99.com Cc: cfriesen@nortel.com LKML-Reference: <20090910143149.GA14273@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 14:05:32 +02:00
Jack Steiner	d2374aecda	x86: SGI UV: Fix IPI macros The UV BIOS has changed the way interrupt remapping is being done. This affects the id used for sending IPIs. The upper id bits no longer need to be masked off. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: <stable@kernel.org> LKML-Reference: <20090909154104.GA25083@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 14:04:25 +02:00
Linus Torvalds	df58bee21e	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits) x86, mce: Fix compilation with !CONFIG_DEBUG_FS in mce-severity.c x86, mce: CE in last bank prevents panic by unknown MCE x86, mce: Fake panic support for MCE testing x86, mce: Move debugfs mce dir creating to mce.c x86, mce: Support specifying raise mode for software MCE injection x86, mce: Support specifying context for software mce injection x86, mce: fix reporting of Thermal Monitoring mechanism enabled x86, mce: remove never executed code x86, mce: add missing __cpuinit tags x86, mce: fix "mce" boot option handling for CONFIG_X86_NEW_MCE x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs x86: mce: Lower maximum number of banks to architecture limit x86: mce: macros to compute banks MSRs x86: mce: Move per bank data in a single datastructure x86: mce: Move code in mce.c x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE x86: mce: Remove old i386 machine check code x86: mce: Update X86_MCE description in x86/Kconfig x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE x86, mce: use atomic_inc_return() instead of add by 1 ... Manually fixed up trivial conflicts: Documentation/feature-removal-schedule.txt arch/x86/kernel/cpu/mcheck/mce.c	2009-09-17 21:07:08 -07:00
Linus Torvalds	dcbf77b9e8	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits) sched: Fix SD_POWERSAVING_BALANCE\|SD_PREFER_LOCAL vs SD_WAKE_AFFINE sched: Stop buddies from hogging the system sched: Add new wakeup preemption mode: WAKEUP_RUNNING sched: Fix TASK_WAKING & loadaverage breakage sched: Disable wakeup balancing sched: Rename flags to wake_flags sched: Clean up the load_idx selection in select_task_rq_fair sched: Optimize cgroup vs wakeup a bit sched: x86: Name old_perf in a unique way sched: Implement a gentler fair-sleepers feature sched: Add SD_PREFER_LOCAL sched: Add a few SYNC hint knobs to play with sched: Fix sync wakeups again sched: Add WF_FORK sched: Rename sync arguments sched: Rename select_task_rq() argument sched: Feature to disable APERF/MPERF cpu_power x86: sched: Provide arch implementations using aperf/mperf x86: Add generic aperf/mperf code x86: Move APERF/MPERF into a X86_FEATURE ... Fix up trivial conflict in arch/x86/include/asm/processor.h due to nearby addition of amd_get_nb_id() declaration from the EDAC merge.	2009-09-17 21:00:02 -07:00
Linus Torvalds	ca043a66ae	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pat: don't use rb-tree based lookup in reserve_memtype() x86: Increase MIN_GAP to include randomized stack	2009-09-17 20:58:11 -07:00
Linus Torvalds	1218259b2d	Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits) vsnprintf: remove duplicate comment of vsnprintf softirq: add BLOCK_IOPOLL to softirq_to_name oprofile: fix oprofile regression: select RING_BUFFER_ALLOW_SWAP tracing: switch function prints from %pf to %ps vsprintf: add %ps that is the same as %pS but is like %pf tracing: Fix minor bugs for __unregister_ftrace_function_probe tracing: remove notrace from __kprobes annotation tracing: optimize global_trace_clock cachelines MAINTAINERS: Update tracing tree details ftrace: document function and function graph implementation tracing: make testing syscall events a separate configuration tracing: remove some unused macros ftrace: add compile-time check on F_printk() tracing: fix F_printk() typos tracing: have TRACE_EVENT macro use __flags to not shadow parameter tracing: add static to generated TRACE_EVENT functions ring-buffer: typecast cmpxchg to fix PowerPC warning tracing: add filter event logic to special, mmiotrace and boot tracers tracing: remove trace_event_types.h tracing: use the new trace_entries.h to create format files ...	2009-09-17 20:56:37 -07:00
H. Peter Anvin	3bb045f1e2	Merge branch 'x86/pat' into x86/urgent Merge reason: Suresh Siddha (1): x86, pat: don't use rb-tree based lookup in reserve_memtype() ... requires previous x86/pat commits already pushed to Linus. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-17 14:40:49 -07:00
Ingo Molnar	45bd00d31d	Merge branch 'linus' into tracing/core Merge reason: Pick up kernel/softirq.c update for dependent fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-17 20:53:10 +02:00
Linus Torvalds	de55a8958f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: check NB MCE bank enable on the current node properly amd64_edac: Rewrite unganged mode code of f10_early_channel_count amd64_edac: cleanup amd64_check_ecc_enabled x86, EDAC: Provide function to return NodeId of a CPU amd64_edac: build driver only on AMD hardware	2009-09-17 09:55:52 -07:00
Linus Torvalds	4406c56d0a	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits) PCI hotplug: clean up acpi_run_hpp() PCI hotplug: acpiphp: use generic pci_configure_slot() PCI hotplug: shpchp: use generic pci_configure_slot() PCI hotplug: pciehp: use generic pci_configure_slot() PCI hotplug: add pci_configure_slot() PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation PCI: Clear saved_state after the state has been restored PCI PM: Return error codes from pci_pm_resume() PCI: use dev_printk in quirk messages PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset() PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle PCI Hotplug: acpiphp: find bridges the easy way PCI: pcie portdrv: remove unused variable PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support ACPI PM: Replace wakeup.prepared with reference counter PCI PM: Introduce device flag wakeup_prepared PCI / ACPI PM: Rework some debug messages PCI PM: Simplify PCI wake-up code ... Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree scanning having been moved and merged for the 32- and 64-bit cases. The 'needs_freset' initialization added in `6e19314cc` ("PCI/powerpc: support PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.	2009-09-16 07:49:54 -07:00
Peter Zijlstra	182a85f8a1	sched: Disable wakeup balancing Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't really mind too much, SD_BALANCE_NEWIDLE picks up most of the slack. On a dual socket, quad core, dual thread nehalem system: sysbench (--num_threads=16): SD_BALANCE_WAKE-: 13982 tx/s SD_BALANCE_WAKE+: 15688 tx/s kbuild (-j16): SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% ) SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% ) (same within noise) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 16:44:33 +02:00
Feng Tang	7bd867dfb4	x86: Move get/set_wallclock to x86_platform_ops get/set_wallclock() have already a set of platform dependent implementations (default, EFI, paravirt). MRST will add another variant. Moving them to platform ops simplifies the existing code and minimizes the effort to integrate new variants. Signed-off-by: Feng Tang <feng.tang@intel.com> LKML-Reference: <new-submission> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-09-16 14:34:50 +02:00
Andreas Herrmann	6a8126911a	x86, EDAC: Provide function to return NodeId of a CPU Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: H. Peter Anvin <hpa@zytor.com>	2009-09-16 11:33:40 +02:00
Linus Torvalds	ada3fa1505	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c	2009-09-15 09:39:44 -07:00
Linus Torvalds	227423904c	Merge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pat: Fix cacheflush address in change_page_attr_set_clr() mm: remove !NUMA condition from PAGEFLAGS_EXTENDED condition set x86: Fix earlyprintk=dbgp for machines without NX x86, pat: Sanity check remap_pfn_range for RAM region x86, pat: Lookup the protection from memtype list on vm_insert_pfn() x86, pat: Add lookup_memtype to get the current memtype of a paddr x86, pat: Use page flags to track memtypes of RAM pages x86, pat: Generalize the use of page flag PG_uncached x86, pat: Add rbtree to do quick lookup in memtype tracking x86, pat: Add PAT reserve free to io_mapping* APIs x86, pat: New i/f for driver to request memtype for IO regions x86, pat: ioremap to follow same PAT restrictions as other PAT users x86, pat: Keep identity maps consistent with mmaps even when pat_disabled x86, mtrr: make mtrr_aps_delayed_init static bool x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled x86: Fix an incorrect argument of reserve_bootmem() x86: Fix system crash when loading with "reservetop" parameter	2009-09-15 09:19:38 -07:00
Linus Torvalds	1aaf2e5913	Merge branch 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, intel_txt: clean up the impact on generic code, unbreak non-x86 x86, intel_txt: Handle ACPI_SLEEP without X86_TRAMPOLINE x86, intel_txt: Fix typos in Kconfig help x86, intel_txt: Factor out the code for S3 setup x86, intel_txt: tboot.c needs <asm/fixmap.h> intel_txt: Force IOMMU on for Intel TXT launch x86, intel_txt: Intel TXT Sx shutdown support x86, intel_txt: Intel TXT reboot/halt shutdown support x86, intel_txt: Intel TXT boot support	2009-09-15 09:19:20 -07:00
Linus Torvalds	66a4fe0cb8	Merge branch 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp/intel: remove restore in resume agp: fix uninorth build intel-agp: Set dma mask for i915 agp: kill phys_to_gart() and gart_to_phys() intel-agp: fix sglist allocation to avoid vmalloc() intel-agp: Move repeated sglist free into separate function agp: Switch agp_{un,}map_page() to take struct page * argument agp: tidy up handling of scratch pages w.r.t. DMA API intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU agp: Add generic support for graphics dma remapping agp: Switch mask_memory() method to take address argument again, not page	2009-09-15 09:18:07 -07:00
Peter Zijlstra	5cbc19a983	x86: Add generic aperf/mperf code Move some of the aperf/mperf code out from the cpufreq driver thingy so that other people can enjoy it too. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: cpufreq@vger.kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:26 +02:00
Peter Zijlstra	a8303aaf2b	x86: Move APERF/MPERF into a X86_FEATURE Move the APERFMPERF capacility into a X86_FEATURE flag so that it can be used outside of the acpi cpufreq driver. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: cpufreq@vger.kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:25 +02:00
Peter Zijlstra	b8a543ea5a	sched: Reduce forkexec_idx If we're looking to place a new task, we might as well find the idlest position _now_, not 1 tick ago. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:23 +02:00
Mike Galbraith	0ec9fab3d1	sched: Improve latencies and throughput Make the idle balancer more agressive, to improve a x264 encoding workload provided by Jason Garrett-Glaser: NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 252.82 fps, 22096.60 kb/s encoded 600 frames, 250.69 fps, 22096.60 kb/s encoded 600 frames, 245.76 fps, 22096.60 kb/s NO_NEXT_BUDDY LB_BIAS encoded 600 frames, 344.44 fps, 22096.60 kb/s encoded 600 frames, 346.66 fps, 22096.60 kb/s encoded 600 frames, 352.59 fps, 22096.60 kb/s NO_NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 425.75 fps, 22096.60 kb/s encoded 600 frames, 425.45 fps, 22096.60 kb/s encoded 600 frames, 422.49 fps, 22096.60 kb/s Peter pointed out that this is better done via newidle_idx, not via LB_BIAS, newidle balancing should look for where there is load _now_, not where there was load 2 ticks ago. Worst-case latencies are improved as well as no buddies means less vruntime spread. (as per prior lkml discussions) This change improves kbuild-peak parallelism as well. Reported-by: Jason Garrett-Glaser <darkshikari@gmail.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253011667.9128.16.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:16 +02:00
Peter Zijlstra	78e7ed53c9	sched: Tweak wake_idx When merging select_task_rq_fair() and sched_balance_self() we lost the use of wake_idx, restore that and set them to 0 to make wake balancing more aggressive. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:01:07 +02:00
Peter Zijlstra	c88d591089	sched: Merge select_task_rq_fair() and sched_balance_self() The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:01:05 +02:00
Ingo Molnar	dca2d6ac09	Merge branch 'linus' into tracing/hw-breakpoints Conflicts: arch/x86/kernel/process_64.c Semantic conflict fixed in: arch/x86/kvm/x86.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 12:18:15 +02:00
Borislav Petkov	b8a4754147	x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus Since rdmsr_on_cpus and wrmsr_on_cpus are almost identical, unify them into a common __rwmsr_on_cpus helper thus avoiding code duplication. While at it, convert cpumask_t's to const struct cpumask *. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 09:15:54 +02:00
Linus Torvalds	f86054c245	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (23 commits) at_hdmac: Rework suspend_late()/resume_early() PM: Reset transition_started at dpm_resume_noirq PM: Update kerneldoc comments in drivers/base/power/main.c PM: Add convenience macro to make switching to dev_pm_ops less error-prone hp-wmi: Switch driver to dev_pm_ops floppy: Switch driver to dev_pm_ops PM: Trivial fixes PM / Hibernate / Memory hotplug: Always use for_each_populated_zone() PM/Hibernate: Do not try to allocate too much memory too hard (rev. 2) PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2) PM/Hibernate: Rework shrinking of memory PM: Fix typo in label name s/Platofrm_finish/Platform_finish/ PM: Run-time PM platform device bus support PM: Introduce core framework for run-time PM of I/O devices (rev. 17) Driver Core: Make PM operations a const pointer PM: Remove platform device suspend_late()/resume_early() V2 USB: Rework musb suspend()/resume_early() I2C: Rework i2c-s3c2410 suspend_late()/resume() V2 I2C: Rework i2c-pxa suspend_late()/resume_early() DMA: Rework txx9dmac suspend_late()/resume_early() ... Fix trivial conflict in drivers/base/platform.c (due to same constification patch being merged in both sides, along with some other PM work in the PM branch)	2009-09-14 20:03:54 -07:00
Linus Torvalds	69def9f05d	Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (202 commits) MAINTAINERS: update KVM entry KVM: correct error-handling code KVM: fix compile warnings on s390 KVM: VMX: Check cpl before emulating debug register access KVM: fix misreporting of coalesced interrupts by kvm tracer KVM: x86: drop duplicate kvm_flush_remote_tlb calls KVM: VMX: call vmx_load_host_state() only if msr is cached KVM: VMX: Conditionally reload debug register 6 KVM: Use thread debug register storage instead of kvm specific data KVM guest: do not batch pte updates from interrupt context KVM: Fix coalesced interrupt reporting in IOAPIC KVM guest: fix bogus wallclock physical address calculation KVM: VMX: Fix cr8 exiting control clobbering by EPT KVM: Optimize kvm_mmu_unprotect_page_virt() for tdp KVM: Document KVM_CAP_IRQCHIP KVM: Protect update_cr8_intercept() when running without an apic KVM: VMX: Fix EPT with WP bit change during paging KVM: Use kvm_{read,write}_guest_virt() to read and write segment descriptors KVM: x86 emulator: Add adc and sbb missing decoder flags KVM: Add missing #include ...	2009-09-14 17:43:43 -07:00
Rafael J. Wysocki	ac8d513a68	Merge branch 'master' into for-linus	2009-09-14 20:26:05 +02:00
Linus Torvalds	55e0715f61	Merge branch 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, percpu: Collect hot percpu variables into one cacheline x86, percpu: Fix DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED() x86, percpu: Add 'percpu_read_stable()' interface for cacheable accesses	2009-09-14 08:01:28 -07:00
Linus Torvalds	7dfd54a905	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, highmem_32.c: Clean up comment x86, pgtable.h: Clean up types x86: Clean up dump_pagetable()	2009-09-14 07:59:32 -07:00
Linus Torvalds	625037cc40	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64: move clts into batch cpu state updates when preloading fpu x86-64: move unlazy_fpu() into lazy cpu state part of context switch x86-32: make sure clts is batched during context switch x86: split out core __math_state_restore	2009-09-14 07:58:08 -07:00
Linus Torvalds	c7208de304	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) x86: Fix code patching for paravirt-alternatives on 486 x86, msr: change msr-reg.o to obj-y, and export its symbols x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus x86, sched: Workaround broken sched domain creation for AMD Magny-Cours x86, mcheck: Use correct cpumask for shared bank4 x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors x86: Fix CPU llc_shared_map information for AMD Magny-Cours x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h x86, msr: fix msr-reg.S compilation with gas 2.16.1 x86, msr: Export the register-setting MSR functions via /dev/*/msr x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT x86, msr: CFI annotations, cleanups for msr-reg.S x86, asm: Make _ASM_EXTABLE() usable from assembly code x86, asm: Add 32-bit versions of the combined CFI macros x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit x86, msr: Rewrite AMD rd/wrmsr variants x86, msr: Add rd/wrmsr interfaces with preset registers x86: add specific support for Intel Atom architecture ...	2009-09-14 07:57:32 -07:00
Linus Torvalds	9b74aec028	Merge branch 'x86-asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: remove all now-duplicate header files x86: convert termios.h to the asm-generic version x86: convert almost generic headers to asm-generic version x86: convert trivial headers to asm-generic version x86: add copies of some headers to convert to asm-generic	2009-09-14 07:55:37 -07:00
Linus Torvalds	b581af5110	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/i386: Put aligned stack-canary in percpu shared_aligned section x86/i386: Make sure stack-protector segment base is cache aligned x86: Detect stack protector for i386 builds on x86_64 x86: allow "=rm" in native_save_fl() x86: properly annotate alternatives.c x86: Introduce GDT_ENTRY_INIT(), initialize bad_bios_desc statically x86, 32-bit: Use generic sys_pipe() x86: Introduce GDT_ENTRY_INIT(), fix APM x86: Introduce GDT_ENTRY_INIT() x86: Introduce set_desc_base() and set_desc_limit() x86: Remove unused patch_espfix_desc() x86: Use get_desc_base()	2009-09-14 07:53:49 -07:00
Linus Torvalds	ffaf854b01	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits) ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n x86, apic: Slim down stack usage in early_init_lapic_mapping() x86, ioapic: Get rid of needless check and simplify ioapic_setup_resources() x86, ioapic: Define IO_APIC_DEFAULT_PHYS_BASE constant x86: Fix x86_model test in es7000_apic_is_cluster() x86, apic: Move dmar_table_init() out of enable_IR() x86, ioapic: Panic on irq-pin binding only if needed x86/apic: Enable x2APIC without interrupt remapping under KVM x86, apic: Drop redundant bit assignment x86, ioapic: Throw BUG instead of NULL dereference x86, ioapic: Introduce for_each_irq_pin() helper x86: Remove superfluous NULL pointer check in destroy_irq() x86/ioapic.c: unify ioapic_retrigger_irq() x86/ioapic.c: convert __target_IO_APIC_irq to conventional for() loop x86/ioapic.c: clean up replace_pin_at_irq_node logic and comments x86/ioapic.c: convert replace_pin_at_irq_node to conventional for() loop x86/ioapic.c: simplify add_pin_to_irq_node() x86/ioapic.c: convert io_apic_level_ack_pending loop to normal for() loop x86/ioapic.c: move lost comment to what seems like appropriate place x86/ioapic.c: remove redundant declaration of irq_pin_list ...	2009-09-14 07:51:20 -07:00
Linus Torvalds	483e3cd6a3	Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits) ring-buffer: only enable ring_buffer_swap_cpu when needed ring-buffer: check for swapped buffers in start of committing tracing: report error in trace if we fail to swap latency buffer tracing: add trace_array_printk for internal tracers to use tracing: pass around ring buffer instead of tracer tracing: make tracing_reset safe for external use tracing: use timestamp to determine start of latency traces tracing: Remove mentioning of legacy latency_trace file from documentation tracing/filters: Defer pred allocation, fix memory leak tracing: remove users of tracing_reset tracing: disable buffers and synchronize_sched before resetting tracing: disable update max tracer while reading trace tracing: print out start and stop in latency traces ring-buffer: disable all cpu buffers when one finds a problem ring-buffer: do not count discarded events ring-buffer: remove ring_buffer_event_discard ring-buffer: fix ring_buffer_read crossing pages ring-buffer: remove unnecessary cpu_relax ring-buffer: do not swap buffers during a commit ring-buffer: do not reset while in a commit ...	2009-09-11 13:24:03 -07:00
Linus Torvalds	774a694f8c	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits) sched: Fix sched::sched_stat_wait tracepoint field sched: Disable NEW_FAIR_SLEEPERS for now sched: Keep kthreads at default priority sched: Re-tune the scheduler latency defaults to decrease worst-case latencies sched: Turn off child_runs_first sched: Ensure that a child can't gain time over it's parent after fork() sched: enable SD_WAKE_IDLE sched: Deal with low-load in wake_affine() sched: Remove short cut from select_task_rq_fair() sched: Turn on SD_BALANCE_NEWIDLE sched: Clean up topology.h sched: Fix dynamic power-balancing crash sched: Remove reciprocal for cpu_power sched: Try to deal with low capacity, fix update_sd_power_savings_stats() sched: Try to deal with low capacity sched: Scale down cpu_power due to RT tasks sched: Implement dynamic cpu_power sched: Add smt_gain sched: Update the cpu_power sum during load-balance sched: Add SD_PREFER_SIBLING ...	2009-09-11 13:23:18 -07:00
Linus Torvalds	4f0ac85416	Merge branch 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) perf tools: Avoid unnecessary work in directory lookups perf stat: Clean up statistics calculations a bit more perf stat: More advanced variance computation perf stat: Use stddev_mean in stead of stddev perf stat: Remove the limit on repeat perf stat: Change noise calculation to use stddev x86, perf_counter, bts: Do not allow kernel BTS tracing for now x86, perf_counter, bts: Correct pointer-to-u64 casts x86, perf_counter, bts: Fail if BTS is not available perf_counter: Fix output-sharing error path perf trace: Fix read_string() perf trace: Print out in nanoseconds perf tools: Seek to the end of the header area perf trace: Fix parsing of perf.data perf trace: Sample timestamps as well perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access perf trace: Sample the CPU too perf tools: Work around strict aliasing related warnings perf tools: Clean up warnings list in the Makefile perf tools: Complete support for dynamic strings ...	2009-09-11 13:22:43 -07:00
Linus Torvalds	a66a50054e	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits) x86/gart: Do not select AGP for GART_IOMMU x86/amd-iommu: Initialize passthrough mode when requested x86/amd-iommu: Don't detach device from pt domain on driver unbind x86/amd-iommu: Make sure a device is assigned in passthrough mode x86/amd-iommu: Align locking between attach_device and detach_device x86/amd-iommu: Fix device table write order x86/amd-iommu: Add passthrough mode initialization functions x86/amd-iommu: Add core functions for pd allocation/freeing x86/dma: Mark iommu_pass_through as __read_mostly x86/amd-iommu: Change iommu_map_page to support multiple page sizes x86/amd-iommu: Support higher level PTEs in iommu_page_unmap x86/amd-iommu: Remove old page table handling macros x86/amd-iommu: Use 2-level page tables for dma_ops domains x86/amd-iommu: Remove bus_addr check in iommu_map_page x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX x86/amd-iommu: Change alloc_pte to support 64 bit address space x86/amd-iommu: Introduce increase_address_space function x86/amd-iommu: Flush domains if address space size was increased x86/amd-iommu: Introduce set_dte_entry function x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices ...	2009-09-11 13:16:37 -07:00
Linus Torvalds	989aa44a5f	Merge branch 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debug lockups: Improve lockup detection, fix generic arch fallback debug lockups: Improve lockup detection	2009-09-11 13:15:55 -07:00
Michal Hocko	80938332d8	x86: Increase MIN_GAP to include randomized stack Currently we are not including randomized stack size when calculating mmap_base address in arch_pick_mmap_layout for topdown case. This might cause that mmap_base starts in the stack reserved area because stack is randomized by 1GB for 64b (8MB for 32b) and the minimum gap is 128MB. If the stack really grows down to mmap_base then we can get silent mmap region overwrite by the stack values. Let's include maximum stack randomization size into MIN_GAP which is used as the low bound for the gap in mmap. Signed-off-by: Michal Hocko <mhocko@suse.cz> LKML-Reference: <1252400515-6866-1-git-send-email-mhocko@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Stable Team <stable@kernel.org>	2009-09-10 17:00:12 -07:00
Ben Hutchings	5367b6887e	x86: Fix code patching for paravirt-alternatives on 486 As reported in <http://bugs.debian.org/511703> and <http://bugs.debian.org/515982>, kernels with paravirt-alternatives enabled crash in text_poke_early() on at least some 486-class processors. The problem is that text_poke_early() itself uses inline functions affected by paravirt-alternatives and so will modify instructions that have already been prefetched. Pentium and later processors will invalidate the prefetched instructions in this case, but 486-class processors do not. Change sync_core() to limit prefetching on 486-class (and 386-class) processors, and move the call to sync_core() above the call to the modifiable local_irq_restore(). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> LKML-Reference: <1252547631.3423.134.camel@localhost> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-10 16:50:19 -07:00
Frederic Weisbecker	8f8ffe2485	Merge commit 'tracing/core' into tracing/kprobes Conflicts: kernel/trace/trace_export.c kernel/trace/trace_kprobe.c Merge reason: This topic branch lacks an important build fix in tracing/core: 0dd7b74787eaf7858c6c573353a83c3e2766e674: tracing: Fix double CPP substitution in TRACE_EVENT_FN that prevents from multiple tracepoint headers inclusion crashes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-09-11 01:09:23 +02:00
Steven Rostedt	fc06b8520b	x86/tracing: comment need for atomic nop The dynamic function tracer relys on the macro P6_NOP5 always being an atomic NOP. If for some reason it is changed to be two operations (like a nop2 nop3) it can faults within the kernel when the function tracer modifies the code. This patch adds a comment to note that the P6_NOPs are expected to be atomic. This will hopefully prevent anyone from changing that. Reported-by: Mathieu Desnoyer <mathieu.desnoyers@polymtl.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-09-10 17:22:44 -04:00
Avi Kivity	0a79b00952	KVM: VMX: Check cpl before emulating debug register access Debug registers may only be accessed from cpl 0. Unfortunately, vmx will code to emulate the instruction even though it was issued from guest userspace, possibly leading to an unexpected trap later. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-09-10 18:11:10 +03:00
Avi Kivity	3d53c27d05	KVM: Use thread debug register storage instead of kvm specific data Instead of saving the debug registers from the processor to a kvm data structure, rely in the debug registers stored in the thread structure. This allows us not to save dr6 and dr7. Reduces lightweight vmexit cost by 350 cycles, or 11 percent. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 18:11:04 +03:00
Avi Kivity	fa6870c6b6	KVM: Add missing #include Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 10:46:49 +03:00
Avi Kivity	56e8231841	KVM: Rename x86_emulate.c to emulate.c We're in arch/x86, what could we possibly be emulating? Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 10:46:45 +03:00
Joerg Roedel	344f414fa0	KVM: report 1GB page support to userspace If userspace knows that the kernel part supports 1GB pages it can enable the corresponding cpuid bit so that guests actually use GB pages. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:19 +03:00
Joerg Roedel	04326caacf	KVM: MMU: enable gbpages by increasing nr of pagesizes Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:19 +03:00
Joerg Roedel	7e4e4056f7	KVM: MMU: shadow support for 1gb pages This patch adds support for shadow paging to the 1gb page table code in KVM. With this code the guest can use 1gb pages even if the host does not support them. [ Marcelo: fix shadow page collision on pmd level if a guest 1gb page is mapped with 4kb ptes on host level ] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:19 +03:00
Joerg Roedel	852e3c19ac	KVM: MMU: make direct mapping paths aware of mapping levels Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:18 +03:00
Sheng Yang	b927a3cec0	KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl Now KVM allow guest to modify guest's physical address of EPT's identity mapping page. (change from v1, discard unnecessary check, change ioctl to accept parameter address rather than value) Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-09-10 08:33:16 +03:00
Gleb Natapov	a1b37100d9	KVM: Reduce runnability interface with arch support code Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from interface between general code and arch code. kvm_arch_vcpu_runnable() checks for interrupts instead. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:13 +03:00
Gleb Natapov	0b71785dc0	KVM: Move kvm_cpu_get_interrupt() declaration to x86 code It is implemented only by x86. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:13 +03:00
Beth Kon	e9f4275732	KVM: PIT support for HPET legacy mode When kvm is in hpet_legacy_mode, the hpet is providing the timer interrupt and the pit should not be. So in legacy mode, the pit timer is destroyed, but the state of the pit is maintained. So if kvm or the guest tries to modify the state of the pit, this modification is accepted, except that the timer isn't actually started. When we exit hpet_legacy_mode, the current state of the pit (which is up to date since we've been accepting modifications) is used to restart the pit timer. The saved_mode code in kvm_pit_load_count temporarily changes mode to 0xff in order to destroy the timer, but then restores the actual value, again maintaining "current" state of the pit for possible later reenablement. [avi: add some reserved storage in the ioctl; make SET_PIT2 IOW] [marcelo: fix memory corruption due to reserved storage] Signed-off-by: Beth Kon <eak@us.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:12 +03:00
Gleb Natapov	fc61b800f9	KVM: Add Directed EOI support to APIC emulation Directed EOI is specified by x2APIC, but is available even when lapic is in xAPIC mode. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:07 +03:00
Joerg Roedel	ec04b2604c	KVM: Prepare memslot data structures for multiple hugepage sizes [avi: fix build on non-x86] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:02 +03:00
Marcelo Tosatti	229456fc34	KVM: convert custom marker based tracing to event traces This allows use of the powerful ftrace infrastructure. See Documentation/trace/ for usage information. [avi, stephen: various build fixes] [sheng: fix control register breakage] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:59 +03:00
Alexander Graf	0367b4330e	x86: Add definition for IGNNE MSR Hyper-V accesses MSR_IGNNE while running under KVM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:58 +03:00
Marcelo Tosatti	e799794e02	KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bits Required for EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:55 +03:00
Avi Kivity	7ffd92c53c	KVM: VMX: Move rmode structure to vmx-specific code rmode is only used in vmx, so move it to vmx.c Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:50 +03:00
Nitin A Kamble	3a624e29c7	KVM: VMX: Support Unrestricted Guest feature "Unrestricted Guest" feature is added in the VMX specification. Intel Westmere and onwards processors will support this feature. It allows kvm guests to run real mode and unpaged mode code natively in the VMX mode when EPT is turned on. With the unrestricted guest there is no need to emulate the guest real mode code in the vm86 container or in the emulator. Also the guest big real mode code works like native. The attached patch enhances KVM to use the unrestricted guest feature if available on the processor. It also adds a new kernel/module parameter to disable the unrestricted guest feature at the boot time. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:49 +03:00
Avi Kivity	6de4f3ada4	KVM: Cache pdptrs Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:46 +03:00
Huang Ying	890ca9aefa	KVM: Add MCE support The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:39 +03:00
Jaswinder Singh Rajput	af24a4e4ae	KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.h Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:32:38 +03:00
Rafael J. Wysocki	bf992fa2bc	Merge branch 'master' into for-linus	2009-09-10 00:02:02 +02:00
Alex Chiang	a7db504052	PCI: remove pcibios_scan_all_fns() This was #define'd as 0 on all platforms, so let's get rid of it. This change makes pci_scan_slot() slightly easier to read. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Luck <tony.luck@intel.com> Cc: David Howells <dhowells@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Acked-by: Russell King <linux@arm.linux.org.uk> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-09 13:29:18 -07:00
Peter Zijlstra	a8fae3ec5f	sched: enable SD_WAKE_IDLE Now that SD_WAKE_IDLE doesn't make pipe-test suck anymore, enable it by default for MC, CPU and NUMA domains. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-07 22:00:17 +02:00
Ingo Molnar	a1922ed661	Merge branch 'tracing/core' into tracing/hw-breakpoints Conflicts: arch/Kconfig kernel/trace/trace.h Merge reason: resolve the conflicts, plus adopt to the new ring-buffer APIs. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-07 08:19:51 +02:00
Ingo Molnar	ed011b22ce	Merge commit 'v2.6.31-rc9' into tracing/core Merge reason: move from -rc5 to -rc9. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-06 06:11:42 +02:00
Ingo Molnar	695a461296	Merge branch 'amd-iommu/2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu	2009-09-04 14:44:16 +02:00
Ingo Molnar	840a065310	sched: Turn on SD_BALANCE_NEWIDLE Start the re-tuning of the balancer by turning on newidle. It improves hackbench performance and parallelism on a 4x4 box. The "perf stat --repeat 10" measurements give us: domain0 domain1 ....................................... -SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2041.273208 task-clock-msecs # 9.354 CPUs ( +- 0.363% ) +SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2086.326925 task-clock-msecs # 11.934 CPUs ( +- 0.301% ) +SD_BALANCE_NEWIDLE +SD_BALANCE_NEWIDLE: 2115.289791 task-clock-msecs # 12.158 CPUs ( +- 0.263% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 11:52:54 +02:00
Ingo Molnar	47734f89be	sched: Clean up topology.h Re-organize the flag settings so that it's visible at a glance which sched-domains flags are set and which not. With the new balancer code we'll need to re-tune these details anyway, so make it cleaner to make fewer mistakes down the road ;-) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 11:52:53 +02:00
Jeremy Fitzhardinge	53f824520b	x86/i386: Put aligned stack-canary in percpu shared_aligned section Pack aligned things together into a special section to minimize padding holes. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <4AA035C0.9070202@goop.org> [ queued up in tip:x86/asm because it depends on this commit: x86/i386: Make sure stack-protector segment base is cache aligned ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 07:10:31 +02:00
Andreas Herrmann	4a376ec3a2	x86: Fix CPU llc_shared_map information for AMD Magny-Cours Construct entire NodeID and use it as cpu_llc_id. Thus internal node siblings are stored in llc_shared_map. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-03 15:09:59 -07:00
Jeremy Fitzhardinge	1ea0d14e48	x86/i386: Make sure stack-protector segment base is cache aligned The Intel Optimization Reference Guide says: In Intel Atom microarchitecture, the address generation unit assumes that the segment base will be 0 by default. Non-zero segment base will cause load and store operations to experience a delay. - If the segment base isn't aligned to a cache line boundary, the max throughput of memory operations is reduced to one [e]very 9 cycles. [...] Assembly/Compiler Coding Rule 15. (H impact, ML generality) For Intel Atom processors, use segments with base set to 0 whenever possible; avoid non-zero segment base address that is not aligned to cache line boundary at all cost. We can't avoid having a non-zero base for the stack-protector segment, but we can make it cache-aligned. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> LKML-Reference: <4AA01893.6000507@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-03 21:30:51 +02:00
Joerg Roedel	2b681fafcc	Merge branch 'amd-iommu/pagetable' into amd-iommu/2.6.32 Conflicts: arch/x86/kernel/amd_iommu.c	2009-09-03 17:14:57 +02:00
Joerg Roedel	03362a05c5	Merge branch 'amd-iommu/passthrough' into amd-iommu/2.6.32 Conflicts: arch/x86/kernel/amd_iommu.c arch/x86/kernel/amd_iommu_init.c	2009-09-03 16:34:23 +02:00
Joerg Roedel	85da07c409	Merge branches 'gart/fixes', 'amd-iommu/fixes+cleanups' and 'amd-iommu/fault-handling' into amd-iommu/2.6.32	2009-09-03 16:32:00 +02:00
Joerg Roedel	0feae533dd	x86/amd-iommu: Add passthrough mode initialization functions When iommu=pt is passed on kernel command line the devices should run untranslated. This requires the allocation of a special domain for that purpose. This patch implements the allocation and initialization path for iommu=pt. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:15:42 +02:00
Joerg Roedel	abdc5eb3d6	x86/amd-iommu: Change iommu_map_page to support multiple page sizes This patch adds a map_size parameter to the iommu_map_page function which makes it generic enough to handle multiple page sizes. This also requires a change to alloc_pte which is also done in this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:11:17 +02:00
Joerg Roedel	a6b256b413	x86/amd-iommu: Support higher level PTEs in iommu_page_unmap This patch changes fetch_pte and iommu_page_unmap to support different page sizes too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:11:08 +02:00
Joerg Roedel	674d798a80	x86/amd-iommu: Remove old page table handling macros These macros are not longer required. So remove them. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:49 +02:00
Joerg Roedel	50020fb632	x86/amd-iommu: Introduce increase_address_space function This function will be used to increase the address space size of a protection domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:46 +02:00
Joerg Roedel	04bfdd8406	x86/amd-iommu: Flush domains if address space size was increased Thist patch introduces the update_domain function which propagates the larger address space of a protection domain to the device table and flushes all relevant DTEs and the domain TLB. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:45 +02:00
Joerg Roedel	9355a08186	x86/amd-iommu: Make fetch_pte aware of dynamic mapping levels This patch changes the fetch_pte function in the AMD IOMMU driver to support dynamic mapping levels. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 16:03:34 +02:00
Joerg Roedel	b26e81b871	x86/amd-iommu: Panic if IOMMU command buffer reset fails To prevent the driver from doing recursive command buffer resets, just panic when that recursion happens. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:35 +02:00
Joerg Roedel	93f1cc67cf	x86/amd-iommu: Add reset function for command buffers This patch factors parts of the command buffer initialization code into a seperate function which can be used to reset the command buffer later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:55:34 +02:00
Joerg Roedel	4c6f40d4e0	x86/amd-iommu: replace "AMD IOMMU" by "AMD-Vi" This patch replaces the "AMD IOMMU" printk strings with the official name for the hardware: "AMD-Vi". Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-09-03 15:49:56 +02:00
Ingo Molnar	f76bd108e5	Merge branch 'perfcounters/urgent' into perfcounters/core Merge reason: We are going to modify a place modified by perfcounters/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 21:42:59 +02:00
Ingo Molnar	936e894a97	Merge commit 'v2.6.31-rc8' into x86/txt Conflicts: arch/x86/kernel/reboot.c security/Kconfig Merge reason: resolve the conflicts, bump up from rc3 to rc8. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 08:17:56 +02:00
Huang Ying	ae4b688db2	x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h This function measures whether the FPU/SSE state can be touched in interrupt context. If the interrupted code is in user space or has no valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ or soft_irq context too. This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-01 21:39:15 -07:00
Shane Wang	69575d3886	x86, intel_txt: clean up the impact on generic code, unbreak non-x86 Move tboot.h from asm to linux to fix the build errors of intel_txt patch on non-X86 platforms. Remove the tboot code from generic code init/main.c and kernel/cpu.c. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-01 18:25:07 -07:00
Ingo Molnar	c931aaf0e1	Merge branch 'x86/paravirt' into x86/cpu Conflicts: arch/x86/include/asm/paravirt.h Manual merge: arch/x86/include/asm/paravirt_types.h Merge reason: x86/paravirt conflicts non-trivially with x86/cpu, resolve it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-01 12:13:30 +02:00
H. Peter Anvin	ff55df53df	x86, msr: Export the register-setting MSR functions via /dev/*/msr Make it possible to access the all-register-setting/getting MSR functions via the MSR driver. This is implemented as an ioctl() on the standard MSR device node. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>	2009-08-31 16:16:04 -07:00
H. Peter Anvin	8b956bf1f0	x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() Create _on_cpu helpers for {rw,wr}msr_safe_regs() analogously with the other MSR functions. This will be necessary to add support for these to the MSR driver. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>	2009-08-31 16:15:57 -07:00
H. Peter Anvin	0cc0213e73	x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT For some reason, the _safe MSR functions returned -EFAULT, not -EIO. However, the only user which cares about the return code as anything other than a boolean is the MSR driver, which wants -EIO. Change it to -EIO across the board. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-08-31 15:15:23 -07:00
H. Peter Anvin	709972b1f6	x86, asm: Make _ASM_EXTABLE() usable from assembly code We have had this convenient macro _ASM_EXTABLE() to generate exception table entry in inline assembly. Make it also usable for pure assembly. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:30 -07:00
H. Peter Anvin	fe9b4e4e40	x86, asm: Add 32-bit versions of the combined CFI macros Add 32-bit versions of the combined CFI macros, equivalent to the 64-bit ones except, obviously, operating on 32-bit stack words. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:29 -07:00
Borislav Petkov	177fed1ee8	x86, msr: Rewrite AMD rd/wrmsr variants Switch them to native_{rd,wr}msr_safe_regs and remove pv_cpu_ops.read_msr_amd. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-2-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:28 -07:00
Borislav Petkov	132ec92f3f	x86, msr: Add rd/wrmsr interfaces with preset registers native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow presetting of a subset of eight x86 GPRs before executing the rd/wrmsr instructions. This is needed at least on AMD K8 for accessing an erratum workaround MSR. Originally based on an idea by H. Peter Anvin. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-31 15:14:26 -07:00
Thomas Gleixner	e11dadabf4	x86: apic namespace cleanup boot_cpu_physical_apicid is a global variable and used as function argument as well. Rename the function arguments to avoid confusion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 21:30:47 +02:00
Thomas Gleixner	bc07844a33	x86: Distangle ioapic and i8259 The proposed Moorestown support patches use an extra feature flag mechanism to make the ioapic work w/o an i8259. There is a much simpler solution. Most i8259 specific functions are already called dependend on the irq number less than NR_IRQS_LEGACY. Replacing that constant by a read_mostly variable which can be set to 0 by the platform setup code allows us to achieve the same without any special feature flags. That trivial change allows us to proceed with MRST w/o doing a full blown overhaul of the ioapic code which would delay MRST unduly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 19:23:09 +02:00
Thomas Gleixner	3f4110a48a	x86: Add Moorestown early detection Moorestown MID devices need to be detected early in the boot process to setup and do not call x86_default_early_setup as there is no EBDA region to reserve. [ Copied the minimal code from Jacobs latest MRST series ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@intel.com>	2009-08-31 11:09:40 +02:00
Pan, Jacob jun	162bc7ab01	x86: Add hardware_subarch ID for Moorestown x86 bootprotocol 2.07 has introduced hardware_subarch ID in the boot parameters provided by FW. We use it to identify Moorestown platforms. [ tglx: Cleanup and paravirt fix ] Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 11:09:40 +02:00
Thomas Gleixner	47a3d5da70	x86: Add early platform detection Platforms like Moorestown require early setup and want to avoid the call to reserve_ebda_region. The x86_init override is too late when the MRST detection happens in setup_arch. Move the default i386 x86_init overrides and the call to reserve_ebda_region into a separate function which is called as the default of a switch case depending on the hardware_subarch id in boot params. This allows us to add a case for MRST and let MRST have its own early setup function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 11:09:40 +02:00
Thomas Gleixner	2d826404f0	x86: Move tsc_calibration to x86_init_ops TSC calibration is modified by the vmware hypervisor and paravirt by separate means. Moorestown wants to add its own calibration routine as well. So make calibrate_tsc a proper x86_init_ops function and override it by paravirt or by the early setup of the vmware hypervisor. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:47 +02:00
Thomas Gleixner	08047c4f17	x86: Move calibrate_cpu to tsc.c Move the code where it's only user is. Also we need to look whether this hardwired hackery might interfere with perfcounters. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:46 +02:00
Thomas Gleixner	64fcbac1f3	x86: Simplify timer_ack magic in time_32.c Let the compiler optimize the timer_ack magic away in the 32bit timer interrupt and put the same code into time_64.c. It's optimized out for CONFIG_X86_IO_APIC on 32bit and for 64bit because timer_ack is const 0 in both cases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:46 +02:00
Thomas Gleixner	ecce85089e	x86: Remove do_timer hook This is a left over of the old x86 sub arch support. Remove it and open code it like we do in time_64.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:46 +02:00
Thomas Gleixner	845b3944bb	x86: Add timer_init to x86_init_ops The timer init code is convoluted with several quirks and the paravirt timer chooser. Figuring out which code path is actually taken is not for the faint hearted. Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and replace the paravirt time chooser and the remaining x86 quirk with a simple x86_init_ops function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:46 +02:00
Thomas Gleixner	736decac64	x86: Move percpu clockevents setup to x86_init_ops paravirt overrides the setup of the default apic timers as per cpu timers. Moorestown needs to override that as well. Move it to x86_init_ops setup and create a separate x86_cpuinit struct which holds the function for the secondary evtl. hotplugabble CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:46 +02:00
Thomas Gleixner	f1d7062a23	x86: Move xen_post_allocator_init into xen_pagetable_setup_done We really do not need two paravirt/x86_init_ops functions which are called in two consecutive source lines. Move the only user of post_allocator_init into the already existing pagetable_setup_done function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	030cb6c00d	x86: Move paravirt pagetable_setup to x86_init_ops Replace more paravirt hackery by proper x86_init_ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	6f30c1ac3f	x86: Move paravirt banner printout to x86_init_ops Replace another obscure paravirt magic and move it to x86_init_ops. Such a hook is also useful for embedded and special hardware. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	42bbdb43b1	x86: Replace ARCH_SETUP by a proper x86_init_ops ARCH_SETUP is a horrible leftover from the old arch/i386 mach support code. It still has a lonely user in xen. Move it to x86_init_ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	428cf9025b	x86: Move traps_init to x86_init_ops Replace the quirks by a simple x86_init_ops function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	66bcaf0bde	x86: Move irq_init to x86_init_ops irq_init is overridden by x86_quirks and by paravirts. Unify the whole mess and make it an unconditional x86_init_ops function which defaults to the standard function and can be overridden by the early platform code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	d9112f4302	x86: Move pre_intr_init to x86_init_ops Replace the quirk machinery by a x86_init_ops function which defaults to the standard implementation. This is also a preparatory patch for Moorestown support which needs to replace the default init_ISA_irqs as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Thomas Gleixner	b3f1b617f4	x86: Move get/find_smp_config to x86_init_ops Replace the quirk machinery by a x86_init_ops function which defaults to the standard implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-31 09:35:45 +02:00
Ingo Molnar	eebc57f73d	Merge branch 'for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6 into x86/apic Merge reason: the SFI (Simple Firmware Interface) feature in the ACPI tree needs this cleanup, pull it into the APIC branch as well so that there's no interactions. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-29 09:31:47 +02:00
Feng Tang	e55a5999ff	ACPI: Handle CONFIG_ACPI=n better from linux/acpi.h linux/acpi.h is the top level header for interfacing with the ACPI sub-system, so acpi_disabled should be up there instead of down in asm/acpi.h -- particularly since asm/acpi.h doesn't exist for all architectures. Same story for acpi_table_parse(), which is a top-level API to Linux/ACPI. This is necessary for building some code that used to always depend on CONFIG_ACPI=y, but will soon also need to build with CONFIG_ACPI=n. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2009-08-28 19:57:29 -04:00
Feng Tang	2a4ab640d3	ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n Some IO-APIC routines are ACPI specific now, but need to be exposed when CONFIG_ACPI=n for the benefit of SFI. Remove #ifdef ACPI around these routines: io_apic_get_unique_id(int ioapic, int apic_id); io_apic_get_version(int ioapic); io_apic_get_redir_entries(int ioapic); Move these routines from ACPI-specific boot.c to io_apic.c: uniq_ioapic_id(u8 id) mp_find_ioapic() mp_find_ioapic_pin() mp_register_ioapic() Also, since uniq_ioapic_id() is now no longer static, re-name it to io_apic_unique_id() for consistency with the other public io_apic routines. For simplicity, do not #ifdef the resulting code ACPI \|\| SFI, thought that could be done in the future if it is important to optimize the !ACPI !SFI IO-APIC x86 kernel for size. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org	2009-08-28 19:57:27 -04:00
Suresh Siddha	c8bc6f3c80	x86: arch specific support for remapping HPET MSIs x86 arch support for remapping HPET MSI's by associating the HPET timer block with the interrupt-remapping HW unit and setting up appropriate irq_chip Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jay Fenlason <fenlason@redhat.com> LKML-Reference: <20090804190729.630510000@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 23:33:20 +02:00
Thomas Gleixner	90e1c6969d	x86: Move oem_bus_info to x86_init_ops Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:53 +02:00
Thomas Gleixner	52fdb56846	x86: Move mpc_oem_pci_bus to x86_init_ops Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	72302142e1	x86: Move smp_read_mpc_oem to x86_init_ops. Move smp_read_mpc_oem from quirks to x86_init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	fd6c666149	x86: Move mpc_apic_id to x86_init_ops The mpc_apic_id setup is handled by a x86_quirk. Make it a x86_init_ops function with a default implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	de93410310	x86: Move ioapic_ids_setup to x86_init_ops 32bit and also the numaq code have special requirements on the ioapic_id setup. Convert it to a x86_init_ops function and get rid of the quirks and #ifdefs Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	f4848472cd	x86: Sanitize smp_record and move it to x86_init_ops The x86 quirkification introduced an extra ugly hackery with a variable pointer in the mpparse code. If the pointer is initialized then it is dereferenced and the variable set to 0 or incremented. Create a x86_init_ops function and let the affected numaq code hold the function. Default init is a setup noop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	6b18ae3e2f	x86: Move memory_setup to x86_init_ops memory_setup is overridden by x86_quirks and by paravirts with weak functions and quirks. Unify the whole mess and make it an unconditional x86_init_ops function which defaults to the standard function and can be overridden by the early platform code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	816c25e7d4	x86: Add reserve_ebda_region to x86_init_ops reserve_ebda_region needs to be called befor start_kernel. Moorestown needs to override it. Make it a x86_init_ops function and initialize it with the default reserve_ebda_region. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	8fee697d99	x86: Add request_standard_resources to x86_init The 32bit and the 64bit code are slighty different in the reservation of standard resources. Also the upcoming Moorestown support needs its own version of that. Add it to x86_init_ops and initialize it with the 64bit default. 32bit overrides it in early boot. Now moorestown can add it's own override w/o sprinkling the code with more #ifdefs Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	f7cf5a5b8c	x86: Add probe_roms to x86_init probe_roms is only used on 32bit. Add it to the x86_init ops and remove the #ifdefs. Default initializer is x86_init_noop() which is overridden in the 32bit boot code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	57844a8f8e	x86: Add x86_init infrastructure The upcoming Moorestown support brings the embedded world to x86. The setup code of x86 has already a couple of hooks which are either x86_quirks or paravirt ops. Some of those setup hooks are pretty convoluted like the timer setup and the tsc calibration code. But there are other places which could do with a cleanup. Instead of having inline functions/macros which are modified at compile time I decided to introduce x86_init ops which are unconditional in the code and make it clear that they can be changed either during compile time or in the early boot process. The function pointers are initialized by default functions which can be noops so that the pointer can be called unconditionally in the most cases. This also allows us to remove 32bit/64bit, paravirt and other #ifdeffery. paravirt guests are just a hardware platform in the setup code, so we should treat them as such and not hide all behind multiple layers of indirection and compile time dependencies. It's more obvious that x86_init.timers.timer_init() is a function pointer than the late_time_init = choose_time_init() obscurity. It's also way simpler to grep for x86_init.timers.timer_init and find all the places which modify that function pointer instead of analyzing weak functions, macros and paravirt indirections. Note. This is not a general paravirt_ops replacement. It just will move setup related hooks which are potentially useful for other platform setup purposes as well out of the paravirt domain. Add the base infrastructure without any functionality. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:12:52 +02:00
Thomas Gleixner	4152f93508	Merge branch 'sched/clock' into x86/cleanups Reason: The tsc init cleanup depends on sched_clock_init moving past late_time_init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 17:06:24 +02:00
Thomas Gleixner	6effcd9245	Merge branch 'x86/paravirt' into x86/cleanups Reason: The setup cleanups conflict with the paravirt cleanups. Avoid a rather large merge conflict Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-27 16:42:32 +02:00
H. Peter Anvin	b855192c08	Merge branch 'x86/urgent' into x86/pat Reason: Change to is_new_memtype_allowed() in x86/urgent Resolved semantic conflicts in: arch/x86/mm/pat.c arch/x86/mm/ioremap.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 17:24:28 -07:00
Venkatesh Pallipadi	f584174096	x86, pat: Use page flags to track memtypes of RAM pages Change reserve_ram_pages_type and free_ram_pages_type to use 2 page flags to track UC_MINUS, WC, WB and default types. Previous RAM tracking just tracked WB or NonWB, which was not complete and did not allow tracking of RAM fully and there was no way to get the actual type reserved by looking at the page flags. We use the memtype_lock spinlock for atomicity in dealing with memtype tracking in struct page. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:24 -07:00
Venkatesh Pallipadi	9e36fda0b3	x86, pat: Add PAT reserve free to io_mapping* APIs io_mapping_* interfaces were added, mainly for graphics drivers. Make this interface go through the PAT reserve/free, instead of hardcoding WC mapping. This makes sure that there are no aliases due to unconditional WC setting. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:16 -07:00
Venkatesh Pallipadi	9fd126bc74	x86, pat: New i/f for driver to request memtype for IO regions Add new routines to request memtype for IO regions. This will currently be a backend for io_mapping_* routines. But, it can also be made available to drivers directly in future, in case it is needed. reserve interface reserves the memory, makes sure we have a compatible memory type available and keeps the identity map in sync when needed. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-26 15:41:10 -07:00
Masami Hiramatsu	b1cf540f0e	x86: Add pt_regs register and stack access APIs Add following APIs for accessing registers and stack entries from pt_regs. These APIs are required by kprobes-based event tracer on ftrace. Some other debugging tools might be able to use it too. - regs_query_register_offset(const char name) Query the offset of "name" register. - regs_query_register_name(unsigned int offset) Query the name of register by its offset. - regs_get_register(struct pt_regs regs, unsigned int offset) Get the value of a register by its offset. - regs_within_kernel_stack(struct pt_regs regs, unsigned long addr) Check the address is in the kernel stack. - regs_get_kernel_stack_nth(struct pt_regs reg, unsigned int nth) Get Nth entry of the kernel stack. (N >= 0) - regs_get_argument_nth(struct pt_regs *reg, unsigned int nth) Get Nth argument at function call. (N >= 0) Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: linux-arch@vger.kernel.org Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Baron <jbaron@redhat.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it> Cc: Roland McGrath <roland@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> LKML-Reference: <20090813203444.31965.26374.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-27 00:35:57 +02:00
Masami Hiramatsu	eb13296cfa	x86: Instruction decoder API Add x86 instruction decoder to arch-specific libraries. This decoder can decode x86 instructions used in kernel into prefix, opcode, modrm, sib, displacement and immediates. This can also show the length of instructions. This version introduces instruction attributes for decoding instructions. The instruction attribute tables are generated from the opcode map file (x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk). Currently, the opcode maps are based on opcode maps in Intel(R) 64 and IA-32 Architectures Software Developers Manual Vol.2: Appendix.A, and consist of below two types of opcode tables. 1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are written as below; Table: table-name Referrer: escaped-name opcode: mnemonic\|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [\| 2nd-mnemonic ...] (or) opcode: escape # escaped-name EndTable Group opcodes, which has 8 elements, are written as below; GrpTable: GrpXXX reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [\| 2nd-mnemonic ...] EndTable These opcode maps include a few SSE and FP opcodes (for setup), because those opcodes are used in the kernel. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it> Cc: Roland McGrath <roland@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-27 00:35:56 +02:00
Jason Baron	117226d158	tracing: Remove FTRACE_SYSCALL_MAX definitions Remove the FTRACE_SYSCALL_MAX definitions now that we have converted the syscall event tracing code to use NR_syscalls. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Josh Stone <jistone@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anwin <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <f2240cdc8f0b1ca7617390c8f5ec90ba2bd348cf.1251146513.git.jbaron@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-26 21:30:39 +02:00
Jason Baron	a5a2f8e2ac	tracing: Define NR_syscalls for x86_64 Express the available number of syscalls in a standard way by defining NR_syscalls. The common way to define it is to place its definition in asm/unistd.h However, the number of syscalls is defined using __NR_syscall_max in x86-64 after building a dynamic header file "asm-offsets.h" The source file that generates this header, asm-offsets-64.c includes unistd.h, then if we want to express NR_syscalls from __NR_syscall_max in unistd.h only after generating the dynamic header file, we need a watchguard. If unistd.h is included from asm-offsets-64.c, then we are generating asm-offset.h which defines __NR_syscall_max. At this time, we don't want to (we can't) define NR_syscalls, then we do nothing. Otherwise we define NR_syscalls because we know asm-offsets.h has been generated. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Josh Stone <jistone@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anwin <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <20090826160910.GB2658@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-26 21:29:58 +02:00
Jason Baron	dd86dda24c	tracing: Define NR_syscalls for x86 (32) Add a NR_syscalls #define for x86. This is used in the syscall events tracing code. Todo: make it dynamic like x86_64. NR_syscalls is the usual name used to determine the number of syscalls supported by the current arch. We want to unify the use of this number across archs that support the syscall tracing. This also prepare to move some of the arch code to core code in the syscall tracing area. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Josh Stone <jistone@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anwin <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <0f33c0f96d198fccc3ddd9ff7f5334ff5cb42706.1251146513.git.jbaron@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-26 21:29:55 +02:00
Cyrill Gorcunov	8f3e1df48b	x86, ioapic: Define IO_APIC_DEFAULT_PHYS_BASE constant We already have APIC_DEFAULT_PHYS_BASE so just to be consistent. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090824175550.927946757@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-26 08:16:38 +02:00
H. Peter Anvin	ab94fcf528	x86: allow "=rm" in native_save_fl() This is a partial revert of `f1f029c7bf`. "=rm" is allowed in this context, because "pop" is explicitly defined to adjust the stack pointer before it evaluates its effective address, if it has one. Thus, we do end up writing to the correct address even if we use an on-stack memory argument. The original reporter for `f1f029c7bf` was apparently using a broken x86 simulator. [ Impact: performance ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Gabe Black <spamforgabe@umich.edu>	2009-08-25 16:47:16 -07:00
H. Peter Anvin	e8a2eb47e6	Merge commit 'origin/x86/urgent' into x86/asm	2009-08-25 15:40:29 -07:00
Josh Stone	6670000119	tracing: Rename FTRACE_SYSCALLS for tracepoints s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g The syscall enter/exit tracing is no longer specific to just ftrace, so they now have names that reflect their tie to tracepoints instead. Signed-off-by: Josh Stone <jistone@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-26 00:17:35 +02:00
Tobias Doerffel	366d19e181	x86: add specific support for Intel Atom architecture Add another option when selecting CPU family so the kernel can be optimized for Intel Atom CPUs. If GCC supports tuning options for Intel Atom they will be used. Signed-off-by: Tobias Doerffel <tobias.doerffel@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1251018457-19157-1-git-send-email-tobias.doerffel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-23 11:20:02 +02:00
H. Peter Anvin	5400743db5	x86, mtrr: make mtrr_aps_delayed_init static bool mtr_aps_delayed_init was declared u32 and made global, but it only ever takes boolean values and is only ever used in arch/x86/kernel/cpu/mtrr/main.c. Declare it "static bool" and remove external references. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>	2009-08-21 17:00:02 -07:00
Suresh Siddha	d0af9eed5a	x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init SDM Vol 3a section titled "MTRR considerations in MP systems" specifies the need for synchronizing the logical cpu's while initializing/updating MTRR. Currently Linux kernel does the synchronization of all cpu's only when a single MTRR register is programmed/updated. During an AP online (during boot/cpu-online/resume) where we initialize all the MTRR/PAT registers, we don't follow this synchronization algorithm. This can lead to scenarios where during a dynamic cpu online, that logical cpu is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical HT sibling continue to run (also with cache disabled because of cr0.cd=1 on its sibling). Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly (because of some VMX performance optimizations) and the above scenario (with one logical cpu doing VMX activity and another logical cpu coming online) can result in system crash. Fix the MTRR initialization by doing rendezvous of all the cpus. During boot and resume, we delay the MTRR/PAT init for APs till all the logical cpu's come online and the rendezvous process at the end of AP's bringup, will initialize the MTRR/PAT for all AP's. For dynamic single cpu online, we synchronize all the logical cpus and do the MTRR/PAT init on the AP that is coming online. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-21 16:25:55 -07:00
Jan Beulich	8b5a10fc6f	x86: properly annotate alternatives.c Some of the NOPs tables aren't used on 64-bits, quite some code and data is needed post-init for module loading only, and a couple of functions aren't used outside that file (i.e. can be static, and don't need to be exported). The change to __INITDATA/__INITRODATA is needed to avoid an assembler warning. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4A8BC8A00200007800010823@vpn.id2.novell.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-21 15:30:12 -07:00
Jan Beulich	5946fa3d5c	x86, hpet: Simplify the HPET code On 64-bits, using unsigned long when unsigned int suffices needlessly creates larger code (due to the need for REX prefixes), and most of the logic in hpet.c really doesn't need 64-bit operations. At once this avoids the need for a couple of type casts. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <4A8BC9780200007800010832@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-21 21:55:25 +02:00
john stultz	da15cfdae0	time: Introduce CLOCK_REALTIME_COARSE After talking with some application writers who want very fast, but not fine-grained timestamps, I decided to try to implement new clock_ids to clock_gettime(): CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE which returns the time at the last tick. This is very fast as we don't have to access any hardware (which can be very painful if you're using something like the acpi_pm clocksource), and we can even use the vdso clock_gettime() method to avoid the syscall. The only trade off is you only get low-res tick grained time resolution. This isn't a new idea, I know Ingo has a patch in the -rt tree that made the vsyscall gettimeofday() return coarse grained time when the vsyscall64 sysctrl was set to 2. However this affects all applications on a system. With this method, applications can choose the proper speed/granularity trade-off for themselves. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: nikolag@ca.ibm.com Cc: Darren Hart <dvhltc@us.ibm.com> Cc: arjan@infradead.org Cc: jonathan@jonmasters.org LKML-Reference: <1250734414.6897.5.camel@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-21 21:43:46 +02:00
Rafael J. Wysocki	39cf0518d8	Merge branch 'master' into for-linus	2009-08-20 20:24:33 +02:00
Suresh Siddha	1adcaafe74	x86, pat: Allow ISA memory range uncacheable mapping requests Max Vozeler reported: > Bug 13877 - bogl-term broken with CONFIG_X86_PAT=y, works with =n > > strace of bogl-term: > 814 mmap2(NULL, 65536, PROT_READ\|PROT_WRITE, MAP_SHARED, 4, 0) > = -1 EAGAIN (Resource temporarily unavailable) > 814 write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n", > 57) = 57 PAT code maps the ISA memory range as WB in the PAT attribute, so that fixed range MTRR registers define the actual memory type (UC/WC/WT etc). But the upper level is_new_memtype_allowed() API checks are failing, as the request here is for UC and the return tracked type is WB (Tracked type is WB as MTRR type for this legacy range potentially will be different for each 4k page). Fix is_new_memtype_allowed() by always succeeding the ISA address range checks, as the null PAT (WB) and def MTRR fixed range register settings satisfy the memory type needs of the applications that map the ISA address range. Reported-and-Tested-by: Max Vozeler <xam@debian.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-17 14:12:44 -07:00
Cliff Wickman	3ef12c3c97	x86: Fix UV BAU destination subnode id The SGI UV Broadcast Assist Unit is used to send TLB shootdown messages to remote nodes of the system. The header of the message must contain the subnode id of the block in the receiving hub that handles such messages. It should always be 0x10, the id of the "LB" block. It had previously been documented as a "must be zero" field. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <E1Mc1x7-0005Ce-6t@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-15 11:58:02 +02:00
Tejun Heo	384be2b18a	Merge branch 'percpu-for-linus' into percpu-for-next Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-08-14 14:45:31 +09:00
Jason Baron	9daa77e2e9	tracing: Update FTRACE_SYSCALL_MAX update FTRACE_SYSCALL_MAX to the current number of syscalls FTRACE_SYSCALL_MAX is a temporary solution to get the number of syscalls supported by the arch until we find a more dynamic way to get this number. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-11 20:35:27 +02:00
Huang Ying	0dcc66851f	x86, mce: Support specifying raise mode for software MCE injection Raise mode include raising as exception or raising as poll, it is specified via the mce.inject_flags field. This can be used to specify raise mode of UCNA, which is UC error but raised not as exception. And this can be used to test the filter code of poll handler or exception handler too. For example, enforce a poll raise mode for a fatal MCE. ChangeLog: v2: - Re-base on latest x86-tip.git/mce3 Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-10 13:58:41 -07:00
Huang Ying	5b7e88edc6	x86, mce: Support specifying context for software mce injection The cpu context is specified via the new mce.inject_flags fields. This allows more realistic machine check testing in different situations. "RANDOM" context is implemented via NMI broadcasting to add randomization to testing. AK: Fix NMI broadcasting check. Fix 32-bit building. Some race fixes. Move to module. Various changes ChangeLog: v3: - Re-based on latest x86-tip.git/mce4 - Fix 32-bit building v2: - Re-base on latest x86-tip.git/mce3 Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-08-10 13:58:27 -07:00
Markus Metzger	30dd568c91	x86, perf_counter, bts: Add BTS support to perfcounters Implement a performance counter with: attr.type = PERF_TYPE_HARDWARE attr.config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS attr.sample_period = 1 Using branch trace store (BTS) on x86 hardware, if available. The from and to address for each branch can be sampled using: PERF_SAMPLE_IP for the from address PERF_SAMPLE_ADDR for the to address [ v2: address review feedback, fix bugs ] Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-09 13:04:14 +02:00
Akinobu Mita	1e5de18278	x86: Introduce GDT_ENTRY_INIT() GDT_ENTRY_INIT is static initializer of desc_struct. We already have similar macro GDT_ENTRY() but it's static initializer for u64 and it cannot be used for desc_struct. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> LKML-Reference: <20090718151219.GD11294@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-08 17:44:11 +02:00
Rafael J. Wysocki	c00aafcd49	Merge branch 'master' into for-linus	2009-08-05 23:56:54 +02:00
Gleb Natapov	ce69a78450	x86/apic: Enable x2APIC without interrupt remapping under KVM KVM would like to provide x2APIC interface to a guest without emulating interrupt remapping device. The reason KVM prefers guest to use x2APIC is that x2APIC interface is better virtualizable and provides better performance than mmio xAPIC interface: - msr exits are faster than mmio (no page table walk, emulation) - no need to read back ICR to look at the busy bit - one 64 bit ICR write instead of two 32 bit writes - shared code with the Hyper-V paravirt interface Included patch changes x2APIC enabling logic to enable it even if IR initialization failed, but kernel runs under KVM and no apic id is greater than 255 (if there is one spec requires BIOS to move to x2apic mode before starting an OS). -v2: fix build -v3: fix bug causing compiler warning Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Sheng Yang <sheng@linux.intel.com> Cc: "avi@redhat.com" <avi@redhat.com> LKML-Reference: <20090720122417.GR5638@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-05 14:28:50 +02:00
Dave Airlie	b7f3158428	Merge git://git.infradead.org/~dwmw2/iommu-agp into agp-next	2009-08-05 10:16:57 +10:00
Linus Torvalds	067e18133f	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Work around compilation warning in arch/x86/kernel/apm_32.c x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() x86, 32-bit: Fix double accounting in reserve_top_address() x86: Don't use current_cpu_data in x2apic phys_pkg_id x86, UV: Fix UV apic mode x86, UV: Fix macros for accessing large node numbers x86, UV: Delete mapping of MMR rangs mapped by BIOS x86, UV: Handle missing blade-local memory correctly x86: fix assembly constraints in native_save_fl() x86, msr: execute on the correct CPU subset x86: Fix assert syntax in vmlinux.lds.S x86: Make 64-bit efi_ioremap use ioremap on MMIO regions x86: Add quirk to make Apple MacBook5,2 use reboot=pci x86: Fix CPA memtype reserving in the set_pages_array*() cases x86, pat: Fix set_memory_wc related corruption x86: fix section mismatch for i386 init code	2009-08-04 15:28:59 -07:00
Jack Steiner	67e83f309e	x86, UV: Fix macros for accessing large node numbers The UV chipset automatically supplies the upper bits on nodes being referenced by MMR accesses. These bit can be deleted from the hub addressing macros. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20090727143808.GA8076@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-04 16:19:14 +02:00
Jack Steiner	6c7184b774	x86, UV: Handle missing blade-local memory correctly UV blades may not have any blade-local memory. Add a field (nid) to the UV blade structure to indicates whether the node has local memory. This is needed by the GRU driver (pushed separately). Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org LKML-Reference: <20090727143507.GA7006@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-04 16:18:01 +02:00
H. Peter Anvin	f1f029c7bf	x86: fix assembly constraints in native_save_fl() From Gabe Black in bugzilla 13888: native_save_fl is implemented as follows: 11static inline unsigned long native_save_fl(void) 12{ 13 unsigned long flags; 14 15 asm volatile("# __raw_save_flags\n\t" 16 "pushf ; pop %0" 17 : "=g" (flags) 18 : /* no input */ 19 : "memory"); 20 21 return flags; 22} If gcc chooses to put flags on the stack, for instance because this is inlined into a larger function with more register pressure, the offset of the flags variable from the stack pointer will change when the pushf is performed. gcc doesn't attempt to understand that fact, and address used for pop will still be the same. It will write to somewhere near flags on the stack but not actually into it and overwrite some other value. I saw this happen in the ide_device_add_all function when running in a simulator I work on. I'm assuming that some quirk of how the simulated hardware is set up caused the code path this is on to be executed when it normally wouldn't. A simple fix might be to change "=g" to "=r". Reported-by: Gabe Black <spamforgabe@umich.edu> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Stable Team <stable@kernel.org>	2009-08-03 16:36:17 -07:00

... 3 4 5 6 7 ...

1622 Commits