linux

Commit Graph

Author	SHA1	Message	Date
Yan, Zheng	ebb6cc0359	perf/x86: Fixes for Nehalem-EX uncore driver This patch includes following fixes and update: - Only some events in the Sbox and Mbox can use the match/mask registers, add code to check this. - The format definitions for xbr_mm_cfg and xbr_match registers in the Rbox are wrong, xbr_mm_cfg should use 32 bits, xbr_match should use 64 bits. - Cleanup the Rbox code. Compute the addresses extra registers in the enable_event function instead of the hw_config function. This simplifies the code in nhmex_rbox_alter_er(). Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1344229882-3907-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-08-13 19:01:03 +02:00
Borislav Petkov	cffa59baa5	perf, x86: Fix uncore_types_exit section mismatch Fix the following section mismatch: WARNING: arch/x86/kernel/cpu/built-in.o(.text+0x7ad9): Section mismatch in reference from the function uncore_types_exit() to the function .init.text:uncore_type_exit() The function uncore_types_exit() references the function __init uncore_type_exit(). This is often because uncore_types_exit lacks a __init annotation or the annotation of uncore_type_exit is wrong. caused by `14371cce03` ("perf: Add generic PCI uncore PMU device support"). Cc: Zheng Yan <zheng.z.yan@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-8-git-send-email-zheng.z.yan@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-08-13 19:01:03 +02:00
Chen Gong	55babd8f41	x86/mce: Add CMCI poll mode On Intel systems corrected machine check interrupts (CMCI) may be sent to multiple logical processors; possibly to all processors on the affected socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface"). This means that a persistent error (such as a stuck bit in ECC memory) may cause a storm of interrupts that greatly hinders or prevents forward progress (probably on many processors). To solve this we keep track of the rate at which each processor sees CMCI. If we exceed a threshold, we disable CMCI delivery and switch to polling the machine check banks. If the storm subsides (none of the affected processors see any more errors for a complete poll interval) we re-enable CMCI. [Tony: Added console messages when storm begins/ends and increased storm threshold from 5 to 15 so we have a few more logged entries before we disable interrupts and start dropping reports] Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-09 11:44:51 -07:00
Tony Luck	4670a300a2	x86/mce: Make cmci_discover() quiet cmci_discover() works out which machine check banks support CMCI, and which of those are shared by multiple logical processors. It uses this information to ensure that exactly one cpu is designated the owner of each bank so that when interrupts are broadcast to multiple cpus, only one of them will look in a shared bank to log the error and clear the bank. At boot time cmci_discover() performs this task silently. But during certain cpu hotplug operations it prints out a set of summary lines like this: CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 CMCI:10 CMCI:11 CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3 The value of these messages seems very low. A user might painstakingly cross-check against the data sheet for a processor to ensure that all CMCI supported banks are correctly reported, but this seems improbable. If users really wanted to do this, we should print the information at boot time too. Remove the messages. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-09 10:59:21 -07:00
Suresh Siddha	c6fd893da9	x86, avx: don't use avx instructions with "noxsave" boot param Clear AVX, AVX2 features along with clearing XSAVE feature bits, as part of the parsing "noxsave" parameter. Fixes the kernel boot panic with "noxsave" boot parameter. We could have checked cpu_has_osxsave along with cpu_has_avx etc, but Peter mentioned clearing the feature bits will be better for uses like static_cpu_has() etc. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343755754.2041.2.camel@sbsiddha-desk.sc.intel.com Cc: <stable@vger.kernel.org> # v3.5 Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-08-08 13:41:42 -07:00
Borislav Petkov	057237bb35	x86, cpu: Preset default tlb_flushall_shift on AMD Run the mprotect.c microbenchmark on all our families >= K8 and preset the flushall shift variable accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344272439-29080-5-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-08-06 19:18:39 -07:00
Borislav Petkov	b46882e4c4	x86, cpu: Add AMD TLB size detection Read I- and DTLB entries count from CPUID on AMD. Handle all the different family-specific cases. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344272439-29080-4-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-08-06 19:18:34 -07:00
Borislav Petkov	5b556332c3	x86, cpu: Push TLB detection CPUID check down Push the max CPUID leaf check into the ->detect_tlb function and remove general test case from the generic path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344272439-29080-3-git-send-email-bp@amd64.org Acked-by: Alex Shi <alex.shi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-08-06 19:18:29 -07:00
Borislav Petkov	a9ad773e0d	x86, cpu: Fixup tlb_flushall_shift formatting The TLB characteristics appeared like this in dmesg: [ 0.065817] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512 [ 0.065817] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512 [ 0.065817] tlb_flushall_shift is 0xffffffff where tlb_flushall_shift is actually -1 but dumped as a hex number. However, the Kconfig option CONFIG_DEBUG_TLBFLUSH and the rest of the code treats this as a signed decimal and states "If you set it to -1, the code flushes the whole TLB unconditionally." So, fix its formatting in accordance with the other references to it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344272439-29080-2-git-send-email-bp@amd64.org Acked-by: Alex Shi <alex.shi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-08-06 19:18:09 -07:00
Thomas Gleixner	1a65f970d1	x86: mce: Remove the frozen cases in the hotplug code No point in having double cases if we can simply mask the FROZEN bit out. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-03 11:46:50 -07:00
Thomas Gleixner	26c3c283c5	x86: mce: Split timer init Split timer init function into the init and the start part, so the start part can replace the open coded version in CPU_DOWN_FAILED. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-03 11:46:29 -07:00
Thomas Gleixner	b5975917a3	x86: mce: Serialize mce injection raise_mce() fiddles with global state, but lacks any kind of serialization. Add a mutex around the raise_mce() call, so concurrent writers do not stomp on each other toes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-03 11:45:56 -07:00
Thomas Gleixner	ea22571c8f	x86: mce: Disable preemption when calling raise_local() raise_mce() has a code path which does not disable preemption when the raise_local() is called. The per cpu variable access in raise_local() depends on preemption being disabled to be functional. So that code path was either never tested or never tested with CONFIG_DEBUG_PREEMPT enabled. Add the missing preempt_disable/enable() pair around the call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-08-03 11:45:20 -07:00
Linus Torvalds	1ca0049f2c	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Various fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64, kcmp: The kcmp system call can be common arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case x86/mce: Add quirk for instruction recovery on Sandy Bridge processors x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h> x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs x86, nops: Missing break resulting in incorrect selection on Intel x86: CONFIG_CC_STACKPROTECTOR=y is no longer experimental	2012-08-03 10:59:36 -07:00
Linus Torvalds	bd463a0606	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Fix merge window fallout and fix sleep profiling (this was always broken, so it's not a fix for the merge window - we can skip this one from the head of the tree)." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/trace: Add ability to set a target task for events perf/x86: Fix USER/KERNEL tagging of samples properly perf/x86/intel/uncore: Make UNCORE_PMU_HRTIMER_INTERVAL 64-bit	2012-08-03 10:57:20 -07:00
Linus Torvalds	bca1a5c0ea	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The biggest changes are Intel Nehalem-EX PMU uncore support, uprobes updates/cleanups/fixes from Oleg and diverse tooling updates (mostly fixes) now that Arnaldo is back from vacation." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) uprobes: __replace_page() needs munlock_vma_page() uprobes: Rename vma_address() and make it return "unsigned long" uprobes: Fix register_for_each_vma()->vma_address() check uprobes: Introduce vaddr_to_offset(vma, vaddr) uprobes: Teach build_probe_list() to consider the range uprobes: Remove insert_vm_struct()->uprobe_mmap() uprobes: Remove copy_vma()->uprobe_mmap() uprobes: Fix overflow in vma_address()/find_active_uprobe() uprobes: Suppress uprobe_munmap() from mmput() uprobes: Uprobe_mmap/munmap needs list_for_each_entry_safe() uprobes: Clean up and document write_opcode()->lock_page(old_page) uprobes: Kill write_opcode()->lock_page(new_page) uprobes: __replace_page() should not use page_address_in_vma() uprobes: Don't recheck vma/f_mapping in write_opcode() perf/x86: Fix missing struct before structure name perf/x86: Fix format definition of SNB-EP uncore QPI box perf/x86: Make bitfield unsigned perf/x86: Fix LLC-* and node-* events on Intel SandyBridge perf/x86: Add Intel Nehalem-EX uncore support perf/x86: Fix typo in format definition of uncore PCU filter ...	2012-07-31 15:34:13 -07:00
Peter Zijlstra	d07bdfd322	perf/x86: Fix USER/KERNEL tagging of samples properly Some PMUs don't provide a full register set for their sample, specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide 'better' than regular interrupt accuracy. In this case we use the interrupt regs as basis and over-write some fields (typically IP) with different information. The perf core however uses user_mode() to distinguish user/kernel samples, user_mode() relies on regs->cs. If the interrupt skid pushed us over a boundary the new IP might not be in the same domain as the interrupt. Commit `ce5c1fe9a9` ("perf/x86: Fix USER/KERNEL tagging of samples") tried to fix this by making the perf core use kernel_ip(). This however is wrong (TM), as pointed out by Linus, since it doesn't allow for VM86 and non-zero based segments in IA32 mode. Therefore, provide a new helper to set the regs->ip field, set_linear_ip(), which massages the regs into a suitable state assuming the provided IP is in fact a linear address. Also modify perf_instruction_pointer() and perf_callchain_user() to deal with segments base offsets. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-31 17:02:04 +02:00
Andrew Morton	7740dfc036	perf/x86/intel/uncore: Make UNCORE_PMU_HRTIMER_INTERVAL 64-bit i386 allmodconfig: arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_hrtimer': arch/x86/kernel/cpu/perf_event_intel_uncore.c:728: warning: integer overflow in expression arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_start_hrtimer': arch/x86/kernel/cpu/perf_event_intel_uncore.c:735: warning: integer overflow in expression Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-h84qlqj02zrojmxxybzmy9hi@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-31 17:02:03 +02:00
Linus Torvalds	4cb38750d4	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/mm changes from Peter Anvin: "The big change here is the patchset by Alex Shi to use INVLPG to flush only the affected pages when we only need to flush a small page range. It also removes the special INVALIDATE_TLB_VECTOR interrupts (32 vectors!) and replace it with an ordinary IPI function call." Fix up trivial conflicts in arch/x86/include/asm/apic.h (added code next to changed line) * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tlb: Fix build warning and crash when building for !SMP x86/tlb: do flush_tlb_kernel_range by 'invlpg' x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR x86/tlb: enable tlb flush range support for x86 mm/mmu_gather: enable tlb flush range in generic mmu_gather x86/tlb: add tlb_flushall_shift knob into debugfs x86/tlb: add tlb_flushall_shift for specific CPU x86/tlb: fall back to flush all when meet a THP large page x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range x86/tlb_info: get last level TLB entry number of CPU x86: Add read_mostly declaration/definition to variables from smp.h x86: Define early read-mostly per-cpu macros	2012-07-26 13:17:17 -07:00
Linus Torvalds	79071638ce	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The biggest change is a performance improvement on SMP systems: \| 4 socket 40 core + SMT Westmere box, single 30 sec tbench \| runs, higher is better: \| \| clients 1 2 4 8 16 32 64 128 \|.......................................................................... \| pre 30 41 118 645 3769 6214 12233 14312 \| post 299 603 1211 2418 4697 6847 11606 14557 \| \| A nice increase in performance. which speedup is particularly noticeable on heavily interacting few-tasks workloads, so the changes should help desktop-style Xorg workloads and interactivity as well, on multi-core CPUs. There are also cpuset suspend behavior fixes/restructuring and various smaller tweaks." * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix race in task_group() sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task sched: Reset loop counters if all tasks are pinned and we need to redo load balance sched: Reorder 'struct lb_env' members to reduce its size sched: Improve scalability via 'CPU buddies', which withstand random perturbations cpusets: Remove/update outdated comments cpusets, hotplug: Restructure functions that are invoked during hotplug cpusets, hotplug: Implement cpuset tree traversal in a helper function CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume sched/x86: Remove broken power estimation	2012-07-26 13:08:01 -07:00
Tony Luck	61b0fccd7f	x86/mce: Add quirk for instruction recovery on Sandy Bridge processors Sandy Bridge processors follow the SDM (Vol 3B, Table 15-20) and set both the RIPV and EIPV bits in the MCG_STATUS register to zero for machine checks during instruction fetch. This is more than a little counter-intuitive and means that Linux cannot recover from these errors. Rather than insert special case code at several places in mce.c and mce-severity.c, we pretend the EIPV bit was set for just this case early in processing the machine check. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Link: http://lkml.kernel.org/r/180a06f3f357cf9f78259ae443a082b14a29535b.1343078495.git.tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 15:05:47 +02:00
Tony Luck	736edce5f3	x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h> We will need some of these values in mce.c. Move them to the appropriate header file so they are available. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Link: http://lkml.kernel.org/r/0ccfb1af5fe35e537b7cd8e4d448bf7d851dbfb9.1343078495.git.tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 15:05:47 +02:00
Yan, Zheng	c1ece48cf7	perf/x86: Fix format definition of SNB-EP uncore QPI box The event control register of SNB-EP uncore QPI box has a one bit extension at bit position 21. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1343097850-4348-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 12:23:14 +02:00
Peter Zijlstra	597ed953d7	perf/x86: Make bitfield unsigned Fix: arch/x86/kernel/cpu/perf_event.h:377:43: sparse: dubious one-bit signed bitfield Cc: Borislav Petkov <bp@amd64.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-2jxkmktkppkclj1qe6qxd7ah@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 12:23:13 +02:00
Yan, Zheng	74e6543fdc	perf/x86: Fix LLC-* and node-* events on Intel SandyBridge LLC-* and node-* events require using the OFFCORE_RESPONSE events on SandyBridge, but the hw_cache_extra_regs is left uninitialized. This patch adds the missing extra register configure table for SandyBridge. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1342517275-2875-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 12:23:12 +02:00
Yan, Zheng	254298c726	perf/x86: Add Intel Nehalem-EX uncore support The uncore subsystem in Nehalem-EX consists of 7 components (U-Box, C-Box, B-Box, S-Box, R-Box, M-Box and W-Box). This patch is large because the way to program these boxes is diverse. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FF534F1.3030307@intel.com [ Improved the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 12:23:11 +02:00
Yan, Zheng	4f3f713fc7	perf/x86: Fix typo in format definition of uncore PCU filter The format definition of uncore PCU filter should be filter_band* instead of filter_brand*. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1343024611-4692-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-26 12:23:11 +02:00
Linus Torvalds	62c4d9afa4	Features: * Performance improvement to lower the amount of traps the hypervisor has to do 32-bit guests. Mainly for setting PTE entries and updating TLS descriptors. * MCE polling driver to collect hypervisor MCE buffer and present them to /dev/mcelog. * Physical CPU online/offline support. When an privileged guest is booted it is present with virtual CPUs, which might have an 1:1 to physical CPUs but usually don't. This provides mechanism to offline/online physical CPUs. Bug-fixes for: * Coverity found fixes in the console and ACPI processor driver. * PVonHVM kexec fixes along with some cleanups. * Pages that fall within E820 gaps and non-RAM regions (and had been released to hypervisor) would be populated back, but potentially in non-RAM regions. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQDWcvAAoJEFjIrFwIi8fJ6GAH/iFIkOC5wseD8qZ9nV4VI46t 0GYvBFC4F91NvC7CNfoAySr84v+ZORIZzMcdyDF8H/tLO9MaOY/Mwn0S5ZSqmYMi rhskvK3InBaVkYtceOHugNGM7mB0c3STIm7OsjW6gbVzohmTN25rbQR+X5iWAtVA cTUtDyH3AU15mwuVT3U+VC4IulHpnNJz4pHoq3Sn61/UK1LYmhLXYd5fveA0D0B8 lRZTAvNMsYDJDDmkWNrs8RczKkQ86DTSjfGawm0YG+Gf94GgD5yMHWbiHh2Gy93e u7sHK0RrKbP5BY/MV6vVJxkoV5NoWgCc0tcjBcYwdyvwzxDS75UhV6uoVHC3Ao8= =drt2 -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen update from Konrad Rzeszutek Wilk: "Features: * Performance improvement to lower the amount of traps the hypervisor has to do 32-bit guests. Mainly for setting PTE entries and updating TLS descriptors. * MCE polling driver to collect hypervisor MCE buffer and present them to /dev/mcelog. * Physical CPU online/offline support. When an privileged guest is booted it is present with virtual CPUs, which might have an 1:1 to physical CPUs but usually don't. This provides mechanism to offline/online physical CPUs. Bug-fixes for: * Coverity found fixes in the console and ACPI processor driver. * PVonHVM kexec fixes along with some cleanups. * Pages that fall within E820 gaps and non-RAM regions (and had been released to hypervisor) would be populated back, but potentially in non-RAM regions." * tag 'stable/for-linus-3.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: populate correct number of pages when across mem boundary (v2) xen PVonHVM: move shared_info to MMIO before kexec xen: simplify init_hvm_pv_info xen: remove cast from HYPERVISOR_shared_info assignment xen: enable platform-pci only in a Xen guest xen/pv-on-hvm kexec: shutdown watches from old kernel xen/x86: avoid updating TLS descriptors if they haven't changed xen/x86: add desc_equal() to compare GDT descriptors xen/mm: zero PTEs for non-present MFNs in the initial page table xen/mm: do direct hypercall in xen_set_pte() if batching is unavailable xen/hvc: Fix up checks when the info is allocated. xen/acpi: Fix potential memory leak. xen/mce: add .poll method for mcelog device driver xen/mce: schedule a workqueue to avoid sleep in atomic context xen/pcpu: Xen physical cpus online/offline sys interface xen/mce: Register native mce handler as vMCE bounce back point x86, MCE, AMD: Adjust initcall sequence for xen xen/mce: Add mcelog support for Xen platform	2012-07-24 13:14:03 -07:00
Linus Torvalds	5fecc9d8f5	KVM updates for the 3.6 merge window -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQDRDNAAoJEI7yEDeUysxlkl8P/3C2AHx2webOU8sVzhfU6ONZ ZoGevwBjyZIeJEmiWVpFTTEew1l0PXtpyOocXGNUXIddVnhXTQOKr/Scj4uFbmx8 ROqgK8NSX9+xOGrBPCoN7SlJkmp+m6uYtwYkl2SGnsEVLWMKkc7J7oqmszCcTQvN UXMf7G47/Ul2NUSBdv4Yvizhl4kpvWxluiweDw3E/hIQKN0uyP7CY58qcAztw8nG csZBAnnuPFwIAWxHXW3eBBv4UP138HbNDqJ/dujjocM6GnOxmXJmcZ6b57gh+Y64 3+w9IR4qrRWnsErb/I8inKLJ1Jdcf7yV2FmxYqR4pIXay2Yzo1BsvFd6EB+JavUv pJpixrFiDDFoQyXlh4tGpsjpqdXNMLqyG4YpqzSZ46C8naVv9gKE7SXqlXnjyDlb Llx3hb9Fop8O5ykYEGHi+gIISAK5eETiQl4yw9RUBDpxydH4qJtqGIbLiDy8y9wi Xyi8PBlNl+biJFsK805lxURqTp/SJTC3+Zb7A7CzYEQm5xZw3W/CKZx1ZYBfpaa/ pWaP6tB7JwgLIVXi4HQayLWqMVwH0soZIn9yazpOEFv6qO8d5QH5RAxAW2VXE3n5 JDlrajar/lGIdiBVWfwTJLb86gv3QDZtIWoR9mZuLKeKWE/6PRLe7HQpG1pJovsm 2AsN5bS0BWq+aqPpZHa5 =pECD -----END PGP SIGNATURE----- Merge tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Avi Kivity: "Highlights include - full big real mode emulation on pre-Westmere Intel hosts (can be disabled with emulate_invalid_guest_state=0) - relatively small ppc and s390 updates - PCID/INVPCID support in guests - EOI avoidance; 3.6 guests should perform better on 3.6 hosts on interrupt intensive workloads) - Lockless write faults during live migration - EPT accessed/dirty bits support for new Intel processors" Fix up conflicts in: - Documentation/virtual/kvm/api.txt: Stupid subchapter numbering, added next to each other. - arch/powerpc/kvm/booke_interrupts.S: PPC asm changes clashing with the KVM fixes - arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c: Duplicated commits through the kvm tree and the s390 tree, with subsequent edits in the KVM tree. * tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits) KVM: fix race with level interrupts x86, hyper: fix build with !CONFIG_KVM_GUEST Revert "apic: fix kvm build on UP without IOAPIC" KVM guest: switch to apic_set_eoi_write, apic_write apic: add apic_set_eoi_write for PV use KVM: VMX: Implement PCID/INVPCID for guests with EPT KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check KVM: PPC: Critical interrupt emulation support KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests KVM: PPC64: booke: Set interrupt computation mode for 64-bit host KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt KVM: PPC: bookehv64: Add support for std/ld emulation. booke: Added crit/mc exception handler for e500v2 booke/bookehv: Add host crit-watchdog exception support KVM: MMU: document mmu-lock and fast page fault KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint KVM: MMU: trace fast page fault KVM: MMU: fast path of handling guest page fault KVM: MMU: introduce SPTE_MMU_WRITEABLE bit KVM: MMU: fold tlb flush judgement into mmu_spte_update ...	2012-07-24 12:01:20 -07:00
Peter Zijlstra	ee08d1284e	sched/x86: Remove broken power estimation The x86 sched power implementation has been broken forever and gets in the way of other stuff, remove it. [ For archaeological interest, fixing this code would require dealing with the cross-cpu calling of these functions and more importantly, we need to filter idle time out of the a/m-perf stuff because the ratio will go down to 0 when idle, giving a 0 capacity which is not what we'd want. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/1339594110.8980.38.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:53:00 +02:00
Linus Torvalds	5b160bd426	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/mce changes from Ingo Molnar: "This tree improves the AMD thresholding bank code and includes a memory fault signal handling fixlet." * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults x86, MCE, AMD: Update copyrights and boilerplate x86, MCE, AMD: Give proper names to the thresholding banks x86, MCE, AMD: Make error_count read only x86, MCE, AMD: Cleanup reading of error_count x86, MCE, AMD: Print decimal thresholding values x86, MCE, AMD: Move shared bank to node descriptor x86, MCE, AMD: Remove local_allocate_... wrapper x86, MCE, AMD: Remove shared banks sysfs linking x86, amd_nb: Export model 0x10 and later PCI id	2012-07-22 16:07:45 -07:00
Linus Torvalds	3fad0953a1	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debug-for-linus git tree from Ingo Molnar. Fix up trivial conflict in arch/x86/kernel/cpu/perf_event_intel.c due to a printk() having changed to a pr_info() differently in the two branches. * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Move call to print_modules() out of show_regs() x86/mm: Mark free_initrd_mem() as __init x86/microcode: Mark microcode_id[] as __initconst x86/nmi: Clean up register_nmi_handler() usage x86: Save cr2 in NMI in case NMIs take a page fault (for i386) x86: Remove cmpxchg from i386 NMI nesting code x86: Save cr2 in NMI in case NMIs take a page fault x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>	2012-07-22 12:04:44 -07:00
Linus Torvalds	a065de0d25	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/asm changes from Ingo Molnar: "Assorted single-commit improvements, as usual" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/mtrr: Slightly simplify print_mtrr_state() x86/mm/mtrr: Fix alignment determination in range_to_mtrr() x86/copy_user_generic: Optimize copy_user_generic with CPU erms feature x86/alternatives: Use atomic_xchg() instead atomic_dec_and_test() for stop_machine_text_poke()	2012-07-22 11:42:28 -07:00
Liu, Jinsong	a8fccdb061	x86, MCE, AMD: Adjust initcall sequence for xen there are 3 funcs which need to be _initcalled in a logic sequence: 1. xen_late_init_mcelog 2. mcheck_init_device 3. threshold_init_device xen_late_init_mcelog must register xen_mce_chrdev_device before native mce_chrdev_device registration if running under xen platform; mcheck_init_device should be inited before threshold_init_device to initialize mce_device, otherwise a a NULL ptr dereference will cause panic. so we use following _initcalls 1. device_initcall(xen_late_init_mcelog); 2. device_initcall_sync(mcheck_init_device); 3. late_initcall(threshold_init_device); when running under xen, the initcall order is 1,2,3; on baremetal, we skip 1 and we do only 2 and 3. Acked-and-tested-by: Borislav Petkov <bp@amd64.org> Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-07-19 15:51:37 -04:00
Liu, Jinsong	cef12ee52b	xen/mce: Add mcelog support for Xen platform When MCA error occurs, it would be handled by Xen hypervisor first, and then the error information would be sent to initial domain for logging. This patch gets error information from Xen hypervisor and convert Xen format error into Linux format mcelog. This logic is basically self-contained, not touching other kernel components. By using tools like mcelog tool users could read specific error information, like what they did under native Linux. To test follow directions outlined in Documentation/acpi/apei/einj.txt Acked-and-tested-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Ke, Liping <liping.ke@intel.com> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-07-19 15:51:36 -04:00
Avi Kivity	d63d3e6217	x86, hyper: fix build with !CONFIG_KVM_GUEST Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-07-18 17:01:48 -03:00
Ingo Molnar	bb65a764de	Merge branch 'mce-ripvfix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Merge memory fault handling fix from Tony Luck. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-11 22:37:48 +02:00
Tony Luck	6751ed65dc	x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults In commit `dad1743e59` ("x86/mce: Only restart instruction after machine check recovery if it is safe") we fixed mce_notify_process() to force a signal to the current process if it was not restartable (RIPV bit not set in MCG_STATUS). But doing it here means that the process doesn't get told the virtual address of the fault via siginfo_t->si_addr. This would prevent application level recovery from the fault. Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so that we will provide the right information with the signal. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: stable@kernel.org # 3.4+	2012-07-11 10:20:47 -07:00
Prarit Bhargava	fc73373b33	KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check While debugging I noticed that unlike all the other hypervisor code in the kernel, kvm does not have an entry for x86_hyper which is used in detect_hypervisor_platform() which results in a nice printk in the syslog. This is only really a stub function but it does make kvm more consistent with the other hypervisors. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marcelo Tostatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-11 19:33:32 +03:00
Ingo Molnar	92254d3144	Linux 3.5-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJP+NMmAAoJEHm+PkMAQRiGPxEH/18YQN8FAzEIjcC10ytA3RC3 KzPv31jXgJGZDy1UqmpKtJ7GDwb92AhqZxVnJimMa+6d1uA8NsZQq5EMOPPiX8Qi 8P4AEaw5kSMmR/6zxsxguCGdbDLU3xZ1nJZkHyMgjo2UJbMU0jBPneb/79heWPhe 0HOkLzN5VA6Yx3Nt70sWQ1zsuj0Ji5jCGO0iNTCBmTiv4J9ZlOx3xJQn4aK6JscO /3QRTM43GG0j6zToEOCTHrn8ajOq6rHQQkG0bPVR723nFrSGLoaCT6QVBXYug+AZ 9Xay7zVNvrq2oH5x5jADG2t2vyaG+nEJpSrVjXznzxgDnK7tWjYqiuG5zqKhAq8= =IMfr -----END PGP SIGNATURE----- Merge tag 'v3.5-rc6' into x86/mce Merge Linux 3.5-rc6 before merging more code. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-11 09:41:37 +02:00
Jan Beulich	a7101d1526	x86/mm/mtrr: Slightly simplify print_mtrr_state() high_width can be easily calculated in a single expression when making use of __ffs64(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/4FF71053020000780008E1B5@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-10 10:38:15 +02:00
Jan Beulich	1ba9a29414	x86/mm/mtrr: Fix alignment determination in range_to_mtrr() With the variable operated on being of "unsigned long" type, neither ffs() nor fls() are suitable to use on them, as those truncate their arguments to 32 bits. Using __ffs() and __fls() respectively at once eliminates the need to subtract 1 from their results. Additionally, with the alignment value subsequently used as a shift count, it must be enforced to be less than BITS_PER_LONG (and on 64-bit there's no need for it to be any smaller). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/4FF70D54020000780008E179@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-10 10:38:14 +02:00
Pekka Enberg	c3b7cdf180	perf/x86: Fix intel_perfmon_event_mapformatting Use tabs for "intel_perfmon_event_map" formatting in perf_event_intel.c. Signed-off-by: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1341568786-7045-1-git-send-email-penberg@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-06 13:16:15 +02:00
Yan, Zheng	6a67943a18	perf/x86: Uncore filter support for SandyBridge-EP This patch adds C-Box and PCU filter support for SandyBridge-EP uncore. We can filter C-Box events by thread/core ID and filter PCU events by frequency/voltage. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341381616-12229-5-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:56:01 +02:00
Yan, Zheng	4208969724	perf/x86: Detect number of instances of uncore CBox The CBox manages the interface between the core and the LLC, so the instances of uncore CBox is equal to number of cores. Reported-by: Andrew Cooks <acooks@gmail.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341381616-12229-4-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:56:00 +02:00
Yan, Zheng	3b19e4c98c	perf/x86: Fix event constraint for SandyBridge-EP C-Box The constraint for C-Box event 0x1f should have overlap flag set. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340866596-22502-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:59 +02:00
Yan, Zheng	eca26c9950	perf/x86: Use 0xff as pseudo code for fixed uncore event Stephane Eranian suggestted using 0xff as pseudo code for fixed uncore event and using the umask value to determine which of the fixed events we want to map to. So far there is at most one fixed counter in a uncore PMU. So just change the definition of UNCORE_FIXED_EVENT to 0xff. Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340780953-21130-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:58 +02:00
Peter Zijlstra	3e0091e2b6	perf/x86: Save a few bytes in 'struct x86_pmu' All these are basically boolean flags, use a bitfield to save a few bytes. Suggested-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-vsevd5g8lhcn129n3s7trl7r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:58 +02:00
Peter Zijlstra	c93dc84cbe	perf/x86: Add a microcode revision check for SNB-PEBS Recent Intel microcode resolved the SNB-PEBS issues, so conditionally enable PEBS on SNB hardware depending on the microcode revision. Thanks to Stephane for figuring out the various microcode revisions. Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-v3672ziwh9damwqwh1uz3krm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:57 +02:00
Robert Richter	f285f92f7e	perf/x86: Improve debug output in check_hw_exists() It might be of interest which perfctr msr failed. Signed-off-by: Robert Richter <robert.richter@amd.com> [ added hunk to avoid GCC warn ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:42 +02:00
Robert Richter	b1dc3c4820	perf/x86/amd: Unify AMD's generic and family 15h pmus There is no need for keeping separate pmu structs. We can enable amd_{get,put}_event_constraints() functions also for family 15h event. The advantage is that there is only a single pmu struct for all AMD cpus. This patch introduces functions to setup the pmu to enabe core performance counters or counter constraints. Also, cpuid checks are used instead of family checks where possible. Thus, it enables the code independently of cpu families if the feature flag is set. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:41 +02:00
Robert Richter	a1eac7ac90	perf/x86: Move Intel specific code to intel_pmu_init() There is some Intel specific code in the generic x86 path. Move it to intel_pmu_init(). Since p4 and p6 pmus don't have fixed counters we may skip the check in case such a pmu is detected. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:40 +02:00
Robert Richter	15c7ad51ad	perf/x86: Rename Intel specific macros There are macros that are Intel specific and not x86 generic. Rename them into INTEL_*. This patch removes X86_PMC_IDX_GENERIC and does: $ sed -i -e 's/X86_PMC_MAX_/INTEL_PMC_MAX_/g' \ arch/x86/include/asm/kvm_host.h \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/kvm/pmu.c $ sed -i -e 's/X86_PMC_IDX_FIXED/INTEL_PMC_IDX_FIXED/g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_intel_ds.c \ arch/x86/kvm/pmu.c $ sed -i -e 's/X86_PMC_MSK_/INTEL_PMC_MSK_/g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:39 +02:00
Ingo Molnar	b0338e99b2	Merge branch 'x86/cpu' into perf/core Merge this branch because we changed the wrmsr*_safe() API and there's a conflict. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:12:11 +02:00
Ingo Molnar	90574ebb7e	Merge branch 'perf/urgent' into perf/core Merge this branch to pick up a fixlet and to update to a more recent base. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:10:23 +02:00
Peter Zijlstra	ce5c1fe9a9	perf/x86: Fix USER/KERNEL tagging of samples Several perf interrupt handlers (PEBS,IBS,BTS) re-write regs->ip but do not update the segment registers. So use an regs->ip based test instead of an regs->cs/regs->flags based test. Reported-and-tested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-xxrt0a1zronm1sm36obwc2vy@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 20:59:07 +02:00
Alex Shi	c4211f42d3	x86/tlb: add tlb_flushall_shift for specific CPU Testing show different CPU type(micro architectures and NUMA mode) has different balance points between the TLB flush all and multiple invlpg. And there also has cases the tlb flush change has no any help. This patch give a interface to let x86 vendor developers have a chance to set different shift for different CPU type. like some machine in my hands, balance points is 16 entries on Romely-EP; while it is at 8 entries on Bloomfield NHM-EP; and is 256 on IVB mobile CPU. but on model 15 core2 Xeon using invlpg has nothing help. For untested machine, do a conservative optimization, same as NHM CPU. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-5-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-27 19:29:10 -07:00
Alex Shi	e0ba94f14f	x86/tlb_info: get last level TLB entry number of CPU For 4KB pages, x86 CPU has 2 or 1 level TLB, first level is data TLB and instruction TLB, second level is shared TLB for both data and instructions. For hupe page TLB, usually there is just one level and seperated by 2MB/4MB and 1GB. Although each levels TLB size is important for performance tuning, but for genernal and rude optimizing, last level TLB entry number is suitable. And in fact, last level TLB always has the biggest entry number. This patch will get the biggest TLB entry number and use it in furture TLB optimizing. Accroding Borislav's suggestion, except tlb_ll[i/d]_* array, other function and data will be released after system boot up. For all kinds of x86 vendor friendly, vendor specific code was moved to its specific files. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-2-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-27 19:28:24 -07:00
H. Peter Anvin	1b6b7c9ff3	x86, cpufeature: Remove stray %s, add -w to mkcapflags.pl There was a stray %s left from testing, remove it. Add -w to the #! line (which is parsed by Perl even if the Perl interpreter is invoked explicitly on the command line) to catch these kinds of errors in the future. Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/20120626143246.0c9bf301@endymion.delvare Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-06-26 08:02:48 -07:00
H. Peter Anvin	55f6cb9d0b	x86, cpufeature: Catch duplicate CPU feature strings We had a case of duplicate CPU feature strings, a user space ABI violation, for almost two years. Make it a build error so that doesn't happen again. Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Cc: Jean Delvare <khali@linux-fr.org>	2012-06-25 09:02:13 -07:00
H. Peter Anvin	4ad3341130	x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM It makes sense to label "Digital Thermal Sensor" as "DTS", but unfortunately the string "dts" was already used for "Debug Store", and /proc/cpuinfo is a user space ABI. Therefore, rename this to "dtherm". This conflict went into mainline via the hwmon tree without any x86 maintainer ack, and without any kind of hint in the subject. `a4659053` x86/hwmon: fix initialization of coretemp Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Cc: <stable@vger.kernel.org> v2.6.36..v3.4 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-06-25 09:01:15 -07:00
Robert Richter	357398e96d	perf/x86: Fix section mismatch in uncore_pci_init() Fix section mismatch in uncore_pci_init(): WARNING: vmlinux.o(.init.text+0x9246): Section mismatch in reference from the function uncore_pci_init() to the function .devexit.text:uncore_pci_remove() The function __init uncore_pci_init() references a function __devexit uncore_pci_remove(). [...] Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: <a.p.zijlstra@chello.nl> Cc: <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/20120620163927.GI5046@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-25 10:32:21 +02:00
Ingo Molnar	6a991accee	Merge commit 'v3.5-rc3' into x86/debug Merge it in to pick up a fix that we are going to clean up in this branch. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-20 14:22:34 +02:00
Peter Zijlstra	2992c542fc	perf/x86: Lowercase uncore PMU event names Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-ucnds8gkve4x3s4biuukyph3@git.kernel.org [ Trivial build fix ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 15:55:52 +02:00
Yan, Zheng	7c94ee2e09	perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support The uncore subsystem in Sandy Bridge-EP consists of 8 components: Ubox, Cacheing Agent, Home Agent, Memory controller, Power Control, QPI Link Layer, R2PCIe, R3QPI. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-9-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:23 +02:00
Yan, Zheng	14371cce03	perf: Add generic PCI uncore PMU device support This patch adds generic support for uncore PMUs presented as PCI devices. (These come in addition to the CPU/MSR based uncores.) Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-8-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:23 +02:00
Yan, Zheng	fcde10e916	perf/x86: Add Intel Nehalem and Sandy Bridge uncore PMU support Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-7-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:22 +02:00
Yan, Zheng	087bfbb032	perf/x86: Add generic Intel uncore PMU support This patch adds the generic Intel uncore PMU support, including helper functions that add/delete uncore events, a hrtimer that periodically polls the counters to avoid overflow and code that places all events for a particular socket onto a single cpu. The code design is based on the structure of Sandy Bridge-EP's uncore subsystem, which consists of a variety of components, each component contains one or more "boxes". (Tooling support follows in the next patches.) Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-6-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:22 +02:00
Yan, Zheng	4b4969b144	perf: Export perf_assign_events() Export perf_assign_events() so the uncore code can use it to schedule events. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:20 +02:00
Robert Richter	76958a61e4	perf/x86/amd: Fix RDPMC index calculation for AMD family 15h The RDPMC index calculation is wrong for AMD family 15h (X86_FEATURE_ PERFCTR_CORE set). This leads to a #GP when accessing the counter: Pid: 2237, comm: syslog-ng Not tainted 3.5.0-rc1-perf-x86_64-standard-g130ff90 #135 AMD Pike/Pike RIP: 0010:[<ffffffff8100dc33>] [<ffffffff8100dc33>] x86_perf_event_update+0x27/0x66 While the msr address offset is (index << 1) we must use index to select the correct rdpmc. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 11:14:35 +02:00
Shuah Khan	e2b297fcf1	perf/x86: Convert obsolete simple_strtoul() usage to kstrtoul() Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1339384421.3025.8.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:52:12 +02:00
Ingo Molnar	c3e228d59b	Linux 3.5-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJP0qm4AAoJEHm+PkMAQRiG62QIAJRNJFyVB0ZrsMPgdwLnlX4O 5I86H7GaYXoOK/KMb2s5h4KiFggIODnyEkZi+/39tJOgGo0KrMcDlsh0owB1Iggw LE6iyze9I1z9wQze0+SXe7VAcvUYvsx2vgpOKvoNi97Qgn3B6onL+SAi5U+NAqJl 0NdKmveEd42UIm7JfChHlxl8bm8YB+WcU38OkMGpRpJ/Moz9EbSjYVQg3oHrzJjy duiX6SD/OV4m5yCcXXmu+f41pN+SG7xENJ5r4enyi2ZF8mAyVz2goIyL2bA0AJX2 +GbpD1sxUHkZ6yPg4tf2bmJOj0PkfZNAi8YpFxZDlP4y1pKuCTEDTBp8O2id43w= =Jyn8 -----END PGP SIGNATURE----- Merge tag 'v3.5-rc2' into perf/core Merge in Linux 3.5-rc2 - to pick up fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:51:35 +02:00
Linus Torvalds	0b35d326f8	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Fix section mismatch warnings on 32-bit x86/uv: Fix UV2 BAU legacy mode x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space x86, efi stub: Add .reloc section back into image x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs x86/reboot: Fix a warning message triggered by stop_other_cpus() x86/intel/moorestown: Change intel_scu_devices_create() to __devinit x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init() x86/gart: Fix kmemleak warning x86: mce: Add the dropped timer interval init back x86/mce: Fix the MCE poll timer logic	2012-06-08 09:26:55 -07:00
Linus Torvalds	106544d81d	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A bit larger than what I'd wish for - half of it is due to hw driver updates to Intel Ivy-Bridge which info got recently released, cycles:pp should work there now too, amongst other things. (but we are generally making exceptions for hardware enablement of this type.) There are also callchain fixes in it - responding to mostly theoretical (but valid) concerns. The tooling side sports perf.data endianness/portability fixes which did not make it for the merge window - and various other fixes as well." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) perf/x86: Check user address explicitly in copy_from_user_nmi() perf/x86: Check if user fp is valid perf: Limit callchains to 127 perf/x86: Allow multiple stacks perf/x86: Update SNB PEBS constraints perf/x86: Enable/Add IvyBridge hardware support perf/x86: Implement cycles:p for SNB/IVB perf/x86: Fix Intel shared extra MSR allocation x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix perf: Remove duplicate invocation on perf_event_for_each perf uprobes: Remove unnecessary check before strlist__delete perf symbols: Check for valid dso before creating map perf evsel: Fix 32 bit values endianity swap for sample_id_all header perf session: Handle endianity swap on sample_id_all header data perf symbols: Handle different endians properly during symbol load perf evlist: Pass third argument to ioctl explicitly perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP perf tools: Make --version show kernel version instead of pull req tag perf tools: Check if callchain is corrupted perf callchain: Make callchain cursors TLS ...	2012-06-08 09:14:46 -07:00
Ingo Molnar	707ecec1dc	AMD thresholding fixes for 3.6 Those are a bunch of patches which give the MCE thresholding code a hard look and a scrubbing to remove a couple of annoyances like sysfs warnings when running CPU off-/online tests and the threshold_bank4 node under /sys/devices/system/machinecheck/ is a symlink. It also gives proper names to the thresholding banks instead of simply enumerating them, like this: /sys/devices/system/machinecheck/machinecheck0/ \|-- bank0 \|-- bank1 \|-- bank2 \|-- bank3 \|-- bank4 \|-- bank5 \|-- bank6 \|-- check_interval \|-- cmci_disabled \|-- combined_unit \| \|-- combined_unit \| \|-- error_count \| \|-- threshold_limit \|-- dont_log_ce \|-- execution_unit \| \|-- execution_unit \| \|-- error_count \| \|-- threshold_limit \|-- ignore_ce \|-- insn_fetch \| \|-- insn_fetch \| \|-- error_count \| \|-- threshold_limit \|-- load_store \| \|-- load_store \| \|-- error_count \| \|-- threshold_limit \|-- monarch_timeout \|-- northbridge \| \|-- dram \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- ht_links \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- l3_cache \| \|-- error_count \| \|-- interrupt_enable \| \|-- threshold_limit ... It is tested on all our families >= K8. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJP0Jw9AAoJEBLB8Bhh3lVKMa8P/1ZPWkFZVFIdilyViQdSR/1/ 6MPy6BcZAACBl4rgrvjtFhmNv8C2dCGoPYRksHiO9sjgsilhQe/L92rmORifrNB4 kvqR1QfKH2Hw2X1B/0fWXthh7UV37h1TdrVNJNlzhmi3wO+MHlX54iZcwpsaceFx QdzSqdHbaKfkfttojxIdgSfl7M2aCRnkmMOUG4X9HCsIK0C3ChdHLhJDnLT0xYb8 fdA8dkXMktli0GC+KfevOXILZGLhUQuigu4iYKRm689N98N1Ejfa7BvMCVqLr0kF 4fNmC+BtZmdw8MYd7EiuYXhA0Unu+CAg23ADQpn0AEyGQcM5h7/9/4GKvgjjsV1h /2r1WU+UVGZSUQ3FRDbzD37QVAa9FoOv967Gks6Fa31K7kEPC8yIRhWl72wXQXpa hFk+Hf3RlKtaO06iH/2RD2JA+W6xntiFo8CZ+AUMoLWfIQaYSAFP039lpjJp/Hzd CDdNWKCchAaMYI1MBmbRZ65mSgsVLLioNrf55+kdWT/CbuXJua95YxRRmllNFv5k MHjPoTajL0WKZhYxUSjCH87rqZHyNBH5s8iZlIt7wR//kqBGYfmRvGSDe31yMrL8 PH/MgEIBVmrLQSWcojF+pU6ep+sQELVNsbu1+doZd/ruD/hzsZeu+MANWtJgrrVs +rsPRDWTcC3ca/V5Y1UO =XN3W -----END PGP SIGNATURE----- Merge tag 'amd-thresholding-fixes-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Pull in AMD MCE thresholding fixes for v3.6, from Borislav Petkov: " Those are a bunch of patches which give the MCE thresholding code a hard look and a scrubbing to remove a couple of annoyances like sysfs warnings when running CPU off-/online tests and the threshold_bank4 node under /sys/devices/system/machinecheck/ is a symlink. It also gives proper names to the thresholding banks instead of simply enumerating them, like this: /sys/devices/system/machinecheck/machinecheck0/ \|-- bank0 \|-- bank1 \|-- bank2 \|-- bank3 \|-- bank4 \|-- bank5 \|-- bank6 \|-- check_interval \|-- cmci_disabled \|-- combined_unit \| \|-- combined_unit \| \|-- error_count \| \|-- threshold_limit \|-- dont_log_ce \|-- execution_unit \| \|-- execution_unit \| \|-- error_count \| \|-- threshold_limit \|-- ignore_ce \|-- insn_fetch \| \|-- insn_fetch \| \|-- error_count \| \|-- threshold_limit \|-- load_store \| \|-- load_store \| \|-- error_count \| \|-- threshold_limit \|-- monarch_timeout \|-- northbridge \| \|-- dram \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- ht_links \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- l3_cache \| \|-- error_count \| \|-- interrupt_enable \| \|-- threshold_limit ... It is tested on all our families >= K8." Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 12:29:47 +02:00
H. Peter Anvin	715c85b1fc	x86, cpu: Rename checking_wrmsrl() to wrmsrl_safe() Rename checking_wrmsrl() to wrmsrl_safe(), to match the naming convention used by all the other MSR access functions/macros. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-07 13:32:04 -07:00
Borislav Petkov	2c929ce6f1	x86, cpu, amd: Deprecate AMD-specific MSR variants Now that all users of {rd,wr}msr_amd_safe have been fixed, deprecate its use by making them private to amd.c and adding warnings when used on anything else beside K8. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1338562358-28182-5-git-send-email-bp@amd64.org Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-07 11:43:30 -07:00
Andre Przywara	169e9cbd77	x86, cpu, amd: Fix crash as Xen Dom0 on AMD Trinity systems `f7f286a910` ("x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it") wrongfully added code which used the AMD-specific {rd,wr}msr variants for no real reason. This caused boot panics on xen which wasn't initializing the {rd,wr}msr_safe_regs pv_ops members properly. This, in turn, caused a heated discussion leading to us reviewing all uses of the AMD-specific variants and removing them where unneeded (almost everywhere except an obscure K8 BIOS fix, see `6b0f43ddfa`). Finally, this patch switches to the standard {rd,wr}msr_safe variants which should've been used in the first place anyway and avoided unneeded excitation with xen. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Link: http://lkml.kernel.org/r/1338562358-28182-4-git-send-email-bp@amd64.org Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: <http://lkml.kernel.org/r/1338383402-3838-1-git-send-email-andre.przywara@amd.com> [Boris: correct and expand commit message] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-07 11:43:30 -07:00
Borislav Petkov	ecd431d95a	x86, cpu: Fix show_msr MSR accessing function There's no real reason why, when showing the MSRs on a CPU at boottime, we should be using the AMD-specific variant. Simply use the generic safe one which handles #GPs just fine. Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1338562358-28182-3-git-send-email-bp@amd64.org Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-07 11:41:28 -07:00
Borislav Petkov	1112257019	x86, MCE, AMD: Update copyrights and boilerplate Jacob is doing something else now so add myself as the loser who provides support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:50 +02:00
Borislav Petkov	336d335a96	x86, MCE, AMD: Give proper names to the thresholding banks Having the banks numbered is ok but having real names which mean something to the user makes a lot more sense: /sys/devices/system/machinecheck/machinecheck0/ \|-- bank0 \|-- bank1 \|-- bank2 \|-- bank3 \|-- bank4 \|-- bank5 \|-- bank6 \|-- check_interval \|-- cmci_disabled \|-- combined_unit \| \|-- combined_unit \| \|-- error_count \| \|-- threshold_limit \|-- dont_log_ce \|-- execution_unit \| \|-- execution_unit \| \|-- error_count \| \|-- threshold_limit \|-- ignore_ce \|-- insn_fetch \| \|-- insn_fetch \| \|-- error_count \| \|-- threshold_limit \|-- load_store \| \|-- load_store \| \|-- error_count \| \|-- threshold_limit \|-- monarch_timeout \|-- northbridge \| \|-- dram \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- ht_links \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- l3_cache \| \|-- error_count \| \|-- interrupt_enable \| \|-- threshold_limit ... Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:48 +02:00
Borislav Petkov	6e927361bd	x86, MCE, AMD: Make error_count read only Until now, writing to error count caused the code to reset the thresholding bank to the current thresholding limit and start counting errors from the beginning. This is misleading and unclear, and can be accomplished by writing the old thresholding limit into ->threshold_limit. Make error_count read-only with the functionality to show the current error count. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:47 +02:00
Borislav Petkov	2c9c42fa98	x86, MCE, AMD: Cleanup reading of error_count We have rdmsr_on_cpu() now so remove locally defined solution in favor of the generic one. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:46 +02:00
Borislav Petkov	18c20f373b	x86, MCE, AMD: Print decimal thresholding values If one sets the threshold limit, say to 25: $ echo 25 > machinecheck0/threshold_bank4/misc0/threshold_limit and then reads it back again, it gives $ cat machinecheck0/threshold_bank4/misc0/threshold_limit 19 which is actually 0x19 but we don't know that. Make all output decimal. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:45 +02:00
Borislav Petkov	019f34fccf	x86, MCE, AMD: Move shared bank to node descriptor Well, instead of having a real bank 4 on the BSP of each node and symlinks on the remaining cores, we push it up into the amd_northbridge descriptor which now contains a pointer to the northbridge bank 4 because the bank is one per northbridge and, as such, belongs in the NB descriptor anyway. Each time we hotplug CPUs, we use the northbridge pointer to copy the shared bank into the per-CPU array of threshold_banks pointers, or destroy it when the last CPU on the node goes offline, or create it when the first comes online. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:44 +02:00
Borislav Petkov	26ab256eaa	x86, MCE, AMD: Remove local_allocate_... wrapper It is unneeded now so drop it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:43 +02:00
Borislav Petkov	92e26e2a1a	x86, MCE, AMD: Remove shared banks sysfs linking The code used to create a symlink on all non-BSP cores of a node when the MCi_MISCj bank is present once per node. (This is generally the case with bank 4 on AMD). However, these sysfs links cause a bunch of problems with cpu off-/onlining testing and are, as such, a bit overengineered. IOW, there's nothing wrong with having normal sysfs files for the shared banks since the corresponding MSRs are replicated across each core anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-06-07 12:43:42 +02:00
Andi Kleen	70ab7003de	perf/x86: Don't assume there can be only 4 PEBS events On Sandy Bridge in non HT mode there are 8 counters available. Since every counter can write a PEBS record assuming there are 4 max is incorrect. Use the reported counter number -- with an upper limit for a static array -- instead. Also I made the warning messages a bit more informational. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338944211-28275-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 17:23:40 +02:00
Vince Weaver	c48b60538c	perf/x86: Use rdpmc() rather than rdmsr() when possible in the kernel The rdpmc instruction is faster than the equivelant rdmsr call, so use it when possible in the kernel. The perfctr kernel patches did this, after extensive testing showed rdpmc to always be faster (One can look in etc/costs in the perfctr-2.6 package to see a historical list of the overhead). I have done some tests on a 3.2 kernel, the kernel module I used was included in the first posting of this patch: rdmsr rdpmc Core2 T9900: 203.9 cycles 30.9 cycles AMD fam0fh: 56.2 cycles 9.8 cycles Atom 6/28/2: 129.7 cycles 50.6 cycles The speedup of using rdpmc is large. [ It's probably possible (and desirable) to do this without requiring a new field in the hw_perf_event structure, but the fixed events make this tricky. ] Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1203011724030.26934@cl320.eecs.utk.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 17:23:35 +02:00
Peter Zijlstra	1c2ac3fde3	perf/x86: Fix wrmsrl() debug wrapper Move the wrmslr() debug wrapper to the common header now that all the include games are gone. Also clean it up a bit to avoid multiple evaluation of the argument. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-l4gkfnivwv4yi5mqxjlovymx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 17:23:22 +02:00
Arun Sharma	bc6ca7b342	perf/x86: Check if user fp is valid Signed-off-by: Arun Sharma <asharma@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 17:08:01 +02:00
Arun Sharma	302fa4b58a	perf/x86: Allow multiple stacks Without this patch, applications with two different stack regions (eg: native stack vs JIT stack) get truncated callchains even when RBP chaining is present. GDB shows proper stack traces and the frame pointer chaining is intact. This patch disables the (fp < RSP) check, hoping that other checks in the code save the day for us. In our limited testing, this didn't seem to break anything. In the long term, we could potentially have userspace advise the kernel on the range of valid stack addresses, so we don't spend a lot of time unwinding from bogus addresses. Signed-off-by: Arun Sharma <asharma@fb.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1334961696-19580-2-git-send-email-asharma@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 17:07:58 +02:00
Peter Zijlstra	8440ccb43f	perf/x86: Update SNB PEBS constraints Afaict there's no need to (incompletely) iterate the MEM_UOPS_RETIRED.* umask state. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 16:59:52 +02:00
Peter Zijlstra	b6db437ba8	perf/x86: Enable/Add IvyBridge hardware support Implement rudimentary IVB perf support. The SDM states its identical to SNB with exception of the exact event tables, but a quick look suggests they're similar enough. Also mark SNB-EP as broken for now. Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 16:59:49 +02:00
Peter Zijlstra	cccb9ba9e4	perf/x86: Implement cycles:p for SNB/IVB Now that there's finally a chip with working PEBS (IvyBridge), we can enable the hardware and implement cycles:p for SNB/IVB. Cc: Stephane Eranian <eranian@google.com> Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 16:59:47 +02:00
Peter Zijlstra	b430f7c470	perf/x86: Fix Intel shared extra MSR allocation Zheng Yan reported that event group validation can wreck event state when Intel extra_reg allocation changes event state. Validation shouldn't change any persistent state. Cloning events in validate_{event,group}() isn't really pretty either, so add a few special cases to avoid modifying the event state. The code is restructured to minimize the special case impact. Reported-by: Zheng Yan <zheng.z.yan@linux.intel.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338903031.28282.175.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 16:59:44 +02:00
Thomas Gleixner	1a87fc1ec7	x86: mce: Add the dropped timer interval init back commit `82f7af09` ("x86/mce: Cleanup timer mess) dropped the initialization of the per cpu timer interval. Duh :( Restore the previous behaviour. Reported-by: Chen Gong <gong.chen@linux.intel.com> Cc: bp@amd64.org Cc: tony.luck@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-06-06 11:33:21 +02:00
Joe Perches	c767a54ba0	x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: Joe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 09:17:22 +02:00
Chen Gong	958fb3c512	x86/mce: Fix the MCE poll timer logic In commit `82f7af09` ("x86/mce: Cleanup timer mess), Thomas just forgot the "/ 2" there while cleaning up. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@amd64.org Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 08:28:21 +02:00
Linus Torvalds	eea5b5510f	Typo/thinko in a cleanup caused a semantic change. Fix it. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPzkYHAAoJEKurIx+X31iBeZ8QAK44Watxy0Ib0IUGrx2wcds8 dEgVNyv9qrT28/PCBQiFtoRB83vsWWOzdNDWH6s4AAB6cFRbS8dOHvjUTsFfwgJ1 lgD9URJmegBFInmkwSBVhY/MpwNm/pw0CNIs2ymqAvSlAgj8I8zF5xWsxTZfJcYn gxDNTxbuaEd3sPQO8wjBrw8NhCNpNwzEzZUXh31tM92bgpscIsXsJl/cRna5B1NU Z5LiSZev1W0/lf+Ys94ZsOSRT9zfjTI+mjXwv/lu8DlgeRQYyXixRjROgWvCbywx hlepvAQHtss9z5YTiGhRlPnR/CZ0fEUMQRtyRsp4qxG8BrgkWAdB++3ZMVbQYdom i98TQh1HZU3zzxueIwKwfjPKhG9q2Ee1XzE0ow7sXinBJQgiGrXiEv1tcX7001P7 vpkyqVon2KKSYknxdHtbc6XnwjbzGDEoS0fqZf0boKoHee7KWBmFyX9JXLvZjtY1 ef4FqTZNccYWL/5Hi0slZWAucC5iPleeV6sm9y4xG/gJFbTIw+joq1dc3pBZJ2uR rHhxD5tWTwbovsq1igcjAbrh9davwiFWiufW3Y5GdTAZJJ1tF6YjCmg18QbvaHJj 9uplEUBUA4N6UUqPCjdKRdPaxPwkNOmjYH3YQIajSmLcn0YoI9NYw4Xn0sp2jCBo zGnaJvG2IIC9LNgaVohz =LQWb -----END PGP SIGNATURE----- Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull MCE regression fix from Tony Luck: "Typo/thinko in a cleanup caused a semantic change. Fix it." * tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Fix the MCE poll timer logic	2012-06-05 15:15:04 -07:00
Chen Gong	c2238f10e0	x86/mce: Fix the MCE poll timer logic In commit `82f7af09` (x86/mce: Cleanup timer mess), Thomas just forgot the "/ 2" there while cleaning up. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-06-05 10:15:07 -07:00
H. Peter Anvin	40b46a7d29	Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into x86-urgent-for-linus	2012-06-01 15:55:31 -07:00
Steven Rostedt	f8988175fd	x86: Allow nesting of the debug stack IDT setting When the NMI handler runs, it checks if it preempted a debug handler and if that handler is using the debug stack. If it is, it changes the IDT table not to update the stack, otherwise it will reset the debug stack and corrupt the debug handler it preempted. Now that ftrace uses breakpoints to change functions from nops to callers, many more places may hit a breakpoint. Unfortunately this includes some of the calls that lockdep performs. Which causes issues with the debug stack. It too needs to change the debug stack before tracing (if called from the debug handler). Allow the debug_stack_set_zero() and debug_stack_reset() to be nested so that the debug handlers can take advantage of them too. [ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:21 -04:00
Linus Torvalds	2d117403b3	One more mce cleanup before the 3.5 merge window closes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPxpZ1AAoJEKurIx+X31iBACgP/jZiXAdu9JjYqvPA8/ACsntt S0fpbn40kKftn58x6Ddqemu/rg8Iy/ezsB7Fm93TYfBzDb8RGdkSiLj/KoO39Jzy Tmod/vJ0XMsBFgDob5GSKSOYMsnkO7//6jtnM09A8wDlxUwE3LV/3/84EqOrQH0n LR3Ltd6wnCc/HVasdDZX/814QcHe5KSoFMY2jUHWf0suOKcI22X57PQt9831bKky tvvsqFKSaOaerW9F1atwB2Qx37f0pUCu/Qo4KmBB0EVwSapRpHSDu657byft7VWQ FJ1eLfZF7lnVaYHxyCqz+wgTVBTsBPDt1xGkIJqSMMHtziG5iP3m0NSYLCJ5XJYx AcmF55hIx4yMCrIdiKc7wWs2K4U59+FiGJ2LgFCBZ3XLJoAuZnhf+WOh4ws/wi/j qDk9vK8KG3VBYIwZnTWb6N1bRtzznp1ElXjZR1byh/Cu2Ne/EWL64VydNT7ts0Ga 3mGF88keAlw04qHYtU/7y5WrHRKGV5LBjGujVV2e2a5dC6p7LO11UCLptEv11DWS LcbIegbrimWmMxVScJQkL12GEzZKHpZZvrFRIKiWkTA15R6R1OTC7VrywA2GjDbd h5mWKd7zV6Ankjmq9SuvnA1UazhC/r+uQ58INVpFVM+aFor32HVd6L7VRns7sv2Z WbQNYxv5O+D3J2K+dQNy =4tLS -----END PGP SIGNATURE----- Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull mce cleanup from Tony Luck: "One more mce cleanup before the 3.5 merge window closes" * tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Cleanup timer mess	2012-05-31 10:53:37 -07:00
Thomas Gleixner	82f7af09e6	x86/mce: Cleanup timer mess Use unsigned long for dealing with jiffies not int. Rename the callback to something sensible. Use __this_cpu_read/write for accessing per cpu data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-30 14:40:01 -07:00
zhenzhong.duan	2da06af810	x86, mtrr: Fix a type overflow in range_to_mtrr func When boot on sun G5+ with 4T mem, see an overflow in mtrr cleanup as below. BADgran_size: 2G chunk_size: 2G num_reg: 10 lose cover RAM: -18014398505283592M This is because 1<<31 sign extended. Use an unsigned long constant to fix it. Useful for mem larger than or equal to 4T. -v2: Use 64bit constant instead of explicit type conversion as suggested by Yinghai. Description updated too. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Link: http://lkml.kernel.org/r/4FC5A77F.6060505@oracle.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-30 14:37:00 -07:00
H. Peter Anvin	bbd771474e	Merge branch 'x86/trampoline' into x86/urgent x86/trampoline contains an urgent commit which is necessarily on a newer baseline. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-30 12:11:32 -07:00
Ingo Molnar	403e1c5b74	Merge branch 'x86/mce' into x86/urgent Merge in these fixlets. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-30 14:12:06 +02:00
Linus Torvalds	786f02b719	x86/mce merge window patches (including two that make error_context() checks less sucky) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPv7OiAAoJEKurIx+X31iBQxEP/RV8YO4nrozhHY597qabzfJc 4YoCOma+wUyhXPDmZI80XvrIlcCq7TEJL1HTaAyA5rnyYvRHpM+uXDCRmbJDI4e0 gA42/Y8+lbmR6BLY8sdptCXIWxw/d8wYEKK2BgNsPhkJxODGW/gVAws93erist/v yepq+GwI0QGAeRlO6AYgE7NwQmHXK5AdfH3phHUTVABVIUGH5+Zp6471FTB+hYy5 aNvUnL0hw8vxrbpDL/Le359etPqC6wsELCIQ9wVtWCD0/UJM6Yd3j0+CKQ7q/KHU 7zMcP+OCGTJ3koMhEbFIOnxuswWDGq5y/qIzSXMEEemGqgxqFUvX3wUeZ3HFAFNx nJ7ZaA813t7Bud4G4WwESxMGQpxI7FTvxnF1ow3IlRsMtV4ffvAS9xvWi0GQJrVY xixK7G87PGAm6fP9Zbb/lQlRO8gD498j4rfI9MOsUuY9QgFNcH2eg6c4O0iHDpos WkSgUaM49Q610JslrxsXp+BZZLBF/wbcjcFiQGFAWOIKTKgRQ99+dXAQY7fw9CIf /wNl9MkbOvJdPL9OfLTmAYAMXyaXbOX8qcvItwqcBsUT0AV863NdIXtS4BXBOrMs 5u16CDX1ieFAlA2dzhynvE0Zd1Ws6wfe5W/BgtQ+H175uHFr8pHAxsBTX8GSNrXG /bSWWrR3CIBRHoWCJMmH =kG4e -----END PGP SIGNATURE----- Merge tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull x86/mce merge window patches from Tony Luck: "Including two that make error_context() checks less sucky" * tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Add instruction recovery signatures to mce-severity table x86/mce: Fix check for processor context when machine check was taken. MCE: Fix vm86 handling for 32bit mce handler x86/mce Add validation check before GHES error is recorded x86/mce: Avoid reading every machine check bank register twice.	2012-05-25 16:14:12 -07:00
Tony Luck	37c3459b67	x86/mce: Add instruction recovery signatures to mce-severity table Instruction recovery cases are very similar to the data recovery one we already have. Just trade out for a new MCACOD value. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:24:11 -07:00
Tony Luck	875e26648c	x86/mce: Fix check for processor context when machine check was taken. Linus pointed out that there was no value is checking whether m->ip was zero - because zero is a legimate value. If we have a reliable (or faked in the VM86 case) "m->cs" we can use it to tell whether we were in user mode or kernelwhen the machine check hit. Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:22:44 -07:00
Andi Kleen	a129a7c845	MCE: Fix vm86 handling for 32bit mce handler When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:22:37 -07:00
Linus Torvalds	56edab3159	Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: - Leftover AMD PMU driver fix fix from the end of the v3.4 stabilization cycle. - Late tools/perf/ changes that missed the first round: * endianness fixes * event parsing improvements * libtraceevent fixes factored out from trace-cmd * perl scripting engine fixes related to libtraceevent, * testcase improvements * perf inject / pipe mode fixes * plus a kernel side fix * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Update event scheduling constraints for AMD family 15h models * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "sched, perf: Use a single callback into the scheduler" perf evlist: Show event attribute details perf tools: Bump default sample freq to 4 kHz perf buildid-list: Work better with pipe mode perf tools: Fix piped mode read code perf inject: Fix broken perf inject -b perf tools: rename HEADER_TRACE_INFO to HEADER_TRACING_DATA perf tools: Add union u64_swap type for swapping u64 data perf tools: Carry perf_event_attr bitfield throught different endians perf record: Fix documentation for branch stack sampling perf target: Add cpu flag to sample_type if target has cpu perf tools: Always try to build libtraceevent perf tools: Rename libparsevent to libtraceevent in Makefile perf script: Rename struct event to struct event_format in perl engine perf script: Explicitly handle known default print arg type perf tools: Add hardcoded name term for pmu events perf tools: Separate 'mem:' event scanner bits perf tools: Use allocated list for each parsed event perf tools: Add support for displaying event parser debug info perf test: Move parse event automated tests to separated object	2012-05-23 12:12:49 -07:00
Linus Torvalds	70311aaa8a	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MCE updates from Ingo Molnar: "This tree updates/fixes MCE hardware support, it makes the APIC LVT thresholding interrupt optional because a subset of AMD F15h models don't support it." * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, MCE, AMD: Disable error thresholding bank 4 on some models x86, MCE, AMD: Hide interrupt_enable sysfs node x86, MCE, AMD: Make APIC LVT thresholding interrupt optional	2012-05-23 11:01:52 -07:00
Linus Torvalds	19bec32d7f	Merge branches 'x86-asm-for-linus', 'x86-cleanups-for-linus', 'x86-cpu-for-linus', 'x86-debug-for-linus' and 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull initial trivial x86 stuff from Ingo Molnar. Various random cleanups and trivial fixes. * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64: Eliminate dead ia32 syscall handlers * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pci-calgary_64.c: Remove obsoleted simple_strtoul() usage x86: Don't continue booting if we can't load the specified initrd x86: kernel/dumpstack.c simple_strtoul cleanup x86: kernel/check.c simple_strtoul cleanup debug: Add CONFIG_READABLE_ASM x86: spinlock.h: Remove REG_PTR_MODE * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cache_info: Fix setup of l2/l3 ids * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Avoid double stack traces with show_regs() * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: microcode_core.c simple_strtoul cleanup	2012-05-23 10:09:50 -07:00
Borislav Petkov	80f033610f	x86/mce: Fix 32-bit build Got bitten again by the BIT() macro: arch/x86/kernel/cpu/mcheck/mce.c: In function '__mcheck_cpu_apply_quirks': arch/x86/kernel/cpu/mcheck/mce.c:1453:6: warning: left shift count >= width of type arch/x86/kernel/cpu/mcheck/mce.c:1454:7: warning: left shift count >= width of type Fix it already. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337684026-19740-2-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-23 17:16:43 +02:00
Linus Torvalds	e8650a0823	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "As usual, it's mostly typo fixes, redundant code elimination and some documentation updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits) edac, mips: don't change code that has been removed in edac/mips tree xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer lib: Change mail address of Oskar Schirmer net: Change mail address of Oskar Schirmer arm/m68k: Change mail address of Sebastian Hess i2c: Change mail address of Oskar Schirmer net: Fix tcp_build_and_update_options comment in struct tcp_sock atomic64_32.h: fix parameter naming mismatch Kconfig: replace "--- help ---" with "---help---" c2port: fix bogus Kconfig "default no" edac: Fix spelling errors. qla1280: Remove redundant NULL check before release_firmware() call remoteproc: remove redundant NULL check before release_firmware() qla2xxx: Remove redundant NULL check before release_firmware() call. aic94xx: Get rid of redundant NULL check before release_firmware() call tehuti: delete redundant NULL check before release_firmware() qlogic: get rid of a redundant test for NULL before call to release_firmware() bna: remove redundant NULL test before release_firmware() tg3: remove redundant NULL test before release_firmware() call typhoon: get rid of redundant conditional before all to release_firmware() ...	2012-05-22 19:22:50 -07:00
Linus Torvalds	2ff2b289a6	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...	2012-05-22 18:18:55 -07:00
Linus Torvalds	f5c101892f	Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Contains Alex Shi's three patches to remove percpu_xxx() which overlap with this_cpu_xxx(). There shouldn't be any functional change." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: remove percpu_xxx() functions x86: replace percpu_xxx funcs with this_cpu_xxx net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx	2012-05-22 17:37:47 -07:00
Arnaldo Carvalho de Melo	16ee6576e2	Merge remote-tracking branch 'tip/perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit `e7c72d8` perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-05-18 13:13:33 -03:00
Robert Richter	5bcdf5e4fe	perf/x86: Update event scheduling constraints for AMD family 15h models This update is for newer family 15h cpu models from 0x02 to 0x1f. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v2.6.39+ Link: http://lkml.kernel.org/r/1337337642-1621-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 14:07:46 +02:00
Tony Luck	dad1743e59	x86/mce: Only restart instruction after machine check recovery if it is safe Section 15.3.1.2 of the software developer manual has this to say about the RIPV bit in the IA32_MCG_STATUS register: RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program execution can be restarted reliably at the instruction pointed to by the instruction pointer pushed on the stack when the machine-check exception is generated. When clear, the program cannot be reliably restarted at the pushed instruction pointer. We need to save the state of this bit in do_machine_check() and use it in mce_notify_process() to force a signal; even if memory_failure() says it made a complete recovery ... e.g. replaced a clean LRU page. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-14 15:07:48 -07:00
Alex Shi	c6ae41e7d4	x86: replace percpu_xxx funcs with this_cpu_xxx Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-05-14 14:15:31 -07:00
Robert Richter	8b1e13638d	perf/x86-ibs: Fix usage of IBS op current count The value of IbsOpCurCnt rolls over when it reaches IbsOpMaxCnt. Thus, it is reset to zero by hardware. To get the correct count we need to add the max count to it in case we received an ibs sample (valid bit set). Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-13-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:17 +02:00
Robert Richter	fc5fb2b5e1	perf/x86-ibs: Catch spurious interrupts after stopping IBS After disabling IBS there could be still incomming NMIs with samples that even have the valid bit cleared. Mark all this NMIs as handled to avoid spurious interrupt messages. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-12-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:16 +02:00
Robert Richter	c9574fe0bd	perf/x86-ibs: Implement workaround for IBS erratum #420 When disabling ibs there might be the case where hardware continuously generates interrupts. This is described in erratum #420 (Instruction- Based Sampling Engine May Generate Interrupt that Cannot Be Cleared). To avoid this we must clear the counter mask first and then clear the enable bit. This patch implements this. See Revision Guide for AMD Family 10h Processors, Publication #41322. Note: We now keep track of the last read ibs config value which is then used to disable ibs. To update the config value we pass now a pointer to the functions reading it. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-11-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:16 +02:00
Robert Richter	7caaf4d824	perf/x86-ibs: Extend hw period that triggers overflow If the last hw period is too short we might hit the irq handler which biases the results. Thus try to have a max last period that triggers the sw overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-10-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:15 +02:00
Robert Richter	fc006cf7cc	perf/x86-ibs: Trigger overflow if remaining period is too small There are cases where the remaining period is smaller than the minimal possible value. In this case the counter is restarted with the minimal period. This is of no use as the interrupt handler will trigger immediately again and most likely hits itself. This biases the results. So, if the remaining period is within the min range, we better do not restart the counter and instead trigger the overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-9-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:15 +02:00
Robert Richter	98112d2e95	perf/x86-ibs: Rename some variables Simple patch that just renames some variables for better understanding. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-8-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:14 +02:00
Robert Richter	450bbd493d	perf/x86-ibs: Precise event sampling with IBS for AMD CPUs This patch adds support for precise event sampling with IBS. There are two counting modes to count either cycles or micro-ops. If the corresponding performance counter events (hw events) are setup with the precise flag set, the request is redirected to the ibs pmu: perf record -a -e cpu-cycles:p ... # use ibs op counting cycle count perf record -a -e r076:p ... # same as -e cpu-cycles:p perf record -a -e r0C1:p ... # use ibs op counting micro-ops Each ibs sample contains a linear address that points to the instruction that was causing the sample to trigger. With ibs we have skid 0. Thus, ibs supports precise levels 1 and 2. Samples are marked with the PERF_EFLAGS_EXACT flag set. In rare cases the rip is invalid when IBS was not able to record the rip correctly. Then the PERF_EFLAGS_EXACT flag is cleared and the rip is taken from pt_regs. V2: * don't drop samples in precise level 2 if rip is invalid, instead support the PERF_EFLAGS_EXACT flag Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120502103309.GP18810@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:14 +02:00
Robert Richter	d47e8238cd	perf/x86-ibs: Take instruction pointer from ibs sample Each IBS sample contains a linear address of the instruction that caused the sample to trigger. This address is more precise than the rip that was taken from the interrupt handler's stack. Update the rip with that address. We use this in the next patch to implement precise-event sampling on AMD systems using IBS. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-6-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:13 +02:00
Robert Richter	6accb9cf76	perf/x86-ibs: Fix frequency profiling Fixing profiling at a fixed frequency, in this case the freq value and sample period was setup incorrectly. Since sampling periods are adjusted we also allow periods that have lower 4 bits set. Another fix is the setup of the hw counter: If we modify hwc->sample_period, we also need to update hwc->last_period and hwc->period_left. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:13 +02:00
Robert Richter	7bf352384f	perf/x86-ibs: Enable ibs op micro-ops counting mode Allow enabling ibs op micro-ops counting mode. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:12 +02:00
Robert Richter	fd0d000b2c	perf: Pass last sampling period to perf_sample_data_init() We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:12 +02:00
Robert Richter	c75841a398	perf/x86-ibs: Fix update of period The last sw period was not correctly updated on overflow and thus led to wrong distribution of events. We always need to properly initialize data.period in struct perf_sample_data. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:11 +02:00
Ingo Molnar	ad8537cda6	Merge branch 'perf/x86-ibs' into perf/core	2012-05-09 15:22:23 +02:00
Shai Fultheim	ddc5681ed3	x86/cache_info: Fix setup of l2/l3 ids On some architectures (such as vSMP), it is possible to have CPUs with a different number of cores sharing the same cache. The current implementation implicitly assumes that all CPUs will have the same number of cores sharing caches, and as a result, different CPUs can end up with the same l2/l3 ids. Fix this by masking out the shared cache bits, instead of shifting the APICID. By doing so, it is guaranteed that the generated cache ids are always unique. Signed-off-by: Shai Fultheim <shai@scalemp.com> [ rebased, simplified, and reworded the commit message] Signed-off-by: Ido Yariv <ido@wizery.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/1334873351-31142-1-git-send-email-ido@wizery.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 15:27:37 +02:00
Borislav Petkov	575203b474	x86, MCE, AMD: Disable error thresholding bank 4 on some models Turn off MC4_MISC thresholding banks on models which have them but that particular processor implementation does not supply applicable error sources to be counted. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:54 +02:00
Borislav Petkov	d26ecc4894	x86, MCE, AMD: Hide interrupt_enable sysfs node Depending on whether the box supports the APIC LVT interrupt for thresholding, we want to show the 'interrupt_enable' sysfs node or not. Make that the case by adding it to the default sysfs attributes only if it is supported. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:44 +02:00
Borislav Petkov	f227d4306c	x86, MCE, AMD: Make APIC LVT thresholding interrupt optional Currently, the APIC LVT interrupt for error thresholding is implicitly enabled. However, there are models in the F15h range which do not enable it. Make the code machinery which sets up the APIC interrupt support an optional setting and add an ->interrupt_capable member to the bank representation mirroring that capability and enable the interrupt offset programming only if it is true. Simplify code and fixup comment style while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:22 +02:00
Andreas Herrmann	f7f286a910	x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it BIOS will switch off the corresponding feature flag on family 15h models 10h-1fh non-desktop CPUs. The topology extension CPUID leafs are required to detect which cores belong to the same compute unit. (thread siblings mask is set accordingly and also correct information about L1i and L2 cache sharing depends on this). W/o this patch we wouldn't see which cores belong to the same compute unit and also cache sharing information for L1i and L2 would be incorrect on such systems. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-27 16:43:09 +02:00
Robert Richter	5f09fc6889	perf/x86: Fix cmpxchg() usage in amd_put_event_constraints() Now the return value of cmpxchg() is used to match an event. The change removes the duplicate event comparison and traverses the list until an event was removed. This also fixes the following warning: arch/x86/kernel/cpu/perf_event_amd.c:170: warning: value computed is not used Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:51 +02:00
Robert Richter	10c250234c	perf: Trivial cleanup of duplicate code Removing duplicate code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:50 +02:00
Ingo Molnar	fab06992de	perf/x86: Clean up register_nmi_handler() usage A function name represents the pointer to it - no need to take the address of it. (Fixing this helps us introduce some macro magic around register_nmi_handler() in the future.) Cc: Robert Richter <robert.richter@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:56:26 +02:00
Ingo Molnar	cd32b1616b	A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPkEFNAAoJEBLB8Bhh3lVKEigP/1hrNM0oE3IQHejNYkZ2kbj0 DL4WkALiHCgWJVcr/lk9PvTaY7aIAo8HFgvbUKF4PcBvnFZvu6rckJPIsWgsE1wD +OHDaO27vyYKMaA9ImvqYneB8hcl528EDlZ7ssfTepQXPyx99SoTYc9ToT1+znVp qQUC+d67MTLl/eHL+i6rbQfYdVaXGaVTuAIpzQn3oTqUDLrmXKe+60oTgnC0zgSW kQ63Vo8MHi9CpzCe0JhZwvK3d8sPzrBktNisaEbdHe+0Gk5zvcEiPzE2nIyvLXjz cf0QoQFQHzniCdFQoYjzLafT3aItBsN1v29gOrydPT6LxMZ8wO5k+8Suu1NcyI9R RLJR61wKwSzi4JQ/1+LAqoDQHrldATKPCM74BLYiNTi8OGeqda+10COJQmLFITzM 9mn9fg/ZPg8V+z+0e2zObYcFVRtUCIs+XRWIzjhuPICdRRnwoNT+b9HyrLZ5JY1X BH3nHon0W4Bm5jEBhOb02XLdzBMtF3vRM4mNYCzpoS/tDFRszyOuV63FXkPurVbe hT9ZKhDZ63YV8ycwx2jqZgZAFuySi719kG3aUMDNMJYbUEi6D0QRKH9ngE6JAvwH rw1hX65sNX9jbBOAvcDTq2780SpV3dpO1TywLip9uSWIk6DYyEVHbr8AdWwnLlfb fdq9y8V/3+NC9aTNf/e3 =VwUL -----END PGP SIGNATURE----- Merge tag 'l3-fix-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. ( Pulling it into v3.4 despite the v3.5 tag - the fix is small and we better keep the same code across kernel versions for such user facing interfaces. ) Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:24:16 +02:00
Chen Gong	8571723a69	x86/mce Add validation check before GHES error is recorded When GHES error record is logged into mcelog kernel buffer, a validation check for physical address is necessary, which prevents reporting an invalid physical address. [Since physical address is the only useful element in this error record, we drop generating the record completely if we don't have a valid address] Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-04-20 16:02:05 -07:00
Srivatsa S. Bhat	a720b2dd24	x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot() If the L3 disable slot is already in use, return -EEXIST instead of -EINVAL. The caller, store_cache_disable(), checks this return value to print an appropriate warning. Also, we want to signal with -EEXIST that the current index we're disabling has actually been already disabled on the node: $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_1 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu5/cache/index3/cache_disable_1 -bash: echo: write error: File exists The old code would say -bash: echo: write error: Invalid argument for disable slot 1 when playing the example above with no output in dmesg, which is clearly misleading. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120419070053.GB16645@elgon.mountain [Boris: add testing for the other index too] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-19 18:30:28 +02:00
Tony Luck	95022b8cf6	x86/mce: Avoid reading every machine check bank register twice. Reading machine check bank registers is slow. There is a trend of increasing the number of banks, and the number of cores. The main section of do_machine_check() is a serialized section where each cpu in turn checks every bank. Even on a little two socket SandyBridge-EP system that multiplies out as: 2 sockets * 8 cores * 2 hyperthreads * 20 banks = 640 MSRs We already scan the banks in parallel in mce_no_way_out() to see if there is a fatal error anywhere in the system. If we build a cache of VALID bits during this scan, we can avoid uselessly re-reading banks that have no data. Note that this cache is only a hint. If the valid bit is set in a shared bank, all cpus that share that bank will see it during the parallel scan, but the first to find it in the sequential scan will (usually) clear the bank. Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-04-19 09:12:43 -07:00
Andreas Herrmann	68894632af	x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id() It's only called from amd.c:srat_detect_node(). The introduced condition for calling the fixup code is true for all AMD multi-node processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA nodes on one socket. Thus there are cores having different numa-node-id but with equal phys_proc_id. There is no point to print error messages in such a situation. The confusing/misleading error message was introduced with commit `64be4c1c24` ("x86: Add x86_init platform override to fix up NUMA core numbering"). Remove the default fixup function (especially the error message) and replace it by a NULL pointer check, move the Numascale-specific condition for calling the fixup into the fixup-function itself and slightly adapt the comment. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: <stable@kernel.org> Cc: <sp@numascale.com> Cc: <bp@amd64.org> Cc: <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/20120402160648.GR27684@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-16 20:43:43 +02:00
Josh Triplett	a7e0e4e99f	x86: Fix typo in MODULE_DEVICE_TABLE example: s/x86_cpu/x86cpu/ Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-04-16 14:20:19 +02:00
Andreas Herrmann	d7de8649f3	x86/amd: Remove broken links from comment and kernel message Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20120411151238.GA4794@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-16 08:53:43 +02:00
Linus Torvalds	66cfb32772	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/p4: Add format attributes tracing, sched, vfs: Fix 'old_pid' usage in trace_sched_process_exec()	2012-04-04 10:04:42 -07:00
Peter Zijlstra	7b8e6da46b	perf/x86/p4: Add format attributes Steven reported his P4 not booting properly, the missing format attributes cause a NULL ptr deref. Cure this by adding the missing format specification. I took the format description out of the comment near p4_config_pack*() and hope that comment is still relatively accurate. Reported-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1332859842.16159.227.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-03 08:33:38 +02:00
Linus Torvalds	f187e9fd68	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates and fixes from Ingo Molnar: "It's mostly fixes, but there's also two late items: - preliminary GTK GUI support for perf report - PMU raw event format descriptors in sysfs, to be parsed by tooling The raw event format in sysfs is a new ABI. For example for the 'CPU' PMU we have: aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/* -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask those lists of fields contain a specific format: aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp config1:0-63 So, those who wish to specify raw events can now use the following event format: -e cpu/cmask=1,event=2,umask=3 Most people will not want to specify any events (let alone raw events), they'll just use whatever default event the tools use. But for more obscure PMU events that have no cross-architecture generic events the above syntax is more usable and a bit more structured than specifying hex numbers." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) perf tools: Remove auto-generated bison/flex files perf annotate: Fix off by one symbol hist size allocation and hit accounting perf tools: Add missing ref-cycles event back to event parser perf annotate: addr2line wants addresses in same format as objdump perf probe: Finder fails to resolve function name to address tracing: Fix ent_size in trace output perf symbols: Handle NULL dso in dso__name_len perf symbols: Do not include libgen.h perf tools: Fix bug in raw sample parsing perf tools: Fix display of first level of callchains perf tools: Switch module.h into export.h perf: Move mmap page data_head offset assertion out of header perf: Fix mmap_page capabilities and docs perf diff: Fix to work with new hists design perf tools: Fix modifier to be applied on correct events perf tools: Fix various casting issues for 32 bits perf tools: Simplify event_read_id exit path tracing: Fix ftrace stack trace entries tracing: Move the tracing_on/off() declarations into CONFIG_TRACING perf report: Add a simple GTK2-based 'perf report' browser ...	2012-03-31 13:34:04 -07:00
Linus Torvalds	a591afc01d	Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x32 support for x86-64 from Ingo Molnar: "This tree introduces the X32 binary format and execution mode for x86: 32-bit data space binaries using 64-bit instructions and 64-bit kernel syscalls. This allows applications whose working set fits into a 32 bits address space to make use of 64-bit instructions while using a 32-bit address space with shorter pointers, more compressed data structures, etc." Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c} * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) x32: Fix alignment fail in struct compat_siginfo x32: Fix stupid ia32/x32 inversion in the siginfo format x32: Add ptrace for x32 x32: Switch to a 64-bit clock_t x32: Provide separate is_ia32_task() and is_x32_task() predicates x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls x86/x32: Fix the binutils auto-detect x32: Warn and disable rather than error if binutils too old x32: Only clear TIF_X32 flag once x32: Make sure TS_COMPAT is cleared for x32 tasks fs: Remove missed ->fds_bits from cessation use of fd_set structs internally fs: Fix close_on_exec pointer in alloc_fdtable x32: Drop non-__vdso weak symbols from the x32 VDSO x32: Fix coding style violations in the x32 VDSO code x32: Add x32 VDSO support x32: Allow x32 to be configured x32: If configured, add x32 system calls to system call tables x32: Handle process creation x32: Signal-related system calls x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h> ...	2012-03-29 18:12:23 -07:00
Linus Torvalds	6b8212a313	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Ingo Molnar. This touches some non-x86 files due to the sanitized INLINE_SPIN_UNLOCK config usage. Fixed up trivial conflicts due to just header include changes (removing headers due to cpu_idle() merge clashing with the <asm/system.h> split). * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/amd: Be more verbose about LVT offset assignments x86, tls: Off by one limit check x86/ioapic: Add io_apic_ops driver layer to allow interception x86/olpc: Add debugfs interface for EC commands x86: Merge the x86_32 and x86_64 cpu_idle() functions x86/kconfig: Remove CONFIG_TR=y from the defconfigs x86: Stop recursive fault in print_context_stack after stack overflow x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y x86/apic: Add separate apic_id_valid() functions for selected apic drivers locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage x86/kconfig: Update defconfigs x86: Fix excessive MSR print out when show_msr is not specified	2012-03-29 14:28:26 -07:00
Linus Torvalds	0195c00244	Disintegrate and delete asm/system.h -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+ /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD NypQthI85pc= =G9mT -----END PGP SIGNATURE----- Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...	2012-03-28 15:58:21 -07:00
David Howells	f05e798ad4	Disintegrate asm/system.h for X86 Disintegrate asm/system.h for X86. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> cc: x86@kernel.org	2012-03-28 18:11:12 +01:00
Ingo Molnar	7fd52392c5	Merge branch 'linus' into perf/urgent Merge reason: we need to fix a non-trivial merge conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-26 17:19:03 +02:00
Linus Torvalds	ed2d265d12	The following text was taken from the original review request: "[RFC - PATCH 0/7] consolidation of BUG support code." https://lkml.org/lkml/2012/1/26/525 -- The changes shown here are to unify linux's BUG support under the one <linux/bug.h> file. Due to historical reasons, we have some BUG code in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h, but old code in kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h was including <asm/bug.h> to pseudo link them. This has caused confusion[1] and general yuck/WTF[2] reactions. Here is an example that violates the principle of least surprise: CC lib/string.o lib/string.c: In function 'strlcat': lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON' make[2]: *** [lib/string.o] Error 1 $ $ grep linux/bug.h lib/string.c #include <linux/bug.h> $ We've included <linux/bug.h> for the BUG infrastructure and yet we still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh - very confusing for someone who is new to kernel development. With the above in mind, the goals of this changeset are: 1) find and fix any include/.h files that were relying on the implicit presence of BUG code. 2) find and fix any C files that were consuming kernel.h and hence relying on implicitly getting some/all BUG code. 3) Move the BUG related code living in kernel.h to <linux/bug.h> 4) remove the asm/bug.h from kernel.h to finally break the chain. During development, the order was more like 3-4, build-test, 1-2. But to ensure that git history for bisect doesn't get needless build failures introduced, the commits have been reorderd to fix the problem areas in advance. [1] https://lkml.org/lkml/2012/1/3/90 [2] https://lkml.org/lkml/2012/1/17/414 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPbNwpAAoJEOvOhAQsB9HWrqYP/A0t9VB0nK6e42F0OR2P14MZ GJFtf1B++wwioIrx+KSWSRfSur1C5FKhDbxLR3I/pvkAYl4+T4JvRdMG6xJwxyip CC1kVQQNDjWVVqzjz2x6rYkOffx6dUlw/ERyIyk+OzP+1HzRIsIrugMqbzGLlX0X y0v2Tbd0G6xg1DV8lcRdp95eIzcGuUvdb2iY2LGadWZczEOeSXx64Jz3QCFxg3aL LFU4oovsg8Nb7MRJmqDvHK/oQf5vaTm9WSrS0pvVte0msSQRn8LStYdWC0G9BPCS GwL86h/eLXlUXQlC5GpgWg1QQt5i2QpjBFcVBIG0IT5SgEPMx+gXyiqZva2KwbHu LKicjKtfnzPitQnyEV/N6JyV1fb1U6/MsB7ebU5nCCzt9Gr7MYbjZ44peNeprAtu HMvJ/BNnRr4Ha6nPQNu952AdASPKkxmeXFUwBL1zUbLkOX/bK/vy1ujlcdkFxCD7 fP3t7hghYa737IHk0ehUOhrE4H67hvxTSCKioLUAy/YeN1IcfH/iOQiCBQVLWmoS AqYV6ou9cqgdYoyila2UeAqegb+8xyubPIHt+lebcaKxs5aGsTg+r3vq5juMDAPs iwSVYUDcIw9dHer1lJfo7QCy3QUTRDTxh+LB9VlHXQICgeCK02sLBOi9hbEr4/H8 Ko9g8J3BMxcMkXLHT9ud =PYQT -----END PGP SIGNATURE----- Merge tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull <linux/bug.h> cleanup from Paul Gortmaker: "The changes shown here are to unify linux's BUG support under the one <linux/bug.h> file. Due to historical reasons, we have some BUG code in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h, but old code in kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h was including <asm/bug.h> to pseudo link them. This has caused confusion[1] and general yuck/WTF[2] reactions. Here is an example that violates the principle of least surprise: CC lib/string.o lib/string.c: In function 'strlcat': lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON' make[2]: ** [lib/string.o] Error 1 $ $ grep linux/bug.h lib/string.c #include <linux/bug.h> $ We've included <linux/bug.h> for the BUG infrastructure and yet we still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh - very confusing for someone who is new to kernel development. With the above in mind, the goals of this changeset are: 1) find and fix any include/.h files that were relying on the implicit presence of BUG code. 2) find and fix any C files that were consuming kernel.h and hence relying on implicitly getting some/all BUG code. 3) Move the BUG related code living in kernel.h to <linux/bug.h> 4) remove the asm/bug.h from kernel.h to finally break the chain. During development, the order was more like 3-4, build-test, 1-2. But to ensure that git history for bisect doesn't get needless build failures introduced, the commits have been reorderd to fix the problem areas in advance. [1] https://lkml.org/lkml/2012/1/3/90 [2] https://lkml.org/lkml/2012/1/17/414" Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul and linux-next. tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: kernel.h: doesn't explicitly use bug.h, so don't include it. bug: consolidate BUILD_BUG_ON with other bug code BUG: headers with BUG/BUG_ON etc. need linux/bug.h bug.h: add include of it to various implicit C users lib: fix implicit users of kernel.h for TAINT_WARN spinlock: macroize assert_spin_locked to avoid bug.h dependency x86: relocate get/set debugreg fcns to include/asm/debugreg.	2012-03-24 10:08:39 -07:00
Akinobu Mita	307b1cd7ec	bitops: rename for_each_set_bit_cont() in favor of analogous list.h function This renames for_each_set_bit_cont() to for_each_set_bit_from() because it is analogous to list_for_each_entry_from() in list.h rather than list_for_each_entry_continue(). This doesn't remove for_each_set_bit_cont() for now. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:33 -07:00
Yinghai Lu	0b8b8078cb	x86: Fix excessive MSR print out when show_msr is not specified Dave found: \| During bootup, I now have 162 messages like this.. \| [ 0.227346] MSR0000001b: 00000000fee00900 \| [ 0.227465] MSR00000021: 0000000000000001 \| [ 0.227584] MSR0000002a: 00000000c1c81400 \| \| commit `21c3fcf3e3` looks suspect. \| It claims that it will only print these out if show_msr= is \| passed, but that doesn't seem to be the case. Fix it by changing to the version that checks the index. Reported-and-tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1332477103-4595-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 09:55:00 +01:00
Peter Zijlstra	c7206205d0	perf: Fix mmap_page capabilities and docs Complete the syscall-less self-profiling feature and address all complaints, namely: - capabilities, so we can detect what is actually available at runtime Add a capabilities field to perf_event_mmap_page to indicate what is actually available for use. - on x86: RDPMC weirdness due to being 40/48 bits and not sign-extending properly. - ABI documentation as to how all this stuff works. Also improve the documentation for the new features. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1332433596.2487.33.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 09:52:16 +01:00
Linus Torvalds	28f23d1f3b	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 "urgent" leftovers from Ingo Molnar: "Pending x86/urgent bits that were not high prio enough to warrant -rc-less v3.3-final inclusion." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Fix pointer math issue in handle_ramdisks() x86/ioapic: Add register level checks to detect bogus io-apic entries x86, mce: Fix rcu splat in drain_mce_log_buffer() x86, memblock: Move mem_hole_size() to .init	2012-03-22 09:44:50 -07:00
Linus Torvalds	754b980077	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MCE changes from Ingo Molnar. * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Fix return value of mce_chrdev_read() when erst is disabled x86/mce: Convert static array of pointers to per-cpu variables x86/mce: Replace hard coded hex constants with symbolic defines x86/mce: Recognise machine check bank signature for data path error x86/mce: Handle "action required" errors x86/mce: Add mechanism to safely save information in MCE handler x86/mce: Create helper function to save addr/misc when needed HWPOISON: Add code to handle "action required" errors. HWPOISON: Clean up memory_failure() vs. __memory_failure()	2012-03-22 09:42:04 -07:00
Linus Torvalds	35cb8d9e18	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu changes from Ingo Molnar. * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: i387: Split up <asm/i387.h> into exported and internal interfaces i387: Uninline the generic FP helpers that we expose to kernel modules	2012-03-22 09:41:22 -07:00
Linus Torvalds	4c64616bb5	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/debug changes from Ingo Molnar. * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix section warnings x86-64: Fix CFI data for common_interrupt() x86: Properly _init-annotate NMI selftest code x86/debug: Fix/improve the show_msr=<cpus> debug print out	2012-03-22 09:30:39 -07:00
Linus Torvalds	4a52246302	driver core merge for 3.4-rc1 Here's the big driver core merge for 3.4-rc1. Lots of various things here, sysfs fixes/tweaks (with the nlink breakage reverted), dynamic debugging updates, w1 drivers, hyperv driver updates, and a variety of other bits and pieces, full information in the shortlog. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iEYEABECAAYFAk9neCsACgkQMUfUDdst+ylyQwCfY2eizvzw5HhjQs8gOiBRDADe yrgAnj1Zan2QkoCnQIFJNAoxqNX9yAhd =biH6 -----END PGP SIGNATURE----- Merge tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches for 3.4-rc1 from Greg KH: "Here's the big driver core merge for 3.4-rc1. Lots of various things here, sysfs fixes/tweaks (with the nlink breakage reverted), dynamic debugging updates, w1 drivers, hyperv driver updates, and a variety of other bits and pieces, full information in the shortlog." * tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (78 commits) Tools: hv: Support enumeration from all the pools Tools: hv: Fully support the new KVP verbs in the user level daemon Drivers: hv: Support the newly introduced KVP messages in the driver Drivers: hv: Add new message types to enhance KVP regulator: Support driver probe deferral Revert "sysfs: Kill nlink counting." uevent: send events in correct order according to seqnum (v3) driver core: minor comment formatting cleanups driver core: move the deferred probe pointer into the private area drivercore: Add driver probe deferral mechanism DS2781 Maxim Stand-Alone Fuel Gauge battery and w1 slave drivers w1_bq27000: Only one thread can access the bq27000 at a time. w1_bq27000 - remove w1_bq27000_write w1_bq27000: remove unnecessary NULL test. sysfs: Fix memory leak in sysfs_sd_setsecdata(). intel_idle: Revert change of auto_demotion_disable_flags for Nehalem w1: Fix w1_bq27000 driver-core: documentation: fix up Greg's email address powernow-k6: Really enable auto-loading powernow-k7: Fix CPU family number ...	2012-03-20 11:16:20 -07:00
Jiri Olsa	641cc93881	perf: Adding sysfs group format attribute for pmu device Adding sysfs group 'format' attribute for pmu device that contains a syntax description on how to construct raw events. The event configuration is described in following struct pefr_event_attr attributes: config config1 config2 Each sysfs attribute within the format attribute group, describes mapping of name and bitfield definition within one of above attributes. eg: "/sys/...<dev>/format/event" contains "config:0-7" "/sys/...<dev>/format/umask" contains "config:8-15" "/sys/...<dev>/format/usr" contains "config:16" the attribute value syntax is: line: config ':' bits config: 'config' \| 'config1' \| 'config2" bits: bits ',' bit_term \| bit_term bit_term: VALUE '-' VALUE \| VALUE Adding format attribute definitions for x86 cpu pmus. Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-03-16 14:06:06 -03:00
Ingo Molnar	ea281a9eba	Two miscellaneous MCE fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJPX7q/AAoJEKurIx+X31iBIHIQAIjXmz+5cCp/sDjtcoiPvcvu 0OLspJbT4jswGdiFP+Xvn7MEJH2Q+Bj8ejejCrpyXG4R1v/K9zmzDwyAGaJbLiGw T5I9L9/GesYhCoIzyT7IfyNCtjsXMRmdmxLbxvyWxXiw/tT355Reb8Pqsq3Xy9dA 0swO69MC3oJcYG19pLgv+QrtZaK79aiJqe5YmDIIoLYo8jIhnza2bKn3CKvQLV34 8jIKTyo31MOTVZW9qy+pt5kNrnUgtmMBtdEsfVQdrAxKyEXwBd0rFRLHZ0c8YQit QcI0/MyK5la3swtiGnEiJQOvGGP/VIfPklVdw3ahLAthYUC+mof6zDBqpSGheJBQ f0Tbm0k3uWzVVij6Au2i3xi+nKNiPtUlOCcE1EPFvOYh0QOSDy4TPdxRDCZHHPeA hc5nbLTqwxAc8BE3vnDySztCGoeA3/UEOba1Kc4meKyLma7pJ4eNEL9j40XInn/z JRNbfj4BuSgmpV5nZbsb9jU/f15OfOuwZ+UMz//PYuIEI6gK3aYXojc5x1Z++Dot G6GIF4zs1zkzOnvaNeB8KZkmtWvRl/7dZuEM5SMDU+V0lj5xDs3nM4AhYPYSAvX2 knmj//Gr/NE/DpAH0+bZIqzz6eduGc8HV5Y0nWlWvPKO+qs4n9uE0/8VL+5lZcQ8 kojVLi2WTLvhCsnxUV3z =G63l -----END PGP SIGNATURE----- Merge tag 'mce-for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Apply two miscellaneous MCE fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-14 07:44:48 +01:00
Ingo Molnar	cd593accdc	Linux 3.3-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPW8yUAAoJEHm+PkMAQRiGhFIH/RGUPxGmUkJv8EP5I4HDA4dJ c6/PrzZCHs8rxzYzvn7ojXqZGXTOAA5ZgS9A6LkJ2sxMFvgMnkpFi6B4CwMzizS3 vLWo/HNxbiTCNGFfQrhQB8O58uNI8wOBa87lrQfkXkDqN0cFhdjtIxeY1BD9LXIo qbWysGxCcZhJWHapsQ3NZaVJQnIK5vA/+mhyYP4HzbcHI3aWnbIEZ8GQKeY28Ch0 +pct5UQBjZavV5SujaW0Xd65oIiycm8XHAQw6FxQy//DfaabauWgFteR162Q/oew xxUBDOHF3nO1bdteHHaYqxig0j1MbIHsqxTnE/neR8UryF04//1SFF7DYuY/1pg= =SV5V -----END PGP SIGNATURE----- Merge tag 'v3.3-rc7' into x86/mce Merge reason: Update from an ancient -rc1 base to an almost-final stable kernel. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-14 07:44:11 +01:00
Ingo Molnar	bea95c152d	Merge branch 'perf/hw-branch-sampling' into perf/core Merge reason: The 'perf record -b' hardware branch sampling feature is ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-12 20:47:05 +01:00
Peter Zijlstra	f9b4eeb809	perf/x86: Prettify pmu config literals I got somewhat tired of having to decode hex numbers.. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/n/tip-0vsy1sgywc4uar3mu1szm0rg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-12 20:44:54 +01:00
Ingo Molnar	35239e23c6	Merge branch 'perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-12 20:44:11 +01:00
Peter Zijlstra	87e24f4b67	perf/x86: Fix local vs remote memory events for NHM/WSM Verified using the below proglet.. before: [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0 remote write Performance counter stats for './numa 0': 2,101,554 node-stores 2,096,931 node-store-misses 5.021546079 seconds time elapsed [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1 local write Performance counter stats for './numa 1': 501,137 node-stores 199 node-store-misses 5.124451068 seconds time elapsed After: [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 0 remote write Performance counter stats for './numa 0': 2,107,516 node-stores 2,097,187 node-store-misses 5.012755149 seconds time elapsed [root@westmere ~]# perf stat -e node-stores -e node-store-misses ./numa 1 local write Performance counter stats for './numa 1': 2,063,355 node-stores 165 node-store-misses 5.082091494 seconds time elapsed #define _GNU_SOURCE #include <sched.h> #include <stdio.h> #include <errno.h> #include <sys/mman.h> #include <sys/types.h> #include <dirent.h> #include <signal.h> #include <unistd.h> #include <numaif.h> #include <stdlib.h> #define SIZE (3210241024) volatile int done; void sig_done(int sig) { done = 1; } int main(int argc, char *argv) { cpu_set_t mask, mask2; size_t size; int i, err, t; int nrcpus = 1024; char mem; unsigned long nodemask = 0x01; /* node 0 / DIR node; struct dirent de; int read = 0; int local = 0; if (argc < 2) { printf("usage: %s [0-3]\n", argv[0]); printf(" bit0 - local/remote\n"); printf(" bit1 - read/write\n"); exit(0); } switch (atoi(argv[1])) { case 0: printf("remote write\n"); break; case 1: printf("local write\n"); local = 1; break; case 2: printf("remote read\n"); read = 1; break; case 3: printf("local read\n"); local = 1; read = 1; break; } mask = CPU_ALLOC(nrcpus); size = CPU_ALLOC_SIZE(nrcpus); CPU_ZERO_S(size, mask); node = opendir("/sys/devices/system/node/node0/"); if (!node) perror("opendir"); while ((de = readdir(node))) { int cpu; if (sscanf(de->d_name, "cpu%d", &cpu) == 1) CPU_SET_S(cpu, size, mask); } closedir(node); mask2 = CPU_ALLOC(nrcpus); CPU_ZERO_S(size, mask2); for (i = 0; i < size; i++) CPU_SET_S(i, size, mask2); CPU_XOR_S(size, mask2, mask2, mask); // invert if (!local) mask = mask2; err = sched_setaffinity(0, size, mask); if (err) perror("sched_setaffinity"); mem = mmap(0, SIZE, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0); err = mbind(mem, SIZE, MPOL_BIND, &nodemask, 8sizeof(nodemask), MPOL_MF_MOVE); if (err) perror("mbind"); signal(SIGALRM, sig_done); alarm(5); if (!read) { while (!done) { for (i = 0; i < SIZE; i++) mem[i] = 0x01; } } else { while (!done) { for (i = 0; i < SIZE; i++) t += (volatile char )(mem + i); } } return 0; } Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/n/tip-tq73sxus35xmqpojf7ootxgs@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-12 20:43:41 +01:00
Greg Kroah-Hartman	263a5c8e16	Merge 3.3-rc6 into driver-core-next This was done to resolve a conflict in the drivers/base/cpu.c file. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-03-09 12:35:53 -08:00
Robert Richter	db98c5faf8	perf/x86: Implement 64-bit counter support for IBS This patch implements 64 bit counter support for IBS. The sampling period is no longer limited to the hw counter width. The functions perf_event_set_period() and perf_event_try_update() can be used as generic functions. They can replace similar code that is duplicate across architectures. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323968199-9326-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-08 11:35:22 +01:00
Robert Richter	4db2e8e650	perf/x86: Implement IBS pmu control ops Add code to control the IBS pmu. We need to maintain per-cpu states. Since some states are used and changed by the nmi handler, access to these states must be atomic. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323968199-9326-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-08 11:35:22 +01:00
Robert Richter	b7074f1fbd	perf/x86: Implement IBS interrupt handler This patch implements code to handle ibs interrupts. If ibs data is available a raw perf_event data sample is created and sent back to the userland. This patch only implements the storage of ibs data in the raw sample, but this could be extended in a later patch by generating generic event data such as the rip from the ibs sampling data. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323968199-9326-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-08 11:35:21 +01:00
Robert Richter	510419435c	perf/x86: Implement IBS event configuration This patch implements perf configuration for AMD IBS. The IBS pmu is selected using the type attribute in sysfs. There are two types of ibs pmus, for instruction fetch (IBS_FETCH) and for instruction execution (IBS_OP): /sys/bus/event_source/devices/ibs_fetch/type /sys/bus/event_source/devices/ibs_op/type Except for the sample period IBS can only be set up with raw config values and raw data samples. The event attributes for the syscall should be programmed like this (IBS_FETCH): type = get_pmu_type("/sys/bus/event_source/devices/ibs_fetch/type"); memset(&attr, 0, sizeof(attr)); attr.type = type; attr.sample_type = PERF_SAMPLE_CPU \| PERF_SAMPLE_RAW; attr.config = IBS_FETCH_CONFIG_DEFAULT; This implementation does not yet support 64 bit counters. It is limited to the hardware counter bit width which is 20 bits. 64 bit support can be added later. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323968199-9326-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-08 11:35:21 +01:00
Srivatsa S. Bhat	b11e3d782b	x86, mce: Fix rcu splat in drain_mce_log_buffer() While booting, the following message is seen: [ 21.665087] =============================== [ 21.669439] [ INFO: suspicious RCU usage. ] [ 21.673798] 3.2.0-0.0.0.28.36b5ec9-default #2 Not tainted [ 21.681353] ------------------------------- [ 21.685864] arch/x86/kernel/cpu/mcheck/mce.c:194 suspicious rcu_dereference_index_check() usage! [ 21.695013] [ 21.695014] other info that might help us debug this: [ 21.695016] [ 21.703488] [ 21.703489] rcu_scheduler_active = 1, debug_locks = 1 [ 21.710426] 3 locks held by modprobe/2139: [ 21.714754] #0: (&__lockdep_no_validate__){......}, at: [<ffffffff8133afd3>] __driver_attach+0x53/0xa0 [ 21.725020] #1: [ 21.725323] ioatdma: Intel(R) QuickData Technology Driver 4.00 [ 21.733206] (&__lockdep_no_validate__){......}, at: [<ffffffff8133afe1>] __driver_attach+0x61/0xa0 [ 21.743015] #2: (i7core_edac_lock){+.+.+.}, at: [<ffffffffa01cfa5f>] i7core_probe+0x1f/0x5c0 [i7core_edac] [ 21.753708] [ 21.753709] stack backtrace: [ 21.758429] Pid: 2139, comm: modprobe Not tainted 3.2.0-0.0.0.28.36b5ec9-default #2 [ 21.768253] Call Trace: [ 21.770838] [<ffffffff810977cd>] lockdep_rcu_suspicious+0xcd/0x100 [ 21.777366] [<ffffffff8101aa41>] drain_mcelog_buffer+0x191/0x1b0 [ 21.783715] [<ffffffff8101aa78>] mce_register_decode_chain+0x18/0x20 [ 21.790430] [<ffffffffa01cf8db>] i7core_register_mci+0x2fb/0x3e4 [i7core_edac] [ 21.798003] [<ffffffffa01cfb14>] i7core_probe+0xd4/0x5c0 [i7core_edac] [ 21.804809] [<ffffffff8129566b>] local_pci_probe+0x5b/0xe0 [ 21.810631] [<ffffffff812957c9>] __pci_device_probe+0xd9/0xe0 [ 21.816650] [<ffffffff813362e4>] ? get_device+0x14/0x20 [ 21.822178] [<ffffffff81296916>] pci_device_probe+0x36/0x60 [ 21.828061] [<ffffffff8133ac8a>] really_probe+0x7a/0x2b0 [ 21.833676] [<ffffffff8133af23>] driver_probe_device+0x63/0xc0 [ 21.839868] [<ffffffff8133b01b>] __driver_attach+0x9b/0xa0 [ 21.845718] [<ffffffff8133af80>] ? driver_probe_device+0xc0/0xc0 [ 21.852027] [<ffffffff81339168>] bus_for_each_dev+0x68/0x90 [ 21.857876] [<ffffffff8133aa3c>] driver_attach+0x1c/0x20 [ 21.863462] [<ffffffff8133a64d>] bus_add_driver+0x16d/0x2b0 [ 21.869377] [<ffffffff8133b6dc>] driver_register+0x7c/0x160 [ 21.875220] [<ffffffff81296bda>] __pci_register_driver+0x6a/0xf0 [ 21.881494] [<ffffffffa01fe000>] ? 0xffffffffa01fdfff [ 21.886846] [<ffffffffa01fe047>] i7core_init+0x47/0x1000 [i7core_edac] [ 21.893737] [<ffffffff810001ce>] do_one_initcall+0x3e/0x180 [ 21.899670] [<ffffffff810a9b95>] sys_init_module+0xc5/0x220 [ 21.905542] [<ffffffff8149bc39>] system_call_fastpath+0x16/0x1b Fix this by using ACCESS_ONCE() instead of rcu_dereference_check_mce() over mcelog.next. Since the access to each entry is controlled by the ->finished field, ACCESS_ONCE() should work just fine. An rcu_dereference is unnecessary here. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-03-07 11:44:29 +01:00
Stephane Eranian	d010b3326c	perf: Add callback to flush branch_stack on context switch With branch stack sampling, it is possible to filter by priv levels. In system-wide mode, that means it is possible to capture only user level branches. The builtin SW LBR filter needs to disassemble code based on LBR captured addresses. For that, it needs to know the task the addresses are associated with. Because of context switches, the content of the branch stack buffer may contain addresses from different tasks. We need a callback on context switch to either flush the branch stack or save it. This patch adds a new callback in struct pmu which is called during context switches. The callback is called only when necessary. That is when a system-wide context has, at least, one event which uses PERF_SAMPLE_BRANCH_STACK. The callback is never called for per-thread context. In this version, the Intel x86 code simply flushes (resets) the LBR on context switches (fills it with zeroes). Those zeroed branches are then filtered out by the SW filter. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-11-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:42 +01:00
Stephane Eranian	2481c5fa6d	perf: Disable PERF_SAMPLE_BRANCH_* when not supported PERF_SAMPLE_BRANCH_* is disabled for: - SW events (sw counters, tracepoints) - HW breakpoints - ALL but Intel x86 architecture - AMD64 processors Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:42 +01:00
Stephane Eranian	3e702ff6d1	perf/x86: Add LBR software filter support for Intel CPUs This patch adds an internal sofware filter to complement the (optional) LBR hardware filter. The software filter is necessary: - as a substitute when there is no HW LBR filter (e.g., Atom, Core) - to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere) - to provide finer grain filtering (e.g., all processors) Sometimes the LBR HW filter cannot distinguish between two types of branches. For instance, to capture syscall as CALLS, it is necessary to enable the LBR_FAR filter which will also capture JMP instructions. Thus, a second pass is necessary to filter those out, this is what the SW filter can do. The SW filter is built on top of the internal x86 disassembler. It is a best effort filter especially for user level code. It is subject to the availability of the text page of the program. The SW filter is enabled on all Intel processors. It is bypassed when the user is capturing all branches at all priv levels. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-9-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:42 +01:00
Stephane Eranian	60ce0fbd07	perf/x86: Implement PERF_SAMPLE_BRANCH for Intel CPUs This patch implements PERF_SAMPLE_BRANCH support for Intel x86processors. It connects PERF_SAMPLE_BRANCH to the actual LBR. The patch adds the hooks in the PMU irq handler to save the LBR on counter overflow for both regular and PEBS modes. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-8-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:41 +01:00
Stephane Eranian	88c9a65e13	perf/x86: Disable LBR support for older Intel Atom processors The patch adds a restriction for Intel Atom LBR support. Only steppings 10 (PineView) and more recent are supported. Older models do not have a functional LBR. Their LBR does not freeze on PMU interrupt which makes LBR unusable in the context of perf_events. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-7-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:41 +01:00
Stephane Eranian	c5cc2cd906	perf/x86: Add Intel LBR mappings for PERF_SAMPLE_BRANCH filters This patch adds the mappings from the generic PERF_SAMPLE_BRANCH_* filters to the actual Intel x86LBR filters, whenever they exist. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-6-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:41 +01:00
Stephane Eranian	ff3fb511ba	perf/x86: Sync branch stack sampling with precise_sampling If precise sampling is enabled on Intel x86 then perf_event uses PEBS. To correct for the off-by-one error of PEBS, perf_event uses LBR when precise_sample > 1. On Intel x86 PERF_SAMPLE_BRANCH_STACK is implemented using LBR, therefore both features must be coordinated as they may not configure LBR the same way. For PEBS, LBR needs to capture all branches at the priv level of the associated event. This patch checks that the branch type and priv level of BRANCH_STACK is compatible with that of the PEBS LBR requirement, thereby allowing: $ perf record -b any,u -e instructions:upp .... But: $ perf record -b any_call,u -e instructions:upp Is not possible. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-5-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:40 +01:00
Stephane Eranian	b36817e886	perf/x86: Add Intel LBR sharing logic The Intel LBR on some recent processor is capable of filtering branches by type. The filter is configurable via the LBR_SELECT MSR register. There are limitation on how this register can be used. On Nehalem/Westmere, the LBR_SELECT is shared by the two HT threads when HT is on. It is private to each core when HT is off. On SandyBridge, the LBR_SELECT register is private to each thread when HT is on. It is private to each core when HT is off. The kernel must manage the sharing of LBR_SELECT. It allows multiple users on the same logical CPU to use LBR_SELECT as long as they program it with the same value. Across sibling CPUs (HT threads), the same restriction applies on NHM/WSM. This patch implements this sharing logic by leveraging the mechanism put in place for managing the offcore_response shared MSR. We modify __intel_shared_reg_get_constraints() to cause x86_get_event_constraint() to be called because LBR may be associated with events that may be counter constrained. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:40 +01:00
Stephane Eranian	225ce53910	perf/x86: Add Intel LBR MSR definitions This patch adds the LBR definitions for NHM/WSM/SNB and Core. It also adds the definitions for the architected LBR MSR: LBR_SELECT, LBRT_TOS. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:39 +01:00
Stephane Eranian	bce38cd53e	perf: Add generic taken branch sampling support This patch adds the ability to sample taken branches to the perf_event interface. The ability to capture taken branches is very useful for all sorts of analysis. For instance, basic block profiling, call counts, statistical call graph. This new capability requires hardware assist and as such may not be available on all HW platforms. On Intel x86 it is implemented on top of the Last Branch Record (LBR) facility. To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK bit must be set in attr->sample_type. Sampled taken branches may be filtered by type and/or priv levels. The patch adds a new field, called branch_sample_type, to the perf_event_attr structure. It contains a bitmask of filters to apply to the sampled taken branches. Filters may be implemented in HW. If the HW filter does not exist or is not good enough, some arch may also implement a SW filter. The following generic filters are currently defined: - PERF_SAMPLE_USER only branches whose targets are at the user level - PERF_SAMPLE_KERNEL only branches whose targets are at the kernel level - PERF_SAMPLE_HV only branches whose targets are at the hypervisor level - PERF_SAMPLE_ANY any type of branches (subject to priv levels filters) - PERF_SAMPLE_ANY_CALL any call branches (may incl. syscall on some arch) - PERF_SAMPLE_ANY_RET any return branches (may incl. syscall returns on some arch) - PERF_SAMPLE_IND_CALL indirect call branches Obviously filter may be combined. The priv level bits are optional. If not provided, the priv level of the associated event are used. It is possible to collect branches at a priv level different from the associated event. Use of kernel, hv priv levels is subject to permissions and availability (hv). The number of taken branch records present in each sample may vary based on HW, the type of sampled branches, the executed code. Therefore each sample contains the number of taken branches it contains. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 14:55:39 +01:00
Ingo Molnar	737f24bda7	Merge branch 'perf/urgent' into perf/core Conflicts: tools/perf/builtin-record.c tools/perf/builtin-top.c tools/perf/perf.h tools/perf/util/top.h Merge reason: resolve these cherry-picking conflicts. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 09:20:08 +01:00
Joerg Roedel	1018faa6cf	perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled It turned out that a performance counter on AMD does not count at all when the GO or HO bit is set in the control register and SVM is disabled in EFER. This patch works around this issue by masking out the HO bit in the performance counter control register when SVM is not enabled. The GO bit is not touched because it is only set when the user wants to count in guest-mode only. So when SVM is disabled the counter should not run at all and the not-counting is the intended behaviour. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Avi Kivity <avi@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Robert Richter <robert.richter@amd.com> Cc: stable@vger.kernel.org # v3.2 Link: http://lkml.kernel.org/r/1330523852-19566-1-git-send-email-joerg.roedel@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-02 12:16:39 +01:00
H. Peter Anvin	b263b31e8a	x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls Specify the data structures for the 64-bit ioctls with explicit sizing and padding so that the x32 kernel will correctly use the 64-bit forms of these ioctls. Note that these ioctls are bogus in both forms on both 32 and 64 bits; even on 64 bits the maximum MTRR size is only 44 bits long. Note that nothing really is supposed to use these ioctls and that the preferred interface is text strings on /proc/mtrr, or better yet, nothing at all (use /sys/bus/pci/devices//resource_wc for write combining; that uses PAT not MTRRs.) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: H. J. Lu <hjl.tools@gmail.com> Tested-by: Nitin A. Kamble <nitin.a.kamble@intel.com> Link: http://lkml.kernel.org/n/tip-vwvnlu3hjmtkwvij4qxtm90l@git.kernel.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-03-01 12:48:52 -08:00
Paul Gortmaker	f649e9388c	x86: relocate get/set debugreg fcns to include/asm/debugreg. Since we already have a debugreg.h header file, move the assoc. get/set functions to it. In addition to it being the logical home for them, it has a secondary advantage. The functions that are moved use BUG(). So we really need to have linux/bug.h in scope. But asm/processor.h is used about 600 times, vs. only about 15 for debugreg.h -- so adding bug.h to the latter reduces the amount of time we'll be processing it during a compile. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: "H. Peter Anvin" <hpa@zytor.com>	2012-02-28 17:48:04 -05:00
Linus Torvalds	e25bda5642	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/AMD: Fix UP build error x86: Specify a size for the cmp in the NMI handler x86/nmi: Test saved %cs in NMI to determine nested NMI case x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors x86/microcode: Remove noisy AMD microcode warning	2012-02-27 07:55:51 -08:00
Ingo Molnar	11b91d6fe7	Symbolic defines for architectural MCACOD constants -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJPRpJQAAoJEKurIx+X31iBpdUQAJlKRxgGXcaJOykfzIys4kaF PtyqUaV4nmlDCkwoP0MBn3192IAxPRX+PN7u6VCmniiB7H8nGtXHiAK7yhJ1UhDB zDCnNmJ5hliICZ5nORolr9zpWgoZ/RkIest//OZUdO7hjnpEqZ2MP7wCHkpA6/Xu M6OeU/R671pTea9JluS+8AiVl+szxyVoFzbDJ4xXl5Dr2xYCP7tZfNkD1odf0LNk cun5jUov6+UmPTJGm/ve61/eJeOjvrEYDBrWpYtCh7cZkSbLC+Qg9XrHNNQxBEIx 7wFvdhEXX5SiKV2p5DkMzDf6yDlLnf/qerbY3UhUqybbGVXrTmuUI1NAJY5nteIJ InXRJWeEuzpqSqbozE7V9kovmxMO8O6LPLhe9R/cAJAkt5qxL1sQc9FgWZ4c7VCb tFmyVzRmjijdw9EE13U+5GTBvHr28WaOKaveI5uMZakfE93Pt1rHbpxWHSyq05gM 3nr+Vm4VjVeQRww+lsRJ3TZ/F/ThwMRmdwS0QXjuncUkVcU5iOioNsWFeupjPsbg DjjaVLQd4SYXBzvOcsEEAOGtmY1jiLYJnxGTAkM/gnN9uNyLSWCA5YnnrYx74yds HXpm155BiCtir1A5do+QbcrMaTig/zt12tpnHnoYiifKNYhO5AqLbsKy5ZtDWGXz 7yYTV5WNwUzQfy7S2f8B =Gu3A -----END PGP SIGNATURE----- Merge tag 'mce-recovery-for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Add symbolic defines for architectural MCACOD constants	2012-02-24 16:26:39 +01:00
Naoya Horiguchi	fadd85f16a	x86/mce: Fix return value of mce_chrdev_read() when erst is disabled Current kernel MCE code reads ERST at the first reading of /dev/mcelog (maybe in starting mcelogd,) even if the system does not support ERST, which results in a fake "no such device" message (as described in [1].) This problem is not critical, but can confuse system admins. This patch fixes it by filtering the return value from lower (ACPI) layer. [1] http://thread.gmane.org/gmane.linux.kernel/1060250 Reported by: Jon Masters <jonathan@jonmasters.org> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Link: https://lkml.org/lkml/2012/1/23/299 Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-02-22 13:14:16 -08:00
Greg Kroah-Hartman	d6126ef5f3	x86/mce: Convert static array of pointers to per-cpu variables When I previously fixed up the mce_device code, I used a static array of the pointers. It was (rightfully) pointed out to me that I should be using the per_cpu code instead. This patch converts the code over to that structure, moving the variable back into the per_cpu area, like it used to be for 3.2 and earlier. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: https://lkml.org/lkml/2012/1/27/165 Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-02-22 12:58:06 -08:00
Borislav Petkov	3f806e5098	x86/mce/AMD: Fix UP build error `141168c36c` ("x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code touching struct cpuinfo_x86 members but also caused the following build error with Randy's randconfigs: mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map' Restore the #ifdef in threshold_create_bank() which creates symlinks on the non-BSP CPUs. There's a better patch series being worked on by Kevin Winchester which will solve this in a cleaner fashion, but that series is too ambitious for v3.3 merging - so we first queue up this trivial fix and then do the rest for v3.4. Signed-off-by: Borislav Petkov <bp@alien8.de> Acked-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Nick Bowler <nbowler@elliptictech.com> Link: http://lkml.kernel.org/r/20120203191801.GA2846@x1.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-22 13:36:30 +01:00
Linus Torvalds	1361b83a13	i387: Split up <asm/i387.h> into exported and internal interfaces While various modules include <asm/i387.h> to get access to things we actually intend for them to use, most of that header file was really pretty low-level internal stuff that we really don't want to expose to others. So split the header file into two: the small exported interfaces remain in <asm/i387.h>, while the internal definitions that are only used by core architecture code are now in <asm/fpu-internal.h>. The guiding principle for this was to expose functions that we export to modules, and leave them in <asm/i387.h>, while stuff that is used by task switching or was marked GPL-only is in <asm/fpu-internal.h>. The fpu-internal.h file could be further split up too, especially since arch/x86/kvm/ uses some of the remaining stuff for its module. But that kvm usage should probably be abstracted out a bit, and at least now the internal FPU accessor functions are much more contained. Even if it isn't perhaps as contained as it _could_ be. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-02-21 14:12:54 -08:00
Linus Torvalds	8546c00892	i387: Uninline the generic FP helpers that we expose to kernel modules Instead of exporting the very low-level internals of the FPU state save/restore code (ie things like 'fpu_owner_task'), we should export the higher-level interfaces. Inlining these things is pointless anyway: sure, sometimes the end result is small, but while 'stts()' can result in just three x86 instructions, those are not cheap instructions (writing %cr0 is a serializing instruction and a very slow one at that). So the overhead of a function call is not noticeable, and we really don't want random modules mucking about with our internal state save logic anyway. So this unexports 'fpu_owner_task', and instead uninlines and exports the actual functions that modules can use: fpu_kernel_begin/end() and unlazy_fpu(). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211339590.5354@i5.linux-foundation.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-02-21 14:12:46 -08:00
Linus Torvalds	27e74da980	i387: export 'fpu_owner_task' per-cpu variable (And define it properly for x86-32, which had its 'current_task' declaration in separate from x86-64) Bitten by my dislike for modules on the machines I use, and the fact that apparently nobody else actually wanted to test the patches I sent out. Snif. Nobody else cares. Anyway, we probably should uninline the 'kernel_fpu_begin()' function that is what modules actually use and that references this, but this is the minimal fix for now. Reported-by: Josh Boyer <jwboyer@gmail.com> Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-02-20 19:34:10 -08:00
H. Peter Anvin	d1a797f388	x32: Handle process creation Allow an x32 process to be started. Originally-by: H. J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>	2012-02-20 12:52:05 -08:00
Linus Torvalds	7e16838d94	i387: support lazy restore of FPU state This makes us recognize when we try to restore FPU state that matches what we already have in the FPU on this CPU, and avoids the restore entirely if so. To do this, we add two new data fields: - a percpu 'fpu_owner_task' variable that gets written any time we update the "has_fpu" field, and thus acts as a kind of back-pointer to the task that owns the CPU. The exception is when we save the FPU state as part of a context switch - if the save can keep the FPU state around, we leave the 'fpu_owner_task' variable pointing at the task whose FP state still remains on the CPU. - a per-thread 'last_cpu' field, that indicates which CPU that thread used its FPU on last. We update this on every context switch (writing an invalid CPU number if the last context switch didn't leave the FPU in a lazily usable state), so we know that that thread has done nothing else with the FPU since. These two fields together can be used when next switching back to the task to see if the CPU still matches: if 'fpu_owner_task' matches the task we are switching to, we know that no other task (or kernel FPU usage) touched the FPU on this CPU in the meantime, and if the current CPU number matches the 'last_cpu' field, we know that this thread did no other FP work on any other CPU, so the FPU state on the CPU must match what was saved on last context switch. In that case, we can avoid the 'f[x]rstor' entirely, and just clear the CR0.TS bit. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-02-20 10:58:54 -08:00
Ben Hutchings	5467bdda4a	x86/cpu: Clean up modalias feature matching We currently include commas on both sides of the feature ID in a modalias, but this prevents the lowest numbered feature of a CPU from being matched. Since all feature IDs have the same length, we do not need to worry about substring matches, so omit commas from the modalias entirely. Avoid generating multiple adjacent wildcards when there is no feature ID to match. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-02-13 15:24:26 -08:00
Ben Hutchings	70142a9dd1	x86/cpu: Fix overrun check in arch_print_cpu_modalias() snprintf() does not return a negative value when truncating. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-02-13 15:24:26 -08:00
Yinghai Lu	21c3fcf3e3	x86/debug: Fix/improve the show_msr=<cpus> debug print out Found out that show_msr=<cpus> is broken, when I asked a user to use it to capture debug info about broken MTRR's whose MTRR settings are probably different between CPUs. Only the first CPUs MSRs are printed, but that is not enough to track down the suspected bug. For years we called print_cpu_msr from print_cpu_info(), but this commit: \| commit `2eaad1fddd` \| Author: Mike Travis <travis@sgi.com> \| Date: Thu Dec 10 17:19:36 2009 -0800 \| \| x86: Limit the number of processor bootup messages removed the print_cpu_info() call from all APs. Put it back - it will only print MSRs when the user specifically requests them via show_msr=<cpus>. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/1329069237-11483-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-12 19:12:21 +01:00
Andreas Herrmann	32c3233885	x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors For L1 instruction cache and L2 cache the shared CPU information is wrong. On current AMD family 15h CPUs those caches are shared between both cores of a compute unit. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=42607 Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Petkov Borislav <Borislav.Petkov@amd.com> Cc: Dave Jones <davej@redhat.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/20120208195229.GA17523@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-09 09:38:15 +01:00
Stephane Eranian	f39d47ff81	perf: Fix double start/stop in x86_pmu_start() The following patch fixes a bug introduced by the following commit: `e050e3f0a7` ("perf: Fix broken interrupt rate throttling") The patch caused the following warning to pop up depending on the sampling frequency adjustments: ------------[ cut here ]------------ WARNING: at arch/x86/kernel/cpu/perf_event.c:995 x86_pmu_start+0x79/0xd4() It was caused by the following call sequence: perf_adjust_freq_unthr_context.part() { stop() if (delta > 0) { perf_adjust_period() { if (period > 8*...) { stop() ... start() } } } start() } Which caused a double start and a double stop, thus triggering the assert in x86_pmu_start(). The patch fixes the problem by avoiding the double calls. We pass a new argument to perf_adjust_period() to indicate whether or not the event is already stopped. We can't just remove the start/stop from that function because it's called from __perf_event_overflow where the event needs to be reloaded via a stop/start back-toback call. The patch reintroduces the assertion in x86_pmu_start() which was removed by commit: `84f2b9b` ("perf: Remove deprecated WARN_ON_ONCE()") In this second version, we've added calls to disable/enable PMU during unthrottling or frequency adjustment based on bug report of spurious NMI interrupts from Eric Dumazet. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: markus@trippelsdorf.de Cc: paulus@samba.org Link: http://lkml.kernel.org/r/20120207133956.GA4932@quad [ Minor edits to the changelog and to the code ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-07 16:58:56 +01:00
Borislav Petkov	c98fdeaa92	x86/sched/perf/AMD: Set sched_clock_stable Stephane Eranian reported that doing a scheduler latency measurements with perf on AMD doesn't work out as expected due to the fact that the sched_clock() granularity is too coarse, i.e. done in jiffies due to the sched_clock_stable not set, which, if set, would mean that we get to use the TSC as sample source which would give us much higher precision. However, there's no reason not to set sched_clock_stable on AMD because all families from F10h and upwards do have an invariant TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]). Make it so, #1. Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@amd64.org> Cc: Venki Pallipadi <venki@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad [ Should any non-standard system break the TSC, we should mark them so explicitly, in their platform init handler, or in a DMI quirk. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-07 13:12:08 +01:00
Arnaldo Carvalho de Melo	5ddf146f70	Merge branch 'perf/urgent' into perf/core So that we can get the perf bench exec stack fixes and then apply the remaining fix for the files added after what is in perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-02-06 19:11:02 -02:00
Stephane Eranian	84f2b9b2ed	perf: Remove deprecated WARN_ON_ONCE() With the new throttling/unthrottling code introduced with commit: `e050e3f0a7` ("perf: Fix broken interrupt rate throttling") we occasionally hit two WARN_ON_ONCE() checks in: - intel_pmu_pebs_enable() - intel_pmu_lbr_enable() - x86_pmu_start() The assertions are no longer problematic. There is a valid path where they can trigger but it is harmless. The assertion can be triggered with: $ perf record -e instructions:pp .... Leading to paths: intel_pmu_pebs_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context perf_event_task_tick scheduler_tick And: intel_pmu_lbr_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context. perf_event_task_tick scheduler_tick cpuc->enabled is always on because when we get to perf_adjust_freq_unthr_context() the PMU is not totally disabled. Furthermore when we need to adjust a period, we only stop the event we need to change and not the entire PMU. Thus, when we re-enable, cpuc->enabled is already set. Note that when we stop the event, both pebs and lbr are stopped if necessary (and possible). Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/20120202110401.GA30911@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-03 08:24:40 +01:00
Ingo Molnar	44a6839711	Merge branch 'perf/fast' into perf/core Merge reason: Lets ready it for v3.4 Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-27 12:08:09 +01:00
Thomas Renninger	fad12ac8c8	CPU: Introduce ARCH_HAS_CPU_AUTOPROBE and X86 parts This patch is based on Andi Kleen's work: Implement autoprobing/loading of modules serving CPU specific features (x86cpu autoloading). And Kay Siever's work to get rid of sysdev cpu structures and making use of struct device instead. Before, the cpuid driver had to be loaded to get the x86cpu autoloading feature. With this patch autoloading works through the /sys/devices/system/cpu object Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Jones <davej@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-26 16:49:08 -08:00
Thomas Renninger	2f1e097e24	X86: Introduce HW-Pstate scattered cpuid feature It is rather similar to CPB (boot capability) feature and exists since fam10h (can be looked up in AMD's BKDG). The feature is needed for powernow-k8 to cleanup init functions and to provide proper autoloading matching with the new x86cpu modalias feature. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Jones <davej@redhat.com> Cc: Borislav Petkov <bp@amd64.org> Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-26 16:49:06 -08:00
Andi Kleen	644e9cbbe3	Add driver auto probing for x86 features v4 There's a growing number of drivers that support a specific x86 feature or CPU. Currently loading these drivers currently on a generic distribution requires various driver specific hacks and it often doesn't work. This patch adds auto probing for drivers based on the x86 cpuid information, in particular based on vendor/family/model number and also based on CPUID feature bits. For example a common issue is not loading the SSE 4.2 accelerated CRC module: this can significantly lower the performance of BTRFS which relies on fast CRC. Another issue is loading the right CPUFREQ driver for the current CPU. Currently distributions often try all all possible driver until one sticks, which is not really a good way to do this. It works with existing udev without any changes. The code exports the x86 information as a generic string in sysfs that can be matched by udev's pattern matching. This scheme does not support numeric ranges, so if you want to handle e.g. ranges of model numbers they have to be encoded in ASCII or simply all models or families listed. Fixing that would require changing udev. Another issue is that udev will happily load all drivers that match, there is currently no nice way to stop a specific driver from being loaded if it's not needed (e.g. if you don't need fast CRC) But there are not that many cpu specific drivers around and they're all not that bloated, so this isn't a particularly serious issue. Originally this patch added the modalias to the normal cpu sysdevs. However sysdevs don't have all the infrastructure needed for udev, so it couldn't really autoload drivers. This patch instead adds the CPU modaliases to the cpuid devices, which are real devices with full support for udev. This implies that the cpuid driver has to be loaded to use this. This patch just adds infrastructure, some driver conversions in followups. Thanks to Kay for helping with some sysfs magic. v2: Constifcation, some updates v4: (trenn@suse.de): - Use kzalloc instead of kmalloc to terminate modalias buffer - Use uppercase hex values to match correctly against hex values containing letters Cc: Dave Jones <davej@redhat.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Jen Axboe <axboe@kernel.dk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-26 16:44:41 -08:00
Tony Luck	08dda402d6	x86/mce: Replace hard coded hex constants with symbolic defines Magic constants like 0x0134 in code just invite questions on where they come from, what they mean, can they be changed. Provide #defines for the architecturally defined MCACOD values with a reference to the Intel Software Developers manual which describes them. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-26 16:02:22 -08:00
Ingo Molnar	4e9f44ba29	MCE recovery (data path only) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJPHy7VAAoJEKurIx+X31iBHcsP/1VcGuxFAL5i/zBqUqhbS7BL s+4or1j3NOcxIePQ9egg1L/sLzD+jmo37ObFMTzFOLwuLeodtJF6e0DXQhR7bMKz UqOS4WAhNxRBtZtUqIbIiMoDG4Vny1atdqxDQKzmV88ulTG2+JE5U6sGjfTdWvX7 gZA6Vj31Dz7p6scPT2j8tnLjFV+XvVJSBp/2rgi2Nw81UzBeIRZRiWZrBMLemPCU T82OEffnIpSdn60sktMN/ht99yGQO31zT0c+/72Z0ysZAPlTjFbW7CZJHPZmLIVB tPkoTRFOf4iwjy2pZNzs9bB8ord/As3IyTxAsfYUin4N2bX27n058uTQ3CqbgEz+ pa6C5N0ZrV9plYa9BbgCHmNIkhEONIb3WtH27uh/hZOztDA2CXzPT5mm4FOzmrJ7 DGVBqmXth6g2jYJNT/K2QgmVMZM0CeXQnoDJP54sXzv7F4dEM5P64Lz6E1kCd5Jf x9O1orDnEVXssgEPVtF/eEjIQK/vF7s1BUUlMBZJwdAyTwCiD8RvueG87bApnA2z eO8VS62akqjpDt5sHboAGJrjcuhqnkbgtG2dn0EqONzk8DJPnhFXVLmSbvH+KuTC OguH2LC5N7n9Wjr5a9Duw2DdIj8njvzFrKVzo/l6r3m99u/Jby54vGk2cPLwfGvp /9Y+SK2Ou6LSbPiRU4dP =ofSb -----END PGP SIGNATURE----- Merge tag 'mce-recovery-for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Implement MCE recovery for the data load error path and assorted cleanups. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-26 11:40:13 +01:00
Greg Kroah-Hartman	e032d80774	mce: fix warning messages about static struct mce_device When suspending, there was a large list of warnings going something like: Device 'machinecheck1' does not have a release() function, it is broken and must be fixed This patch turns the static mce_devices into dynamically allocated, and properly frees them when they are removed from the system. It solves the warning messages on my laptop here. Reported-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Djalal Harouni <tixxdz@opendz.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@amd64.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-16 17:08:42 -08:00
Linus Torvalds	83c2f912b4	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) perf tools: Fix compile error on x86_64 Ubuntu perf report: Fix --stdio output alignment when --showcpuutilization used perf annotate: Get rid of field_sep check perf annotate: Fix usage string perf kmem: Fix a memory leak perf kmem: Add missing closedir() calls perf top: Add error message for EMFILE perf test: Change type of '-v' option to INCR perf script: Add missing closedir() calls tracing: Fix compile error when static ftrace is enabled recordmcount: Fix handling of elf64 big-endian objects. perf tools: Add const.h to MANIFEST to make perf-tar-src-pkg work again perf tools: Add support for guest/host-only profiling perf kvm: Do guest-only counting by default perf top: Don't update total_period on process_sample perf hists: Stop using 'self' for struct hist_entry perf hists: Rename total_session to total_period x86: Add counter when debug stack is used with interrupts enabled x86: Allow NMIs to hit breakpoints in i386 x86: Keep current stack in NMI breakpoints ...	2012-01-15 11:26:35 -08:00
Srivatsa S. Bhat	a3301b751b	x86/mce: Fix CPU hotplug and suspend regression related to MCE Commit `8a25a2fd12` ("cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem") changed how things are dealt with in the MCE subsystem. Some of the things that got broken due to this are CPU hotplug and suspend/hibernate. MCE uses per_cpu allocations of struct device. So, when a CPU goes offline and comes back online, in order to ensure that we start from a clean slate with respect to the MCE subsystem, zero out the entire per_cpu device structure to 0 before using it. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-13 19:11:35 -08:00
Ingo Molnar	636f0c70f2	Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2012-01-08 12:31:24 +01:00
Linus Torvalds	7affca3537	Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits) arm: fix up some samsung merge sysdev conversion problems firmware: Fix an oops on reading fw_priv->fw in sysfs loading file Drivers:hv: Fix a bug in vmbus_driver_unregister() driver core: remove __must_check from device_create_file debugfs: add missing #ifdef HAS_IOMEM arm: time.h: remove device.h #include driver-core: remove sysdev.h usage. clockevents: remove sysdev.h arm: convert sysdev_class to a regular subsystem arm: leds: convert sysdev_class to a regular subsystem kobject: remove kset_find_obj_hinted() m86k: gpio - convert sysdev_class to a regular subsystem mips: txx9_sram - convert sysdev_class to a regular subsystem mips: 7segled - convert sysdev_class to a regular subsystem sh: dma - convert sysdev_class to a regular subsystem sh: intc - convert sysdev_class to a regular subsystem power: suspend - convert sysdev_class to a regular subsystem power: qe_ic - convert sysdev_class to a regular subsystem power: cmm - convert sysdev_class to a regular subsystem s390: time - convert sysdev_class to a regular subsystem ... Fix up conflicts with 'struct sysdev' removal from various platform drivers that got changed: - arch/arm/mach-exynos/cpu.c - arch/arm/mach-exynos/irq-eint.c - arch/arm/mach-s3c64xx/common.c - arch/arm/mach-s3c64xx/cpu.c - arch/arm/mach-s5p64x0/cpu.c - arch/arm/mach-s5pv210/common.c - arch/arm/plat-samsung/include/plat/cpu.h - arch/powerpc/kernel/sysfs.c and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h	2012-01-07 12:03:30 -08:00
Ingo Molnar	03f70388c3	Merge branch 'tip/x86/core-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2012-01-07 13:25:49 +01:00
Linus Torvalds	edf7c8148e	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: add IRQ context simulation in module mce-inject x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog x86, MCE: Drain mcelog buffer x86, mce: Add wrappers for registering on the decode chain	2012-01-06 15:02:37 -08:00
Linus Torvalds	2c364faabb	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, centaur: Enable cx8 for VIA Eden too	2012-01-06 14:00:44 -08:00
Linus Torvalds	cf3f33551b	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Use "do { } while(0)" for empty lock_cmos()/unlock_cmos() macros x86: Use "do { } while(0)" for empty flush_tlb_fix_spurious_fault() macro x86, CPU: Drop superfluous get_cpu_cap() prototype arch/x86/mm/pageattr.c: Quiet sparse noise; local functions should be static arch/x86/kernel/ptrace.c: Quiet sparse noise x86: Use kmemdup() in copy_thread(), rather than duplicating its implementation x86: Replace the EVT_TO_HPET_DEV() macro with an inline function	2012-01-06 14:00:12 -08:00
Linus Torvalds	69734b644b	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86: Fix atomic64_xxx_cx8() functions x86: Fix and improve cmpxchg_double{,_local}() x86_64, asm: Optimise fls(), ffs() and fls64() x86, bitops: Move fls64.h inside __KERNEL__ x86: Fix and improve percpu_cmpxchg{8,16}b_double() x86: Report cpb and eff_freq_ro flags correctly x86/i386: Use less assembly in strlen(), speed things up a bit x86: Use the same node_distance for 32 and 64-bit x86: Fix rflags in FAKE_STACK_FRAME x86: Clean up and extend do_int3() x86: Call do_notify_resume() with interrupts enabled x86/div64: Add a micro-optimization shortcut if base is power of two x86-64: Cleanup some assembly entry points x86-64: Slightly shorten line system call entry and exit paths x86-64: Reduce amount of redundant code generated for invalidate_interruptNN x86-64: Slightly shorten int_ret_from_sys_call x86, efi: Convert efi_phys_get_time() args to physical addresses x86: Default to vsyscall=emulate x86-64: Set siginfo and context on vsyscall emulation faults x86: consolidate xchg and xadd macros ...	2012-01-06 13:59:14 -08:00
Linus Torvalds	67b0243131	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Skip cpus with apic-ids >= 255 in !x2apic_mode x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present x86, apic: Add probe() for apic_flat x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86' x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat x86: Add per-cpu stat counter for APIC ICR read tries pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86 x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support x86: Add NumaChip support x86: Add x86_init platform override to fix up NUMA core numbering x86: Make flat_init_apic_ldr() available	2012-01-06 13:58:21 -08:00
Greg Kroah-Hartman	ff4b8a57f0	Merge branch 'driver-core-next' into Linux 3.2 This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file, and it fixes the build error in the arch/x86/kernel/microcode_core.c file, that the merge did not catch. The microcode_core.c patch was provided by Stephen Rothwell <sfr@canb.auug.org.au> who was invaluable in the merge issues involved with the large sysdev removal process in the driver-core tree. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-06 11:42:52 -08:00
Linus Torvalds	35b740e466	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits) perf kvm: Fix copy & paste error in description perf script: Kill script_spec__delete perf top: Fix a memory leak perf stat: Introduce get_ratio_color() helper perf session: Remove impossible condition check perf tools: Fix feature-bits rework fallout, remove unused variable perf script: Add generic perl handler to process events perf tools: Use for_each_set_bit() to iterate over feature flags perf tools: Unify handling of features when writing feature section perf report: Accept fifos as input file perf tools: Moving code in some files perf tools: Fix out-of-bound access to struct perf_session perf tools: Continue processing header on unknown features perf tools: Improve macros for struct feature_ops perf: builtin-record: Document and check that mmap_pages must be a power of two. perf: builtin-record: Provide advice if mmap'ing fails with EPERM. perf tools: Fix truncated annotation perf script: look up thread using tid instead of pid perf tools: Look up thread names for system wide profiling perf tools: Fix comm for processes with named threads ...	2012-01-06 08:02:58 -08:00
Linus Torvalds	423d091dfe	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) cpu: Export cpu_up() rcu: Apply ACCESS_ONCE() to rcu_boost() return value Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" docs: Additional LWN links to RCU API rcu: Augment rcu_batch_end tracing for idle and callback state rcu: Add rcutorture tests for srcu_read_lock_raw() rcu: Make rcutorture test for hotpluggability before offlining CPUs driver-core/cpu: Expose hotpluggability to the rest of the kernel rcu: Remove redundant rcu_cpu_stall_suppress declaration rcu: Adaptive dyntick-idle preparation rcu: Keep invoking callbacks if CPU otherwise idle rcu: Irq nesting is always 0 on rcu_enter_idle_common rcu: Don't check irq nesting from rcu idle entry/exit rcu: Permit dyntick-idle with callbacks pending rcu: Document same-context read-side constraints rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass rcu: Remove dynticks false positives and RCU failures rcu: Reduce latency of rcu_prepare_for_idle() rcu: Eliminate RCU_FAST_NO_HZ grace-period hang rcu: Avoid needlessly IPIing CPUs at GP end ...	2012-01-06 08:02:40 -08:00
Ingo Molnar	adaf4ed2ab	Merge commit 'v3.2-rc7' into x86/asm Merge reason: Update from -rc4 to -rc7. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-04 15:01:28 +01:00
Tony Luck	5f7b88d51e	x86/mce: Recognise machine check bank signature for data path error Action required data path signature is defined in table 15-19 of SDM: +-----------------------------------------------------------------------------+ \| SRAR Error \| Valid \| OVER \| UC \| EN \| MISCV \| ADDRV \| PCC \| S \| AR \| MCACOD \| \| Data Load \| 1 \| 0 \| 1 \| 1 \| 1 \| 1 \| 0 \| 1 \| 1 \| 0x134 \| +-----------------------------------------------------------------------------+ Recognise this, and pass MCE_AR_SEVERITY code back to do_machine_check() if we have the action handler configured (CONFIG_MEMORY_FAILURE=y) Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-03 12:07:07 -08:00
Tony Luck	a8c321fbf9	x86/mce: Handle "action required" errors All non-urgent actions (reporting low severity errors and handling "action-optional" errors) are now handled by a work queue. This means that TIF_MCE_NOTIFY can be used to block execution for a thread experiencing an "action-required" fault until we get all cpus out of the machine check handler (and the thread that hit the fault into mce_notify_process(). We use the new mce_{save,find,clear}_info() API to get information from do_machine_check() to mce_notify_process(), and then use the newly improved memory_failure(..., MF_ACTION_REQUIRED) to handle the error (possibly signalling the process). Update some comments to make the new code flows clearer. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-03 12:07:01 -08:00
Tony Luck	af104e394e	x86/mce: Add mechanism to safely save information in MCE handler Machine checks on Intel cpus interrupt execution on all cpus, regardless of interrupt masking. We have a need to save some data about the cause of the machine check (physical address) in the machine check handler that can be retrieved later to attempt recovery in a more flexible execution state. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-03 12:06:53 -08:00
Tony Luck	85f92694af	x86/mce: Create helper function to save addr/misc when needed The MCI_STATUS_MISCV and MCI_STATUS_ADDRV bits in the bank status registers define whether the MISC and ADDR registers respectively contain valid data - provide a helper function to check these bits and read the registers when needed. In addition, processors that support software error recovery (as indicated by the MCG_SER_P bit in the MCG_CAP register) may include some undefined bits in the ADDR register - mask these out. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-03 12:06:45 -08:00
Tony Luck	cd42f4a3b2	HWPOISON: Clean up memory_failure() vs. __memory_failure() There is only one caller of memory_failure(), all other users call __memory_failure() and pass in the flags argument explicitly. The lone user of memory_failure() will soon need to pass flags too. Add flags argument to the callsite in mce.c. Delete the old memory_failure() function, and then rename __memory_failure() without the leading "__". Provide clearer message when action optional memory errors are ignored. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-01-03 12:06:32 -08:00
Robert Richter	2e64694de2	perf/x86: Fix raw_spin_unlock_irqrestore() usage Use raw_spin_unlock_irqrestore() as equivalent to raw_spin_lock_irqsave(). Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324646665-13334-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-23 17:57:01 +01:00
Kay Sievers	8a25a2fd12	cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 14:29:42 -08:00
Steven Rostedt	42181186ad	x86: Add counter when debug stack is used with interrupts enabled Mathieu Desnoyers pointed out a case that can cause issues with NMIs running on the debug stack: int3 -> interrupt -> NMI -> int3 Because the interrupt changes the stack, the NMI will not see that it preempted the debug stack. Looking deeper at this case, interrupts only happen when the int3 is from userspace or in an a location in the exception table (fixup). userspace -> int3 -> interurpt -> NMI -> int3 All other int3s that happen in the kernel should be processed without ever enabling interrupts, as the do_trap() call will panic the kernel if it is called to process any other location within the kernel. Adding a counter around the sections that enable interrupts while using the debug stack allows the NMI to also check that case. If the NMI sees that it either interrupted a task using the debug stack or the debug counter is non-zero, then it will have to change the IDT table to make the int3 not change stacks (which will corrupt the stack if it does). Note, I had to move the debug_usage functions out of processor.h and into debugreg.h because of the static inlined functions to inc and dec the debug_usage counter. __get_cpu_var() requires smp.h which includes processor.h, and would fail to build. Link: http://lkml.kernel.org/r/1323976535.23971.112.camel@gandalf.stny.rr.com Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Turner <pjt@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 15:38:56 -05:00
Steven Rostedt	228bdaa95f	x86: Keep current stack in NMI breakpoints We want to allow NMI handlers to have breakpoints to be able to remove stop_machine from ftrace, kprobes and jump_labels. But if an NMI interrupts a current breakpoint, and then it triggers a breakpoint itself, it will switch to the breakpoint stack and corrupt the data on it for the breakpoint processing that it interrupted. Instead, have the NMI check if it interrupted breakpoint processing by checking if the stack that is currently used is a breakpoint stack. If it is, then load a special IDT that changes the IST for the debug exception to keep the same stack in kernel context. When the NMI is done, it puts it back. This way, if the NMI does trigger a breakpoint, it will keep using the same stack and not stomp on the breakpoint data for the breakpoint it interrupted. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 15:38:55 -05:00
Peter Zijlstra	e3f3541c19	perf: Extend the mmap control page with time (TSC) fields Extend the mmap control page with fields so that userspace can compute time deltas relative to the provided time fields. Currently only implemented for x86 with constant and nonstop TSC. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-3u1jucza77j3wuvs0x2bic0f@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 11:01:13 +01:00
Peter Zijlstra	0c9d42ed4c	perf, x86: Provide means for disabling userspace RDPMC Allow the disabling of RDPMC via a pmu specific attribute: echo 0 > /sys/bus/event_source/devices/cpu/rdpmc Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-pqeog465zo5hsimtkfz73f27@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 11:01:11 +01:00
Peter Zijlstra	fe4a330885	perf, x86: Implement user-space RDPMC support, to allow fast, user-space access to self-monitoring counters Implement a correct pmu::event_idx for the x86 counter index rules and set CR4.PCE on CPU_STARTING. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arun Sharma <asharma@fb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/n/tip-mwxab34dibqgzk5zywutfnha@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 11:01:10 +01:00
Stephane Eranian	9c1497ea59	perf events: Add Intel x86 mapping for PERF_COUNT_HW_REF_CPU_CYCLES Add event maps for Intel x86 processors (with architected PMU v2 or later). On AMD, there is frequency scaling but no Turbo. There is no core cycle event not subject to frequency scaling, therefore we do not provide a mapping. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323559734-3488-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 10:26:39 +01:00
Stephane Eranian	cd09c0c40a	perf events: Enable raw event support for Intel unhalted_reference_cycles event This patch adds the encoding and definitions necessary for the unhalted_reference_cycles event avaialble since Intel Core 2 processors. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323559734-3488-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 10:26:32 +01:00
Kevin Winchester	141168c36c	x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86' Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Steffen Persvold <sp@numascale.com> Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 09:25:09 +01:00
Ingo Molnar	d87f69a16e	Merge commit 'v3.2-rc6' into perf/core Merge reason: Update with the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-20 20:32:11 +01:00

... 3 4 5 6 7 ...

2416 Commits