linux/arch/x86/kernel
Peter Zijlstra f4929bd372 perf, x86: Update/fix Intel Nehalem cache events
Change the Nehalem cache events to use retired memory instruction counters
(similar to Westmere), this greatly improves the provided stats.

Using:

main ()
{
        int i;

        for (i = 0; i < 1000000000; i++) {
                asm("mov (%%rsp), %%rbx;"
                    "mov %%rbx, (%%rsp);" : : : "rbx");
        }
}

We find:

 $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
      4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
      1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
         1.565184942  seconds time elapsed   ( +-   0.005% )

The 5b is surprising - we'd expect 1b:

 $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
      1,000,021,961 r10b:u                     ( +-   0.000% )
      1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
         1.565055422  seconds time elapsed   ( +-   0.003% )

Which this patch thus fixes.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-22 13:50:27 +02:00
..
acpi Merge branch 'linus' into release 2011-03-23 02:34:54 -04:00
apic x86, UV: Fix kdump reboot 2011-03-31 18:44:03 +02:00
cpu perf, x86: Update/fix Intel Nehalem cache events 2011-04-22 13:50:27 +02:00
.gitignore
Makefile x86: only compile 8237A if CONFIG_ISA_DMA_API is enabled 2011-03-22 17:44:16 -07:00
alternative.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
amd_iommu.c tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
amd_iommu_init.c x86: Use syscore_ops instead of sysdev classes and sysdevs 2011-03-23 22:15:54 +01:00
amd_nb.c x86, amd-nb: Rename CPU PCI id define for F4 2011-03-31 08:51:38 +02:00
apb_timer.c x86: apb_timer: Fixup genirq fallout 2011-03-30 00:13:30 +02:00
aperture_64.c x86, gart: Set DISTLBWALKPRB bit always 2011-04-18 09:26:48 -07:00
apm_32.c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-18 10:45:21 -07:00
asm-offsets.c x86, asm: Cleanup unnecssary macros in asm-offsets.c 2011-02-25 16:37:32 -08:00
asm-offsets_32.c x86: Partly unify asm-offsets_{32,64}.c 2011-02-10 13:31:37 +01:00
asm-offsets_64.c x86: Partly unify asm-offsets_{32,64}.c 2011-02-10 13:31:37 +01:00
audit_64.c
bootflag.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
check.c x86: Don't check for BIOS corruption in first 64K when there's no need to 2011-03-09 16:36:41 +01:00
cpuid.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
crash.c x86, UV: Make kdump avoid stack dumps 2010-07-21 11:33:27 -07:00
crash_dump_32.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
crash_dump_64.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
devicetree.c x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional 2011-03-24 23:17:56 +01:00
doublefault_32.c
dumpstack.c Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-25 17:52:22 -07:00
dumpstack_32.c x86, dumpstack: Correct stack dump info when frame pointer is available 2011-03-18 10:51:42 +01:00
dumpstack_64.c x86, dumpstack: Correct stack dump info when frame pointer is available 2011-03-18 10:51:42 +01:00
e820.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
early-quirks.c x86, quirk: Fix SB600 revision check 2011-03-16 14:03:32 +01:00
early_printk.c x86, earlyprintk: Move mrst early console to platform/ and fix a typo 2010-12-06 20:52:04 +01:00
entry_32.S Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-16 10:10:02 -07:00
entry_64.S x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
ftrace.c ftrace/graph: Trace function entry before updating index 2011-03-10 10:34:43 -05:00
head.c x86: Use memblock to replace early_res 2010-08-27 11:12:29 -07:00
head32.c x86, trampoline: Common infrastructure for low memory trampolines 2011-02-17 21:02:43 -08:00
head64.c x86: Cleanup highmap after brk is concluded 2011-03-19 11:58:19 -07:00
head_32.S Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-15 20:01:36 -07:00
head_64.S x86, trampoline: Common infrastructure for low memory trampolines 2011-02-17 21:02:43 -08:00
hpet.c x86: Cleanup the genirq name space 2011-03-12 14:12:00 +01:00
hw_breakpoint.c x86: Use this_cpu_ops to optimize code 2010-12-30 12:20:28 +01:00
i386_ksyms_32.c
i387.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
i8237.c x86: Use syscore_ops instead of sysdev classes and sysdevs 2011-03-23 22:15:54 +01:00
i8253.c i8253: Convert i8253_lock to raw_spinlock 2010-03-02 10:28:38 +01:00
i8259.c x86: Use syscore_ops instead of sysdev classes and sysdevs 2011-03-23 22:15:54 +01:00
init_task.c Rename .data.cacheline_aligned to .data..cacheline_aligned. 2010-03-03 11:25:58 +01:00
io_delay.c
ioport.c x86: Use bitmap library functions 2011-02-17 14:59:22 +01:00
irq.c x86: Stop including <linux/delay.h> in two asm header files 2011-03-29 09:37:42 +02:00
irq_32.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
irq_64.c
irq_work.c irq_work: Add generic hardirq context callbacks 2010-10-18 19:58:50 +02:00
irqinit.c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-15 20:01:36 -07:00
jump_label.c jump label: x86 support 2010-09-22 16:33:03 -04:00
kdebugfs.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
kgdb.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb 2011-03-25 21:04:56 -07:00
kprobes.c kprobes: Disabling optimized kprobes for entry text section 2011-03-08 17:22:12 +01:00
kvm.c KVM guest: Fix section mismatch derived from kvm_guest_cpu_online() 2011-03-17 13:08:25 -03:00
kvmclock.c KVM paravirt: Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c. 2011-01-12 11:23:10 +02:00
ldt.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
machine_kexec_32.c
machine_kexec_64.c x86, cleanups: Use clear_page/copy_page rather than memset/memcpy 2010-09-22 15:36:49 -07:00
mca_32.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
microcode_amd.c x86, microcode, AMD: Fix signedness bug in generic_load_microcode() 2011-02-20 14:01:32 +01:00
microcode_core.c x86, microcode: Unregister syscore_ops after microcode unloaded 2011-03-29 11:12:04 +02:00
microcode_intel.c x86/microcode: Fix double vfree() and remove redundant pointer checks before vfree() 2010-12-27 14:33:30 +01:00
mmconf-fam10h_64.c x86-64: Fix and clean up AMD Fam10 MMCONF enabling 2010-11-18 13:41:35 +01:00
module.c mm: unify module_alloc code for vmalloc 2011-01-13 17:32:34 -08:00
mpparse.c x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context 2011-03-22 13:00:22 +01:00
msr.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
paravirt-spinlocks.c
paravirt.c thp: add pmd paravirt ops 2011-01-13 17:32:39 -08:00
paravirt_patch_32.c
paravirt_patch_64.c
pci-calgary_64.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
pci-dma.c x86, iommu: Utilize the IOMMU_INIT macros functionality. 2010-08-26 15:14:52 -07:00
pci-gart_64.c x86, gart: Make sure GART does not map physmem above 1TB 2011-04-18 09:26:49 -07:00
pci-iommu_table.c x86, iommu: Add proper dependency sort routine (and sanity check). 2010-08-26 15:13:19 -07:00
pci-nommu.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
pci-swiotlb.c x86, swiotlb: Make SWIOTLB use IOMMU_INIT_* macros. 2010-08-26 15:13:37 -07:00
pcspeaker.c
probe_roms_32.c
process.c x86, dumpstack: Correct stack dump info when frame pointer is available 2011-03-18 10:51:42 +01:00
process_32.c cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer 2011-01-12 18:05:16 -05:00
process_64.c x86: mark associated mm when running a task in 32 bit compatibility mode 2011-03-23 16:36:53 -04:00
ptrace.c ptrace: cleanup arch_ptrace() on x86 2010-10-27 18:03:10 -07:00
pvclock.c x86/pvclock: Zero last_value on resume 2010-11-28 09:33:20 +01:00
quirks.c x86: HPET force enable for CX700 / VIA Epia LT 2010-09-15 16:27:04 +02:00
reboot.c x86: Stop including <linux/delay.h> in two asm header files 2011-03-29 09:37:42 +02:00
reboot_32.S x86, reboot: Fix the use of passed arguments in 32-bit BIOS reboot 2011-02-18 15:47:42 -08:00
reboot_fixups_32.c x86: Ce4100: Add reboot_fixup() for CE4100 2010-11-12 00:45:41 +01:00
relocate_kernel_32.S
relocate_kernel_64.S
resource.c x86: avoid high BIOS area when allocating address space 2010-12-17 10:01:30 -08:00
rtc.c rtc: cmos: Add OF bindings 2011-02-23 22:27:55 +01:00
setup.c x86, hibernate: Initialize mmu_cr4_features during boot 2011-04-06 13:10:02 -07:00
setup_percpu.c x86: Unify CPU -> NUMA node mapping between 32 and 64bit 2011-01-28 14:54:09 +01:00
signal.c
smp.c x86, kexec: Make sure to stop all CPUs before exiting the kernel 2010-10-21 13:30:44 -07:00
smpboot.c Revert "x86, NUMA: Fix fakenuma boot failure" 2011-04-21 11:30:59 +02:00
stacktrace.c x86, dumpstack: Correct stack dump info when frame pointer is available 2011-03-18 10:51:42 +01:00
step.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
sys_i386_32.c i386: Make kernel_execve() suitable for stack unwinding 2010-09-03 08:16:02 +02:00
sys_x86_64.c improve sys_newuname() for compat architectures 2010-03-12 15:52:32 -08:00
syscall_64.c
syscall_table_32.S introduce sys_syncfs to sync a single file system 2011-03-21 00:40:29 -04:00
tboot.c thp: pte alloc trans splitting 2011-01-13 17:32:40 -08:00
tce_64.c
test_nx.c
test_rodata.c
time.c x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog 2010-11-18 09:08:23 +01:00
tls.c
tls.h
topology.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
trampoline.c x86, trampoline: Common infrastructure for low memory trampolines 2011-02-17 21:02:43 -08:00
trampoline_32.S x86, trampoline: Common infrastructure for low memory trampolines 2011-02-17 21:02:43 -08:00
trampoline_64.S x86-64, trampoline: Remove unused variable 2011-02-18 15:50:36 -08:00
traps.c x86, NMI: Clean-up default_do_nmi() 2011-01-07 15:08:53 +01:00
tsc.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
tsc_sync.c
verify_cpu.S x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
vm86_32.c thp: split_huge_page_mm/vma 2011-01-13 17:32:41 -08:00
vmlinux.lds.S Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-16 10:10:02 -07:00
vsmp_64.c
vsyscall_64.c timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset 2010-07-27 12:40:54 +02:00
x86_init.c x86/platform: Add a wallclock_init func to x86_init.timers ops 2011-02-14 18:20:43 +01:00
x8664_ksyms_64.c x86-64, mem: Convert memmove() to assembly file and fix return value bug 2011-01-25 16:58:39 -08:00
xsave.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00