This patch implements nmi_setup_mux() and nmi_shutdown_mux() functions
to setup/shutdown multiplexing. Multiplexing code in nmi_int.c is now
much more separated.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch moves some multiplexing code to the new function
op_mux_fill_in_addresses(). Also, the whole multiplexing code is now
at a single location.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Variable switch_index must be initialized for each cpu. This patch
fixes the initialization by moving it to the per-cpu init function
nmi_cpu_setup().
Signed-off-by: Robert Richter <robert.richter@amd.com>
__get_cpu_var() calls smp_processor_id(). When the cpu id is already
known, instead use per_cpu() to avoid generating the id again.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Use the corresponding macros when iterating over counter and control
registers. Since NUM_CONTROLS and NUM_COUNTERS are equal for AMD cpus
the fix is more a cosmetical change.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The number of hardware counters is limited. The multiplexing feature
enables OProfile to gather more events than counters are provided by
the hardware. This is realized by switching between events at an user
specified time interval.
A new file (/dev/oprofile/time_slice) is added for the user to specify
the timer interval in ms. If the number of events to profile is higher
than the number of hardware counters available, the patch will
schedule a work queue that switches the event counter and re-writes
the different sets of values into it. The switching mechanism needs to
be implemented for each architecture to support multiplexing. This
patch only implements AMD CPU support, but multiplexing can be easily
extended for other models and architectures.
There are follow-on patches that rework parts of this patch.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch removes the function nmi_save_registers(). Per-cpu code is
now executed only in the function nmi_cpu_setup(). Also, it renames
the per-cpu function nmi_restore_registers() to
nmi_cpu_restore_registers().
Signed-off-by: Robert Richter <robert.richter@amd.com>
When casting the counter value to a 64 bit value in 32 bit mode, sign
extension may lead to broken counter values. This patch fixes this by
casting to (u64) instead of (s64).
Signed-off-by: Robert Richter <robert.richter@amd.com>
The short name of the achitecture is 'arch_perfmon'. This patch
changes the kernel parameter to use this name.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
op_amd_handle_ibs() should return 0 when IBS is not present or not defined.
Fix compilation warning:
CC [M] arch/x86/oprofile/op_model_amd.o
arch/x86/oprofile/op_model_amd.c: In function ‘op_amd_handle_ibs’:
arch/x86/oprofile/op_model_amd.c:217: warning: no return statement in function returning non-void
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Commit:
e419294 x86/oprofile: moving arch_perfmon counter setup to op_x86_model_spec.init
introduced a bug in the initialization of core_i7 leading to the
incorrect model setup to &op_ppro_spec. This patch fixes this.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The IBS implemention writes 64 bit register values to the cpu buffer
by writing two 32 values using oprofile_add_data(). This patch
introduces oprofile_add_data64() to write a single 64 bit value to the
buffer.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The IBS code internally uses 32 bit values (a low and a high value) to
represent a 64 bit value. This patch changes this and now 64 bit
values are used instead. 64 bit MSR functions can be used now.
No functional changes.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch removes struct op_saved_msr and replaces it by an u64
variable. This makes code easier and it is possible to use 64 bit MSR
functions.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The patch replaces all CTRL_SET_*ACTIVE macros. 64 bit MSR functions
and 64 bit counter values are used now. The code uses bit masks from
<asm/intel_arch_perfmon.h>.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The patch replaces all CTR_OVERFLOWED macros. 64 bit MSR functions and
64 bit counter values are used now. Thus, it will be easier to later
extend the models to use more than 32 bit width counters.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch introduces op_x86_get_ctrl() to calculate the value of the
performance control register. This is generic code usable for all
models. The event and reserved masks are model specific and stored in
struct op_x86_model_spec. 64 bit MSR functions are used now. The patch
removes many hard to read macros used for ctrl calculation.
The function op_x86_get_ctrl() is common code and the first step to
further merge performance counter implementations for x86 models.
Signed-off-by: Robert Richter <robert.richter@amd.com>
In follow-on patches the setup_ctrs() functions will need data that
describes the model. This patch extends the function argument list to
pass a pointer of the model to these function.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The use of the macros has no effect. The oprofilefs has to be extended
first to support these features.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch fixes missing braces around macro parameters. Macro
definitions from intel_arch_perfmon.h are used where possible.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The macros CTRL_READ() and CTRL_WRITE() make the code hard to read and
maintain. This patch replaces them by rdmsr()/wrmsr() functions and
simplifies the code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The macros CTRL_READ() and CTRL_WRITE() make the code hard to read and
maintain. This patch replaces them by rdmsr()/wrmsr() functions and
simplifies the code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The macros CTRL_READ() and CTRL_WRITE() make the code hard to read and
maintain. This patch replaces them by rdmsr()/wrmsr() functions and
simplifies the code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
There are duplicate macro implementations in model specific code. This
patch moves all common macros to op_x86_model.h.
Signed-off-by: Robert Richter <robert.richter@amd.com>
* 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: introduce module_param oprofile.cpu_type
oprofile: add support for Core i7 and Atom
oprofile: remove undocumented oprofile.p4force option
oprofile: re-add force_arch_perfmon option
The function arch_perfmon_init() in nmi_int.c is model specific. This
patch moves it to op_model_ppro.c by using the init function pointer
in struct op_x86_model_spec.
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This reverts commit 59512900ba.
arch_perfmon_setup_counters() is actually never called for ppro, so
there is no code that changes the numbers in op_ppro_spec. The patch
as it is has no effect.
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6
based - to pick up the latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use ®s->sp instead of regs for getting the top of stack in kernel mode.
(on x86-64, regs->sp always points the top of stack)
[ Impact: Oprofile decodes only stack for backtracing on i386 ]
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
[ v2: rename the API to kernel_stack_pointer(), move variable inside ]
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: systemtap@sources.redhat.com
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Jan Blunck <jblunck@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
LKML-Reference: <20090511210300.17332.67549.stgit@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes module_param oprofile.force_arch_perfmon and
introduces oprofile.cpu_type=archperfmon instead. This new parameter
can be reused for other models and architectures.
Currently only archperfmon is supported.
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
The registers are about the same as other Family 6 CPUs
so we only need to add detection.
I'm not completely happy with calling Nehalem Core i7 because
there will be undoubtedly other Nehalem based CPUs
in the future with different marketing names, but it's
the best we got for now.
Requires updated oprofile userland for the new event files.
If you don't want to update right now you can also use
oprofile.force_arch_perfmon=1 (added in the next patch) with 0.9.4
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
There are no new P4s and the oprofile code knows about all existing
ones, so we don't really need the p4force option anymore.
Remove it.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This re-adds the force_arch_perfmon option that was in the original
arch perfmon patchkit. Originally this was rejected in favour
of a generalized perfmon=name option, but it turned out implementing
the later in a reliable way is hard (and it would have been easy
to crash the kernel if a user gets it wrong)
But now Atom and Core i7 support being readded a user would
need to update their oprofile userland to beyond 0.9.4 to use oprofile again
on Atom or Core i7.
To avoid this problem readd the force_arch_perfmon option.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Merge reason: we have gathered quite a few conflicts, need to merge upstream
Conflicts:
arch/powerpc/kernel/Makefile
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/hardirq.h
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
arch/x86/kernel/cpu/common.c
arch/x86/kernel/irq.c
arch/x86/kernel/syscall_table_32.S
arch/x86/mm/iomap_32.c
include/linux/sched.h
kernel/Makefile
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y
In most places it's cleaner to use the accessors cpu_sibling_mask()
and cpu_core_mask() wrappers which already exist.
I couldn't avoid cleaning up the access in oprofile, either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: fix stuck NMIs and non-working oprofile on certain CPUs
Resetting the counter width of the performance counters on Intel's
Core2 CPUs, breaks the delivery of NMIs, when running in x86_64 mode.
This should fix bug #12395:
http://bugzilla.kernel.org/show_bug.cgi?id=12395
Signed-off-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <20090303100412.GC10085@erda.amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix kernel crash
Both oprofile and perfcounters register an NMI die handler, but only one
can handle the NMI. Conveniently, oprofile unregisters it's notifier
when not actively in use, so setting it's notifier priority higher than
perfcounter's allows oprofile to borrow the NMI for the duration of it's
run. Tested/works both as module and built-in.
While testing, I found that if kerneltop was generating NMIs at very
high frequency, the kernel may panic when oprofile registered it's
handler. This turned out to be because oprofile registers it's handler
before reset_value has been allocated, so if an NMI comes in while it's
still setting up, kabOom. Rather than try more invasive changes, I
followed the lead of other places in op_model_ppro.c, and simply
returned in that highly unlikely event. (debug warnings attached)
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
With oprofile as a module, and unloaded by profiling script,
both oprofile and kerneltop work fine.. unless you leave kerneltop
running when you start profiling, then you may see badness.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch creates the new functions
oprofile_write_reserve()
oprofile_add_data()
oprofile_write_commit()
and makes them part of the oprofile api.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The new ring buffer implementation allows the storage of samples with
different size. This patch implements the usage of the new sample
format to store ibs samples in the cpu buffer. Until now, writing to
the cpu buffer could lead to incomplete sampling sequences since IBS
samples were transfered in multiple samples. Due to a full buffer,
data could be lost at any time. This can't happen any more since the
complete data is reserved in advance and then stored in a single
sample.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Special events such as task or context switches are marked with an
escape code in the cpu buffer followed by an event code or a task
identifier. There is one escape code per event. To make escape
sequences also available for data samples the internal cpu buffer
format must be changed. The current implementation does not allow the
extension of event codes since this would lead to collisions with the
task identifiers. To avoid this, this patch introduces an event mask
that allows the storage of multiple events with one escape code. Now,
task identifiers are stored in the data section of the sample. The
implementation also allows the usage of custom data in a sample. As a
side effect the new code is much more readable and easier to
understand.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch fixes the PCI device use count for AMD northbridge
devices. In case of an IBS LVT initialization failure, the PCI device
is released now by calling pci_dev_put().
If there are no initialization errors, the devices are released in
pci_get_device() while iterating.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Impact: rename include file
We'll be providing an asm/perf_counter.h to the generic perfcounter code,
so use the already existing x86 file for this purpose and rename it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implementation of pairwise init/exit funcions for IBS and IBS NMI
setup. There are also some function renames and the removal of forward
function declarations.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Alan Jenkins wrote:
> This is on an EeePC 701, /proc/cpuinfo as attached.
>
> Is this expected? Will the next release work?
>
> Thanks, Alan
>
> # opcontrol --setup --no-vmlinux
> cpu_type 'unset' is not valid
> you should upgrade oprofile or force the use of timer mode
>
> # opcontrol -v
> opcontrol: oprofile 0.9.4 compiled on Nov 29 2008 22:44:10
>
> # cat /dev/oprofile/cpu_type
> i386/p6
> # uname -r
> 2.6.28-rc6eeepc
Hi Alan,
Looking at the kernel driver code for oprofile it can return the "i386/p6" for
the cpu_type. However, looking at the user-space oprofile code there isn't the
matching entry in libop/op_cpu_type.c or the events/unit_mask files in
events/i386 directory.
The Intel AP-485 says this is a "Intel Pentium M processor model D". Seems like
the oprofile kernel driver should be identifying the processor as "i386/p6_mobile"
The driver identification code doesn't look quite right in nmi_init.c
http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=arch/x86/oprofile/nmi_int.c;h=022cd41ea9b4106e5884277096e80e9088a7c7a9;hb=HEAD
has:
409 case 10 ... 13:
410 *cpu_type = "i386/p6";
411 break;
Referring to the Intel AP-485:
case 10 and 11 should produce "i386/piii"
case 13 should produce "i386/p6_mobile"
I didn't see anything for case 12.
Something like the attached patch. I don't have a celeron machine to verify that
changes in this area of the kernel fix thing.
-Will
Signed-off-by: William Cohen <wcohen@redhat.com>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
If oprofile statically compiled in kernel, a cpu unplug triggers
a panic in ppro_stop(), because a NULL pointer is dereferenced.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
reset_value was changed from long to u64 in commit
b991702884 (oprofile: Implement Intel
architectural perfmon support)
But dynamic allocation of this array use a wrong type (long instead of
u64)
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Fix the counter overflow check for CPUs with counter width > 32
I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch adds the logic for enabling additional IBS control bits :
* IBS-Fetch IbsRandEn bit (bit 57)
* IBS-Op IbsOpCntCtl bit (bit 19)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Discover number of counters for all family 6 models even when not
in arch perfmon mode.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Newer Intel CPUs (Core1+) have support for architectural
events described in CPUID 0xA. See the IA32 SDM Vol3b.18 for details.
The advantage of this is that it can be done without knowing about
the specific CPU, because the CPU describes by itself what
performance events are supported. This is only a fallback
because only a limited set of 6 events are supported.
This allows to do profiling on Nehalem and on Atom systems
(later not tested)
This patch implements support for that in oprofile's Intel
Family 6 profiling module. It also has the advantage of supporting
an arbitary number of events now as reported by the CPU.
Also allow arbitary counter widths >32bit while we're at it.
Requires a patched oprofile userland to support the new
architecture.
v2: update for latest oprofile tree
remove force_arch_perfmon
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This essentially reverts Linus' earlier 4b9f12a377
commit. Nehalem is not core_2, so it shouldn't be reported as such.
However with the earlier arch perfmon patch it will fall back to
arch perfmon mode now, so there is no need to fake it as core_2.
The only drawback is that Linus will need to patch the arch perfmon
support into his oprofile binary now, but I think he can do that.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Vegard Nossum reported oprofile + hibernation problems:
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.
This patch should fix that problem at least by fixing cpu hotplug&
suspend support.
[ mingo@elte.hu: fixed 5 coding style errors. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch introduces multiplexing support for the Oprofile kernel
module. It basically adds a new function pointer in oprofile_operator
allowing each architecture to supply its callback to switch between
different sets of event when the timer expires. Userspace tools can
modify the time slice through /dev/oprofile/time_slice.
It also modifies the number of counters exposed to the userspace through
/dev/oprofile. For example, the number of counters for AMD CPUs are
changed to 32 and multiplexed in the sets of 4.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>