The number of hardware counters is limited. The multiplexing feature
enables OProfile to gather more events than counters are provided by
the hardware. This is realized by switching between events at an user
specified time interval.
A new file (/dev/oprofile/time_slice) is added for the user to specify
the timer interval in ms. If the number of events to profile is higher
than the number of hardware counters available, the patch will
schedule a work queue that switches the event counter and re-writes
the different sets of values into it. The switching mechanism needs to
be implemented for each architecture to support multiplexing. This
patch only implements AMD CPU support, but multiplexing can be easily
extended for other models and architectures.
There are follow-on patches that rework parts of this patch.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
When casting the counter value to a 64 bit value in 32 bit mode, sign
extension may lead to broken counter values. This patch fixes this by
casting to (u64) instead of (s64).
Signed-off-by: Robert Richter <robert.richter@amd.com>
op_amd_handle_ibs() should return 0 when IBS is not present or not defined.
Fix compilation warning:
CC [M] arch/x86/oprofile/op_model_amd.o
arch/x86/oprofile/op_model_amd.c: In function ‘op_amd_handle_ibs’:
arch/x86/oprofile/op_model_amd.c:217: warning: no return statement in function returning non-void
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
The IBS implemention writes 64 bit register values to the cpu buffer
by writing two 32 values using oprofile_add_data(). This patch
introduces oprofile_add_data64() to write a single 64 bit value to the
buffer.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The IBS code internally uses 32 bit values (a low and a high value) to
represent a 64 bit value. This patch changes this and now 64 bit
values are used instead. 64 bit MSR functions can be used now.
No functional changes.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The patch replaces all CTRL_SET_*ACTIVE macros. 64 bit MSR functions
and 64 bit counter values are used now. The code uses bit masks from
<asm/intel_arch_perfmon.h>.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The patch replaces all CTR_OVERFLOWED macros. 64 bit MSR functions and
64 bit counter values are used now. Thus, it will be easier to later
extend the models to use more than 32 bit width counters.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch introduces op_x86_get_ctrl() to calculate the value of the
performance control register. This is generic code usable for all
models. The event and reserved masks are model specific and stored in
struct op_x86_model_spec. 64 bit MSR functions are used now. The patch
removes many hard to read macros used for ctrl calculation.
The function op_x86_get_ctrl() is common code and the first step to
further merge performance counter implementations for x86 models.
Signed-off-by: Robert Richter <robert.richter@amd.com>
In follow-on patches the setup_ctrs() functions will need data that
describes the model. This patch extends the function argument list to
pass a pointer of the model to these function.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The use of the macros has no effect. The oprofilefs has to be extended
first to support these features.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The macros CTRL_READ() and CTRL_WRITE() make the code hard to read and
maintain. This patch replaces them by rdmsr()/wrmsr() functions and
simplifies the code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
There are duplicate macro implementations in model specific code. This
patch moves all common macros to op_x86_model.h.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch creates the new functions
oprofile_write_reserve()
oprofile_add_data()
oprofile_write_commit()
and makes them part of the oprofile api.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The new ring buffer implementation allows the storage of samples with
different size. This patch implements the usage of the new sample
format to store ibs samples in the cpu buffer. Until now, writing to
the cpu buffer could lead to incomplete sampling sequences since IBS
samples were transfered in multiple samples. Due to a full buffer,
data could be lost at any time. This can't happen any more since the
complete data is reserved in advance and then stored in a single
sample.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Special events such as task or context switches are marked with an
escape code in the cpu buffer followed by an event code or a task
identifier. There is one escape code per event. To make escape
sequences also available for data samples the internal cpu buffer
format must be changed. The current implementation does not allow the
extension of event codes since this would lead to collisions with the
task identifiers. To avoid this, this patch introduces an event mask
that allows the storage of multiple events with one escape code. Now,
task identifiers are stored in the data section of the sample. The
implementation also allows the usage of custom data in a sample. As a
side effect the new code is much more readable and easier to
understand.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch fixes the PCI device use count for AMD northbridge
devices. In case of an IBS LVT initialization failure, the PCI device
is released now by calling pci_dev_put().
If there are no initialization errors, the devices are released in
pci_get_device() while iterating.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Implementation of pairwise init/exit funcions for IBS and IBS NMI
setup. There are also some function renames and the removal of forward
function declarations.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch adds the logic for enabling additional IBS control bits :
* IBS-Fetch IbsRandEn bit (bit 57)
* IBS-Op IbsOpCntCtl bit (bit 19)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch introduces multiplexing support for the Oprofile kernel
module. It basically adds a new function pointer in oprofile_operator
allowing each architecture to supply its callback to switch between
different sets of event when the timer expires. Userspace tools can
modify the time slice through /dev/oprofile/time_slice.
It also modifies the number of counters exposed to the userspace through
/dev/oprofile. For example, the number of counters for AMD CPUs are
changed to 32 and multiplexed in the sets of 4.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>