Can Linux Perf Profiler be used inside C ++ code?
I would like to measure the L1, L2 and L3 cache / miss ratio of some parts of my C ++ code. I'm not interested in using Perf for my entire application. Can Perf be used as a library inside C ++?
int main() {
...
...
start_profiling()
// The part I'm interested in
...
end_profiling()
...
...
}
I gave Intel PCM a shot, but I had two questions. First, he gave me some weird numbers . Second, it does not support L1 cache profiling.
If this isn't possible with Perf, what's the easiest way to get this information?
source to share
It looks like all you're trying to do is read a few performances, which is perfect for a PAPI library .
a full list of supported counters is quite long, but it seems that you are most interested in PAPI_L1_TCM
, PAPI_L1_TCA
and their analogues L2
and L3
. Note that you can also split read / write accesses and you can distinguish between command and data caches.
source to share
Yes, there is dedicated stream monitoring that allows perforated counters to be read from user space. See the man page forperf_event_open(2)
Since it perf
only supports L1i, L1d and last-level cache events, you will need to use the mode PERF_EVENT_RAW
and numbers from the manual for your processor.
To implement profiling, you need to configure sample_interval
, poll
/ select
fd is or wait for the signal SIGIO
, and when this happens, read the sample and the instruction pointer from it. You can try to resolve returned command pointers to function names using a debugger such as GDB.
Another option is to use SystemTap . You will need an empty implementation start|end_profiling()
to enable SystemTap profiling with something like this:
global traceme, prof;
probe process("/path/to/your/executable").function("start_profiling") {
traceme = 1;
}
probe process("/path/to/your/executable").function("end_profiling") {
traceme = 0;
}
probe perf.type(4).config(/* RAW value of perf event */).sample(10000) {
prof[usymname(uaddr())] <<< 1;
}
probe end {
foreach([sym+] in prof) {
printf("%16s %d\n", sym, @count(prof[sym]));
}
}
source to share