When using the statistical execution profiler OProfile to visualise a callgraph profile for my C application, it includes the following warning multiple times. The warning is rather cryptic to me:
warning: dropping hyperspace sample at offset 1af9 >= 2be8 for binary /home/myuser/mybinary
I'm using OProfile in a Xen virtualized environment like this:
modprobe oprofile timer=1
opcontrol --no-vmlinux
opcontrol --start
(wait for profiling data to accumulate)
opcontrol --stop
opreport --session-dir=/var/lib/oprofile --exclude-dependent --demangle=smart \
--symbols /home/myuser/mybinary --callgraph
The complete output from the last command is:
Overflow stats not available
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
warning: dropping hyperspace sample at offset 84d0 >= 79a0 for binary /home/myuser/mybinary
warning: dropping hyperspace sample at offset 7ac0 >= 79a0 for binary /home/myuser/mybinary
warning: dropping hyperspace sample at offset 7d90 >= 79a0 for binary /home/myuser/mybinary
warning: dropping hyperspace sample at offset 7ac0 >= 79a0 for binary /home/myuser/mybinary
warning: dropping hyperspace sample at offset 7d90 >= 79a0 for binary /home/myuser/mybinary
warning: dropping hyperspace sample at offset 8210 >= 79a0 for binary /home/myuser/mybinary
samples % symbol name
-------------------------------------------------------------------------------
After that it prints plausible-looking call graph data.
What does the 'hyperspace' warning mean? What causes it? Does it impact profiling results? How can I fix it?
Maynard Johnson explains this warning in a message to a mailinglist:
There have been cases reported where samples recorded by the oprofile
kernel driver appear to be attributed to the wrong binary, in
particular if the sample rate is very high or when doing callgraph
profiling (since callgraph profiling, like a high sample rate, also
results in very high overhead of the oprofile kernel driver and
overflows of its internal sample buffers). I suspect that's what
you're running into. Unfortunately, this is a very insidious bug, and
no one has been able to find the root cause yet. The kernel driver
does report overflow counts of its internal buffers, and the oprofiled
log prints those out. As a convenience, starting with oprofile 0.9.5,
opreport will also print a warning when it finds non-zero overflow
counts and suggests lowering the sampling interval.
I suggest looking in your /var/lib/oprofile/samples/oprofiled.log and
find the overflow statistics for the above profile run (log entries
are timestamped). If you're only seeing a few or a very small
percentage (say, less than 3%), you can probably just ignore the
anomalies. In general, to avoid/limit this kind of thing, you should
profile at the lowest sampling rate practical, especially when you're
doing callgraph profiling. So what do I mean by "practical"? Well,
as with any sample-based profiler, oprofile is statistical in nature.
And the more data points you have, the more confidence you have in the
data. So for 100% confidence, you should (theoretically) profile with
a count value of '1'. Not too practical, though, since your machine
may appear to lock up because most of the work being done is recording
samples. For cycles event profiling, you can probably use a count
value of several million or so (on today's processors) and still be
pretty confident of the da ta. For other events, it really depends
on their frequency of occurrence.