I'm trying to measure a function's performance by measuring the time for each iteration. During the process, I found even if I do nothing, the results still vary quite a bit.
e.g.
volatile long count = 0;
for (int i = 0; i < N; ++i) {
measure.begin();
++count;
measure.end();
}
In measure.end(), I measure the time difference and keep an unordered_map to keep track of the time-count. I've used clock_gettime as well as rdtsc, but there's always about 1% of the data points lie far away from mean, in a 1000 factor.
Here's what the above loop generates:
T: count percentile
18 117563 11.7563%
19 111821 22.9384%
21 201605 43.0989%
22 541095 97.2084%
23 2136 97.422%
24 2783 97.7003%
...
406 1 99.9994%
3678 1 99.9995%
6662 1 99.9996%
17945 1 99.9997%
18148 1 99.9998%
18181 1 99.9999%
22800 1 100%
mean:21
So whether it's ticks or ns, the worst case 22800 is about 1000 times bigger than mean.
I did isolcpus in grub and was running this with taskset. The simple loop almost does nothing, the hash table to do time-count statistics is outside of the time measurements.
What am I missing?
I'm running this on a laptop with ubuntu installed, CPU is Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz