I've been trying to measure/monitor the utilization of all those 60 cores on Xeon Phi (Knights Corner, in-order processors) at a relatively high frequency, say, at least every 0.1s which yields to 10Hz.
I tried the latest PAPI library. But it only supports PAPI_TOT_INS which is the counter of completed instructions. This won't work because I actually need something related to the instructions issued every 0.1s, not finished. Several instructions issued at different cycles may finish at the same cycle. The issue of instructions is influenced by whether the core is halted or not.
Other commands available like 'top' and 'perf' operate at 1Hz which is too slow for my measurement. I need a higher frequency. And, I also need to synchronize the measurement with vital phases of my codes. So, the Intel Vtune Profile does not work for me either.
Is there a possible way for me to monitor the issue of instructions on Xeon Phi or any other activities linked to their utilizations? I understand that those hardware counters are there, but to read them seems very challenging to me. Maybe I can deduce this utilization by measuring the CPU time of each thread?
Thanks.