我写了一个小程序来检查线程的性能,我从获得的结果中发现了几个问题
(我笔记本的cpu是i5 3220M)
1)每次我运行程序时需要抽出 2 个线程的时间。是因为我使用的 omp 计时器还是程序中有一些逻辑错误?
2)如果我使用cpu周期来衡量性能会更好吗?
3)时间随着线程数的增加而不断减少。我知道我的程序很简单,所以可能不需要上下文切换,但额外的性能来自哪里?因为 cpu 本身可以调整到涡轮频率?(根据英特尔网站,普通 2.6MHz,turbo 3.3MHz)
谢谢!
输出 10 亿次加 1
Average Time Elapsed for 1 threads = 3.11565(Check = 5000000000)
Average Time Elapsed for 2 threads = 4.54309(Check = 5000000000)
Average Time Elapsed for 4 threads = 2.19321(Check = 5000000000)
Average Time Elapsed for 8 threads = 2.48927(Check = 5000000000)
Average Time Elapsed for 16 threads = 1.84427(Check = 5000000000)
Average Time Elapsed for 32 threads = 1.30958(Check = 5000000000)
Average Time Elapsed for 64 threads = 1.08472(Check = 5000000000)
Average Time Elapsed for 128 threads = 0.996898(Check = 5000000000)
Average Time Elapsed for 256 threads = 1.01366(Check = 5000000000)
Average Time Elapsed for 512 threads = 0.951436(Check = 5000000000)
Average Time Elapsed for 1024 threads = 0.973331(Check = 4999997440)
程序
#include <iostream>
#include <thread>
#include <algorithm> // for_each
#include <vector>
#include <omp.h> // omp_get_wtime
class Adder{
public:
long sum;
Adder(){};
void operator()(long endVal_i){
sum = 0;
for (long i = 1; i<= endVal_i; i++)
sum++;
};
};
int main()
{
long totalCount = 1000000000;
int maxThread = 1025;
int numSample = 5;
std::vector<std::thread> threads;
Adder adderArray[maxThread];
std::cout << "Adding 1 for " << totalCount/1000000 << " million times\n\n";
for (int numThread = 1; numThread <=maxThread; numThread=numThread*2){
double avgTime=0;
long check = 0;
for (int i = 1; i<=numSample; i++){
double startTime = omp_get_wtime();
long loop = totalCount/numThread;
for (int i = 0; i<numThread;i++)
threads.push_back(std::thread(std::ref(adderArray[i]), loop));
std::for_each(threads.begin(), threads.end(),std::mem_fn(&std::thread::join));
double endTime = omp_get_wtime();
for (int i = 0; i<numThread;i++)
check += adderArray[i].sum;
threads.erase(threads.begin(), threads.end());
avgTime += endTime - startTime;
}
std::cout << "Average Time Elapsed for " << numThread<< " threads = " << avgTime/numSample << "(Check = "<<check<<")\n";
}
}