1

我写了一个小程序来检查线程的性能,我从获得的结果中发现了几个问题

(我笔记本的cpu是i5 3220M)

1)每次我运行程序时需要抽出 2 个线程的时间。是因为我使用的 omp 计时器还是程序中有一些逻辑错误?

2)如果我使用cpu周期来衡量性能会更好吗?

3)时间随着线程数的增加而不断减少。我知道我的程序很简单,所以可能不需要上下文切换,但额外的性能来自哪里?因为 cpu 本身可以调整到涡轮频率?(根据英特尔网站,普通 2.6MHz,turbo 3.3MHz)

谢谢!

输出 10 亿次加 1

Average Time Elapsed for 1 threads = 3.11565(Check = 5000000000)
Average Time Elapsed for 2 threads = 4.54309(Check = 5000000000)
Average Time Elapsed for 4 threads = 2.19321(Check = 5000000000)
Average Time Elapsed for 8 threads = 2.48927(Check = 5000000000)
Average Time Elapsed for 16 threads = 1.84427(Check = 5000000000)
Average Time Elapsed for 32 threads = 1.30958(Check = 5000000000)
Average Time Elapsed for 64 threads = 1.08472(Check = 5000000000)
Average Time Elapsed for 128 threads = 0.996898(Check = 5000000000)
Average Time Elapsed for 256 threads = 1.01366(Check = 5000000000)
Average Time Elapsed for 512 threads = 0.951436(Check = 5000000000)
Average Time Elapsed for 1024 threads = 0.973331(Check = 4999997440)

程序

#include <iostream>
#include <thread>
#include <algorithm>    // for_each
#include <vector>
#include <omp.h>        // omp_get_wtime

class Adder{
public:
    long sum;
    Adder(){};
    void operator()(long endVal_i){
        sum = 0;
        for (long i = 1; i<= endVal_i; i++)
            sum++;
    };
};
int main()
{
    long totalCount = 1000000000;
    int maxThread = 1025;
    int numSample = 5;

    std::vector<std::thread> threads;
    Adder adderArray[maxThread];

    std::cout << "Adding 1 for " << totalCount/1000000 << " million times\n\n";

    for (int numThread = 1; numThread <=maxThread; numThread=numThread*2){

        double avgTime=0;
        long check = 0;

        for (int i = 1; i<=numSample; i++){
            double startTime = omp_get_wtime();

            long loop = totalCount/numThread;
            for (int i = 0; i<numThread;i++)
                threads.push_back(std::thread(std::ref(adderArray[i]), loop));

            std::for_each(threads.begin(), threads.end(),std::mem_fn(&std::thread::join));

            double endTime = omp_get_wtime();

            for (int i = 0; i<numThread;i++)
                check += adderArray[i].sum;

            threads.erase(threads.begin(), threads.end());

            avgTime += endTime - startTime;

        }

        std::cout << "Average Time Elapsed for "  << numThread<< " threads = " << avgTime/numSample << "(Check = "<<check<<")\n";
    }

}
4

0 回答 0