有时可以在这种情况下提供帮助的一个技巧是按类型对向量进行排序(您应该能够使用 add() 函数中可用类型的知识来强制执行此操作),如果元素的顺序没有,否则事情。如果您主要是为了调用虚函数而遍历向量,这将有助于 CPU 的分支预测器预测调用的目标。或者,您可以在管理器中为每种类型维护单独的向量并依次迭代它们,这具有类似的效果。
您的编译器的优化器也可以帮助您处理此类代码,特别是如果它支持 Profile Guided Optimization (POGO)。编译器可以在某些情况下对调用进行去虚拟化,或者使用 POGO 可以在生成的程序集中做一些事情来帮助 CPU 的分支预测器,比如测试最常见的类型,并对那些回退到间接调用的直接调用不太常见的类型。
这是一个测试程序的结果,它说明了按类型排序的性能优势,Manager 是您的版本,Manager2 维护一个按 typeid 索引的向量哈希表:
Derived1::count = 50043000, Derived2::count = 49957000
class Manager::funcAll took 714ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager2::funcAll took 274ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager2::funcAll took 273ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager::funcAll took 714ms
测试代码:
#include <iostream>
#include <vector>
#include <memory>
#include <random>
#include <unordered_map>
#include <typeindex>
#include <chrono>
using namespace std;
using namespace std::chrono;
static const int instanceCount = 100000;
static const int funcAllIterations = 1000;
static const int numTypes = 2;
struct Base { virtual void func() = 0; };
struct Derived1 : Base { static int count; void func() override { ++count; } };
int Derived1::count = 0;
struct Derived2 : Base { static int count; void func() override { ++count; } };
int Derived2::count = 0;
class Manager {
vector<unique_ptr<Base>> items;
public:
template<class T> void add() { items.emplace_back(new T); }
void funcAll() { for (auto& i : items) i->func(); }
};
class Manager2 {
unordered_map<type_index, vector<unique_ptr<Base>>> items;
public:
template<class T> void add() { items[type_index(typeid(T))].push_back(make_unique<T>()); }
void funcAll() {
for (const auto& type : items) {
for (auto& i : type.second) {
i->func();
}
}
}
};
template<typename Man>
void Test() {
mt19937 engine;
uniform_int_distribution<int> d(0, numTypes - 1);
Derived1::count = 0;
Derived2::count = 0;
Man man;
for (auto i = 0; i < instanceCount; ++i) {
switch (d(engine)) {
case 0: man.add<Derived1>(); break;
case 1: man.add<Derived2>(); break;
}
}
auto startTime = high_resolution_clock::now();
for (auto i = 0; i < funcAllIterations; ++i) {
man.funcAll();
}
auto endTime = high_resolution_clock::now();
cout << "Derived1::count = " << Derived1::count << ", Derived2::count = " << Derived2::count << "\n"
<< typeid(Man).name() << "::funcAll took " << duration_cast<milliseconds>(endTime - startTime).count() << "ms" << endl;
}
int main() {
Test<Manager>();
Test<Manager2>();
Test<Manager2>();
Test<Manager>();
}