c++ - 实体组件系统的自定义堆预分配

Question

我有一个 OOP 实体组件系统，目前的工作方式如下：

// In the component system
struct Component { virtual void update() = 0; }
struct Entity
{
    bool alive{true};
    vector<unique_ptr<Component>> components;
    void update() { for(const auto& c : components) c->update(); }
}

// In the user application
struct MyComp : Component
{
    void update() override { ... }
}

为了创建新的实体和组件，我使用 C++ 的常用newand delete：

// In the component system
struct Manager
{
    vector<unique_ptr<Entity>> entities;
    Entity& createEntity() 
    { 
        auto result(new Entity);
        entities.emplace_back(result);
        return *result;
    }
    template<typename TComp, typename... TArgs>
        TComp& createComponent(Entity& mEntity, TArgs... mArgs)
    {
        auto result(new TComp(forward<TArgs>(mArgs)...));
        mEntity.components.emplace_back(result);
        return result;
    }
    void removeDead() { /* remove all entities with 'alive == false' - 'delete' is called here by the 'unique_ptr' */ }
}

// In the user application
{
    Manager m;
    auto& myEntity(m.createEntity());
    auto& myComp(m.createComponent<MyComp>(myEntity));
    // Do stuff with myEntity and myComp
    m.removeDead();
}

该系统运行良好，我喜欢它的语法和灵活性。但是，当不断向管理器添加和删除实体和组件时，内存分配/释放会减慢应用程序的速度。（我已经分析并确定减速是由newand引起的delete）。

我最近读到可以在 C++ 中预先分配堆内存 - 如何将其应用于我的情况？

期望的结果：

// In the user application
{
    Manager m{1000}; 
    // This manager can hold about 1000 entities with components 
    // (may not be 1000 because of dynamic component size, 
    // since the user can define it's on components, but it's ok for me)

    auto& myEntity(m.createEntity());
    auto& myComp(m.createComponent<MyComp>(myEntity));
    // Do stuff with myEntity and myComp

    m.removeDead(); 
    // No 'delete' is called here! Memory of the 'dead' entities can
    // be reused for new entity creation
}
// Manager goes out of scope: 'delete' is called here

score 6 · Accepted Answer

您可以做一些事情来更好地实现您的设计规模。

Entity在您当前的实现中，每个和有两个内存分配Component。第一个分配一个对象，第二个在对象放入向量时分配。第二种情况发生在向量用完空间并分配更大的数组并将旧元素移动到新数组中时。

在这种情况下，您能做的最好的事情就是使用侵入式列表。也就是说，每个Entity和Component也成为列表节点。然后，在这些被分配之后，不需要额外的内存分配来将对象放入列表中。使用来自Boost.Intrusive的单链表或双链表，或者自己编写。这就是 Linux 内核跟踪许多不同对象的方式。

下一步是预分配Entity和Component元素。预分配可以像全局数组一样简单，也可以更复杂，例如Boost.Pool。有很多方法可以构建对象的内存池。

一旦Entity和Component被预先分配并使用侵入式列表，您就完成了。

使用 boost 组件的示例：

#include <boost/intrusive/list.hpp>
#include <boost/pool/pool_alloc.hpp>
#include <new>

namespace bi = boost::intrusive;

// api.h

//
// Object pooling support begin.
//
template<class T>
struct Pool
{
    static boost::pool_allocator<T> pool;
};

// Singleton. Although it is defined in the header, the linkers
// make sure there is only one instance of it in the application.
// It is instantiated on demand when Pool<T> is used.
template<class T>
boost::pool_allocator<T> Pool<T>::pool;

template<class Derived>
struct Pooled // use it on the most derived class only, not on intermediate base classes
{
    // Automatically use the object pool for plain new/delete.
    static void* operator new(size_t) { return Pool<Derived>::pool.allocate(1); }
    static void operator delete(void* p) { return Pool<Derived>::pool.deallocate(static_cast<Derived*>(p), 1); }
};
//
// Object pooling support end.
//

// Using bi::list_base_hook<bi::link_mode<bi::auto_unlink> > because it automatically
// unlinks from the list when the object is destroyed. No need to manually
// remove the object from the list when an object is about to be destroyed.

struct Component
    : bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
{
    virtual void update() = 0;
    virtual ~Component() {}
};

struct Entity
    : bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
    , Pooled<Entity> // optional, make it allocated from the pool
{
    bool active = false;

    bi::list<Component, bi::constant_time_size<false> > components;

    ~Entity() {
        for(auto i = components.begin(), j = components.end(); i != j;)
            delete &*i++; // i++ to make sure i stays valid after the object is destroyed
    }

    void update() {
        for(auto& c : components)
            c.update();
    }
};

struct Manager
{
    bi::list<Entity, bi::constant_time_size<false> > entities;

    ~Manager() {
        for(auto i = entities.begin(), j = entities.end(); i != j;)
            delete &*i++; // i++ to make sure i stays valid after the object is destroyed
    }

    Entity& createEntity() {
        auto result = new Entity;
        entities.push_back(*result);
        return *result;
    }

    template<typename TComp, typename... TArgs>
    TComp& createComponent(Entity& mEntity, TArgs... mArgs)
    {
        auto result = new TComp(std::forward<TArgs>(mArgs)...);
        mEntity.components.push_back(*result);
        return *result;
    }

    void removeDead() {
        for(auto i = entities.begin(), j = entities.end(); i != j;) {
            auto& entity = *i++;
            if(!entity.active)
                delete &entity;
        }
    }
};

// user.cc
struct MyComp
    : Component
    , Pooled<MyComp> // optional, make it allocated from the pool
{
    void update() override {}
};

int main() {
    Manager m;
    auto& myEntity(m.createEntity());
    auto& myComp(m.createComponent<MyComp>(myEntity));
    m.removeDead();
}

在上面的例子中，boost::pool_allocator<T>实际上是new用来分配对象，然后它不断地重用被销毁的对象，而不是调用delete它们。您可以通过预先分配所有对象来做得更好，但是根据您的要求有很多方法可以做到这一点，所以boost::pool_allocator<T>为了简单起见，我在这里使用以避免头发分裂。您可以将实现更改为Pooled<T>类似于Pooled<T, N>whereN代表最大对象数的东西，其余代码保持不变，因为它使用 plainnew/delete恰好被从池中分配的对象覆盖。

score 1 · Accepted Answer

C++ 支持这种类型的特定于类的内存池。通用new/delete对不可避免地在

搜索大小合适的空闲块以满足每个请求所花费的时间
合并空闲块所花费的时间
花费时间维护并可能重新组织内部数据结构以使上述两个操作更快。

获得速度的主要方法是使用自定义分配器完全避免这些折衷 - 正如您所说 - 预分配一大块内存，这些内存被视为一个简单的大小相同的空闲对象数组。最初，这些都链接在一个空闲列表上，其中链接指针占据每个“覆盖”数据最终将到达的块的第一个字节。分配只是从空闲列表的头部解开一个块 - 一个需要大约 2 条指令的“弹出”操作。释放是一个“推：”另外两个指令。在许多情况下，可以将内存硬件设置为在池为空时生成陷阱，这样就没有每次分配的开销来检测这种错误情况。（在 GC 系统中，相同的技巧用于启动收集而没有开销。）

在您的情况下，您需要两个池：一个用于实体，一个用于组件。

定义自己的池分配器并不难，特别是如果您的应用程序是单线程的。有关教程处理，请参阅此文档。

score 0 · Accepted Answer

您的问题之一可以通过在创建向量时分配足够的空间来解决

为了

vector<unique_ptr<Entity>> entities;

在构造函数中提供足够的空间

Manager::Manager() : entities(10000)
{
    //...
}

因此，您可以避免在后期阶段进行重新分配和复制。

第二个问题是unique_ptr<Entity>指针的创建。在这里，由于您将始终使用默认构造的对象，您还可以使用预分配的对象池，从中创建指针。而不是调用 new 你会调用一个自己的类

class EntityPool
{
public:
     EntityPool(unsigned int size = 10000) : pool(size), nextEntity(0)
     {
     }
     Entity* getNext(void)
     {
         if (nextEntity != pool.size()) // if pool is exhausted create new
         {
             pool.emplace_back(Entity());
         }                 
         return pool[nextEntity++];
     }
private:
     vector<Entity> pool;
     unsigned int nextEntity; // index into the vector to the next Entity
};

struct Manager
{
    vector<unique_ptr<Entity>> entities;
    Entity& createEntity() 
    { 
        entities.emplace_back(entityPoolInstance.getNext());
        return *result;
    }
 //...

score 0 · Accepted Answer

或者您可以连接标准的“新位置”。这允许您分配一大块内存以根据您的选择构造（放置）对象。这将在您需要时将块保留在堆上，并允许您将多个短期对象分配到该块中，而不是进行最终导致堆碎片化的昂贵分配和取消分配。这涉及到一些问题，但总而言之，它是一个非常简单的解决方案，而无需遵循自定义内存管理器路线。这是消除一些陷阱的绝佳方法并详细描述新的安置。我使用了像堆栈一样简单的数据结构来跟踪要分配到的下一个空闲块：将即将被删除的块的地址压入堆栈。分配时只需将下一个空闲块从堆栈中弹出并将其用作放置新块的参数。超级简单，超级快！

score 0 · Accepted Answer

使用大多数答案和谷歌作为参考，我在我的SSVUtils 库中实现了一些预分配实用程序。

Prealloc.h

例子：

using MemUnit = char;
using MemUnitPtr = MemUnit*;
using MemSize = decltype(sizeof(MemUnit)); // Should always be 1 byte

class MemBuffer
{
    Uptr<MemUnit[]> buffer;
    MemRange range;

    MemBuffer(MemSize mSize) : ... 
    { 
        // initialize buffer from mSize
    }
};

class PreAllocatorChunk
{
    protected:
        MemSize chunkSize;
        MemBuffer buffer;
        std::stack<MemRange> available;

    public:
        PreAllocatorChunk(MemSize mChunkSize, unsigned int mChunks) : ...
        {
            // Add "chunks" to to available...
        }

        template<typename T, typename... TArgs> T* create(TArgs&&... mArgs)
        {
            // create on first "chunk" using placement new
            auto toUse(available.top().begin); available.pop();
            return new (toUse) T{std::forward<TArgs>(mArgs)...};
        }
};

更多预分配实用程序可用：

PreAllocatorDynamic: 预分配一个大缓冲区，然后，在创建对象时，将缓冲区分成两部分：
- [buffer start, buffer start + obj size)
- [buffer start + obj size, buffer end)
当一个对象被销毁时，其占用的内存范围被设置为“可用”。如果在创建新对象期间没有找到足够大的“块”，则预分配器会在抛出运行时异常之前尝试统一连续的内存块。这个预分配器有时比快new/delete，但它很大程度上取决于预分配缓冲区的大小。
PreAllocatorStatic<T>: 继承自PreAllocatorChunk. 块的大小等于sizeof(T)。最快的预分配器，不太灵活。几乎总是比new/delete.

c++ - 实体组件系统的自定义堆预分配

5 回答 5

Related

Reference