31

Here is the code I normally use to get aligned memory with Visual Studio and GCC

inline void* aligned_malloc(size_t size, size_t align) {
    void *result;
    #ifdef _MSC_VER 
    result = _aligned_malloc(size, align);
    #else 
     if(posix_memalign(&result, align, size)) result = 0;
    #endif
    return result;
}

inline void aligned_free(void *ptr) {
    #ifdef _MSC_VER 
        _aligned_free(ptr);
    #else 
      free(ptr);
    #endif

}

Is this code fine in general? I have also seen people use _mm_malloc, _mm_free. In most cases that I want aligned memory it's to use SSE/AVX. Can I use those functions in general? It would make my code a lot simpler.

Lastly, it's easy to create my own function to align memory (see below). Why then are there so many different common functions to get aligned memory (many of which only work on one platform)?

This code does 16 byte alignment.

float* array = (float*)malloc(SIZE*sizeof(float)+15);

// find the aligned position
// and use this pointer to read or write data into array
float* alignedArray = (float*)(((unsigned long)array + 15) & (~0x0F));

// dellocate memory original "array", NOT alignedArray
free(array);
array = alignedArray = 0;

See: http://www.songho.ca/misc/alignment/dataalign.html and How to allocate aligned memory only using the standard library?

Edit: In case anyone cares, I got the idea for my aligned_malloc() function from Eigen (Eigen/src/Core/util/Memory.h)

Edit: I just discovered that posix_memalign is undefined for MinGW. However, _mm_malloc works for Visual Studio 2012, GCC, MinGW, and the Intel C++ compiler so it seems to be the most convenient solution in general. It also requires using its own _mm_free function, although on some implementations you can pass pointers from _mm_malloc to the standard free / delete.

4

5 回答 5

14

只要您同意调用特殊函数来进行释放,您的方法就可以了。不过,我会反过来做你#ifdef的:从标准指定的选项开始,然后回退到平台特定的选项。例如

  1. 如果__STDC_VERSION__ >= 201112L使用aligned_alloc.
  2. 如果_POSIX_VERSION >= 200112L使用posix_memalign.
  3. 如果_MSC_VER已定义,请使用 Windows 的东西。
  4. ...
  5. 如果一切都失败了,只需使用malloc/free并禁用 SSE/AVX 代码。

如果您希望能够将分配的指针传递给free;问题就更难了。这在所有标准接口上都有效,但在 Windows 上无效,并且不一定具有memalign某些类 unix 系统所具有的遗留功能。

于 2013-05-04T17:47:17.920 回答
5

您提出的第一个功能确实可以正常工作。

您的“自制”功能也可以使用,但缺点是如果值已经对齐,您就浪费了 15 个字节。有时可能无关紧要,但操作系统很可能能够提供正确分配的内存而不会造成任何浪费(如果它需要对齐到 256 或 4096 字节,您可能会通过添加“alignment-1”来浪费大量内存字节)。

于 2013-05-04T17:43:11.130 回答
2

这是 user2093113 的固定示例,直接代码不是为我构建的(void* 未知大小)。我还将它放在覆盖运算符 new/delete 的模板类中,因此您不必进行分配和调用放置 new。

#include <memory>

template<std::size_t Alignment>
class Aligned
{
public:
    void* operator new(std::size_t size)
    {
        std::size_t space = size + (Alignment - 1);
        void *ptr = malloc(space + sizeof(void*));
        void *original_ptr = ptr;

        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes += sizeof(void*);
        ptr = static_cast<void*>(ptr_bytes);

        ptr = std::align(Alignment, size, ptr, space);

        ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);
        std::memcpy(ptr_bytes, &original_ptr, sizeof(void*));

        return ptr;
    }

    void operator delete(void* ptr)
    {
        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);

        void *original_ptr;
        std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

        std::free(original_ptr);
    }
};

像这样使用它:

class Camera : public Aligned<16>
{
};

尚未测试此代码的跨平台性。

于 2013-08-03T18:36:48.187 回答
1

如果您的编译器支持它,C++11 会添加一个std::align函数来进行运行时指针对齐。您可以像这样实现自己的 malloc/free(未经测试):

template<std::size_t Align>
void *aligned_malloc(std::size_t size)
{
    std::size_t space = size + (Align - 1);
    void *ptr = malloc(space + sizeof(void*));
    void *original_ptr = ptr;

    char *ptr_bytes = static_cast<char*>(ptr);
    ptr_bytes += sizeof(void*);
    ptr = static_cast<void*>(ptr_bytes);

    ptr = std::align(Align, size, ptr, space);

    ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);
    std::memcpy(ptr_bytes, original_ptr, sizeof(void*));

    return ptr;
}

void aligned_free(void* ptr)
{
    void *ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);

    void *original_ptr;
    std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

    std::free(original_ptr);
}

然后您不必保留原始指针值来释放它。这是否是 100% 便携我不确定,但如果不是,我希望有人能纠正我!

于 2013-05-04T18:43:28.560 回答
0

这是我的 2 美分:

temp = new unsigned char*[num];
AlignedBuffers = new unsigned char*[num];
for (int i = 0; i<num; i++)
{
    temp[i] = new  unsigned char[bufferSize +15];
    AlignedBuffers[i] = reinterpret_cast<unsigned char*>((reinterpret_cast<size_t>
                        (temp[i% num]) + 15) & ~15);// 16 bit alignment in preperation for SSE
}
于 2013-05-11T21:25:00.007 回答