c++ - 为什么 sizeof...(T) 这么慢？在没有 sizeof...(T) 的情况下实现 C++14 make_index_sequence

Question

我找到了 C++14 make_index_sequence '算法'的实现：

template< int ... > struct index_sequence{   using type = index_sequence; };

template< typename T> using invoke = typename T :: type ;

template< typename T, typename U > struct concate;
template< int ...i, int ... j>
struct concate< index_sequence<i...>, index_sequence<j...> >
        : index_sequence< i... ,  (j + sizeof ... (i ) )... > {};
  //                                   \          /
  //                                    ----------
 //                                   I think here is slowly.
template< int n>
struct make_index_sequence_help : concate< 
                          invoke< make_index_sequence_help<n/2>>,
                          invoke< make_index_sequence_help<n-n/2>>
                          > {};

template<> struct make_index_sequence_help <0> : index_sequence<>{};
template<> struct make_index_sequence_help <1> : index_sequence<0>{};

template< int n> using make_index_sequence = invoke< make_index_sequence_help<n> >;


int main()
{
    using iseq = make_index_sequence< 1024 > ; // successfull
    using jseq = make_index_sequence< 1024 * 16 > ; // a lot of compile time!!!
    using kseq = make_index_sequence< 1024 * 64 > ; // can't compile: memory exhauted!!!
};

但是，当我将 sizeof...(i) 从 'concate' 替换为具体数字时，make_index_sequence<1024 *64> - 编译得非常快。

template< int s, typename T, typename U > struct concate;
template< int s, int ...i, int ...j >
struct concate< s, index_sequence<i...>, index_sequence<j...> >
 :  index_sequence< i..., ( j + s ) ... > {};

// and 
template< int n >
struct make_index_sequence_help : concate<
                                  n / 2 , 
                          invoke< make_index_sequence_help< n / 2 > >,
                          invoke< make_index_sequence_help< n - n/2 > >
                           >{};

问：为什么 sizeof ... (i) 这么慢？

我用 gcc 4.8.1 更新测试：

对于第一种情况：（仅 1024 和 1024*16 ）。

g++  -Wall  -c "ctx_fptr.cpp"   -g  -O2 -std=c++11 -ftime-report
Execution times (seconds)
 garbage collection    :   0.06 ( 1%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 preprocessing         :   0.03 ( 0%) usr   0.04 ( 2%) sys   0.09 ( 1%) wall     293 kB ( 0%) ggc
 parser                :  10.41 (97%) usr   1.61 (95%) sys  12.01 (96%) wall 2829842 kB (99%) ggc
 name lookup           :   0.12 ( 1%) usr   0.04 ( 2%) sys   0.23 ( 2%) wall    7236 kB ( 0%) ggc
 dead store elim1      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall       0 kB ( 0%) ggc
 symout                :   0.15 ( 1%) usr   0.00 ( 0%) sys   0.15 ( 1%) wall   12891 kB ( 0%) ggc
 unaccounted todo      :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 :  10.78             1.70            12.55            2850835 kB

对于第二种情况：（所有 1024、1024*16 和 1024 * 64）

g++  -Wall  -c "ctx_fptr.cpp"   -g  -O2 -std=c++11 -ftime-report 
Execution times (seconds)
 preprocessing         :   0.02 ( 2%) usr   0.01 ( 5%) sys   0.05 ( 4%) wall     293 kB ( 0%) ggc
 parser                :   0.54 (45%) usr   0.10 (53%) sys   0.71 (50%) wall   95339 kB (58%) ggc
 name lookup           :   0.47 (39%) usr   0.04 (21%) sys   0.47 (33%) wall   20197 kB (12%) ggc
 tree PRE              :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall       1 kB ( 0%) ggc
 varconst              :   0.00 ( 0%) usr   0.01 ( 5%) sys   0.00 ( 0%) wall      17 kB ( 0%) ggc
 symout                :   0.17 (14%) usr   0.03 (16%) sys   0.18 (13%) wall   47092 kB (29%) ggc
 TOTAL                 :   1.21             0.19             1.41             163493 kB

score 3 · Accepted Answer

编译很慢并且使用大量内存，因为您正在递归地扩展模板。这是在编译时完成的，它会创建大量类型，这会占用大量内存。它不是由 sizeof 或任何其他个人声明引起的。正是递归导致模板扩展的每一位都是昂贵的。

我在 VC++ 中遇到了同样的问题——我发现当我将较大的常量传递给我编写的模板函数时，我的编译会变得非常慢。

当然，就我而言，我试图让编译器运行缓慢。但是，也许这会有所帮助：

https://randomascii.wordpress.com/2014/03/10/making-compiles-slow/

c++ - 为什么 sizeof...(T) 这么慢？在没有 sizeof...(T) 的情况下实现 C++14 make_index_sequence

1 回答 1

Related

Reference