1

我用 C 语言创建了一个程序,它编写了一个程序作为其输出。

目标是测试单体程序的性能方面。

第一个测试配置了 10 000 次迭代,结果程序编译并运行。具有 100 000 次迭代的第二个测试是在具有 16 GB RAM (16 GB SWAP) 的 i7 3770 中在 Ubuntu 12.04 x86_64 中编译(到现在 3030 分钟)。

我知道解析复杂度从 O(n**2) 到 O(n**3),但这需要的时间太长。在最坏的情况下,编译时间增加 1000 倍。

正在消耗 35.2% 的内存并且仍在增加。

我的问题是:

  • GCC 对每个模块的变量数量或模块大小有限制吗?

  • 这是一个错误吗?

原始程序生成器是:

#include <stdio.h>
#define MAX_INTERACTION 100000

int main(int argc, char **argv)
{
FILE * fp;
fp = fopen("source.c","w");

fprintf(fp,"#include <stdio.h> \n \n \n");
fprintf(fp,"int main(int argc, char **argv) \n");
fprintf(fp,"{ \n");

// local variables and exchange variables
for (int i=0; i< MAX_INTERACTION ; ++i)
{
    // passed variable, return label , local variable
    fprintf(fp," int pv%d , rl%d, loc%d ; \n",i,i,i); 
}

fprintf(fp," int pvd =0 ;\n \n \n");

//code blocks
for (int i=0; i< MAX_INTERACTION ; ++i)
{
    fprintf(fp," block%d : \n",i);
    fprintf(fp," loc%d = pv%d +1 ; \n",i,i);
    fprintf(fp," goto  rl%d; \n",i);
}

//call blocks
for (int i=1; i< MAX_INTERACTION +1; ++i)
{
    fprintf(fp," pvd = pv%d ;\n",(i-1));
    fprintf(fp," goto block%d; \n",(i-1));
    fprintf(fp," rl%d: \n",(i-1));
}

fprintf (fp,"printf( \"Concluido \\n \"); \n");
fprintf(fp,"}\n");

fclose(fp);
}
4

1 回答 1

1

I did some timing on a MacBook Pro with 8 GB main memory (2.3 GHz Intel Core i7).

Modified the generator to take a parameter indicating the program size, and then ran it repeatedly:

$ for size in 10 100 1000 2000 3000 4000 5000 10000 20000 30000
> do
>     ./generator $size
>     echo $size
>     time make -s source 2>/dev/null
>     sleep 1
> done
10

real    0m0.526s
user    0m0.030s
sys     0m0.029s
100

real    0m0.084s
user    0m0.031s
sys     0m0.018s
1000

real    0m0.333s
user    0m0.235s
sys     0m0.044s
2000

real    0m0.392s
user    0m0.318s
sys     0m0.046s
3000

real    0m0.786s
user    0m0.661s
sys     0m0.070s
4000

real    0m0.657s
user    0m0.599s
sys     0m0.053s
5000

real    0m0.978s
user    0m0.893s
sys     0m0.069s
10000

real    0m3.063s
user    0m2.770s
sys     0m0.149s
20000

real    0m8.109s
user    0m7.315s
sys     0m0.274s
30000

real    0m21.410s
user    0m19.553s
sys     0m0.483s

$

Clearly, at the small sizes, there is overhead in simply getting the compiler run (especially the first run!), reading and writing files, etc. Redoing in multiples of 10,000 up to 90,000, I got the results in the table below. The slowdown is appreciable, especially at between 20,000 and 30,000. I also got fairly significant variability in the times for any given size. And making the code configurable and then running incrementally when you run into a problem is only sensible.

                    Compared to 10K
Size       Time     Size Ratio   Time Ratio  Log Time Ratio
 10K        3.7      1.00          1.00       0.000
 20K        8.1      2.00          2.19       0.784
 30K       25.1      3.00          6.78       1.915
 40K       45.2      4.00         12.22       2.503
 50K       76.7      5.00         20.73       3.032
 60K      110.5      6.00         29.96       3.397
 70K      176.0      7.00         47.57       3.862
 80K      212.0      8.00         57.30       4.048
 90K      292.3      9.00         79.00       4.369
100K      363.3     10.00         98.19       4.587

Your mileage will vary, of course.

For reference, the GCC I'm using is:

i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)

The compilation command line is:

/usr/bin/gcc -O3 -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
       -Wold-style-definition source.c -o source 

A home-built GCC 4.7.1 took 34.6 s for 10K, and 150.9 s for 20K.

于 2012-08-15T02:04:58.937 回答