根据底层操作系统处理 fprintf() 的方式,使用 fwrite() 可能会获得一些效率。
正如您所指出的,您不能直接执行 fwrite(),但可以使用 sprintf() 格式化 csv 文本,然后将其推送到一个大缓冲区中。当缓冲区变满时,你 fwrite() 整个缓冲区。
通常操作系统中文件 I/O 的实现已经在执行此操作,因此 fwrite() 可能不会比 fprintf() 更有效。
正如 Eric 在回答中指出的那样,保存此数据的最有效方法是直接使用二进制格式。如果您可以对其进行预处理以减少使用量,那就更好了-
例如,您的数据是否需要完整的浮点精度?您能否将其转换为 16 位定点整数,并为每个 32 位无符号整数保存两个数据点,同时为您报告的计算保持足够的精度?如果将它们视为一组有符号整数,则 16 位有符号整数值 5 位精度。
如果您正在对这些数据进行进一步处理,您肯定不想使用 Excel 或 Matlab,因为处理时间会失控。如果您使用 C 或 C++ 开发处理算法,那么二进制数据格式将不是问题。
如果您正在绘制此数据的图形,则图形显示基本上会对数据进行下采样,因此您也可以处理到更像 10k 点并输出统计数据,这对绘图很有意义。
好吧,无论如何,有我的想法。它的目的更广泛,因为您可能已经解决了您的问题,因此其他人可能会阅读有类似问题的内容。
编辑:这是我运行的一个有趣的测试,下面是完整的可编译源
// what's faster, fwrite or fprintf?
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
#define HUGE_NUMBER 1000
LARGE_INTEGER ticksPerSecond;
LARGE_INTEGER time1;
LARGE_INTEGER time2;
float floatDiffTime;
const int runs = 1000000;
int main(int argc, char* argv[])
{
// Get the speed of the CPU
QueryPerformanceFrequency( &ticksPerSecond );
printf( "Your computer does %lld ticks per second\n", ticksPerSecond.QuadPart );
// %lld means type "long long" int, which is the
// 64 bit int which is what we want here.
// define some random valued variables to use
// in the print statements
int a = 5;
double b = 9.2919e92;
char c = 'x';
char * d = "blah blah blah";
// test start: open a file to write
FILE *outfile = fopen( "testfile.txt", "w" );
char buf[HUGE_NUMBER];
int i;
int index = 0;
//Test line-by-line fprintf
// START timing
QueryPerformanceCounter( &time1 );
memset(buf,'\0', HUGE_NUMBER);
for(i=0; i<runs; i++)
{
fprintf(outfile, "blah %i %f %c %s\n", a, b, c, d );
}
fflush ( outfile );
fclose( outfile );
// STOP timing
QueryPerformanceCounter( &time2 );
// get the difference between time1 and time2,
// and that is how long the for loop took to run.
floatDiffTime = ((float)time2.QuadPart - time1.QuadPart)/ticksPerSecond.QuadPart;
printf( "line-by-line fprintf took %f seconds\n", floatDiffTime );
//Test fprintf
// START timing
QueryPerformanceCounter( &time1 );
memset(buf,'\0', HUGE_NUMBER);
for(i=0; i<runs; i++)
{
sprintf(&buf[index], "blah %i %f %c %s\n", a, b, c, d );
index += strlen(&buf[index]);
if(index >= HUGE_NUMBER) {
fprintf(outfile, "%s", buf );
index = 0;
memset(buf,'\0', HUGE_NUMBER);
}
}
fflush ( outfile );
fclose( outfile );
// STOP timing
QueryPerformanceCounter( &time2 );
// get the difference between time1 and time2,
// and that is how long the for loop took to run.
floatDiffTime = ((float)time2.QuadPart - time1.QuadPart)/ticksPerSecond.QuadPart;
printf( "fprintf took %f seconds\n", floatDiffTime );
//Test fwrite
outfile = fopen( "testfile.txt", "w" );
index = 0;
/////////////////////
// START timing
QueryPerformanceCounter( &time1 );
memset(buf,'\0', HUGE_NUMBER);
for(i=0; i<runs; i++)
{
sprintf(&buf[index], "blah %i %f %c %s\n", a, b, c, d );
index += strlen(&buf[index]);
if(index >= HUGE_NUMBER) {
fwrite( buf, 1, strlen(buf), outfile );
index = 0;
//printf("buf size: %d\n", strlen(buf));
memset(buf,'\0', HUGE_NUMBER);
}
}
fflush(outfile);
fclose( outfile );
////////////////////
// STOP timing
QueryPerformanceCounter( &time2 );
// get the difference between time1 and time2,
// and that is how long the for loop took to run.
floatDiffTime = ((float)time2.QuadPart - time1.QuadPart)/ticksPerSecond.QuadPart;
printf( "fwrite took %f seconds\n", floatDiffTime );
//Test WriteFile
outfile = fopen( "testfile.txt", "w" );
index = 0;
DWORD bWritten = 0;
/////////////////////
// START timing
QueryPerformanceCounter( &time1 );
memset(buf,'\0', HUGE_NUMBER);
for(i=0; i<runs; i++)
{
sprintf(&buf[index], "blah %i %f %c %s\n", a, b, c, d );
index += strlen(&buf[index]);
if(index >= HUGE_NUMBER) {
WriteFile( outfile, buf, strlen(buf), &bWritten, NULL );
index = 0;
//printf("buf size: %d\n", strlen(buf));
memset(buf,'\0', HUGE_NUMBER);
}
}
fflush(outfile);
fclose( outfile );
////////////////////
// STOP timing
QueryPerformanceCounter( &time2 );
// get the difference between time1 and time2,
// and that is how long the for loop took to run.
floatDiffTime = ((float)time2.QuadPart - time1.QuadPart)/ticksPerSecond.QuadPart;
printf( "WriteFile took %f seconds\n", floatDiffTime );
//Test WriteFile
outfile = fopen( "testfile.txt", "w" );
index = 0;
bWritten = 0;
/////////////////////
// START timing
QueryPerformanceCounter( &time1 );
memset(buf,'\0', HUGE_NUMBER);
for(i=0; i<runs; i++)
{
sprintf(&buf[index], "blah %i %f %c %s\n", a, b, c, d );
WriteFile( outfile, buf, strlen(buf), &bWritten, NULL );
memset(buf,'\0', strlen(buf));
}
fflush(outfile);
fclose( outfile );
////////////////////
// STOP timing
QueryPerformanceCounter( &time2 );
// get the difference between time1 and time2,
// and that is how long the for loop took to run.
floatDiffTime = ((float)time2.QuadPart - time1.QuadPart)/ticksPerSecond.QuadPart;
printf( "WriteFile line-by-line took %f seconds\n", floatDiffTime );
return 0;
}
和结果???
Your computer does 2337929 ticks per second
line-by-line fprintf took 2.970491 seconds
fprintf took 2.345687 seconds
fwrite took 3.456101 seconds
WriteFile took 2.131118 seconds
WriteFile line-by-line took 2.495092 seconds
它看起来像将大量数据缓冲为字符串,然后传送到 fprintf()(便携式)或 Windows WriteFile()(如果使用 Windows)调用是处理此问题的最有效方法。
编译器命令:
gcc write_speed_test.c -o wspt
编译器版本:
$ gcc -v
Using built-in specs.
Target: i686-w64-mingw32
Configured with: ../gcc44-svn/configure --target=i686-w64-mingw32 --host=i686-w64-mingw32 --disable-multilib --disable-nls --disable-win32-registry --prefix=/mingw32 --with-gmp=/mingw32 --with-mpfr=/mingw32 --enable-languages=c,c++
Thread model: win32
gcc version 4.4.3 (GCC)