您的参数myfile
是 type 的变量MPI_File
,而不是FILE *
,因此您不能将其用于fgets()
,rewind()
等。这可能是您的段错误的来源。
我的建议是采用此答案中的方法并读取每个文件的重叠块(以考虑到您不知道一行有多长的事实),每个任务都读取它们的块和进程their
行。如果您真的关心每个文件具有完全相同的行数(尽可能),您可以让它们相互交换数据以具有完全相同的行数。
更新:如果你真的想这样做(请注意,如果你的输入都是数字,这在二进制格式中会容易得多),一些读取文本文件的代码,在另一个数字中的分区,然后处理每个行(例如通过对列求和)作为我上面链接的答案的直接扩展:
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
void readlines(MPI_File *in, const int rank, const int size, const int overlap,
char ***lines, int *nlines) {
MPI_Offset filesize;
MPI_Offset localsize;
MPI_Offset start;
MPI_Offset end;
char *chunk;
/* figure out who reads what */
MPI_File_get_size(*in, &filesize);
localsize = filesize/size;
start = rank * localsize;
end = start + localsize - 1;
/* add overlap to the end of everyone's chunk... */
end += overlap;
/* except the last processor, of course */
if (rank == size-1) end = filesize;
localsize = end - start + 1;
/* allocate memory */
chunk = malloc( (localsize + 1)*sizeof(char));
/* everyone reads in their part */
MPI_File_read_at_all(*in, start, chunk, localsize, MPI_CHAR, MPI_STATUS_IGNORE);
chunk[localsize] = '\0';
/*
* everyone calculate what their start and end *really* are by going
* from the first newline after start to the first newline after the
* overlap region starts (eg, after end - overlap + 1)
*/
int locstart=0, locend=localsize;
if (rank != 0) {
while(chunk[locstart] != '\n') locstart++;
locstart++;
}
if (rank != size-1) {
locend-=overlap;
while(chunk[locend] != '\n') locend++;
}
localsize = locend-locstart+1;
/* Now let's copy our actual data over into a new array, with no overlaps */
char *data = (char *)malloc((localsize+1)*sizeof(char));
memcpy(data, &(chunk[locstart]), localsize);
data[localsize] = '\0';
free(chunk);
/* Now we'll count the number of lines */
*nlines = 0;
for (int i=0; i<localsize; i++)
if (data[i] == '\n') (*nlines)++;
/* Now the array lines will point into the data array at the start of each line */
/* assuming nlines > 1 */
*lines = (char **)malloc((*nlines)*sizeof(char *));
(*lines)[0] = strtok(data,"\n");
for (int i=1; i<(*nlines); i++)
(*lines)[i] = strtok(NULL, "\n");
return;
}
void processlines(char **lines, const int nlines, const int rank) {
for (int i=0; i<nlines; i++) {
float a, b;
sscanf(lines[i],"%f %f", &a, &b);
printf("%d: <%s>: %f + %f = %f\n", rank, lines[i], a, b, a+b);
}
}
int main(int argc, char **argv) {
MPI_File in;
int rank, size;
int ierr;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if (argc != 2) {
if (rank == 0) fprintf(stderr, "Usage: %s infilename\n", argv[0]);
MPI_Finalize();
exit(1);
}
ierr = MPI_File_open(MPI_COMM_WORLD, argv[1], MPI_MODE_RDONLY, MPI_INFO_NULL, &in);
if (ierr) {
if (rank == 0) fprintf(stderr, "%s: Couldn't open file %s\n", argv[0], argv[1]);
MPI_Finalize();
exit(2);
}
const int overlap=100;
char **lines;
int nlines;
readlines(&in, rank, size, overlap, &lines, &nlines);
printf("Rank %d has %d lines\n", rank, nlines);
processlines(lines, nlines, rank);
free(lines[0]);
free(lines);
MPI_File_close(&in);
MPI_Finalize();
return 0;
}
并在您提供的数据集上运行它:
$ mpirun -np 2 ./textio foo2.in
Rank 0 has 4 lines
0: <45.87 13.22>: 45.869999 + 13.220000 = 59.090000
0: <45.71 13.27>: 45.709999 + 13.270000 = 58.980000
0: <45.78 13.21>: 45.779999 + 13.210000 = 58.989998
0: <45.67 13.1>: 45.669998 + 13.100000 = 58.769997
Rank 1 has 3 lines
1: <45.7 13.24>: 45.700001 + 13.240000 = 58.940002
1: <45.81 13.28>: 45.810001 + 13.280000 = 59.090000
1: <45.85 13.32>: 45.849998 + 13.320000 = 59.169998