c++ - 在 C++ 和 MPI 中独立并行写入文件

Question

我已经用 C++ 和 MPI 实现了一个代码，它应该进行数百万次计算，并为每个处理其数据的 CPU 将数百万个数字保存在大约 7 个文件中。我使用了大约 10,000 个内核，总共提供了 70,000 个文件，其中包含数百万行要并行编写的代码。

我使用 ofstream 进行写作，但由于某种原因，MPI 代码在中间中断并且文件似乎是空的。我希望每个处理器独立于所有其他处理器编写其 7 个文件，根据我的搜索，这可以使用 MPI 完成，但我在许多资源中阅读了它，我不明白它如何用于独立写入和在执行期间动态指定文件名。如果这是正确的方法，有人可以尽可能详细地解释它吗？如果不是，请尽可能详细地解释您的其他建议？

我目前不起作用的写作看起来像这样：

if (rank == 0)
    {

    if(mkdir("Database",0777)==-1)//creating a directory
    {

    }
    rowsCount = fillCombinations(BCombinations,  RCombinations,
                                 BList,               RList,
                                 maxCombinations,        BIndexBegin, 
                                 BIndexEnd,           RIndexBegin, 
                                 RIndexEnd,    
                                 BCombinationsIndex,  RCombinationsIndex
                          );
}

//then broad cast all the arrays that will be used in all of the computations and at the root 
//send all the indexes to work on on the slaves then at the slave 

or (int cc = BeginIndex ; cc <= EndIndex; cc++)
        {


           // begin by specifying the values that will be used 
           // and making files for each B and R in the list


            BIndex      = betaCombinationsIndex   [cc];
            RIndex     = roughCombinationsIndex  [cc];



            //creating files to save data in and indicating the R and B by their index 
            //specifying files names

           std::string str1;
           std::ostringstream buffer1;
           buffer1 << "Database/";
           str1 = buffer1.str();

           //specifying file names

            std::ostringstream pFileName;
            std::string ppstr2;
            std::ostringstream ppbuffer2;
            ppbuffer2 <<"P_"<<"Beta_"<<(BIndex+1)<<"_Rho_"<<(RIndex+1)<<"_sampledP"<< ".txt";
            ppstr2 = ppbuffer2.str();
            pFileName <<str1.c_str()<<ppstr2.c_str();
            std::string p_file_name = pFileName.str();

            std::ostringstream eFileName;
            std::string eestr2;
            std::ostringstream eebuffer2;
            eebuffer2 <<"E_"<<"Beta_"<<(BIndex+1)<<"_Rho_"<<(RIndex+1)<<"_sampledE"<< ".txt";
            eestr2 = eebuffer2.str();
            eFileName <<str1.c_str()<< eestr2.c_str();
            std::string e_file_name = eFileName.str();

            // and so on for the 7 files .... 


            //creating the files
            ofstream pFile;
            ofstream eFile;

            // and so on for the 7 files .... 

            //opening the files
            pFile      .open (p_file_name.c_str());
            eFile        .open (e_file_name.c_str());

            // and so on for the 7 files .... 
            // then I start the writing in the files and at the end ...



            pFile.close();

            eFile.close();
}
// end of the segment loop

score 3 · Accepted Answer

标准 C++/C 库不足以访问那么多文件。如果您尝试同时访问数十万个文件，即使是 BG/L/P 内核也会崩溃，这与您的数字非常接近。大量的物理文件也对具有额外元数据的并行系统造成了压力。

复杂的超级计算机通常有大量专用的 I/O 节点——为什么不利用标准的 MPI 功能进行并行 I/O？对于您要保存的文件数量，这应该足够了。

你可以从这里开始：http ://www.open-mpi.org/doc/v1.4/man3/MPI_File_open.3.php

祝你好运！

score 2 · Accepted Answer

你需要自己做IO吗？如果没有，你可以试试HDF5 库，它在使用 HPC 的科学家中非常流行。可能会看到它，这可能会简化您的工作。例如，您可以在同一个文件中写入内容并避免拥有数千个文件。（请注意，您的性能也可能取决于您的集群的文件系统）

score 1 · Accepted Answer

好吧，创建 7 个线程或进程，无论您使用什么，并将 threadid / processid 附加到正在写入的文件中。这种方式不应该有争议。

score 1 · Accepted Answer

Blue Gene 架构可能只剩下几年的时间，但如何做“可扩展 I/O”的问题仍将困扰我们一段时间。

首先，MPI-IO 本质上是这种规模的需求，尤其是集体 I/O 功能。尽管这篇论文是为 /L 写的，但教训仍然是相关的：

集体开放让图书馆设置一些优化
集体读写可以转换为与 GPFS 文件系统块边界很好地对齐的请求（这对于锁管理和最小化开销很重要）
“I/O 聚合器”的选择和放置可以通过考虑机器拓扑的方式来完成

https://press3.mcs.anl.gov/romio/2006/02/15/romio-on-blue-gene-l/

在 /Q 上选择聚合器非常复杂，但其想法是选择这些聚合器以平衡所有可用“系统调用 I/O 转发”（ciod）链接的 I/O：

https://press3.mcs.anl.gov/romio/2015/05/15/aggregation-selection-on-blue-gene/

c++ - 在 C++ 和 MPI 中独立并行写入文件

4 回答 4

Related

Reference