Святослав Павленко 的回答启发了我:使用阻塞 MPI 通信来强制实时串行输出。虽然 Wesley Bland 认为 MPI 不是为串行输出而构建的。因此,如果我们想要输出数据,那么让每个处理器输出(非冲突)数据都是有意义的。或者,如果数据的顺序很重要(并且不是太大),推荐的方法是将其全部发送到 cpu(比如 0 级),然后正确格式化数据。
对我来说,这似乎有点矫枉过正,尤其是当数据可以是可变长度的字符串时,这通常是std::cout << "a=" << some_varible << " b=" << some_other_variable
常见的。因此,如果我们想要一些快速而肮脏的有序打印,我们可以利用 Святослав Павленко 的答案来构建串行输出流。这个解决方案工作正常,但它的性能与许多 cpu 的比例很差,所以不要将它用于数据输出!
#include <iostream>
#include <sstream>
#include <mpi.h>
MPI 内务管理:
int mpi_size;
int mpi_rank;
void init_mpi(int argc, char * argv[]) {
MPI_Init(& argc, & argv);
MPI_Comm_size(MPI_COMM_WORLD, & mpi_size);
MPI_Comm_rank(MPI_COMM_WORLD, & mpi_rank);
}
void finalize_mpi() {
MPI_Finalize();
}
启用 MPI 消息链的通用类
template<class T, MPI_Datatype MPI_T> class MPIChain{
// Uses a chained MPI message (T) to coordinate serial execution of code (the content of the message is irrelevant).
private:
T message_out; // The messages aren't really used here
T message_in;
int size;
int rank;
public:
void next(){
// Send message to next core (if there is one)
if(rank + 1 < size) {
// MPI_Send - Performs a standard-mode blocking send.
MPI_Send(& message_out, 1, MPI_T, rank + 1, 0, MPI_COMM_WORLD);
}
}
void wait(int & msg_count) {
// Waits for message to arrive. Message is well-formed if msg_count = 1
MPI_Status status;
// MPI_Probe - Blocking test for a message.
MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
// MPI_Get_count - Gets the number of top level elements.
MPI_Get_count(& status, MPI_T, & msg_count);
if(msg_count == 1) {
// MPI_Recv - Performs a standard-mode blocking receive.
MPI_Recv(& message_in, msg_count, MPI_T, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
}
}
MPIChain(T message_init, int c_rank, int c_size): message_out(message_init), size(c_size), rank(c_rank) {}
int get_rank() const { return rank;}
int get_size() const { return size;}
};
我们现在可以使用我们的MPIChain
类来创建管理输出流的类:
class ChainStream : public MPIChain<int, MPI_INT> {
// Uses the MPIChain class to implement a ostream with a serial operator<< implementation.
private:
std::ostream & s_out;
public:
ChainStream(std::ostream & os, int c_rank, int c_size)
: MPIChain<int, MPI_INT>(0, c_rank, c_size), s_out(os) {};
ChainStream & operator<<(const std::string & os){
if(this->get_rank() == 0) {
this->s_out << os;
// Initiate chain of MPI messages
this->next();
} else {
int msg_count;
// Wait untill a message arrives (MPIChain::wait uses a blocking test)
this->wait(msg_count);
if(msg_count == 1) {
// If the message is well-formed (i.e. only one message is recieved): output string
this->s_out << os;
// Pass onto the next member of the chain (if there is one)
this->next();
}
}
// Ensure that the chain is resolved before returning the stream
MPI_Barrier(MPI_COMM_WORLD);
// Don't output the ostream! That would break the serial-in-time exuction.
return *this;
};
};
注意MPI_Barrier
末尾的operator<<
. 这是为了防止代码启动第二个输出链。即使这可以移到 之外operator<<
,我想我会把它放在这里,因为无论如何这应该是串行输出......
把它们放在一起:
int main(int argc, char * argv[]) {
init_mpi(argc, argv);
ChainStream cs(std::cout, mpi_rank, mpi_size);
std::stringstream str_1, str_2, str_3;
str_1 << "FIRST: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
str_2 << "SECOND: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
str_3 << "THIRD: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
cs << str_1.str() << str_2.str() << str_3.str();
// Equivalent to:
//cs << str_1.str();
//cs << str_2.str();
//cs << str_3.str();
finalize_mpi();
}
请注意,我们在向它们发送实例之前连接字符串str_1
, 。通常一个人会做这样的事情:str_2
str_3
ChainStream
std::cout << "a" << "b" << "c"" << std::endl
但这适用operator<<
于从左到右,我们希望字符串在按顺序运行每个进程之前准备好输出。
g++-7 -O3 -lmpi serial_io_obj.cpp -o serial_io_obj
mpirun -n 10 ./serial_io_obj
输出:
FIRST: MPI_SIZE = 10 RANK = 0
FIRST: MPI_SIZE = 10 RANK = 1
FIRST: MPI_SIZE = 10 RANK = 2
FIRST: MPI_SIZE = 10 RANK = 3
FIRST: MPI_SIZE = 10 RANK = 4
FIRST: MPI_SIZE = 10 RANK = 5
FIRST: MPI_SIZE = 10 RANK = 6
FIRST: MPI_SIZE = 10 RANK = 7
FIRST: MPI_SIZE = 10 RANK = 8
FIRST: MPI_SIZE = 10 RANK = 9
SECOND: MPI_SIZE = 10 RANK = 0
SECOND: MPI_SIZE = 10 RANK = 1
SECOND: MPI_SIZE = 10 RANK = 2
SECOND: MPI_SIZE = 10 RANK = 3
SECOND: MPI_SIZE = 10 RANK = 4
SECOND: MPI_SIZE = 10 RANK = 5
SECOND: MPI_SIZE = 10 RANK = 6
SECOND: MPI_SIZE = 10 RANK = 7
SECOND: MPI_SIZE = 10 RANK = 8
SECOND: MPI_SIZE = 10 RANK = 9
THIRD: MPI_SIZE = 10 RANK = 0
THIRD: MPI_SIZE = 10 RANK = 1
THIRD: MPI_SIZE = 10 RANK = 2
THIRD: MPI_SIZE = 10 RANK = 3
THIRD: MPI_SIZE = 10 RANK = 4
THIRD: MPI_SIZE = 10 RANK = 5
THIRD: MPI_SIZE = 10 RANK = 6
THIRD: MPI_SIZE = 10 RANK = 7
THIRD: MPI_SIZE = 10 RANK = 8
THIRD: MPI_SIZE = 10 RANK = 9