c - Comparing CPU utilization during MPI thread deadlock using mvapich2 vs. openmpi

Question

I noticed that when I have a deadlocked MPI program, e.g. wait.c

#include <stdio.h>
#include <mpi.h>


int main(int argc, char * argv[])
{
    int taskID = -1; 
    int NTasks = -1; 
    int a = 11; 
    int b = 22; 
    MPI_Status Stat;

    /* MPI Initializations */
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &taskID);
    MPI_Comm_size(MPI_COMM_WORLD, &NTasks);

    if(taskID == 0)
        MPI_Send(&a, 1, MPI_INT, 1, 66, MPI_COMM_WORLD);
    else //if(taskID == 1)
        MPI_Recv(&b, 1, MPI_INT, 0, 66, MPI_COMM_WORLD, &Stat);

    printf("Task %i :    a: %i    b: %i\n", taskID, a, b); 

    MPI_Finalize();
    return 0;
}

When I compile wait.c with the mvapich2-2.1 library (which itself was compiled using gcc-4.9.2) and run it (e.g. mpirun -np 4 ./a.out) I notice (via top), that all 4 processors are chugging along at 100%.

When I compile wait.c with the openmpi-1.6 library (which itself was compiled using gcc-4.9.2) and run it (e.g. mpirun -np 4 ./a.out), I notice (via top), that 2 processors are chugging at 100% and 2 at 0%.

Presumably the 2 at 0% are the ones that completed communication.

QUESTION : Why is there a difference in CPU usage between openmpi and mvapich2? Is this the expected behavior? When the CPU usage is 100%, is that from constantly checking to see if a message is being sent?

score 4 · Accepted Answer

两种实现都忙于等待以MPI_Recv()最小化延迟。这解释了为什么使用两种 MPI 实现中的任何一种，排名 2 和 3 都为 100%。

现在，显然将呼叫进度排在 0 和 1 MPI_Finalize()，这就是两种实现不同的地方：mvapich2 busy-wait 而 openmpi 没有。

回答您的问题：是的，他们在检查是否已收到消息以及是否为预期行为时处于 100%。

如果您不在 InfiniBand 上，则可以通过将 a 附加strace到其中一个进程来观察这一点：您应该在那里看到许多 poll() 调用。

c - Comparing CPU utilization during MPI thread deadlock using mvapich2 vs. openmpi

1 回答 1

Related

Reference