3

I am programming in c using openMPI. My code is posted below. What happens is that I receive a segmentation fault error whenever I run this program. I believe that I have isolated the problem by using these printf statements. The segmentation seems to happen after the MPI_Finalize(). Any help is greatly appreciated.

The error I receive is:

[linuxscc003:10019] *** Process received signal ***
[linuxscc003:10019] Signal: Segmentation fault (11)
[linuxscc003:10019] Signal code: Address not mapped (1)

and my code:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h> 

int main(int argc, char** argv) 
{
    int i = 0; //index
    int comm_sz, my_rank;
    int part_sum = 0;
    //size of the array is hard-coded in, 

    MPI_Init(NULL,NULL);
    MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    if(my_rank == 0)
    {
        int* array;
        //generate the array of n elements
        printf("There are %d array elements\n", comm_sz);
        array = (int*)malloc(sizeof(int*)*comm_sz);
        for(i = 0; i < comm_sz; i++)
        {
            //we don't want to count zero in here
            //nobody likes zero
            array[i] = (i+1); 
        }
        for(i = 1; i < comm_sz; i++)
        {
            MPI_Send(&array[i], sizeof(int*), MPI_INT, i, 0, MPI_COMM_WORLD);
        }   
        //part_sum = 1;
        free(array);
        printf("freed array!\n");
    }

    if(my_rank != 0)
    {
        MPI_Recv(&part_sum, sizeof(int*), MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("proc %d out of %d, I have received %d!\n", my_rank, comm_sz, part_sum);

    }



    printf("proc %d signing off!\n",my_rank);
    MPI_Finalize();
    printf("proc %d signed off!\n",my_rank);
    return 0;
} 
4

1 回答 1

7

In this line:

array = (int*)malloc(sizeof(int*)*comm_sz);

sizeof(int*) should be sizeof(int).

In this line:

MPI_Send(&array[i], sizeof(int*), MPI_INT, i, 0, MPI_COMM_WORLD);

sizeof(int*) should be 1. You are specifying how many MPI_INTs you are sending (not how many bytes). sizeof(int) is likely 4 or 8, so you are overreading your buffer.

In this line:

MPI_Recv(&part_sum, sizeof(int*), MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

sizeof(int*) should be 1, same thing.

The segfault may be happening before MPI_Finalize() on some process and you believe it happens after because other processes have completed its MPI_Finalize(). Or it's possible that because you have overwritten part_sum on the stack the corrupted stack doesn't cause a problem until MPI_Finalize() is called.

于 2013-03-30T22:02:47.117 回答