2

I'm seeing different epoll and select behavior in two different binaries and was hoping for some debugging help. In the following, epoll_wait and select will be used interchangeably.

I have two processes, one writer and one reader, that communicate over a fifo. The reader performs an epoll_wait to be notified of writes. I would also like to know when the writer closes the fifo, and it appears that epoll_wait should notify me of this as well. The following toy program, which behaves as expected, illustrates what I'm trying to accomplish:

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/stat.h>
#include <unistd.h>

int
main(int argc, char** argv)
{
  const char* filename = "tempfile";
  char buf[1024];
  memset(buf, 0, sizeof(buf));

  struct stat statbuf;
  if (!stat(filename, &statbuf))
    unlink(filename);

  mkfifo(filename, S_IRUSR | S_IWUSR);

  pid_t pid = fork();
  if (!pid) {
    int fd = open(filename, O_WRONLY);
    printf("Opened %d for writing\n", fd);
    sleep(3);
    close(fd);
  } else {
    int fd = open(filename, O_RDONLY);
    printf("Opened %d for reading\n", fd);

    static const int MAX_LENGTH = 1;
    struct epoll_event init;
    struct epoll_event evs[MAX_LENGTH];
    int efd = epoll_create(MAX_LENGTH);

    int i;
    for (i = 0; i < MAX_LENGTH; ++i) {
        init.data.u64 = 0;
        init.data.fd = fd;
        init.events |= EPOLLIN | EPOLLPRI | EPOLLHUP;
        epoll_ctl(efd, EPOLL_CTL_ADD, fd, &init);
    }

    while (1) {
      int nfds = epoll_wait(efd, evs, MAX_LENGTH, -1);
      printf("%d fds ready\n", nfds);
      int nread = read(fd, buf, sizeof(buf));
      if (nread < 0) {
        perror("read");
        exit(1);
      } else if (!nread) {
        printf("Child %d closed the pipe\n", pid);
        break;
      }
      printf("Reading: %s\n", buf);
    }
  }
  return 0;
}

However, when I do this with another reader (whose code I'm not privileged to post, but which makes the exact same calls--the toy program is modeled on it), the process does not wake when the writer closes the fifo. The toy reader also gives the desired semantics with select. The real reader configured to use select also fails.

What might account for the different behavior of the two? For any provided hypotheses, how can I verify them? I'm running Linux 2.6.38.8.

4

1 回答 1

0

strace是一个很好的工具,可以确认系统调用被正确调用(即参数被正确传递并且它们不会返回任何意外错误)。

除此之外,我建议使用lsof检查是否没有其他进程仍然打开该 FIFO。

于 2012-10-27T13:20:55.990 回答