0

目前我正在使用 epoll 实现一个多线程网络客户端应用程序。我的模型很简单:

  1. 获取client_fd & 向远程服务器写入请求

  2. 设置 fd 非阻塞并将其添加到 epfd(EPOLLIN|EPOLLET|EPOLLONESHOT) 以等待响应

  3. 从 fd 获取 EPOLLIN,读取整个响应并释放资源

我遇到的问题是,有时我会在同一个 fd 上获得多个 EPOLLIN(通过使用 EPOLLIN|EPOLLET|EPOLLONESHOT)。由于我在第一个 EPOLLIN evt 中释放了所有资源(包括 client_fd),第二个 evt 使我的程序崩溃。

任何建议都非常感谢:)

这是代码片段:

//multi-thread wait on the sem, since there should be only one thread 
//at epoll_wait at the same time(L-F model)
sem_wait(wait_sem); 

int nfds = epoll_wait(epoll_fd,evts,max_evt_cnt,wait_time_out);

//leader got the fds to proceed
for(int i =0; i < nfds; ++i){
    io_request* req = (io_request*)evts[i].data.ptr;
    int sockfd = req->fd;
    if(evts[i].events & EPOLLIN){
        ev.data.fd=sockfd;
        if(0!=epoll_ctl(epoll_fd,EPOLL_CTL_DEL,sockfd,&ev)){
            switch(errno){
                case EBADF:
                    //multiple EPOLLIN cause EPOLL_CTL_DEL fail
                    WARNING("delete fd failed for EBADF");
                    break;
                default:
                    WARNING("delete fd failed for %d", errno);
            }
         }
         else{
                //currently walk around by just ignore the error fd
                crt_idx.push_back(i);
         }
    }
}

if(crt_idx.size() != nfds)//just warning when the case happen
    WARNING("crt_idx.size():%u != nfds:%d there has been some error!!", crt_idx.size(), nfds);

//current leader waked up next leader, and become a follower
sem_post(wait_sem);

for(int i = 0; i < crt_idx.size(); ++i)
{
    io_request* req = (io_request*)evts[crt_idx[i]].data.ptr;
    ...do business logic...
    ...release the resources & release the client_fd
}
4

1 回答 1

0

我怀疑您的代码中某处存在某种错误或竞争条件。特别注意关闭套接字的位置。

于 2012-09-11T18:04:18.640 回答