c++ - async_receive_from stops receiving after a few packets under Linux

Question

I have a setup with multiple peers broadcasting udp packets (containing images) every 200ms (5fps).

While receiving both the local stream as external streams works fine under Windows, the same code (except for the socket->cancel(); in Windows XP, see comment in code) produces rather strange behavior under Linux:

The first few (5~7) packets sent by another machine (when this machine starts streaming) are received as expected;
After this, the packets from the other machine are received after irregular, long intervals (12s, 5s, 17s, ...) or get a time out (defined after 20 seconds). At certain moments, there is again a burst of (3~4) packets received as expected.
The packets sent by the machine itself are still being received as expected.

Using Wireshark, I see both local as external packets arriving as they should, with correct time intervals between consecutive packages. The behavior also presents itself when the local machine is only listening to a single other stream, with the local stream disabled.

This is some code from the receiver (with some updates as suggested below, thanks!):

Receiver::Receiver(port p)
{
  this->port = p;
  this->stop = false;
}

int Receiver::run()
{
  io_service io_service;
  boost::asio::ip::udp::socket socket(
    io_service,
    boost::asio::ip::udp::endpoint(boost::asio::ip::udp::v4(),
    this->port));
  while(!stop)
  {
    const int bufflength = 65000;
    int timeout = 20000;
    char sockdata[bufflength];
    boost::asio::ip::udp::endpoint remote_endpoint;
    int rcvd;

    bool read_success = this->receive_with_timeout(
           sockdata, bufflength, &rcvd, &socket, remote_endpoint, timeout);

    if(read_success)
    {
      std::cout << "read succes " << remote_endpoint.address().to_string() << std::endl;
    }
    else
    {
      std::cout << "read fail" << std::endl;
    }
  }
  return 0;
}

void handle_receive_from(
  bool* toset, boost::system::error_code error, size_t length, int* outsize)
{
  if(!error || error == boost::asio::error::message_size)
  {
    *toset = length>0?true:false;
    *outsize = length;
  }
  else
  {
    std::cout << error.message() << std::endl;
  }
}

// Update: error check
void handle_timeout( bool* toset, boost::system::error_code error)
{
  if(!error)
  {
    *toset = true;
  }
  else
  {
    std::cout << error.message() << std::endl;
  }
}

bool Receiver::receive_with_timeout(
  char* data, int buffl, int* outsize,
  boost::asio::ip::udp::socket *socket,
  boost::asio::ip::udp::endpoint &sender_endpoint, int msec_tout)
{
  bool timer_overflow = false;
  bool read_result = false;

  deadline_timer timer( socket->get_io_service() );

  timer.expires_from_now( boost::posix_time::milliseconds(msec_tout) );
  timer.async_wait( boost::bind(&handle_timeout, &timer_overflow,
    boost::asio::placeholders::error) );

  socket->async_receive_from(
    boost::asio::buffer(data, buffl), sender_endpoint,
    boost::bind(&handle_receive_from, &read_result,
    boost::asio::placeholders::error,
    boost::asio::placeholders::bytes_transferred, outsize));

  socket->get_io_service().reset();

  while ( socket->get_io_service().run_one())
  {
    if ( read_result )
    {
      timer.cancel();
    }
    else if ( timer_overflow )
    {
      //not to be used on Windows XP, Windows Server 2003, or earlier
      socket->cancel();
      // Update: added run_one()
      socket->get_io_service().run_one();
    }
  }
  // Update: added run_one()
  socket->get_io_service().run_one();
  return read_result;
}

When the timer exceeds the 20 seconds, the error message "Operation canceled" is returned, but it is difficult to get any other information about what is going on.

Can anyone identify a problem or give me some hints to get some more information about what is going wrong? Any help is appreciated.

score 1 · Accepted Answer

好的，您所做的是，当您调用时receive_with_timeout，您正在设置两个异步请求（一个用于接收，一个用于超时）。当第一个完成时，您取消另一个。

但是，您永远不会ioservice::run_one()再次调用以允许它的回调完成。当您在 boost::asio 中取消操作时，它会调用处理程序，通常带有指示操作已中止或取消的错误代码。在这种情况下，我相信一旦您销毁了最后期限服务，您就会有一个处理程序悬空，因为它有一个指向堆栈的指针，用于存储结果。

解决方法是在退出函数之前再次调用run_one()处理取消的回调结果。您还应该检查传递给超时处理程序的错误代码，并且仅在没有错误时将其视为超时。

此外，在您确实有超时的情况下，您需要执行run_one以便async_recv_from处理程序可以执行，并报告它已被取消。

score 1 · Accepted Answer

在使用 Xubuntu 12.04 进行全新安装而不是使用 Ubuntu 10.04 进行旧安装之后，现在一切都按预期工作。也许是因为新安装运行了更新的内核，可能具有改进的网络？无论如何，使用较新版本的发行版重新安装解决了我的问题。

如果其他人在使用较旧内核时遇到意外的网络行为，我建议在安装了较新内核的系统上进行尝试。

c++ - async_receive_from stops receiving after a few packets under Linux

2 回答 2

Related

Reference