4

我的印象是 UDP 的不稳定性是物理层的属性,但似乎不是:

我正在尝试通过 UDP 发送消息,该消息分为一系列数据包。消息识别和重新排序是隐式完成的。

我在同一台计算机上运行的两个应用程序上测试了这种方法,并希望它能够顺利运行。然而,即使数据传输完全是在同一台机器上的两个程序之间进行的,也存在丢包现象,而且非常频繁。损失似乎也很随机:有时整个信息都通过了,有时没有。

现在,即使在同一台机器上也会发生损失,这让我想知道我做得对吗?

最初,我是单次异步发送消息的所有片段,而不是等待一个片段完成后再发送下一个片段。

然后,我尝试从前一个消息的完成例程中发送下一个消息。这确实提高了丢包率,但并没有完全阻止它。

如果我在两个部分之间添加一个暂停 (Sleep(...)),它可以 100% 工作。

编辑: 正如答案所建议的那样:数据包发送得太快了,操作系统做了最小的缓冲。这是合乎逻辑的。

那么,如果我想阻止向系统添加确认和重新传输(我可以只使用 TCP),我应该怎么做?在不将数据速率降低到可能更高的水平的情况下,提高丢包率的最佳方法是什么?

编辑2: 我想到问题可能不完全是缓冲区溢出,而不是缓冲区不可用。我正在使用异步 WSARecvFrom 来接收,据我所知,它需要一个缓冲区,覆盖默认的操作系统缓冲区。当接收到数据报时,将其送入缓冲区,无论缓冲区是否已满,都会调用完成例程。

此时,根本没有缓冲区来处理传入的数据,直到从完成例程中重新调用 WSARecvFrom。

问题是是否有办法创建某种缓冲池,以便在处理不同的缓冲区时缓冲数据?

4

7 回答 7

6

在您的情况下,您只是将数据包发送得太快,以至于接收过程无法读取它们。O/S 在开始丢弃它们之前只会缓冲一定数量的接收到的数据包。

避免这种情况的最简单机制是让接收进程发回一个最小的 ACK 数据包,但让发送进程继续进行,无论它是否在几毫秒左右内没有收到 ACK。

编辑 - 从本质上讲,UDP 是“一劳永逸”。协议中没有像 TCP 那样内置反馈机制。调整传输速率的唯一方法是让远端告诉您它没有接收到整个流。另请参阅RFC 2309


回复:数据包序列 - 由于物理层,不会发生重新排序,通常是因为 IP 网络是“数据包交换”而不是“电路交换”。

这意味着每个数据包可能通过网络采用不同的路由,并且由于这些不同的路由可能具有不同的延迟,因此数据包可能会乱序到达。

在实践中,这些天很少有数据包因为物理层错误而丢失。数据包丢失是因为它们以高于该管道可以容纳的速率发送到有限的吞吐量管道。缓冲可以通过平滑数据包流率来帮助实现这一点,但如果缓冲区填满,您将回到原点。

于 2009-05-20T13:13:27.517 回答
3

为了避免操作系统缓冲区的问题,您需要实现一个速率控制系统。它可以是闭环(接收方发回 ACK 和有关缓冲区的信息)或开环(发送方减慢速度,这意味着您必须保守)。

有 UDP 的半标准协议来实现这两者。RBUDP(Reliable Blast UDP)浮现在脑海中,还有其他的。

于 2009-05-20T13:55:42.520 回答
2

If you're using UDP, the only way to detect packet loss as far as I know is going to involve some sort of feedback. If you're on a network with fairly consistent throughput, you could do a training period where you send bursts of data and wait for the receiver to respond and tell you how many packets of from the burst it received (i.e. make the receiver count and after a timeout, respond with the number it got). Then you just step up the amount of data per burst until you hit the limit and drop back down a little just to be sure.

This would avoid acks after the initial evaluation period, but will only work if the load on the network / receiving process does not change.

I've written UDP clients in Python before and the only time I've found any significant packet loss was when the input buffer on the receiving process was too small. As a result, when the system was under heavy load, you'd get packet loss because the buffer would silently overfill.

于 2009-05-20T13:46:36.967 回答
1

如果将WSA_FLAG_OVERLAPPED标志传递给WSASocket(),则可以多次调用WSARecvFrom()以将多个接收 I/O 请求排队。这样,已经有另一个缓冲区可用于接收下一个数据包,甚至在您的完成例程对另一个 I/O 请求进行排队之前。

这并不一定意味着您不会丢弃数据包。如果您的程序没有足够快地提供足够的缓冲区,或者处理它们并重新排队它们需要太长时间,那么它将无法跟上,这时某种速率限制可能会有所帮助。

于 2009-05-21T05:27:39.523 回答
0

You have to be doing something wrong. The only way you should be losing packets is 1) An unreliable network 2) You are sending data too fast to be handled by your receiving program. 3) You are sending messages that are bigger than the UDP max message size 4) Each device in your network has a max message size (MTU), so you might be exceeding a limit there.

In case #1, since you are sending on the same machine, the network is not even involved so it should be 100% reliable. You didn't say you had 2 network cards so I don't think this is an issue.

In case #2, you usually have to send a heck of a lot of data before you start dropping data. From your description, that does not sound like the case.

In case #3, make sure all your messages fall below this limit.

In case #4, I'm pretty certain if you meet the UDP max message size then you should be ok, but there very well could be some older hardware or custom device with a small MTU that your data is going through. If that is the case then those packets will be silently dropped.

I have used UDP on many applications and it has proven very reliable. Are you using MFC for receiving the messages? If you are, then you need to read the documentation very carefully as they clearly state some issues that you need to be aware of, but most people just gloss over them. I've had to fix quite a few of those gloss overs when people couldn't figure out why there messaging isn't working.

EDIT: You say that your packets are implicitly reordered. I might begin by verifying that your implicit reordering is really working correctly. That seems like the most likely candidate for your problem.

EDIT#2: Have you tried using a network monitor. Microsoft has (or at least used to) a free program called Network Monitor that will probably help.

于 2009-05-20T13:43:58.703 回答
0

我怀疑您机器的 IP 层传输速度不如您发送它们的速度快。

可能是因为协议允许丢弃数据包,而另一个目标 - 尽可能快地传输数据包 - 否则无法实现。

您的机器上的其他流量或 CPU 占用进程可能会解释不同的结果,您在测试期间是否使用 top (unix) 或 prcess explorer (nt) 进行了观察?

于 2009-05-20T13:19:43.603 回答
-1

看起来操作系统缓冲无法跟上不太频繁的上下文切换,即低级别发送需要更频繁的上下文切换。检查是否有办法优化低级发送缓冲区大小。

于 2014-02-11T04:27:49.327 回答