我想捕获我机器的所有传入 HTTP 数据包。为此,我使用了 SharpPcap,它是一个 WinPcap 包装器。
SharpPcap 工作得很好,但它捕获 TCP 数据包,这太低级了,无法满足我的要求。有谁知道我怎样才能从所有这些 TCP 数据包中轻松获得完整的 HTTP 请求/响应?
谢谢
我想捕获我机器的所有传入 HTTP 数据包。为此,我使用了 SharpPcap,它是一个 WinPcap 包装器。
SharpPcap 工作得很好,但它捕获 TCP 数据包,这太低级了,无法满足我的要求。有谁知道我怎样才能从所有这些 TCP 数据包中轻松获得完整的 HTTP 请求/响应?
谢谢
SharpPcap is already able to capture packets in the same manner that wireshark does (just in code rather than a GUI). And you can either parse them directly or you can dump them to the drive in the common .pcap file format.
The steps to parse a capture are:
If you're reading .pcap dump files the process is almost the same except you call an offline capture reader, don't need to pick an interface, and don't need to set promiscuous mode. All of the standard filters that wireshark, tcpdump, and most other Pcap frameworks use are supported in SharpPcap. For a reference to these check the tcpdump man.
Currently there is no support for parsing HTTP directly but parsing TCP packets is really easy.
When you receive the raw packet (non parsed) do this:
TCPPacket packet = TCPPacket.GetEncapsulated(rawPacket);
The Packet.Net (A separate and included component of SharpPcap) parser is capable of pulling out the TCP portion directly even if the communication is encapsulated by VPN, PPoE, or PPP.
Once you have the TCPPacket parsed just grab packet.PayloadBytes for the payload in a byte array that should contain the HTTP header in raw bytes that can be converted to the proper text format (I'm not really sure if HTTP headers use UTF-8 or ASCII encoding on that level). There should be plenty of freely available tools/libraries to parse HTTP headers.
To extract the HTTP packet from TCP:
You need to collect the tcp packets of the connection as they come in and if the data is fragmented (greater than 1500 bytes) you need to re-assemble the parts in memory. To discover which parts go in what order you need to carefully track the sequence/acknowledgement numbers.
This is a non-trivial thing to accomplish with SharpPcap because you're working with a much lower part of the stack and re-assembling the connection manually.
Wireshark has an interesting article on how to accomplish this in C.
As of right now, SharpPcap doesn't support TCP payload parsing.
If you're looking for easy-to-follow examples of how to use SharpPcap download the source tree and look at the example projects included. There is also a tutorial for SharpPcap on codeproject.
If you have more questions and/or you want to make any feature requests to the project, feel free to post on the SourceForge project. It is far from dead and continues to be under active development.
Note: Chris Morgan is the project lead and I'm one of the developers for SharpPcap/Packet.Net.
Update: The tutorial project on code project is now up-to-date to match the current API.
将 TCP 流解码为 HTTP 请求/响应对并非易事。像 WireShark 这样的工具付出了相当大的努力。
我为 Ruby 编写了一个 WireShark 包装器(这对你没有帮助),但在编写它之前,我尝试使用 tshark(WireShark 的命令行版本)。这并没有解决我的问题,但它可能对你有用。就是这样:
您捕获数据包并将它们写入 pcap 文件(SharpPcap 可能有办法做到这一点)。在某个时候关闭 cap 文件并启动另一个文件,然后在旧的文件上运行 tshark 并使用 HTTP 流量过滤器和一个指示您希望以 PDML 格式输出的标志。您会发现这是一种 XML 格式,可以使用 System.Xml 工具轻松解析,其中包含各种格式的每个 HTTP 字段的值。您可以编写 C# 代码来生成 tshark,并将其 StdOut 流通过管道传输到 XML 阅读器中,以便在数据包出现时将其从 tshark 中取出。我不建议使用 DOM 解析器,因为大型捕获文件的 PDML 输出会很快变得疯狂。
除非您的要求很复杂(就像我的要求一样),否则这可能就是您所需要的。
我认为您已经接近解决方案:如果您有来自 HTTP 流量的 TCP 数据包,则只需提取 TCP 有效负载即可重建 HTTP 请求/响应。请参阅此SO 条目以了解可能的方法。