c - 如何劫持所有本地http请求并使用c提取url？

Question

我应该往哪个方向发展（图书馆、文件）？

更新

有人可以说明如何使用 winpcap 来完成这项工作吗？

更新 2

如何验证数据包是否为 HTTP 数据包？

score 15 · Accepted Answer

如果通过“劫持”您的意思是嗅探数据包，那么您应该使用 WinPcap 执行以下操作：

找到您要使用的设备 -请参阅 WinPcap 教程。

使用打开设备pcap_open

// Open the device
char errorBuffer[PCAP_ERRBUF_SIZE];
pcap_t *pcapDescriptor = pcap_open(source,                // name of the device
                                   snapshotLength,        // portion of the packet to capture
                                                          // 65536 guarantees that the whole packet will be captured on all the link layers
                                   attributes,            // 0 for no flags, 1 for promiscuous
                                   readTimeout,           // read timeout
                                   NULL,                  // authentication on the remote machine
                                   errorBuffer);          // error buffer

使用从描述符中读取数据包的函数，例如pcap_loop
```
int result = pcap_loop(pcapDescriptor, count, functionPointer, NULL);
```
这将循环直到发生错误或使用特殊方法调用中断循环。它将为每个数据包调用 functionPointer。

在函数指向的实现解析数据包的东西，它应该看起来像pcap_handler：

typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *,
         const u_char *);

现在你剩下的就是解析它们的缓冲区所在的数据包，它们const u_char*的长度在pcap_pkthdr结构caplen字段中。

假设您有基于 TCP 的 HTTP GET over IPv4 over Ethernet 数据包，您可以：
- 跳过以太网报头的 14 个字节。
- 跳过 IPv4 标头的 20 个字节（假设没有 IPv4 选项，如果您怀疑 IPv4 选项是可能的，您可以读取 IPv4 标头的 5-8 位，将其乘以 4，这将是字节数IPv4 标头采用）。
- 跳过 TCP 标头的 20 个字节（假设没有 TCP 选项，如果您怀疑 TCP 选项是可能的，您可以读取 TCP 标头的 96-99 位，将其乘以 4，这将是字节数TCP 标头采用）。
- 数据包的其余部分应该是 HTTP 文本。第一个和第二个空格之间的文本应该是 URI。如果它太长，您可能需要进行一些 TCP 重建，但大多数 URI 都小到可以放入一个数据包中。
  
  更新：在代码中，这看起来像这样（我没有测试就写了）：
```
int tcp_len, url_length;
uchar *url, *end_url, *final_url, *tcp_payload;

... /* code in http://www.winpcap.org/docs/docs_40_2/html/group__wpcap__tut6.html */

/* retireve the position of the tcp header */
ip_len = (ih->ver_ihl & 0xf) * 4;

/* retireve the position of the tcp payload */
tcp_len = (((uchar*)ih)[ip_len + 12] >> 4) * 4;
tcpPayload = (uchar*)ih + ip_len + tcp_len;

/* start of url - skip "GET " */
url = tcpPayload + 4;

/* length of url - lookfor space */
end_url = strchr((char*)url, ' ');
url_length = end_url - url;

/* copy the url to a null terminated c string */
final_url = (uchar*)malloc(url_length + 1);
strncpy((char*)final_url, (char*)url, url_length);
final_url[url_length] = '\0';
```

您还可以通过创建和设置 BPF 仅过滤 HTTP 流量。请参阅 WinPcap 教程。您可能应该使用过滤器"tcp and dst port 80"，它只会为您提供计算机发送到服务器的请求。

如果您不介意使用 C#，您可以尝试使用Pcap.Net，它会更轻松地为您完成所有这些工作，包括解析数据包的以太网、IPv4 和 TCP 部分。

score 2 · Accepted Answer

听起来可能有点矫枉过正，但 Web 代理/缓存服务器 Squid 正是这样做的。几年前，我的公司使用它，我不得不在本地调整代码以在访问某些 URL 时提供一些特殊警告，这样我就知道它可以做你想做的事。您只需要找到您想要的代码并将其用于您的项目。我使用了 2.X 版本，我看到它们现在已经达到 3.X，但我怀疑代码的这方面在内部没有太大变化。

您没有说 Windows 是“要求”还是“偏好”，但根据网站： http ://www.squid-cache.org/他们可以两者兼得。

score 1 · Accepted Answer

1

试试http://www.winpcap.org/

于 2010-04-24T04:39:55.543 回答

score 0 · Accepted Answer

您可能想查看源代码tcpdump以了解它是如何工作的。 tcpdump是一个 Linux 命令行实用程序，用于监视和打印网络活动。不过，您需要对机器进行 root 访问才能使用它。

c - 如何劫持所有本地http请求并使用c提取url？

4 回答 4

Related

Reference