0

我正在尝试建立一个管道,通过 awk 从 tcpdump 捕获的数据包流中的每个数据包中提取某些字段和 ascii 数据(源 IP、目标 IP 和有效负载),但我遇到了困难。我认为问题在于有效负载是任意的,并且很难找到一个可以用来通过 awk 将其过滤到记录中的固定结构。这是我当前的命令:

sudo tcpdump -i en1  -A -q -l  | awk ' { print "fields are $3 $5 $8} '

这是我要过滤的单行输出:

    12:45:23.890302 IP 10.0.1.3.52695 > weblnb.fogcreek.com.http: tcp 739
E....M@.@...
T.........P-.....&.....
2U......GET /default.asp?pg=pgRss&ixDiscussGroup=5 HTTP/1.1
Host: discuss.joelonsoftware.com
User-Agent: Vienna/2.6.0.2601
Accept: */*
Accept-Encoding: gzip
Accept-Language: en-us
Cookie: __utma=261409944.1875583.1351297139.1362842383.1362868129.78; __utmz=261409944.1358134504.43.4.utmcsr=joelonsoftware.com|utmccn=(referral)|utmcmd=referral|utmcct=/; fb_SessionId=qc48cvnjvacl3jeo76l8qv69emn119; DBID=LTOJIXRXTFAPXDGFBKCAYLVCILYFCA; fbToken=lqdf3avvfodabtfvd5c4drt18107B8; sUniqueID=20121026230417-66.117.217.10-slb5btkgb5; __utma=131697940.47826445.1351869116.1360335377.1361680499.5; __utmz=131697940.1361680499.5.2.utmccn=(referral)|utmcsr=statcounter.com|utmcct=/p8568424/exit_link_activity/|utmcmd=referral
Connection: keep-alive

这个过滤器的期望输出是

10.0.1.3.52695  weblnb.fogcreek.com.http: { E....M@.@...
    T.........P-.....&.....
    2U......GET /default.asp?pg=pgRss&ixDiscussGroup=5 HTTP/1.1
    Host: discuss.joelonsoftware.com
    User-Agent: Vienna/2.6.0.2601
    Accept: */*
    Accept-Encoding: gzip
    Accept-Language: en-us
    Cookie: __utma=261409944.1875583.1351297139.1362842383.1362868129.78; __utmz=261409944.1358134504.43.4.utmcsr=joelonsoftware.com|utmccn=(referral)|utmcmd=referral|utmcct=/; fb_SessionId=qc48cvnjvacl3jeo76l8qv69emn119; DBID=LTOJIXRXTFAPXDGFBKCAYLVCILYFCA; fbToken=lqdf3avvfodabtfvd5c4drt18107B8; sUniqueID=20121026230417-66.117.217.10-slb5btkgb5; __utma=131697940.47826445.1351869116.1360335377.1361680499.5; __utmz=131697940.1361680499.5.2.utmccn=(referral)|utmcsr=statcounter.com|utmcct=/p8568424/exit_link_activity/|utmcmd=referral
    Connection: keep-alive}

注意:这里的抽象级别不限于上面的单个具体示例。过滤后的输出的一般结构应如下所示:

$sourceip $targetip {$raw_packet_data/payload,_could_be_http_stream_or_just_plain_gibberish}

有效载荷字段的结束分界应该是下一个数据包的开始,参见。$sourceip。

并且 awk 过滤器应该以这种方式捕获 tcpdump 输出流的每一行,而不仅仅是一行。

关于如何实现这一点的任何建议?

4

1 回答 1

0

以下将您的示例输入映射到所需的输出,它适用于整个流吗?

$ awk '/tcp [0-9]+/{printf "%s %s { ",$3,$5;getline;print $0;next}$1=="Connection:"{$2=$2"}"}{printf "\t%s\n",$0}' file
10.0.1.3.52695 weblnb.fogcreek.com.http: { E....M@.@...
    T.........P-.....&.....
    2U......GET /default.asp?pg=pgRss&ixDiscussGroup=5 HTTP/1.1
    Host: discuss.joelonsoftware.com
    User-Agent: Vienna/2.6.0.2601
    Accept: */*
    Accept-Encoding: gzip
    Accept-Language: en-us
    Cookie: __utma=261409944.1875583.1351297139.1362842383.1362868129.78; __utmz=261409944.1358134504.43.4.utmcsr=joelonsoftware.com|utmccn=(referral)|utmcmd=referral|utmcct=/; fb_SessionId=qc48cvnjvacl3jeo76l8qv69emn119; DBID=LTOJIXRXTFAPXDGFBKCAYLVCILYFCA; fbToken=lqdf3avvfodabtfvd5c4drt18107B8; sUniqueID=20121026230417-66.117.217.10-slb5btkgb5; __utma=131697940.47826445.1351869116.1360335377.1361680499.5; __utmz=131697940.1361680499.5.2.utmccn=(referral)|utmcsr=statcounter.com|utmcct=/p8568424/exit_link_activity/|utmcmd=referral
    Connection: keep-alive}
于 2013-03-10T18:47:17.273 回答