1

我需要将查询字符串拆分为几个无限量的变量以进行调试:

输出来自 tshark,目的是实时调试谷歌分析事件。tshark 的输出如下所示:

82.387501       hampus -> domain.net 1261 GET /__utm.gif?utmwv=5.3.7&utms=22&utmn=1234&utmhn=domain.com&utmt=event&utme=5(x*y*z%2Fstart%2Fklipp%2F166_SS%20example)(10)&utmcs=UTF-8~ HTTP/1.1 

我想要的是一个更易于阅读的版本:

utmhn:  domain.com
utmt:   event
utme:   5(x*y*z/start/klipp/166_SS/example)(10)
utmcs:  UTF-8

甚至更好:

utmhn:  domain.com
utmt:   event
utme:   5(
          x
          y
          z/start/klipp/166_SS/example
         )(10)
utmcs:  UTF-8

但是出于这个目的,我无法理解 sed (或 awk )......

4

6 回答 6

3

文件

82.387501       hampus -> domain.net 1261 GET /__utm.gif?utmwv=5.3.7&utms=22&utmn=1234&utmhn=domain.com&utmt=event&utme=5(x*y*z%2Fstart%2Fklipp%2F166_SS%20example)(10)&utmcs=UTF-8~ HTTP/1.1 

命令

 sed 's/.*utmhn=/uthmhn:   /
     s/&utmt=/\nutmt:     /
     s/&utme=/\nutme:     /
     s/utmcs=/\nutmcs:    /
     s:[%]2F:/:g
     s:[%]20: :g
     s:[\(]:(\n\t    :
     s:\*:\n\t    :g
     s:[\)]:\n\t  ):
     s/[~].*$//' samp1.txt

输出

uthmhn:   domain.com
utmt:     event
utme:     5(
            x
            y
            z/start/klipp/166_SS example
          )(10)&
utmcs:    UTF-8

我不确定您的 %20 VS 示例数据中 '/' char 的预期结果该说什么。你手动输入了一些吗?

于 2012-11-01T21:30:58.340 回答
1

这是使用GNU awk. 像这样运行:

awk -f script.awk file.txt

内容script.awk

BEGIN {
    FS="[ \t=&~]+"
    OFS="\t"
}

{
    for (i=1; i<=NF; i++) {
        if ($i ~ /^utmhn$|^utmt$|^utme$|^utmcs$/) {

             if ($i == "utme") {
                 sub(/\(/,"(\n\t  ", $(i+1))
                 gsub(/*/,"\n\t  ", $(i+1))
                 sub(/\)/,"\n\t )", $(i+1))
             }

             print $i":", $(i+1)
        }
    }
}

结果:

utmhn:  domain.net
utmt:   event
utme:   5(
          x
          y
          z%2Fstart%2Fklipp%2F166_SS%20example
         )(10)
utmcs:  UTF-8

或者,这是单线:

awk 'BEGIN { FS="[ \t=&~]+"; OFS="\t" } { for (i=1; i<=NF; i++) { if ($i ~ /^utmhn$|^utmt$|^utme$|^utmcs$/) { if ($i == "utme") { sub(/\(/,"(\n\t  ", $(i+1)); gsub(/*/,"\n\t  ", $(i+1)); sub(/\)/,"\n\t )", $(i+1)) } print $i":", $(i+1) } } }' file.txt
于 2012-11-01T21:49:34.660 回答
1

使用 Perl 的另一种方法:

#!/usr/bin/perl -l
use strict; use warnings;

while (<>) {
    my @arr;
    my ($qs) = m/.*?GET.*?\?(\S+)\s/;
    my @pairs = split(/[&~]/, $qs);
    foreach my $pair (@pairs){
         my ($name, $value) = split(/=/, $pair);
         if ($name eq 'utme') {
            $value =~ s!(%2F|%20)!/!g;
            $value =~ s!\*!\n\t\t!g;
            $value =~ s!\(!(\n\t\t!;
            $value =~ s/\)\(/\n\t)(/;
         }
         # let's URI unescape stuff
         $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
         if ($name eq 'utmhn') {
            print "$name: $value";
        }
        else {
            push @arr, "$name: $value";
        }
    }

    print join "\n", @arr;
    print "\n";
}

输出

utmhn: domain.com
utmwv: 5.3.7
utms: 22
utmn: 1234
utmt: event
utme: 5(
                x
                y
                z/start/klipp/166_SS/example
        )(10)
utmcs: UTF-8

用法

tshark ... | ./script.pl

好处

  • 我注意utmhn: domain.com在第一行显示
  • 我在值上运行 URI unescape
  • 它不仅限于“utmhn”、“utmt”、“utme”、“utmcs”
于 2012-11-02T02:10:37.277 回答
0

假设您的数据位于名为“file”的文件中:

awk -F "&" '{ for ( i=2;i<=NF;i++ ){sub(/=/,":\t",$i);sub(/[~].*$/,"",$i);gsub(/\%2F/,"/",$i);gsub(/\%20/," ",$i);print $i} }' tst

产生输出:

utms:   22
utmn:   1234
utmhn:  domain.com
utmt:   event
utme:   5(x*y*z/start/klipp/166_SS example)(10)
utmcs:  UTF-8

它有点脏,但它有效。

于 2012-11-01T21:49:29.657 回答
0
$ cat tst.awk
BEGIN { FS="[&=~]"; OFS=":\t" }
{
   for (i=1;i<=NF;i++) {
      map[$i]=$(i+1)
   }

   sub(/\(/,"&\n\t  ", map["utme"])
   gsub(/\*/,"\n\t  ", map["utme"])
   gsub(/%2./,"/",     map["utme"])
   sub(/\)/,"\n\t&",   map["utme"])

   print "utmhn", map["utmhn"]
   print "utmt",  map["utmt"]
   print "utme",  map["utme"]
   print "utmcs", map["utmcs"]
}
$
$ awk -f tst.awk file
utmhn:  domain.com
utmt:   event
utme:   5(
          x
          y
          z/start/klipp/166_SS/example
        )(10)
utmcs:  UTF-8
于 2012-11-02T06:22:16.587 回答
0

这可能对您有用(GNU sed):

sed 's/.*\(utmhn.*=\S*\).*/\1/;s/&/\n/g;s/=/:\t/g;s/(/&\n\t/;s/*/\n\t/g;s/%2F/\//g;s/%20/ /g;s/)/\n\t&/' file
于 2012-11-02T09:42:38.820 回答