1

我目前将以下行存储在一个名为google.txt. 我想分隔这些行并将这些分隔的字符串存储在数组中。

喜欢第一行

@qf_file= q33AgCEv006441  
@date =    Tue Apr  3 16:12
@junk_message = User unknown
@rf_number = ngandotra@nkn.in

the line ends at the @rf_number at last emailadress
   q33AgCEv006441     1038 Tue Apr  3 16:12 <test10-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     <ngandotra@nkn.in>
    q33BDrP9007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (Deferred: 451 4.2.1 mailbox temporarily disabled: paond.tndt)
                      <paond.tndta@nic.in>
    q33BDrPB007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     paocorp.tndta@nic.in>
                                             <dtocbe@tn.nic.in>
                                             <dtodgl@nic.in>
    q33BDrPA007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     <dtokar@nic.in>
                     <dtocbe@nic.in>
    q2VDWKkY010407  2221878 Sat Mar 31 19:37 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                     <arjunpan@now-india.net.in>
    q2VDWKkR010407  2221878 Sat Mar 31 19:31 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (aaplawoffices.in): deferred)
                      <amit.bhagat@aaplawoffices.in>
    q2U8qZM7026999   360205 Fri Mar 30 14:38 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                      <arjunpan@now-india.net.in>
                       <amit.bhagat@aaplawoffices.in>
    q2TEWWE4013920  2175270 Thu Mar 29 20:30 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                               <arjunpan@now-india.net.in>
                               <amit.bhagat@aaplawoffices.in>
4

1 回答 1

1

未经测试的Perl 脚本:

我们称这个脚本为parser.pl

$file = shift;
open(IN, "<$file") or die "Cannot open file: $file for reading ($!)\n";
while(<IN>) {
    push(@qf_file, /^\w+/g); 
    push(@date, /(?:Sat|Sun|Mon|Tue|Wed|Thu|Fri)[\w\s:]+/g);
    push(@junk_message, /(?<=\().+(?=\)\s*<)/g);
    push(@rf_number, /(?<=<)[^>]+(?=>\s*$)/g);
}
close(IN);

这假定该行的最后一封电子邮件应该是该行的“rf_number”。请注意,电子邮件可能很难打印,因为它们有一个@字符,而 perl 非常乐意为您打印一个不存在的列表:-)

要在命令行中调用它:

parser.pl google.txt

看到这个工作在这里

于 2012-05-31T17:13:30.777 回答