perl - Perl：读取7个文件并搜索一个词（文件名不断变化）

Question

每天我都会收到一个日志文件，例如：

/home/ado/log/log.20130605

日志文件包含项目 id 和 id 被出售的次数。我做一个每日和每周的排名。

所以我有一个像这样的日志阅读器

    #!/usr/bin/perl
    use strict;
    use warnings;
    use POSIX 'strftime';

    my $current_date = strftime "%Y%m%d", localtime;
    my $filename     = "/home/ado/log/log.$current_date";

    open my $file, "<", $filename or die("$!: $filename");
    while (<$file>) {
        if (/item_id:(\d+)\s*,\s*start/) {
            $output{$1}++;
        }
    }
    close $file;
    for my $item(keys %output) {
        print "$item -> $output{$item}\n";
    }

我将其保存在数据库中。

我每天都使用 cron 命令运行它。到目前为止，我拥有每天进行排名的一切。

但是每周呢？

这意味着制作一个可以同时读取 7 个文件的新脚本：

    /home/ado/log/log.20130603
    /home/ado/log/log.20130604
    /home/ado/log/log.20130605
    /home/ado/log/log.20130606
    /home/ado/log/log.20130607
    /home/ado/log/log.20130608
    /home/ado/log/log.20130609

并搜索正则表达式。然后我会使用 cron 每周运行它。

如何修改脚本以读取 7 个文件而不是 1 个文件，注意文件名不断变化？– adriancdperu 4 分钟前编辑

score 1 · Accepted Answer

添加了围绕文件处理的循环，并在此之前收集所有日志文件，

    #!/usr/bin/perl
    use strict;
    use warnings;
    use POSIX 'strftime';

    # my $current_date = strftime "%Y%m%d", localtime;
    # my $filename     = "/home/ado/log/log.$current_date";
    my @filenames     = reverse sort glob("/home/ado/log/log.*");
    if (@filenames > 7) { $#filenames=6; }

    for my $filename (@filenames) {

      my %output;
      open my $file, "<", $filename or die("$!: $filename");
      while (<$file>) {
          if (/item_id:(\d+)\s*,\s*start/) {
              $output{$1}++;
          }
      }
      close $file;
      for my $item(keys %output) {
          print "$item->$output{$item}\n";
      }

    }

score 1 · Accepted Answer

我建议您使用Time::Piece查找所有相关文件名并将它们放入@ARGV，就好像它们已作为命令行参数输入一样。然后，您可以使用<>.

像这样

use strict;
use warnings;

use Time::Piece;
use Time::Seconds 'ONE_DAY';

my $today = localtime;
@ARGV = grep {
  /\.(\d{8})$/ and
      $today - Time::Piece->strptime($1, '%Y%m%d') < ONE_DAY * 7;
} glob '/home/ado/log/log.*';

while (<>) {
  ++$output{$1} if /item_id:(\d+)[\s,]*start/;
}

printf "%s -> %s\n", $_, $output{$_} for sort keys %output;

score 0 · Accepted Answer

使用线程也很有帮助！

#!/usr/bin/perl

use strict;
use warnings;
use threads;

my ($fh1, $fh2, $fh3, $fh4, $fh5, $fh6, $fh7);
my $thr1 = threads->new(\&sub1, "file1", $fh1);
my $thr2 = threads->new(\&sub1, "file2", $fh2);
my $thr3 = threads->new(\&sub1, "file3", $fh3);
my $thr4 = threads->new(\&sub1, "file4", $fh4);
my $thr5 = threads->new(\&sub1, "file5", $fh5);
my $thr6 = threads->new(\&sub1, "file6", $fh6);
my $thr7 = threads->new(\&sub1, "file7", $fh7);

$thr1->join();
$thr2->join();
$thr3->join();
$thr4->join();
$thr5->join();
$thr6->join();
$thr7->join();

sub sub1 {
    my ($file, $fh) = @_;

    my %output;
    open $fh, "<", $file or die("$!: $file");
    while (<$fh>) {
          if (/item_id:(\d+)\s*,\s*start/) {
              $output{$1}++;
          }
    }
    close $fh;
    for my $item (keys %output) {
        print "$item->$output{$item}\n";
    }
}

score 0 · Accepted Answer

编写将一组输入文件作为参数并写入标准输出的程序。

使用 7 个每日输入文件作为参数调用该程序，并将其标准输出重定向到您的每周摘要。

summarize_files file1 file2 file3 file4 file5 file6 file7 > weekly.summary

您可以将同一程序与单个每日输入文件一起使用，并将其标准输出也重定向到您的每日摘要。

summarize_files file1 > daily.summary

您还可以安排使用距今天日期的两个偏移量（以天为单位）之间的文件名来生成输入文件的名称：

 summarize_files -7 -1 > weekly.$(date +%Y%m%d)
 summarize_files -1 -1 > daily.$(date +%Y%m%d)

perl - Perl：读取7个文件并搜索一个词（文件名不断变化）

4 回答 4

Related

Reference