perl - Perl 脚本——多文本文件的解析和写入

Question

假设我有这个目录充满了文本文件（原始文本）。我需要的是一个Perl 脚本，它将一个一个地解析目录（up2bottom）文本文件并将它们的内容保存在一个新的单个文件中，由我指定。换句话说，我只是想创建一个包含许多文档的语料库。注意：这些文档必须用一些标签分隔，例如指示它们被解析的顺序。

到目前为止，我已经设法遵循了一些示例，并且我知道如何读取、写入和解析文本文件。但是我还不能将它们合并到一个脚本中并处理许多文本文件。你能提供一些帮助吗？谢谢

编辑： 用于写入文件的示例代码。

#!/usr/local/bin/perl
 open (MYFILE, '>>data.txt');
 print MYFILE "text\n";
 close (MYFILE);

读取文件的示例代码。

#!/usr/local/bin/perl
 open (MYFILE, 'data.txt');
 while (<MYFILE>) {
    chomp;
    print "$_\n";
 }
 close (MYFILE);

我还发现了可用于此类任务的foreach函数，但仍然不知道如何组合它们并达到描述中解释的结果。

score 0 · Accepted Answer

这个建议的要点是：

“魔术”菱形运算符（又名readline），它从中的每个文件中读取*ARGV，
该eof函数，它告诉readline当前文件句柄上的下一个是否会返回任何数据
该$ARGV变量包含当前打开的文件的名称。

有了这个介绍，我们开始吧！

#!/usr/bin/perl

use strict; # Always!
use warnings; # Always!

my $header = 1; # Flag to tell us to print the header
while (<>) { # read a line from a file
    if ($header) {
        # This is the first line, print the name of the file
        print "========= $ARGV ========\n";
        # reset the flag to a false value
        $header = undef;
    }
    # Print out what we just read in
    print;
}
continue { # This happens before the next iteration of the loop
    # Check if we finished the previous file
    $header = 1 if eof;
}

要使用它，只需执行以下操作：perl concat.pl *.txt > compiled.TXT

perl - Perl 脚本——多文本文件的解析和写入

1 回答 1

Related

Reference