perl - 文件中最常用的字符串

Question

我在这里发现了一篇文章，有人设法从文件中读取信息并整理出最常用的单词并返回每个单词的使用次数。输入来自命令行参数，但我想执行相同的脚本，然后将文件名作为输入在脚本中运行。我找不到我做错了什么。

print "Type the name of the file: ";
chomp(my $file = <>);

open (FILE, "$file") or die;

while (<FILE>){
    $seen{$_}++ for split /\W+/;
}

my $count = 0;
for (sort {
    $seen{$b} <=> $seen{$a}
              ||
       lc($a) cmp lc($b)
              ||
          $a  cmp  $b
} keys %seen)
{
    next unless /\w/;
    printf "%-20s %5d\n", $seen{$_}, $_;
    last if ++$count > 100;
}
close (FILE);

我目前的结果是：

15                       0
15                       0
10                       0
10                       0
10                       0
5                        1
5                        0
5                        0
5                        0
5                        0

我想要的结果是：

<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>
<word>             <number of occurances>

score 2 · Accepted Answer

线

printf "%-20s %5d\n", $seen{$_}, $_;

与您的意图相反。$_是一个字符串，$seen{$_}是文本中出现次数的计数$_（一个数字），所以你想说要么

printf "%-20s %5d\n", $_, $seen{$_};

或者

printf "%5d %-20s\n", $seen{$_}, $_;

score 0 · Accepted Answer

在第二行中，您要将要打开的文件的名称放入 $file，而不是 $seen。所以：

chomp(my $file = <>);

chomp 在最后摆脱了换行符（从按回车键）。

score 0 · Accepted Answer

两件事情：

您正在将用户输入的文件输入读入变量$seen而不是$file.
您需要剔除收到的输入以摆脱尾随的换行符：
```
my $file= <>;
chomp($file);
```
或缩写形式：
```
chomp(my $file = <>);
```

perl - 文件中最常用的字符串

3 回答 3

Related

Reference