perl - 解析 CSV 文件和散列

Question

我正在尝试解析 CSV 文件以读取所有其他邮政编码。我正在尝试创建一个哈希，其中每个键都是一个邮政编码，值是它出现在文件中的数字。然后我想将内容打印为邮政编码 - 号码。这是我到目前为止的 Perl 脚本。

use strict;
use warnings;

my %hash = qw (
     zipcode count
);

my $file = $ARGV[0] or die "Need CSV file on command line \n";

open(my $data, '<', $file) or die "Could not open '$file $!\n";
while (my $line = <$data>) {
   chomp $line;
   my @fields = split "," , $line;
   if (exists($hash{$fields[2]})) {
        $hash{$fields[1]}++;
   }else {
        $hash{$fields[1]} = 1;
   }
}

my $key;
my $value;
while (($key, $value) = each(%hash)) {
  print "$key - $value\n";
}

exit;

score 5 · Accepted Answer

您没有说您的邮政编码在哪一列，但是您使用第三个字段来检查现有的哈希元素，然后使用第二个字段来增加它。

无需检查散列元素是否已经存在：Perl 会愉快地创建一个不存在的散列元素，并在您第一次访问它时将其递增为 1。

也不需要显式打开作为命令行参数传递的任何文件：如果您使用<>没有文件句柄的运算符，Perl 将打开并读取它们。

您自己的程序的这种修改可能会起作用。它假定邮政编码位于 CSV 的第二列中。如果它在其他任何地方，只需++$hash{$fields[1]}适当更改即可。

use strict;
use warnings;

@ARGV or die "Need CSV file on command line \n";

my %counts;

while (my $line = <>) {
   chomp $line;
   my @fields = split /,/, $line;
   ++$counts{$fields[1]};
}

while (my ($key, $value) = each %counts) {
  print "$key - $value\n";
}

score 2 · Accepted Answer

抱歉，如果这是题外话，但如果您在具有标准 Unix 文本处理工具的系统上，您可以使用此命令计算字段 #2 中每个值的出现次数，而无需编写任何代码.

cut -d, -f2 filename.csv | sort | uniq -c

这将生成类似这样的输出，其中首先列出计数，然后是邮政编码：

perl - 解析 CSV 文件和散列

2 回答 2

Related

Reference