2

我有这种格式的 csv 文件:

"Keyword"   "Competition"   "Global Monthly Searches"   "Local Monthly Searches (United States)"    "Approximate CPC (Search) - INR"

"kasperaky support" -0  -0  -0  -0

第一行是列标题。

我已经尝试了 Text::CSV 中的大多数选项,但我无法提取这些字段。

这里 sep_char=>' '

我能去的最接近的是获取第一列的第一个单词(仅限“kasperaky”)。

我正在以这种方式创建对象(同时尝试各种设置):

my $csv = Text::CSV->new ( { 
    binary => 1 ,
    sep_char=>' ',allow_loose_quotes=>0,quote_space=>0,quote_char          => '"',
    ,allow_whitespace    =>0, eol=>"\015\012"
     } ) 
                 or die "Cannot use CSV: ".Text::CSV->error_diag ();
4

4 回答 4

5

您的 CSV 是制表符分隔的。使用以下内容(代码经过测试可针对您的示例文件工作):

use strictures;
use autodie qw(:all);       # automatic error checking open/close
use charnames qw(:full);    # \N named characters
use Text::CSV qw();
my $csv = Text::CSV->new({
    auto_diag   => 2,       # automatic error checking CSV methods
    binary      => 1,
    eol         => "\N{CR}\N{LF}",
    sep_char    => "\N{TAB}",
}) or die 'Cannot use CSV: ' . Text::CSV->error_diag;

open my $fh, '<:encoding(ASCII)', 'computer crash.csv';
while (my $row = $csv->getline($fh)) {
    ...
}
close $fh;
于 2012-06-05T16:43:52.583 回答
4

To call that a CSV file is a bit of stretch! Your separator isn't a space, it's a sequence of 1 or more spaces, and Text::CSV doesn't handle that. (allow_whitespace doesn't work when your separator is a space, unfortunately.) You could use something like:

use List::MoreUtils qw( apply );
my @fields = apply { s/\\(.)/$1/sg } $line =~ /"((?:[^"\\]|\\.)*)"/sg;

Now, if those are tabs, that's a different story, and you could use sep_char => "\t".

于 2012-06-05T16:46:01.647 回答
1

我总是推荐使用解析器,通常 Text::CSV 很好,但是当你不使用真正的 CSV 时,有时会很痛苦。Text::ParseWords在这种情况下,您可以尝试使用核心模块。

这是我的例子。

#!/usr/bin/env perl

use strict;
use warnings;

use Text::ParseWords qw/parse_line/;

my @data;
while( my $line = <DATA> ) {
  chomp $line;
  my @words = parse_line( qr/\s+/, 0, $line );
  next unless @words;
  push @data, \@words;
}

use Data::Dumper;
print Dumper \@data;

__DATA__

"Keyword"   "Competition"   "Global Monthly Searches"   "Local Monthly Searches (United States)"    "Approximate CPC (Search) - INR"

"kasperaky support" -0  -0  -0  -0

此实现构建了一个二维数据数组,跳过了未使用的行。当然,一旦你解析了标记,你就可以构建你想要的任何数据结构。

$VAR1 = [
          [
            'Keyword',
            'Competition',
            'Global Monthly Searches',
            'Local Monthly Searches (United States)',
            'Approximate CPC (Search) - INR'
          ],
          [
            'kasperaky support',
            '-0',
            '-0',
            '-0',
            '-0'
          ]
        ];
于 2012-06-09T16:18:08.823 回答
0

这对我有用,文件空间用 1 个或多个空格分隔 这是 Text::CSV 不能完成工作的情况......

open(my $data, '<:encoding(UTF-8)', $filename) or die "Cannot open $filename";

while( my $line = <$data> ) {
        my @fields = split(' ', $line);
        print "\n$line : $fields[0] --- $fields[1] ----- $fields[2]";

}
于 2016-04-20T16:08:13.187 回答