1

到底是怎么回事?我创建了一个简单的程序来读取行并在文件上打印输出。但它会引发一些错误......

这是代码,它的解释在评论中:

use warnings;
use List::MoreUtils qw(indexes);

my @array_words = ();
my @array_split = ();
my @array_of_zeros = (0);
my $index = 0;

open my $info, 'models/busquedas.csv';
open my $model, '>>models/model.txt';

#First while is to count the words and store it into an array
while( my $line = <$info>)  {
    @array_split = regex($line);
    for (my $i=0; $i < scalar(@array_split); $i++) {
            # Get the index if the word is repeated
        $index = indexes { $_ eq $array_split[$i] } $array_words[$i];
            # if the word is not repeated then save it to the array by 
            # checking the index
        if ($index != -1){ push(@array_words, $array_split[$i]); }
    }
}

print $model @array_words;

sub regex{
    # get only basic info like: 'texto judicial madrid' instead of the full url
    if ($_[0] =~ m/textolibre=/ and 
        $. < 3521239 && 
        $_[0] =~ m/textolibre=(.*?)&translated/) {
        return split(/\+/, $_[0]);
    }
}

我不明白的错误是:

Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12216.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12216.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12216.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12217.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12217.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12217.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12217.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12217.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12218.
Use of uninitialized value $index in numeric ne (!=) at classifier.pl line 21, <$info> line 12218.

为什么未初始化$index?我已经声明它并用 0 值初始化它!我怎样才能解决这个问题?

4

2 回答 2

1

You have initialized the variable with zero, but then you change its value with

$index = indexes { $_ eq $array_split[$i] } $array_words[$i];

The function probably returns an undef (because $array_words[$i] does not eq $array_split[$i]). It would return one otherwise, as there is only one element in the list.

BTW, initializing a variable outside of a loop is a bad practice if you do not need its value outside the loop. You can declare my $index at the same line where you populate it with indexes.

于 2013-08-22T10:45:50.963 回答
0

正如已经观察到的,indexes子程序不是那样工作的。它返回块评估为true的索引列表。在这样的标量上下文中使用它是错误的。

如果你要为此使用一个库,你会想要any- 也来自List::MoreUtils. 代码看起来像这样

while( my $line = <$info>)  {
    @array_split = regex($line);
    for my $word (@array_split) {
      push @array_words, $word unless any { $_ eq $word } @array_words;
    }
}

但是我认为你想要一些更简单的东西。根据我对您的代码的理解,Perl 哈希将满足您的需求。

我已经像这样重构了你的程序。我希望它有所帮助。

@array_words本质上,如果行中的每个“单词”尚未在散列中,则它会被推送到该行中。

您的regex子程序中似乎也有错误。该声明

return split(/\+/, $_[0]);

分割整行并返回结果。我认为它应该只拆分您刚刚提取的 URL 的查询部分,就像这样

return split /\+/, $1;

通常,您应该检查open调用是否成功。添加autodie编译指示会为您隐式执行此操作。

use strict;
use warnings;
use autodie;

open my $info,  '<',  'models/busquedas.csv';
open my $model, '>>', 'models/model.txt';

my %unique_words;
my @array_words;

#First while is to count the words and store it into an array
while( my $line = <$info>)  {
  for my $word (regex($line)) {
    push @array_words, $word unless $unique_words{$word}++;
  }
}

print $model "$_\n" for @array_words;

sub regex {

  my ($line) = @_;

  # get only basic info like: 'texto judicial madrid' instead of the full url
  return unless $line =~ /textolibre=/ and $. < 3521239;
  if ( $line =~ /textolibre=(.*?)&translated/ ) {
    return split /\+/, $1;
  }
}
于 2013-08-22T11:24:38.220 回答