6

我对 Perl 比较陌生,我遇到了这个项目,我遇到了一些困难。该项目的目标是比较两个 csv 文件,其中一个将包含:$name、$model、$version 和另一个将包含:$name2、$disk、$storage 最后结果文件将包含匹配的行并将信息放在一起,如下所示:$name、$model、$version、$disk、$storage。

我已经设法做到了,但我的问题是,当缺少程序的元素之一中断时。当它遇到文件中缺少元素的行时,它会停在该行。我该如何解决这个问题?关于如何让它跳过该行并继续继续的任何建议或方法?

这是我的代码:

open( TESTING, '>testing.csv' ); # Names will be printed to this during testing. only .net       ending names should appear
open( MISSING, '>Missing.csv' ); # Lines with missing name feilds will appear here.

#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#my (@array) =<FILE>;
my @hostname;    #stores names

#close FILE;
#***** TESTING TO SEE IF ANY OF THE LISTED ITEMS BEGIN WITH A COMMA AND DO NOT HAVE A   NAME.
#***** THESE OBJECTS ARE PLACED INTO THE MISSING ARRAY AND THEN PRINTED OUT IN A SEPERATE
#***** FILE.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {

}
else {
  die " Cannot open file 1!\n:$!";

}

$count = 0;
$x     = 0;
while (<FILE>) {

  ( $name, $model, $version ) = split(",");    #parsing

  #print $name;
  chomp( $name, $model, $version );

  if ( ( $name =~ /^\s*$/ )
      && ( $model   =~ /^\s*$/ )
      && ( $version =~ /^\s*$/ ) )    #if all of the fields  are blank ( just a blank space)
  {

    #do nothing at all
  }
  elsif ( $name =~ /^\s*$/ ) {   #if name is a blank
    $name =~ s/^\s*/missing/g;
    print MISSING "$name,$model,$version\n";

    #$hostname[$count]=$name;
    #$count++;
  }
  elsif ( $model =~ /^\s*$/ ) {   #if model is blank
    $model =~ s/^\s*/missing/g;
    print MISSING"$name,$model,$version\n";
  }
  elsif ( $version =~ /^\s*$/ ) {   #if version is blank
    $version =~ s/^\s*/missing/g;
    print MISSING "$name,$model,$version\n";
  }

  # Searches for .net to appear in field "$name" if match, it places it into hostname array.
  if ( $name =~ /.net/ ) {

    $hostname[$count] = $name;
    $count++;
  }

#searches for a comma in the name feild, puts that into an array and prints the line into the missing file.
#probably won't have to use this, as I've found a better method to test all of the    feilds ( $name,$model,$version)
#and put those into the missing file. Hopefully it works.
#foreach $line (@array)
#{
#if($line =~ /^\,+/)
#{
#$line =~s/^\,*/missing,/g;
#$missing[$x]=$line;
#$x++;
#}
#}

}
close FILE;

for my $hostname (@hostname) {
  print TESTING $hostname . "\n";
}

#for my $missing(@missing)
#{
# print MISSING $missing;
#}
if ( open( FILE2, "file2.txt" ) ) {    #Run this if the open succeeds

  #open outfile and print starting header
  open( RESULT, '>resultfile.csv' );
  print RESULT ("name,Model,version,Disk, storage\n");
}
else {
  die " Cannot open file 2!\n:$!";
}
$count = 0;
while ( $hostname[$count] ne "" ) {
  while (<FILE>) {
    ( $name, $model, $version ) = split(",");    #parsing

    #print $name,"\n";

    if ( $name eq $hostname[$count] )    # I think this is the problem area.
    {
      print $name, "\n", $hostname[$count], "\n";

      #print RESULT"$name,$model,$version,";
      #open (FILE2,'C:\Users\hp-laptop\Desktop\file2.txt');
      #test
      if ( open( FILE2, "file2.txt" ) ) {

      }
      else {
        die " Cannot open file 2!\n:$!";

      }

      while (<FILE2>) {
        chomp;
        ( $name2, $Dcount, $vname ) = split(",");    #parsing

        if ( $name eq $name2 ) {
          chomp($version);
          print RESULT"$name,$model,$version,$Dcount,$vname\n";

        }

      }

    }

    $count++;
  }

  #open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
  #test
  if ( open( FILE, "file.txt" ) ) {

  }
  else {
    die " Cannot open file 1!\n:$!";

  }

}

close FILE;
close RESULT;
close FILE2;
4

2 回答 2

2

我认为您想要next,它可以让您立即完成当前迭代并开始下一个迭代:

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( $name && $model && $version );
  ...;
  }

您使用的条件取决于您将接受的值。在我的示例中,我假设所有值都需要为真。如果它们不需要是空字符串,也许您可​​以检查长度:

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( length($name) && length($model) && length($version) );
  ...;
  }

如果您知道如何验证每个字段,您可能有这些子程序:

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( length($name) && is_valid_model($model) && length($version) );
  ...;
  }

sub is_valid_model { ... }

现在你只需要决定如何将它整合到你已经在做的事情中。

于 2012-06-17T20:07:34.340 回答
2

您应该首先在程序的顶部添加use strict和,并在首次使用时声明所有变量。这将揭示许多原本难以发现的简单错误。use warningsmy

您还应该使用 ofopen和词法文件句柄的三个参数,并且在打开文件时检查异常的 Perl 习惯用法是添加or dieopen调用中。if成功路径的空块语句会浪费空间并变得不可读。open调用应该是这样的

open my $fh, '>', 'myfile' or die "Unable to open file: $!";

最后,在处理 CSV 文件时使用 Perl 模块要安全得多,因为使用简单的split /,/. 该Text::CSV模块已为您完成所有工作,可在 CPAN 上使用。

您的问题是,在读取到第一个文件的末尾后,在第二个嵌套循环中再次从同一个句柄读取之前,您不会倒带或重新打开它。这意味着不会从该文件中读取更多数据,并且程序的行为就像它是空的一样。

仅仅为了配对对应的记录而读取同一个文件数百次是一个糟糕的策略。如果文件大小合理,您应该在内存中构建一个数据结构来保存信息。Perl 散列是理想的,因为它允许您立即查找与给定名称对应的数据。

我已经编写了您的代码的修订版来演​​示这些要点。由于我没有示例数据,因此测试代码对我来说会很尴尬,但是如果您仍然遇到问题,请告诉我们。

use strict;
use warnings;

use Text::CSV;

my $csv = Text::CSV->new;

my %data;

# Read the name, model and version from the first file. Write any records
# that don't have the full three fields to the "MISSING" file
#
open my $f1, '<', 'file.txt' or die qq(Cannot open file 1: $!);

open my $missing, '>', 'Missing.csv' 
    or die qq(Unable to open "MISSING" file for output: $!);
    # Lines with missing name fields will appear here.

while ( my $line = csv->getline($f1) ) {

  my $name = $line->[0];

  if (grep $_, @$line < 3) {
    $csv->print($missing, $line);
  }
  else {
    $data{$name} = $line if $name =~ /\.net$/i;
  }
}

close $missing;

# Put a list of .net names found into the testing file
#
open my $testing, '>', 'testing.csv'
    or die qq(Unable to open "TESTING" file for output: $!);
    # Names will be printed to this during testing. Only ".net" ending names should appear

print $testing "$_\n" for sort keys %data;

close $testing;

# Read the name, disk and storage from the second file and check that the line
# contains all three fields. Remove the name field from the start and append
# to the data record with the matching name if it exists.
#
open my $f2, '<', 'file2.txt' or die qq(Cannot open file 2: $!);

while ( my $line = $csv->getline($f2) ) {

  next unless grep $_, @$line >= 3;

  my $name = shift @$line;
  next unless $name =~ /\.net$/i;

  my $record = $data{$name};
  push @$record, @$line if $record;
}

# Print the completed hash. Send each record to the result output if it
# has the required five fields
#
open my $result, '>', 'resultfile.csv' or die qq(Cannot open results file: $!);

$csv->print($result, qw( name Model version Disk storage ));

for my $name (sort keys %data) {

  my $line = $data{$name};

  if (grep $_, @$line >= 5) {
    $csv->print($result, $data{$name});
  }
}
于 2012-06-17T20:13:31.583 回答