0

我有 2 个 CSV 文件, file1.csv 和 file2.csv 。我必须选择 file1 中第 3 列的每一行并遍历 file2 的第 3 列以找到匹配项,如果匹配发生,则仅在第三个 file2.csv 中显示完整匹配的行(来自第 1,2 和 3 列) csv 文件。到目前为止,我的代码仅从两个 csv 文件中获取第 3 列。如何匹配两个文件的第 3 列并显示匹配的行?请帮忙。

File1:

Comp_Name,Date,Files
Component1,2013/04/01,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile26;
Component1,2013/04/25,/Com/src2;

File2:

Comp_name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/25,/Com/src2;

Output format:

Comp_Name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/25,/Com/src2;

代码:

use strict;
use warnings;

my $file1 = "C:\\pick\\file1.csv";
my $file2 = "C:\\pick\\file2.csv";
my $file3 = "C:\\pick\\file3.csv";

my $type;
my $type1;
my @fields;
my @fields2;

open(my $fh, '<:encoding(UTF-8)', $file1)  or die "Could not open file '$file1' $!"; #Throw error if file doesn't open
while (my $row = <$fh>) # reading each row till end of file
{  
chomp $row;  
 @fields = split ",",$row;
 $type = $fields[2];
 print"\n$type";     
 }

 open(my $fh2, '<:encoding(UTF-8)', $file2)  or die "Could not open file '$file2' $!"; #Throw error if file doesn't open
 while (my $row2 = <$fh2>) # reading each row till end of file
 {  
 chomp $row2;  
 @fields2 = split ",",$row2;
 $type1 = $fields2[2];
 print"\n$type1";
 foreach($type)
  {
  if ($type eq $type1)
  {
  print $row2;
  }
  }
 }
4

3 回答 3

0

这是哈希的工作(我的 %file1)

因此,您可以将内容读入哈希值,而不是不断打开文件

@fields = split ",",$row;
$type = $fields[2];
$hash1{$type} = $row;

我看到您也有重复项,因此哈希条目将在重复时被替换

因此您可以在哈希中存储一组值

$hash1{$type} = [] unless $hash1{$type};
push @{$hash1{$type}}, $row;

你的下一个问题是如何遍历散列中的数组

于 2013-06-14T08:06:12.917 回答
0

这是使用我的Tie::Array::CSV模块的示例。它使用一些巧妙的 Perl 技巧将每个 CSV 文件表示为一个 Perl arrayrefs 数组。我用它来创建第一个文件的索引,然后循环第二个文件,最后输出到第三个文件。

#!/usr/bin/env perl

use strict;
use warnings;

use Tie::Array::CSV;

tie my @file1,  'Tie::Array::CSV', 'file1'  or die 'Cannot tie file1';
tie my @file2,  'Tie::Array::CSV', 'file2'  or die 'Cannot tie file2';
tie my @output, 'Tie::Array::CSV', 'output' or die 'Cannot tie output';

# setup a match table from file2
my %match = map { ( $_->[-1] => 1 ) } @file1[1..$#file1];

#header
push @output, $file2[0];

# iterate over file2
for my $row ( @file2[1..$#file2] ) {
  next unless $match{$row->[-1]}; # check for match
  push @output, $row; # print to output if match
}

我得到的输出与你的不同,但我无法弄清楚为什么你的输出不包含testfile25and src2

于 2013-06-14T16:28:51.343 回答
0

这不是一件过于复杂的事情。我个人会使用一个模块Text::CSV_XS或如前所述Tie::Array::CSV在这里执行。

如果您在使用模块时遇到问题,我想这将是一个替代方案。您可以根据自己的需要进行修改,我使用了您提供的数据并得到了您想要的结果。

use strict;
use warnings;

open my $fh1, '<', 'file1.csv' or die "failed open: $!";
open my $fh2, '<', 'file2.csv' or die "failed open: $!";
open my $out, '>', 'file3.csv' or die "failed open: $!";

my %hash1 = map { $_ => 1 } <$fh1>;
my %hash2 = map { $_ => 1 } <$fh2>;
close $fh1;
close $fh2;

my @result = 
      map  { join ',', $hash1{$_->[2]} ? () : $_->[0], $_->[1], $_->[2] }
      sort { $a->[1] <=> $b->[1] || $a->[2] cmp $b->[2] || $a->[0] cmp $b->[0] }
      map  { s/\s*$//; [split /,/] } keys %hash2;

print $out "$_\n" for @result;

close $out;

__OUTPUT__
Comp_name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/25,/Com/src2;
于 2013-06-15T07:39:51.460 回答