-1

例如,我有一个制表符分隔的文件

ID   NAME      FAMILYTAG     EFFECT
001  John      Black         Positive
002  Kate      Rhodes,Mich   Positive
003  Aaron     Sunders       Negative
004  Shirley   Rhodes        Negative
005  Dexter    Sunders,Hark  Positive

我想输入这个文件(实际上要大得多)并读入一个名称,例如Kate. 我希望脚本识别它的家庭标签,即它包含Rhodes,然后输出另一个家庭成员Shirley。有没有办法做到这一点?输出文件将如下所示。

Kate  Rhodes 
Shirley Rhodes
4

4 回答 4

1

给定您的输入,这是获得所需输出的一种方法...

use warnings;
use strict;

my %names;
while (<DATA>) {
    next if /^ID/;
    my ($first, $last) = (split)[1 .. 2];
    $last =~ s/,//;
    push @{ $names{$last} }, $first;
}
print "$_ Rhodes\n" for @{ $names{Rhodes} };

__DATA__
ID     NAME   FAMILYTAG   EFFECT
001  John      Black               Positive
002  Kate      Rhodes, Mich           Positive
003  Aaron   Sunders          Negative
004  Shirley  Rhodes          Negative
005  Dexter    Sunders        Positive

复制自我在 PerlMonks 上的回答

于 2012-04-27T16:07:03.570 回答
1

我不清楚FAMILYTAG列中的多个名称表示什么,但我将它们放在一起假设它们是替代姓氏。

use strict;
use warnings;

my %names;
my %families;

while (<DATA>) {
  next unless /^\d/;
  my ($id, $name, $familytag, $effect) = split /\t/;
  for my $tag (split /,/, $familytag) {
    push @{ $names{$name} }, $tag;
    push @{ $families{$tag} }, $name;
  }
}

while () {

  print "\nName: ";
  chomp (my $name = <>);
  last unless $name =~ /\S/;
  print "\n";

  if (my $tags = $names{$name}) {
    for my $tag (@$tags) {
      my $names = $families{$tag};
      next unless @$names > 1;
      printf "%s %s\n", $_, $tag for @$names;
    }
  }
  else {
    warn qq(No name "$name" found);
  }
}


__DATA__
ID  NAME    FAMILYTAG   EFFECT
001 John    Black   Positive
002 Kate    Rhodes,Mich Positive
003 Aaron   Sunders Negative
004 Shirley Rhodes  Negative
005 Dexter  Sunders,Hark    Positive

输出

E:\Perl\source>ff.pl

Name: Kate

Kate Rhodes
Shirley Rhodes

Name: Aaron

Aaron Sunders
Dexter Sunders

Name: Mike

No name "Mike" found at E:\Perl\source\ff.pl line 31, <> line 3.

Name: Dexter

Aaron Sunders
Dexter Sunders
于 2012-04-27T22:25:56.550 回答
0
#!/usr/bin/perl

use strict;
use warnings;
my %db;

open (F,'1.pl.tst');

my $find="Kate";
while(<F>)
{
    chomp;
    if (/^(\d+)[\t\ ]+(\w+)[\t\ ]+([^\t\ ]+)[\t\ ]+(\w+)$/)
    {
        $db{$1}{'name'}=$2;
        $db{$1}{'family'}=[split(',',$3)];
        $db{$1}{'effect'}=$4;
    }
}

my @family=@{name2family($find)};
foreach (@family)
{
    family2name($_);
}

sub name2family
{
    my $name=shift;
    foreach (keys %db)
    {
        if ($db{$_}{'name'} eq $name)
        {
            return $db{$_}{'family'};
        }
    }
}

sub family2name
{
    my $family=shift;
    foreach my $k (keys %db)
    {
        foreach (@{$db{$k}{'family'}})
        {
            if ($_ eq $family)
            {
                print $db{$k}{'name'}."\t\t".$_."\n";
            }
        }
    }
}
于 2012-04-27T16:33:23.873 回答
0

Text::CSV可以被告知使用不同的分隔符;"\t"在这种情况下。

use Text::CSV;

my $tsv = Text::CSV->new ( { sep_char => "\t" } );

然后使用与该模块示例中的$tsv对象类似的$csv对象。

于 2012-04-30T18:13:11.700 回答