1

考虑以下数据:

12 45 64  
12 45 76  
12 37 39 87
12 67 90  
12 39 60  

在这个例子中,只有十个不同的数字。如果我有大量数据,如何在 Perl 中计算?

我有一个从 12 到 45、从 45 到 64 的链接,但没有从 12 到 64 的链接。

我们没有从 45 到 12 的路线,所以 12 的邻域是 4 (45, 37, 67, 39),而 39 的邻域是 2 (87, 60)

如何计算此数据中所有值的邻域?

编辑

另一个要求是我们要忽略任何指向自身的值。例如,假设我们有这个文件:

1 4 3
1 2 2
2 6 7

在这个例子中,1 的邻域必须是 (4, 2)

4 的邻域必须是 3

2 的邻域必须是 6(而不是 2)

我的意思是我们必须删除匹配和重复。

4

4 回答 4

3
my $data = <<'END_DATA';
12 45 64 
12 45 76 
12 37 39 87 
12 67 90 
12 39 60
END_DATA

my @lines = split/\n+/, $data;

# map number the list of numbers following it in the sequence 
my %neighborhoods = (); 
for my $line ( @lines ) { 
    my @nums = split m/\s+/,$line;
    for my $i ( 0 .. $#nums - 1 ) { 
        $neighborhoods{$nums[$i]}{$nums[$i+1]} = 1; 
    }
} 

foreach my $num ( sort keys %neighborhoods ) { 
   print "num [$num] neighboorhood (" . 
         ( join "-", keys %{$neighborhoods{$num}} ) . 
         ") count [" . ( scalar keys %{$neighborhoods{$num}} ) . 
         "]\n"; 
}

输出:

num [12] neighboorhood (67-39-37-45) count [4]   
num [37] neighboorhood (39) count [1]
num [39] neighboorhood (60-87) count [2]
num [45] neighboorhood (64-76) count [2]
num [67] neighboorhood (90) count [1]
于 2012-12-15T22:01:02.120 回答
3

如果我理解正确,您想计算图中每个节点附近的节点数。我认为这可以满足您的要求。

我已经修改了我的代码,因为您已经解释了从节点到自身的向量应该被忽略。

use v5.10;
use warnings;

my %routes;

while (<DATA>) {
  my @nodes = /\d+/g;
  $routes{$_} //= {} for @nodes;
  while (@nodes >= 2) {
    my ($from, $to) = @nodes;
    $routes{$from}{$to}++ unless $from == $to;
    shift @nodes;
  }
}

for my $key (sort { $a <=> $b } keys %routes) {
  my $val = $routes{$key};
  printf "%d - neighbourhood size %d",
      $key,
      scalar keys %$val;
  printf " (%s)", join ', ', keys %$val if %$val;
  print "\n";
}

__DATA__
12 45 64  
12 45 76  
12 37 39 87  
12 67 90  
12 39 60
1 4 3
1 2 2
2 6 7

输出

1 - neighbourhood size 2 (4, 2)
2 - neighbourhood size 1 (6)
3 - neighbourhood size 0
4 - neighbourhood size 1 (3)
6 - neighbourhood size 1 (7)
7 - neighbourhood size 0
12 - neighbourhood size 4 (67, 39, 37, 45)
37 - neighbourhood size 1 (39)
39 - neighbourhood size 2 (60, 87)
45 - neighbourhood size 2 (64, 76)
60 - neighbourhood size 0
64 - neighbourhood size 0
67 - neighbourhood size 1 (90)
76 - neighbourhood size 0
87 - neighbourhood size 0
90 - neighbourhood size 0
于 2012-12-15T22:08:54.377 回答
3

使用Graph::Directed模块:

use Graph::Directed qw( );

my $graph = Graph::Directed->new();
while (<>) {
   my @points = split;
   $graph->add_edge(@points[$_-1, $_])
      for 1..$#points;
}

for my $vertex ($graph->vertices()) {
   my @successors = grep $_ != $vertex, $graph->successors($vertex);
   print("$vertex has ".@successors." successors: @successors\n");
}

输入:

1 4 3
1 2 2
2 6 7

输出:

6 has 1 successors: 7
4 has 1 successors: 3
2 has 1 successors: 6
1 has 2 successors: 4 2
3 has 0 successors:
7 has 0 successors:
于 2012-12-15T22:21:06.067 回答
0

这是另一种选择:

use strict;
use warnings;

my ( %hash, %seen );

while (<DATA>) {
    my @nums = split;

    for my $i ( 0 .. $#nums - 1 ) {
        push @{ $hash{ $nums[$i] } }, $nums[ $i + 1 ]
          if !$seen{ $nums[$i] }{ $nums[ $i + 1 ] }++;
    }
}

for my $num ( sort { $a <=> $b } keys %hash ) {
    print "$num has " . @{ $hash{$num} } . " neighbor(s): @{$hash{$num}}\n";
}

__DATA__
12 45 64  
12 45 76  
12 37 39 87
12 67 90  
12 39 60

输出:

12 has 4 neighbor(s): 45 37 67 39
37 has 1 neighbor(s): 39
39 has 2 neighbor(s): 87 60
45 has 2 neighbor(s): 64 76
67 has 1 neighbor(s): 90
于 2012-12-15T23:36:21.950 回答