2

我是 perl 的新手,我想计算目录距离。

您将在下面找到目录距离计算的示例。

假设我有这个目录的列表:

 abc/a.h                
 abc/clipboard/b.cc             
 abc/gfx/d.cc               
 abc/gfx/e.cc               
 abc/gfx/gl/f.cc                
 abc/gfx/gl/h.cc                
 abc/gfx/gl/tr/aq/i.cc

结果表如下:

       file1              |  file2                    |  Dir. distance
--------------------------+---------------------------+-----------------
     abc/a.h              |   abc/clipboard/b.cc      |  1
     abc/a.h              |   abc/gfx/d.cc            |  1
     abc/a.h              |   abc/gfx/e.cc            |  1
     abc/a.h              |   abc/gfx/gl/f.cc         |  2
     abc/a.h              |   abc/gfx/gl/h.cc         |  2
     abc/a.h              |   abc/gfx/gl/tr/aq/i.cc   |  4
     abc/clipboard/b.cc   |   abc/gfx/d.cc            |  2
     abc/clipboard/b.cc   |   abc/gfx/e.cc            |  2
     abc/clipboard/b.cc   |   abc/gfx/gl/f.cc         |  3
     abc/clipboard/b.cc   |   abc/gfx/gl/h.cc         |  3
     abc/clipboard/b.cc   |   abc/gfx/gl/tr/aq/i.cc   |  5
     abc/gfx/d.cc         |   abc/gfx/e.cc            |  0
     abc/gfx/d.cc         |   abc/gfx/gl/f.cc         |  1
     abc/gfx/d.cc         |   abc/gfx/gl/h.cc         |  1
     abc/gfx/d.cc         |   abc/gfx/gl/tr/aq/i.cc   |  3
     abc/gfx/e.cc         |   abc/gfx/gl/f.cc         |  1
     abc/gfx/e.cc         |   abc/gfx/gl/h.cc         |  1
     abc/gfx/e.cc         |   abc/gfx/gl/tr/aq/i.cc   |  3
     abc/gfx/gl/f.cc      |   abc/gfx/gl/h.cc         |  0
     abc/gfx/gl/f.cc      |   abc/gfx/gl/tr/aq/i.cc   |  2
     abc/gfx/gl/h.cc      |   abc/gfx/gl/tr/aq/i.cc   |  2

我相信使用 Perl 是可行的,但我不确定应该使用什么代码方法。

你们中的一些人知道使之成为可能的方法吗?

问候,

4

3 回答 3

1

包括两个文件之间的相对路径:

use strict;
use warnings;

my @files = <>;

for my $source (@files) {
    for my $destination (@files) {
        chomp ($source, $destination);

        next if $source eq $destination
            or $source gt $destination;

        my @sourceParts = split "/", $source;
        my @destinationParts = split "/", $destination;

        # ignore the actual filename
        my $sourceName = pop @sourceParts;
        my $destinationName = pop @destinationParts;

        # discard the matching parts of the paths
        while(@sourceParts and @destinationParts) {
            last if $sourceParts[0] ne $destinationParts[0];

            shift @sourceParts;
            shift @destinationParts;
        }

        # count the non-matching directories
        my $distance = @sourceParts + @destinationParts;

        # relative paths
        my $rel2source = "../" x @destinationParts . join "/", @sourceParts, $sourceName;
        my $rel2destination = "../" x @sourceParts . join "/", @destinationParts, $destinationName;

        printf("%30s | %30s | %30s | %30s | %3d\n",
            $source,
            $destination,
            $rel2destination,
            $rel2source,
            $distance,
            );
    }
}

exit;

结果:

               abc/a.h |             abc/clipboard/b.cc |                 clipboard/b.cc |                         ../a.h |   1
               abc/a.h |                   abc/gfx/d.cc |                       gfx/d.cc |                         ../a.h |   1
               abc/a.h |                   abc/gfx/e.cc |                       gfx/e.cc |                         ../a.h |   1
               abc/a.h |                abc/gfx/gl/f.cc |                    gfx/gl/f.cc |                      ../../a.h |   2
               abc/a.h |                abc/gfx/gl/h.cc |                    gfx/gl/h.cc |                      ../../a.h |   2
               abc/a.h |          abc/gfx/gl/tr/aq/i.cc |              gfx/gl/tr/aq/i.cc |                ../../../../a.h |   4
    abc/clipboard/b.cc |                   abc/gfx/d.cc |                    ../gfx/d.cc |              ../clipboard/b.cc |   2
    abc/clipboard/b.cc |                   abc/gfx/e.cc |                    ../gfx/e.cc |              ../clipboard/b.cc |   2
    abc/clipboard/b.cc |                abc/gfx/gl/f.cc |                 ../gfx/gl/f.cc |           ../../clipboard/b.cc |   3
    abc/clipboard/b.cc |                abc/gfx/gl/h.cc |                 ../gfx/gl/h.cc |           ../../clipboard/b.cc |   3
    abc/clipboard/b.cc |          abc/gfx/gl/tr/aq/i.cc |           ../gfx/gl/tr/aq/i.cc |     ../../../../clipboard/b.cc |   5
          abc/gfx/d.cc |                   abc/gfx/e.cc |                           e.cc |                           d.cc |   0
          abc/gfx/d.cc |                abc/gfx/gl/f.cc |                        gl/f.cc |                        ../d.cc |   1
          abc/gfx/d.cc |                abc/gfx/gl/h.cc |                        gl/h.cc |                        ../d.cc |   1
          abc/gfx/d.cc |          abc/gfx/gl/tr/aq/i.cc |                  gl/tr/aq/i.cc |                  ../../../d.cc |   3
          abc/gfx/e.cc |                abc/gfx/gl/f.cc |                        gl/f.cc |                        ../e.cc |   1
          abc/gfx/e.cc |                abc/gfx/gl/h.cc |                        gl/h.cc |                        ../e.cc |   1
          abc/gfx/e.cc |          abc/gfx/gl/tr/aq/i.cc |                  gl/tr/aq/i.cc |                  ../../../e.cc |   3
       abc/gfx/gl/f.cc |                abc/gfx/gl/h.cc |                           h.cc |                           f.cc |   0
       abc/gfx/gl/f.cc |          abc/gfx/gl/tr/aq/i.cc |                     tr/aq/i.cc |                     ../../f.cc |   2
       abc/gfx/gl/h.cc |          abc/gfx/gl/tr/aq/i.cc |                     tr/aq/i.cc |                     ../../h.cc |   2
于 2013-09-25T19:13:55.687 回答
1

建议的算法

在这里,我将根据树数据结构使用文件路径。

  • 使集合A由第一条路径中的节点组成
  • 使集合B由第二条路径中的节点组成
  • 从各自的集合中删除两条路径的叶节点
  • A取和的集差B。简而言之,D = A diff B
  • 中的元素数D表示树中两条路径之间的距离

Perl 脚本

#!/usr/bin/perl
use warnings;
use strict;
use Array::Utils qw(:all);

my @dirs = (
                'abc/a.h',
                'abc/clipboard/b.cc',
                'abc/gfx/d.cc',
                'abc/gfx/e.cc',
                'abc/gfx/gl/f.cc',
                'abc/gfx/gl/h.cc',
                'abc/gfx/gl/tr/aq/i.cc'
            );

for (my $i=0; $i<@dirs-1; $i++) {
    my $d1 = $dirs[$i];
    for (my $j=$i+1; $j<@dirs; $j++) {
        my $d2 = $dirs[$j];
        # Set A of nodes in path 1 after discarding leaf node
        my @d1 = split '/', $d1; pop @d1;
        # Set B of nodes in path 2 after discarding leaf node
        my @d2 = split '/', $d2; pop @d2;
        # Set difference D = A diff B
        my @diff = array_diff(@d1, @d2);
        # No of elements in set D
        my $diff = @diff;
        # Print result in desired format
        print "$d1\t| $d2\t| $diff\n";
    }
}

输出

abc/a.h | abc/clipboard/b.cc    | 1
abc/a.h | abc/gfx/d.cc  | 1
abc/a.h | abc/gfx/e.cc  | 1
abc/a.h | abc/gfx/gl/f.cc   | 2
abc/a.h | abc/gfx/gl/h.cc   | 2
abc/a.h | abc/gfx/gl/tr/aq/i.cc | 4
abc/clipboard/b.cc  | abc/gfx/d.cc  | 2
abc/clipboard/b.cc  | abc/gfx/e.cc  | 2
abc/clipboard/b.cc  | abc/gfx/gl/f.cc   | 3
abc/clipboard/b.cc  | abc/gfx/gl/h.cc   | 3
abc/clipboard/b.cc  | abc/gfx/gl/tr/aq/i.cc | 5
abc/gfx/d.cc    | abc/gfx/e.cc  | 0
abc/gfx/d.cc    | abc/gfx/gl/f.cc   | 1
abc/gfx/d.cc    | abc/gfx/gl/h.cc   | 1
abc/gfx/d.cc    | abc/gfx/gl/tr/aq/i.cc | 3
abc/gfx/e.cc    | abc/gfx/gl/f.cc   | 1
abc/gfx/e.cc    | abc/gfx/gl/h.cc   | 1
abc/gfx/e.cc    | abc/gfx/gl/tr/aq/i.cc | 3
abc/gfx/gl/f.cc | abc/gfx/gl/h.cc   | 0
abc/gfx/gl/f.cc | abc/gfx/gl/tr/aq/i.cc | 2
abc/gfx/gl/h.cc | abc/gfx/gl/tr/aq/i.cc | 2
于 2013-09-25T18:59:28.910 回答
1

sub distance应该可以帮助您入门。nC2使用Math::Combinatorics调用它,你会得到你想要的。

#!/usr/bin/perl 

use strict;
use warnings;

distance('abc/clipboard/b.cc','abc/gfx/gl/tr/aq/i.cc');#5

sub distance
{
    my ($path1,$path2)=@_;
    my @levels1=split(/\//,$path1);
    my @levels2=split(/\//,$path2);
    my $depth=0;
    for my $i (0..$#levels1)
    {
        if($levels1[$i] eq $levels2[$i])
        {
            $depth++;
        }
    }
    printf("$path1 $path2 %d\n",scalar(@levels1)+scalar(@levels2)-(2*$depth)-2);
}
于 2013-09-25T18:58:38.900 回答