0

用于从矩阵文件中获取范围值的 Perl 脚本

Matrix.txt(里面的内容)(点击(\t)和换行符(\n))

\t100050\t100070\t100100\t100200\t100300\n
100050\t1\t0.0890344\t0.361651\t0.266263\t0.368639\n
100070\t0.0890344\t1\t0.0873663\t0.0267854\t0.148069\n
100100\t0.361651\t0.0873663\t1\t0.0423538\t0.269991\n
100200\t0.266263\t0.0267854\t0.0423538\t1\t0.215814\n
100300\t0.368639\t0.148069\t0.269991\t0.215814\t1

martix file like
--------100050 100070 100100 100200 100300
100050 1 0.0890344 0.361651 0.266263 0.368639
100070 0.0890344 1 0.0873663 0.0267854 0.148069
100100 0.361651 0.0873663 1 0.0423538 0.269991 100200
0.266263 0.0267854 0.0423538 1 0.215814
100300 0.368639 0.148069 0.269991 0.215814 1

我只需要带有两个头标签的值范围(0.3 到 1)(如果小于 0.3 则不打印)

矩阵是对称的:即对于所有索引 $i 和 $j,$m[$i][$j] == $m[$j][$i]。

$m[$i][$j] == $m[$j][$i] 值相同

如果 $i $j $v 已经退出,则无需打印 $j $i $v 即 100050 100100 0.361651 所以需要打印 (100100 100050 0.361651)
output.txt

Label1  label2  value
100050  100050  1
100050  100100  0.361651
100050  100300  0.368639
100070  100070  1
100100      100100  1
100200      100200  1
100300      100300  1
4

2 回答 2

1

冗长的单线,

 perl -anE 'if(!@h){@h=@F;next} $l{$F[0]}{$h[$_]} = $F[$_] for 1..$#h }{shift@h; $_->[2]<0.3 or say "@$_" for map {$t=$_; map [$t,$_,$l{$t}{$_}], @h}@h' file

或更明确的版本

# opening the file
open my $fh, "<", "file" or die $!;

my @header;  
my %matrix;

while (my $line = <$fh>) {
    my ($label, @F) = split /\s+/, $line;  # split the line into fields

    if (!@header) {
        @header = @F;
        next;
    }

    # assign the fields through a hash slice
    @{ $matrix{$label} }{@header} = @F;
}
close $fh;

my @arr = map {
  my $label = $_; 
  map [ $label, $_, $matrix{$label}{$_} ], @header;
} @header;

for my $el (@arr) {
    print "@$el\n" if $el->[2] >= 0.3;
}
于 2013-06-19T19:55:54.333 回答
1
use strict;
use warnings;

my ($dummy, @headers) = split(/\s+/, <DATA>);
my %seen;
while (<DATA>) {
    my ($head, @v) = split; 
    for (my $i = 0; $i < @v; $i++) {
        printf "%10s %10s %8.2f\n", 
            $head, $headers[$i], $v[$i] if $v[$i] >= 0.3 and not $seen{
              join(":", sort ($head, $headers[$i]))
            }++;
    }   
}

__DATA__
--------    100050  100070  100100  100200  100300
100050  1   0.0890344   0.361651    0.266263    0.368639
100070  0.0890344   1   0.0873663   0.0267854   0.148069
100100  0.361651    0.0873663   1   0.0423538   0.269991
100200  0.266263    0.0267854   0.0423538   1   0.215814
100300  0.368639    0.148069    0.269991    0.215814    1
于 2013-06-19T20:53:34.597 回答