我有一个这样的制表符分隔文件(在我的脚本 DIVERGE 中):
contig04730 contigK02622 0.3515
contig04733 contigK02622 0.3636
contig14757 contigK03055 0.4
我有第二个这样的制表符分隔文件(数据):
contig04730 F GO:0000228 nuclear GO:0000783 telomere_cap
contig04730 F GO:0005528 reproduction GO:0001113 eggs
contig14757 P GO:0123456 immune GO:0003456 cells
contig14757 P GO:0000782 nuclear GO:0001891 DNA_binding
contig14757 C GO:0000001 immune GO:00066669 more_cells
我正在尝试将第一个文件的第二和第三列添加到第二个文件中,这样我就可以拥有(OUT):
contig04730 F GO:0000228 nuclear GO:0000783 telomere_cap contigK02622 0.3515
contig04730 F GO:0005528 reproduction GO:0001113 eggs contigK02622 0.3515
contig14757 P GO:0123456 immune GO:0003456 cells contigK03055 0.4
contig14757 P GO:0000782 nuclear GO:0001891 DNA_binding contigK03055 0.4
contig14757 C GO:0000001 immune GO:00066669 more_cells contigK03055 0.4
这是我正在尝试使用的 perl 脚本(尝试调整我在这里找到的那些 - 对 perl 来说非常新):
#!/usr/bin/env/perl
use strict;
use warnings;
#open the ortholog contig list
open (DIVERGE, "$ARGV[0]") or die "Error opening the input file with contig pairs";
#hash to store contig IDs
my ($espr, $liya, $divergence) = split("\t", $_);
#read through the ortho contig list and read into memory
while(<DIVERGE>){
chomp $_; #get rid of ending whitepace
($espr, $liya, $divergence)->{$_} = 1;
}
close(DIVERGE);
#open output file
open(OUT, ">$ARGV[2]") or die "Error opening the output file";
#open data file
open(DATA, "$ARGV[1]") or die "Error opening the sequence pairs file\n";
while(<DATA>){
chomp $_;
my ($contigs, $FPC, $GOslim, $slimdesc, $GOterm, $GOdesc) = split("\t", $_);
if (defined $espr->{$contigs}) {
print OUT "$_", "\t$liya\t$divergence", "\n";
}
}
close(DATA);
close(OUT);
但是我在第 15 行得到了一个关于私有变量的无用使用和在第 10 行拆分的统一值 _$ 的错误。我对 perl 术语/变量只有非常基本的掌握。因此,如果有人能指出我哪里出错以及如何解决,将不胜感激。