以下文件是双端 fastq 文件的两个伙伴,我想根据它们的长度分隔每个 fastq。
mate1.fq
:
@SRR127.1
TGGTTATGATGTTTGTGTAGGAATAGAAATTTTGATTAAGATATTAGTGAAATTTGAATGTAGTTTATTTGGAAGTTATGGAGAGTTTATATTGTATTTATGTTTATTGTTGTAGATTTATATTTATGTGTATATATTAGTTTTTTTGTGT
+
ABAAAF4FFFFFGGGGGGFFGGFGHGFGHHHHHGGCFFGHHHHH5FDBED55DGGFEGFHHHGBHDDHHHFF3AB3FFG5CBGBEF5BD5DGFEGHFAGAFEDGHGFHHGHGEFFGFGGHFEGHHFHGBEBGHHHHGHBHHFHHGGFGHH2
@SRR127.2
TATGGTAAGAAAATTGAAAATTATAAAAAATGAAAAATGTTTATTTGATGATTTGAAAAATGATGAAATTATTGAAAAATGTGAAAAATGAGAAATGTATATTGTAGGATTTGGAATATGGTGAGATAAATGAAAATTATAGTAAATG
+
AABAA5@D4@5CFFCA55FFGGHDGFHFFCC45DGFA2FA5DD55AAAA55DDBDEDDBGGFF5BA5DDABF5D5B5FF1ADFB5EDGHFG5@BFBD55D5FFB@@5@GBGEFBGHHGB@DBBFHFBDG3B43FFH@FGFHH?FHHHH
mate2.fq
:
@SRR127.1
ACCTATAAAAAAACCATATCAATAACTATAAAATCTTTATAAAATCCCACCCAATTAAAAAAAAATAAATTAATACATATAAAACCTTAAACACATAAAACATAATCACATACTATATAAACAATTACTATCACTACTAAACACCTAATA
+
>AA?AF13B@D@1EFCGGGFFG3EBGHHHBB2FGHHGHGFDGHHDFEGFHGGGHG1FFF1GGCGGGBGHHHHHFHHHHFHEGGFHF0BD1FGHHAGEGHFHHHFGGFHHGHHHFHHGGFHBGHFED1FBGFGFHDGHGHFGG1GB0GFHH
@SRR127.2
CTATTTCTCATTTTTTTATAATTTTCAATTCTCTTACCATATTCCACATCCTACACTAAACATTTCTAAATTTTCCACCTTTTTCTATTTTTCTCACCATATTTCATATCCTAAAAAACATATTCCTCATTTACTATAATTTTCAATTATC
+
11>>AFFDFF3@FFF?EFFGFBGHFDFA33D2FF2GGHFE12DD221AF1F1E1BG1GGBFBGGEGHDAABGAGDFABGG1BBDF12A2@2BG@2@DEFFF2B2@2222BB2211FGEE/11@22B2>1B22F2>GBGBD22BGD2>2B22
我编写了以下代码来执行此操作,但仅对第二个文件mate2.fq
(
#!/usr/bin/perl
use strict;
use warnings;
my @fh;
my $file_name = $ARGV[0];
my $infile = $ARGV[1];
#convert every 4-line fastq to 1-line
open(FH, "cat '$infile' | awk '{printf \"%s%s\",\$0,(NR%4?FS:RS)}' | ");
while (<FH>) {
chomp;
my @line = split(/\s+/, $_);
my $len = length($line[1]);
if ($len >= 100) {
#print $len,"\n",$_,"\n";
push @fh, $len;
if (not defined $fh[$len]) {
open $fh[$len], '>', "$file_name\_$len";
}
print { $fh[$len] } (join("\n", @line), "\n");
}
}
错误:
Can't use string ("151") as a symbol ref while "strict refs" in use at
如何处理这些文件?