您可以使用 bcftools ( https://github.com/samtools/bcftools ) 来执行此任务:
bcftools consensus <file.vcf> \
--fasta-ref <file> \
--iupac-codes \
--output <file> \
--sample <name>
要安装 bcftools:
git clone --branch=develop git://github.com/samtools/bcftools.git
git clone --branch=develop git://github.com/samtools/htslib.git
cd htslib && make && cd ..
cd bcftools && make && cd ..
sudo cp bcftools/bcftools /usr/local/bin/
您还可以将 bcftools 共识与 samtools faidx ( http://www.htslib.org/ ) 结合起来,从 fasta 文件中提取特定的时间间隔。有关更多信息,请参见 bcftools 共识:
About: Create consensus sequence by applying VCF variants to a reference
fasta file.
Usage: bcftools consensus [OPTIONS] <file.vcf>
Options:
-f, --fasta-ref <file> reference sequence in fasta format
-H, --haplotype <1|2> apply variants for the given haplotype
-i, --iupac-codes output variants in the form of IUPAC ambiguity codes
-m, --mask <file> replace regions with N
-o, --output <file> write output to a file [standard output]
-c, --chain <file> write a chain file for liftover
-s, --sample <name> apply variants of the given sample
Examples:
# Get the consensus for one region. The fasta header lines are then expected
# in the form ">chr:from-to".
samtools faidx ref.fa 8:11870-11890 | bcftools consensus in.vcf.gz > out.fa