1

我收到了剂量格式的 SNP 基因型,这些基因型是使用 ENIGMA 协议估算的。我想使用plink --dosage [...] --fam [...]分析这些数据(我相信这是正确的语法。)

对于每条染色体,我收到了一个包含以下文件的 tar 文件

% tar -tf chromosome.21.tar
chunk1-ready4mach.21.imputed.dose.gz
chunk1-ready4mach.21.imputed.erate.gz
chunk1-ready4mach.21.imputed.hapDose.gz
chunk1-ready4mach.21.imputed.haps.gz
chunk1-ready4mach.21.imputed.info.draft
chunk1-ready4mach.21.imputed.info.gz
chunk1-ready4mach.21.imputed.prob.gz
chunk1-ready4mach.21.imputed.rec.gz

这些文件似乎都不符合plink 网站上提到的剂量文件的规格。(特别是,不是 .dose.gz,正如我猜想的那样)

这个事情谁有经验?我是否需要以任何方式修改这些文件中的任何一个?


% plink --dosage $dose --fam $fam
PLINK v1.90b3.38 64-bit (7 Jun 2016)       https://www.cog-genomics.org/plink2
(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --dosage /home/moebius/tmp/chromosome.21/chunk1-ready4mach.21.imputed.dose.gz
  --fam hammer.fam

32054 MB RAM detected; reserving 16027 MB for main workspace.
842 people (324 males, 518 females) loaded from .fam.
842 phenotype values loaded from .fam.
Using 1 thread.
842 people pass filters and QC.
Phenotype data is quantitative.
--dosage: Reading from
/home/moebius/tmp/chromosome.21/chunk1-ready4mach.21.imputed.dose.gz.
Error: Column 1 of
/home/moebius/tmp/chromosome.21/chunk1-ready4mach.21.imputed.dose.gz's header
isn't 'SNP'.
4

1 回答 1

1

We can use the program dose2plink to convert the ENIGMA dataset, which is in MACH format into PLINK dosage format.

Example:

./dose2plink.pl -dose chunk1.21.imputed.dose.gz -info chunk1.21.imputed.info.gz -out chunk1.21

which will produce chunk1.21.pfam and chunk1.21.pdat.gz.

于 2016-11-02T16:36:20.647 回答