我有一个这样的制表符分隔文件:
refseq gene symb locus_id chr strand start end cds_start cds_end status chrm
ENST00000456328.2 ENST00000456328.2 DDX11L1 00000456328 chr1 1 11868 14409 14409 14409 Reviewed 1
ENST00000515242.2 ENST00000515242.2 DDX11L1 00000515242 chr1 1 11871 14412 14412 14412 Reviewed 1
ENST00000518655.2 ENST00000518655.2 DDX11L1 00000518655 chr1 1 11873 14409 14409 14409 Reviewed 1
ENST00000450305.2 ENST00000450305.2 DDX11L1 00000450305 chr1 1 12009 13670 13670 13670 Reviewed 1
ENST00000438504.2 ENST00000438504.2 WASH7P 00000438504 chr1 0 14362 29370 29370 29370 Reviewed 1
我试着这样做:
fid = fopen('gencode.v19.pseudogene_gistic.txt');
headers = textscan(fid,'%s%s%s%s%s%s%s%s%s%s%s%s',1,'delimiter','\t')
data = textscan(fid,'%s%s%s%d%s%d%d%d%d%d%s%d','delimiter','\t')
fclose(fid);
cdata = struct('refseq',data{1}, 'gene',data{2}, 'symb',data{3}, 'locus_id',data{4}, 'chr',data{5}, 'strand',data{6}, 'start',transpose(data{7}), 'end',data{8}, 'cds_start',data{9}, 'cds_end',data{10}, 'status',data{11}, 'chrn',data{12};
但是,它返回的这种结构包含可笑的单元格。所有数字字段的行为都不同。注意:我想要一个 1x17149 结构,而不是 17149x1 结构。
任何人都可以帮忙吗?谢谢。