需要帮助使用其中的键合并两个字典以查看另一个字典中的值。如果返回 true,它会将自己的值附加到另一个字典中(更新它..但不覆盖已经存在的值)
代码(对不起第一个自定义脚本):
otuid2clusteridlist = dict()
finallist = otuid2clusteridlist
clusterid2denoiseidlist = dict()
#first block, also = finallist we append all other blocks into.
for line in open('cluster_97.ucm', 'r'):
lineArray = re.split('\s+',line)
otuid = lineArray[0]
clusterid = lineArray[3]
if otuid in otuid2clusteridlist:
otuid2clusteridlist[otuid].append(clusterid)
else:
otuid2clusteridlist[otuid] = list()
otuid2clusteridlist[otuid].append(clusterid)
#second block, higher tier needs to expand previous blocks hash
for line in open('denoise.ucm_test', 'r'):
lineArray = re.split('\s+', line)
clusterid = lineArray[4]
denoiseid = lineArray[3]
if clusterid in clusterid2denoiseidlist:
clusterid2denoiseidlist[clusterid].append(denoiseid)
else:
clusterid2denoiseidlist[clusterid] = list()
clusterid2denoiseidlist[clusterid].append(denoiseid)
#print/return function for testing (will convert to write out later)
for key in finallist:
print "OTU:", key, "has", len(finallist[key]), "sequence(s) which", "=", finallist[key]
阻止一返回
OTU: 3 has 3 sequence(s) which = ['5PLAS.R2.h_35336', 'GG13_52054', 'GG13_798']
OTU: 5 has 1 sequence(s) which = ['DEX1.h_14175']
OTU: 4 has 1 sequence(s) which = ['PLAS.h_34150']
OTU: 7 has 1 sequence(s) which = ['DEX12.13.h_545']
OTU: 6 has 1 sequence(s) which = ['GG13_45705']
阻止两次退货
OTU: GG13_45705 has 4 sequence(s) which = ['GG13_45705', 'GG13_6312', 'GG13_32148', 'GG13_35246']
所以目标是将块二的输出添加到块一中。我希望它像这样添加
...
OTU: 6 has 4 sequence(s) which = ['GG13_45705', 'GG13_6312', 'GG13_32148', 'GG13_35246']
我尝试dic.update
过,但它只是将块二的内容添加到块一中,因为密钥不存在于块一中。
我认为我的问题更复杂,我需要第二个块在块一个的值中查找其键并将值附加到该列表中。
我一直在尝试 for 循环和 .append (类似于已经编写的代码),但我缺乏 python 的整体知识来解决这个问题。
想法?
补充,
数据的一些子集:
cluster_97.ucm(阻止一个文件):
5 376 * DEX1.h_14175 DEX1.h_14175
6 294 * GG13_45705 GG13_45705
0 447 98.7 DEX22.h_37221 DEX29.h_4583
1 367 98.9 DEX14.15.h_35477 DEX27.h_779
1 443 98.4 DEX27.h_3794 DEX27.h_779
0 478 97.9 DEX22.h_7519 DEX29.h_4583
denoise.ucm_test(块二的文件):
11 294 * GG13_45705 GG13_45705
11 278 99.6 GG13_6312 GG13_45705
11 285 99.6 GG13_32148 GG13_45705
11 275 99.6 GG13_35246 GG13_45705
我选择了这些子集,因为文件一中的第二行是文件二将要更新的内容。
如果有人想试一试。