1

我已经在这几天了,尝试了各种不同的方法,检查了至少 50 个不同的 stackoverflow/python 库/python 新闻组问题,没有一个提供了太多帮助。(虽然如果它在那里并且我错过了它,我不会感到惊讶)。

反正!

我有一个包含字符串的列表列表。如下:

[['CAA46951&Homeobox domain&192:248&F&#CDC1C5&NULL&PFAM&Y&433&'],
 ['CAA46951&Homeodomain-like&165:252&S&#CD5B45&NULL&SCOP&Y&433&'],
 ['5330400P12&WD domain, G-beta repeat&131:168&F&#FF8C69&NULL&PFAM&Y&296&'],
 ['5330400P12&WD domain, G-beta repeat&173:210&F&#FF8C69&NULL&PFAM&Y&296&'],
 ['5330400P12&WD40-repeat&1:296&S&#00FF7F&NULL&SCOP&Y&296&'],
 ['AAH62206&Cell division protein&38:311&S&#00CED1&NULL&PFAM&Y&425&'],
 ['AAH62206&P-loop containing nucleoside triphosphate hydrolases&36:279&S&#00FFFF&NULL&SCOP&Y&425&']]

我想在一个列表中将每个字符串拆分为一个单独的字符串(因此 [['a','b','c'],['a2','b2','c2']] 排序列表[['a&b&c'],['a2&b2&c2]]

我已经尝试了从 for 循环枚举到递归函数的所有方法,但我无法让它工作。我知道这真的是一个非常愚蠢的问题,但请帮忙。

(值得注意的是,列表作为 txt. 文件传入并转换为字符串列表列表。最初它是:

CAA46951&Homeobox domain&192:248&F&#CDC1C5&NULL&PFAM&Y&433& CAA46951&Homeodomain-like&165:252&S&#CD5B45&NULL&SCOP&Y&433&)

4

2 回答 2

1
LofL=[['CAA46951&Homeobox domain&192:248&F&#CDC1C5&NULL&PFAM&Y&433&'], 
      ['CAA46951&Homeodomain-like&165:252&S&#CD5B45&NULL&SCOP&Y&433&'], 
      ['5330400P12&WD domain, G-beta repeat&131:168&F&#FF8C69&NULL&PFAM&Y&296&'], 
      ['5330400P12&WD domain, G-beta repeat&173:210&F&#FF8C69&NULL&PFAM&Y&296&'], 
      ['5330400P12&WD40-repeat&1:296&S&#00FF7F&NULL&SCOP&Y&296&'], 
      ['AAH62206&Cell division protein&38:311&S&#00CED1&NULL&PFAM&Y&425&'], 
      ['AAH62206&P-loop containing nucleoside triphosphate hydrolases&36:279&S&#00FFFF&NULL&SCOP&Y&425&']]

newL=[]      
for L in LofL:
    newSubL=[]
    for e in L:
        for s in e.split('&'):
            if s:
                newSubL.append(s)
    newL.append(newSubL)

输出:

[['CAA46951', 'Homeobox domain', '192:248', 'F', '#CDC1C5', 'NULL', 'PFAM', 'Y', '433'], ['CAA46951', 'Homeodomain-like', '165:252', 'S', '#CD5B45', 'NULL', 'SCOP', 'Y', '433'], ['5330400P12', 'WD domain, G-beta repeat', '131:168', 'F', '#FF8C69', 'NULL', 'PFAM', 'Y', '296'], ['5330400P12', 'WD domain, G-beta repeat', '173:210', 'F', '#FF8C69', 'NULL', 'PFAM', 'Y', '296'], ['5330400P12', 'WD40-repeat', '1:296', 'S', '#00FF7F', 'NULL', 'SCOP', 'Y', '296'], ['AAH62206', 'Cell division protein', '38:311', 'S', '#00CED1', 'NULL', 'PFAM', 'Y', '425'], ['AAH62206', 'P-loop containing nucleoside triphosphate hydrolases', '36:279', 'S', '#00FFFF', 'NULL', 'SCOP', 'Y', '425']]

如果你想进一步减少,你可以这样做:

newL=[filter(len, e.split('&')) for l in LofL for e in l] 
于 2012-05-22T06:18:16.017 回答
1
>>> oldList = [['a&b&c'], ['d&e&f']]
>>> newList = [item[0].split('&') for item in oldList]
>>> newList
[['a', 'b', 'c'], ['d', 'e', 'f']]
于 2012-05-22T05:57:05.190 回答