2

我有一个文件夹名称列表作为一维数组:即:

folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004', 
            'A2_001', 'A2_002', 'A2_003', 'A2_004',
            'A3_001', 'A3_002', 'A3_003', 'A3_004']

并希望按前两个字符对列表进行分组,如“A1”、“A2”和“A3”。我认为这应该通过 groupby 完成,但我的代码不起作用

sectionName=[] #to get the first two characters of each element into a new list

for file in folderList:
    sectionName.append(file.split('_')[0])

for key, group in groupby(folderList,sectionName): 
    print key
    for record in group:
        print record

我得到了一个错误:

for key, group in groupby(folderList,sectionName):
TypeError: 'list' object is not callable

我想要得到的是这样的结果:

A1
['A1_001', 'A1_002', 'A1_003', 'A1_004']

A2
['A2_001', 'A2_002', 'A2_003', 'A2_004']

A3
['A3_001', 'A3_002', 'A3_003', 'A3_004']

我认为该groupby功能需要第二个输入才能成为 keyfunction,但到目前为止未能实现sectionNameinto keyfunction。如果您能提供帮助,请提前致谢。

4

4 回答 4

0
In [40]: folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004','A2_001', 'A2_002', 'A2_003', 'A2_004','A3_001', 'A3_002', 'A3_003', 'A3_004','B1_001','B1_002','B1_003','B2_001','B2_002','B2_003']

In [41]: for k, v in groupby(folderList, lambda x:x[:2]):
    ...:     print k, [x for x in v]
    ...:     
A1 ['A1_001', 'A1_002', 'A1_003', 'A1_004']
A2 ['A2_001', 'A2_002', 'A2_003', 'A2_004']
A3 ['A3_001', 'A3_002', 'A3_003', 'A3_004']
B1 ['B1_001', 'B1_002', 'B1_003']
B2 ['B2_001', 'B2_002', 'B2_003']

或以简单的方式:

In [42]: result={}

In [43]: for v in folderList:
    ...:     result.setdefault(v[:2],[]).append(v)
    ...:     

In [44]: result
Out[44]: 
{'A1': ['A1_001', 'A1_002', 'A1_003', 'A1_004'],
 'A2': ['A2_001', 'A2_002', 'A2_003', 'A2_004'],
 'A3': ['A3_001', 'A3_002', 'A3_003', 'A3_004'],
 'B1': ['B1_001', 'B1_002', 'B1_003'],
 'B2': ['B2_001', 'B2_002', 'B2_003']}
于 2013-03-23T11:27:41.257 回答
0

例如:

grouped = {prefix: list(folders) for 
    prefix, folders in itertools.groupby(folderList, lambda x: x[:2])}

不需要folderList排序的替代方法:

from collections import defaultdict
grouped = defaultdict(list)
for folder in folderList:
    grouped[folder[:2]].append(folder)
于 2013-03-23T11:28:08.880 回答
0

一个简单的循环和一个defaultdict会做:

from collections import defaultdict

folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004', 
            'A2_001', 'A2_002', 'A2_003', 'A2_004',
            'A3_001', 'A3_002', 'A3_003', 'A3_004']

sections = defaultdict(lambda: [])
for folder in folderList:
    sections[folder[:2]].append(folder)
print sections.values()

印刷:

[['A1_001', 'A1_002', 'A1_003', 'A1_004'], ['A3_001', 'A3_002', 'A3_003', 'A3_004'], ['A2_001', 'A2_002', 'A2_003', 'A2_004']]

的缺点groupby是必须对输入进行排序,并输出迭代器。在您的情况下,听起来您想要列表,因此您需要采取额外的步骤来确定list它们。上面的循环是实现您想要的简单方法。

于 2013-03-23T11:40:39.213 回答
0
folderList.sort()
def sectionName(sec):
    return sec.split('_', 1)[0]
for key, lst in groupby(folderList, sectionName):
     print key
     for record in lst:
         print record
于 2013-03-23T11:45:13.207 回答