python - 如何根据元素名称的一部分对一维列表进行排序？

Question

我有一个文件夹名称列表作为一维数组：即：

folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004', 
            'A2_001', 'A2_002', 'A2_003', 'A2_004',
            'A3_001', 'A3_002', 'A3_003', 'A3_004']

并希望按前两个字符对列表进行分组，如“A1”、“A2”和“A3”。我认为这应该通过 groupby 完成，但我的代码不起作用

sectionName=[] #to get the first two characters of each element into a new list

for file in folderList:
    sectionName.append(file.split('_')[0])

for key, group in groupby(folderList,sectionName): 
    print key
    for record in group:
        print record

我得到了一个错误：

for key, group in groupby(folderList,sectionName):
TypeError: 'list' object is not callable

我想要得到的是这样的结果：

A1
['A1_001', 'A1_002', 'A1_003', 'A1_004']

A2
['A2_001', 'A2_002', 'A2_003', 'A2_004']

A3
['A3_001', 'A3_002', 'A3_003', 'A3_004']

我认为该groupby功能需要第二个输入才能成为 keyfunction，但到目前为止未能实现sectionNameinto keyfunction。如果您能提供帮助，请提前致谢。

score 0 · Accepted Answer

In [40]: folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004','A2_001', 'A2_002', 'A2_003', 'A2_004','A3_001', 'A3_002', 'A3_003', 'A3_004','B1_001','B1_002','B1_003','B2_001','B2_002','B2_003']

In [41]: for k, v in groupby(folderList, lambda x:x[:2]):
    ...:     print k, [x for x in v]
    ...:     
A1 ['A1_001', 'A1_002', 'A1_003', 'A1_004']
A2 ['A2_001', 'A2_002', 'A2_003', 'A2_004']
A3 ['A3_001', 'A3_002', 'A3_003', 'A3_004']
B1 ['B1_001', 'B1_002', 'B1_003']
B2 ['B2_001', 'B2_002', 'B2_003']

或以简单的方式：

In [42]: result={}

In [43]: for v in folderList:
    ...:     result.setdefault(v[:2],[]).append(v)
    ...:     

In [44]: result
Out[44]: 
{'A1': ['A1_001', 'A1_002', 'A1_003', 'A1_004'],
 'A2': ['A2_001', 'A2_002', 'A2_003', 'A2_004'],
 'A3': ['A3_001', 'A3_002', 'A3_003', 'A3_004'],
 'B1': ['B1_001', 'B1_002', 'B1_003'],
 'B2': ['B2_001', 'B2_002', 'B2_003']}

score 0 · Accepted Answer

例如：

grouped = {prefix: list(folders) for 
    prefix, folders in itertools.groupby(folderList, lambda x: x[:2])}

不需要folderList排序的替代方法：

from collections import defaultdict
grouped = defaultdict(list)
for folder in folderList:
    grouped[folder[:2]].append(folder)

score 0 · Accepted Answer

一个简单的循环和一个defaultdict会做：

from collections import defaultdict

folderList=['A1_001', 'A1_002', 'A1_003', 'A1_004', 
            'A2_001', 'A2_002', 'A2_003', 'A2_004',
            'A3_001', 'A3_002', 'A3_003', 'A3_004']

sections = defaultdict(lambda: [])
for folder in folderList:
    sections[folder[:2]].append(folder)
print sections.values()

印刷：

[['A1_001', 'A1_002', 'A1_003', 'A1_004'], ['A3_001', 'A3_002', 'A3_003', 'A3_004'], ['A2_001', 'A2_002', 'A2_003', 'A2_004']]

的缺点groupby是必须对输入进行排序，并输出迭代器。在您的情况下，听起来您想要列表，因此您需要采取额外的步骤来确定list它们。上面的循环是实现您想要的简单方法。

score 0 · Accepted Answer

folderList.sort()
def sectionName(sec):
    return sec.split('_', 1)[0]
for key, lst in groupby(folderList, sectionName):
     print key
     for record in lst:
         print record

python - 如何根据元素名称的一部分对一维列表进行排序？

4 回答 4

Related

Reference