0

将以下形式的列表分块的当前方法是什么:["record_a:", "x"*N, "record_b:", "y"*M, ...],即每个记录的开头由以“:”结尾的字符串表示的列表,并包括直到下一条记录的所有元素。所以下面的列表:

["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]

将分为:

[["record_a", "a", "b"], ["record_b", "1", "2", "3", "4"]]

列表包含任意数量的记录,并且每条记录包含任意数量的列表项(直到下一条记录开始或没有更多记录为止)。如何有效地完成此操作?

4

4 回答 4

4
lst = ["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]
out = []
for x in lst:
    if x[-1] == ':':
        out.append([x])
    else:
        out[-1].append(x)
于 2013-01-29T22:45:53.123 回答
4

使用生成器:

def chunkRecords(records):
    record = []
    for r in records:
        if r[-1] == ':':
            if record:
                yield record
            record = [r[:-1]]
        else:
            record.append(r)
    if record:
        yield record 

然后循环过去:

for record in chunkRecords(records):
    # record is a list

或再次变成列表:

records = list(chunkRecords(records))

后者导致:

>>> records = ["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]
>>> records = list(chunkRecords(records))
>>> records
[['record_a', 'a', 'b'], ['record_b', '1', '2', '3', '4']]
于 2013-01-29T22:46:44.673 回答
1
from itertools import groupby,izip,chain

l = ["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]

[list(chain([x[0][0].strip(':')], x[1])) for x in izip(*[(list(g) 
            for _,g in groupby(l,lambda x: x.endswith(':')))]*2)]

出去:

[['record_a', 'a', 'b'], ['record_b', '1', '2', '3', '4']]
于 2013-01-29T23:32:59.157 回答
1

好的,这是我下班后的疯狂 itertools 解决方案:

>>> from itertools import groupby, count
>>> d = ["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]
>>> groups = (list(g) for _, g in groupby(d, lambda x: x.endswith(":")))
>>> git = iter(groups)
>>> paired = ((next(git), next(git)) for _ in count())
>>> combined = [ [a[0][:-1]] + b for a,b in paired]
>>> 
>>> combined
[['record_a', 'a', 'b'], ['record_b', '1', '2', '3', '4']]

(作为一个可以做的事情的例子,而不是作为我必须使用的一段代码。)

于 2013-01-29T23:20:58.303 回答