0

我正在努力在某些索引处将列表切成碎片。虽然我可以一次只做一件,但我还没有找到一个可以让我跳过分段做的表达方式。

import re

#   Creating list to split

list = ['Leading', 'text', 'of', 'no', 'interest', '1.', 'Here', 'begins', 'section', '1', '2.', 'This', 'is', 'section', '2', '3.', 'Now', 'we', `enter code here`'have', 'section', '3']


#   Identifying where sections begin and end

section_ids = [i for i, item in enumerate(list) if re.search('[0-9]+\.(?![0-9])', item)]


#   Simple creation of a new list for each section, piece by piece

section1 = list[section_ids[0]:section_ids[1]]
section2 = list[section_ids[1]:section_ids[2]]
section3 = list[section_ids[2]:]


#   Iterative creation of a new list for each claim - DOES NOT WORK

for i in range(len(section_ids)):
     if i < max(range(len(section_ids))):
          section[i] = list[section_ids[i] : list[section_ids[i + 1]]
     else:
          section[i] = list[section_ids[i] : ]
     print section[i]

#   This is what I'd like to get

#   ['1.', 'Here', 'begins', 'section', '1']
#   ['2.', 'This', 'is', 'section', '2']
#   ['3.', 'Now', 'we', 'have', 'section', '3']
4

3 回答 3

0
for i,j in map(None, section_ids, section_ids[1:]):
    print my_list[i:j]

如果 section_ids 很大,itertools 版本会更有效

from itertools import izip_longest, islice
for i,j in izip_longest(section_ids, islice(section_ids, 1, None)):
    print my_list[i:j]
于 2012-06-27T03:37:09.663 回答
0

我能够使用以下代码产生所需的输出:

section=[]
for i,v in enumerate(section_ids+[len(list)]):
    if i==0:continue
    section.append(list[section_ids[i-1]:v])
于 2012-06-27T03:39:52.087 回答
0

你是否试图实现这样的目标:

>>> section = [] # list to hold sublists ....
>>> for index, location in enumerate(section_ids):
...     if location != section_ids[-1]: # assume its not the last one
...         section.append(list[location:section_ids[index + 1]])
...     else:
...         section.append(list[location:])
...     print section[-1]
...
['1.', 'Here', 'begins', 'section', '1']
['2.', 'This', 'is', 'section', '2']
['3.', 'Now', 'we', 'have', 'section', '3']
>>> 

或者:

>>> import re
>>> from pprint import pprint
>>> values = ['Leading', 'text', 'of', 'no', 'interest', '1.', 'Here', 'begins', 'section', '1', '2.', 'This', 'is', 'section', '2', '3.', 'Now', 'we', 'have', 'section', '3']
>>> section_ids = [i for i, item in enumerate(values) if re.search('[0-9]+\.(?![0-9])', item)] + [len(values)]
>>> section = [values[location:section_ids[index + 1]] for index, location in enumerate(section_ids) if location != section_ids[-1]]
>>> pprint(section)
[['1.', 'Here', 'begins', 'section', '1'],
 ['2.', 'This', 'is', 'section', '2'],
 ['3.', 'Now', 'we', 'have', 'section', '3']]
于 2012-06-27T03:43:24.467 回答