2

所以,我想要做的是,如果你有以下列表:

example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

我希望它被重组为:

example_list=['This is', 'an', 'example list', '.']

注意 QQQQQ 是如何被用作占位符的。所以,基本上我希望 QQQQQ 之间的所有内容都成为一个列表元素。我怎么做?

我看过其他关于 join() 函数的帖子,但我遇到的问题是如果有超过 1 个单词,则在两者之间放置一个空格。

4

4 回答 4

4

使用简单的迭代。

前任:

example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

res = [[]]
for i in example_list:
    if i == "QQQQQ":
        res.append([])
    else:
        res[-1].append(i)
print([" ".join(i) for i in res])

输出:

['This is', 'an', 'example list', '.']
于 2018-08-01T13:17:45.810 回答
2

Simple solution: Do a join with space and then just add the spaces to placeholder in a split function.

Example:

example_list = ['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']

print(' '.join(example_list).split(' QQQQQ '))

Result:

['This is', 'an', 'example list', '.']

or more generalized:

split_arg = ' {} '.format(place_holder)
example_list = ' '.join(example_list).split(split_arg)

edit after comment by tobias_k

comment was: "Of course, this only works if the placeholder actually is a string, and if that stirng does not appear in any of the other words. I.e. it would not work if the placeholder was, e.g., None, 'Q', or '' – tobias_k"

Which is true, so I made an even more generalised solution so it should work for each placeholder.

import random
import string

example_list = ['This', 'is', None, 'an', None, 'example', 'list', None, '.']
place_holder = None
# create a random string of length 10
random_place_holder = ''.join(random.choices(string.ascii_uppercase + string.digits, k=10))  
# Replace all old place holders with our new random string placeholder
example_list = [x if x != place_holder else random_place_holder for x in example_list ]
split_arg = ' {} '.format(random_place_holder)
example_list = ' '.join(example_list).split(split_arg)
print(example_list)

To be honest you might be better off using any of the other solutions if you have an inconvenient place holder such as mentioned by tobias_k.

Decided to time it: used:

example_list = ['This', 'is', None, 'an', None, 'example', 'list', None, '.'] * 10000
place_holder = None

I used a longer list so that the creation of the random-string isn't a significant time consuming part, and timing is silly when you aren't using big lists anyway.

This solution: 11.6 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Rakesh' loop solution: 25.8 ms ± 819 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

RoadRunner's groupby: 34.4 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

于 2018-08-01T13:24:41.687 回答
2

您可以使用itertools.groupby()

>>> from itertools import groupby
>>> example_list=['This', 'is', 'QQQQQ', 'an', 'QQQQQ', 'example', 'list', 'QQQQQ', '.']
>>> [' '.join(g) for k, g in groupby(example_list, lambda x: x == 'QQQQQ') if not k]
['This is', 'an', 'example list', '.']

甚至可以进行.__eq__比较,正如@tobias_k在评论中所建议的那样:

>>> [' '.join(g) for k, g in groupby(example_list, key='QQQQQ'.__eq__) if not k]
['This is', 'an', 'example list', '.']
于 2018-08-01T13:19:58.343 回答
2

join一起尝试strip()去除空白

answer = [s.strip() for s in ' '.join(map(str, example_list)).split('QQQQQ')]
print (answer)

输出

['This is', 'an', 'example list', '.']
于 2018-08-01T13:21:51.043 回答