python - Python 在星号之间多次提取多个字符串

Question

我环顾四周，但找不到我要找的东西......

基本上我有一个带有很多星号的字符串：

例子：red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black

我要做的是将字符串拆分，以便我可以提取“hello”和“world”，并最终使用 for 语句将它们打印为列表。我正在使用的字符串更长，并且不一定有我想要取出的任何固定数量的切片。

有人可以帮我解决这个问题吗？

谢谢

score 7 · Accepted Answer

我希望：

re.findall(r'\*([^*]+)\*',string)

会成功的。基本上，这个正则表达式会查找'*'( \*) ，然后匹配任何不是'*'( ([^*]+)) 的内容，然后再匹配另一个'*'。

score 3 · Accepted Answer

作为优秀re建议的替代方案：

用于split分隔“在星号之间”和“不在星号之间”的部分：

>>> msg = "red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black"
>>> msg.split()
['red blue green ', ' hello', ' pink orange 4pgp42g4jg42 ', ' world', ' violet black']

然后使用数组切片来获取所有其他部分，从第二个开始。

>>>msg.split("*")[1::2]
[' hello', ' world']

score 2 · Accepted Answer

你试过这个re模块吗？它使用一种称为正则表达式的语法，允许您进行非常复杂的匹配（请参阅此处的文档）。在你的情况下，你可以尝试这样的事情：

import re

# Store your string
my_str = 'red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black'

# Find matches
match = re.findall(r'\*([^\*]*)\*', my_str)

# Print everything
print match

# Iterate
for item in match:
    print item

score 1 · Accepted Answer

您可以使用 .split('*') 然后获取所有其他元素。

例如：

my_string = 'this is a *test* of my code that *I* have written'
split_string = my_string.split('*')
words_between = [split_string[i] for i in range(1, len(split_string), 2)]

score 1 · Accepted Answer

正则表达式在这里似乎有点矫枉过正。我只会使用：

my_str = 'red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black'
broken_up = my_str.split('*')

如果你不想要结局，就做

broken_up[1:-1]

编辑：我想我刚刚意识到你真正在寻找什么。从技术上讲，“粉红橙色 4pgp42g4jg42”也在星号之间，这是一个问题。我认为这会起作用。

my_str = 'red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black'
broken_up = my_str.split('*')
broken_up = [broken_up[i] for i in range(1, len(broken_up), 2)]

如果你想摆脱空格，只需使用 .strip() 之类的

broken_up = [broken_up[i].strip() for i in range(1, len(broken_up), 2)]

score 1 · Accepted Answer

试试这个：

from re import findall

sstring = "red blue green * hello* pink orange 4pgp42g4jg42 * world*"

result = findall('\*.*?\*', sstring)
print r

这会给你：

['* hello*', '* world*']

score 0 · Accepted Answer

我将使用 re.split 将其拆分为字符串列表，从而做到这一点：

import re

my_string = "red blue green * hello* pink orange 4pgp42g4jg42 * world* violet black"

all_split_up = re.split('\*', my_string)

执行此操作时，键入：

for item in all_split_up:
    print item

将产生：

red blue green 
 hello
 pink orange 4pgp42g4jg42 
 world
 violet black

通过使用 re.split 而不是 re.findall，您不必担心在正则表达式模式中指定非捕获组。我认为这是最简单的正则表达式答案，尽管“答案”按钮有点晚。

python - Python 在星号之间多次提取多个字符串

7 回答 7

Related

Reference