python - 在 Python 中使用分隔符变量拆分字符串

Question

我正在尝试编写一个函数来使用给定的分隔符拆分字符串。我已经看到使用正则表达式忽略所有特殊字符的类似问题的答案，但我希望能够传入分隔符变量。

到目前为止，我有：

def split_string(source, separators): 
    source_list = source
    for separator in separators:
        if separator in source_list:
                source_list.replace(separator, ' ') 
    return source_list.split()

但它并没有删除分隔符

score 5 · Accepted Answer

正则表达式解决方案（对我来说）似乎很容易：

import re
def split_string(source,separators):
    return re.split('[{0}]'.format(re.escape(separators)),source)

例子：

>>> import re
>>> def split_string(source,separators):
...     return re.split('[{0}]'.format(re.escape(separators)),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']

在这里使用正则表达式的原因是如果您不想在' '分隔符中使用，这仍然有效......

另一种（我认为我更喜欢）可以使用多字符分隔符的方法是：

def split_string(source,separators):
    return re.split('|'.join(re.escape(x) for x in separators),source)

在这种情况下，多字符分隔符作为某种非字符串可迭代（例如元组或列表）传入，但单字符分隔符仍然可以作为单个字符串传入。

>>> def split_string(source,separators):
...     return re.split('|'.join(re.escape(x) for x in separators),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']
>>> split_string("the;foo: went to the store",['foo','st'])
['the;', ': went to the ', 'ore']

或者，最后，如果您还想在连续运行的分隔符上进行拆分：

def split_string(source,separators):
    return re.split('(?:'+'|'.join(re.escape(x) for x in separators)+')+',source)

这使：

>>> split_string("Before the rain ... there was lightning and thunder.", " .")
['Before', 'the', 'rain', 'there', 'was', 'lightning', 'and', 'thunder', '']

score 2 · Accepted Answer

问题是source_list.replace(separator, ' ')没有修改source_list到位；它只返回一个修改后的字符串值。但是你不会对这个修改后的值做任何事情，所以它会丢失。

你可以这样做：

source_list = source_list.replace(separator, ' ')

然后source_list现在将有修改后的版本。我对您的功能进行了这一更改，然后在测试时它运行良好。

score 2 · Accepted Answer

您忘记将 source_list.replace(separator, ' ') 的结果分配回 source_list

看看这个修改过的片段

def split_string(source, separators): 
    source_list = source
    for separator in separators:
        if separator in source_list:
                source_list=source_list.replace(separator, ' ') 
    return source_list.split()

score 0 · Accepted Answer

你应该使用 split 来解决问题，它不需要正则表达式，但你可以让它做你需要的事情。

在您的示例代码中，您不会重新分配。

python - 在 Python 中使用分隔符变量拆分字符串

4 回答 4

Related

Reference