3

有时我会删除或替换一个长字符串的子字符串。因此,我将确定一个开始模式和一个结束模式,这将确定子字符串的起点和终点:

long_string = "lorem ipsum..white chevy..blah,blah...lot of text..beer bottle....and so to the end"
removed_substr_start = "white chevy"
removed_substr_end = "beer bott"

# this is pseudo method down
STRresult = long_string.replace( [from]removed_substr_start [to]removed_substr_end, "")
4

5 回答 5

6

I guess you want something like that, without regex:

def replace_between(text, begin, end, alternative=''):
    middle = text.split(begin, 1)[1].split(end, 1)[0]
    return text.replace(middle, alternative)

Not tested and you should protected the first line from exception (if begin or end is not found), but the idea is here :)

于 2013-07-29T09:33:32.447 回答
5

You can use regex:

>>> import re
>>> strs = "lorem ipsum..white chevy..blah,blah...lot of text..beer bottle....and so to the end"
>>> sub_start = "white chevy"
>>> sub_end = "beer bott"
>>> re.sub(r'{}.*?{}'.format(re.escape(sub_start),re.escape(sub_end)),'',strs)
'lorem ipsum..le....and so to the end'

If you only want to remove the sub-string between "white chevy" and "beer bott" but not these words:

>>> re.sub(r'({})(.*?)({})'.format(re.escape(sub_start),
                                               re.escape(sub_end)),r'\1\3',strs)
'lorem ipsum..white chevybeer bottle....and so to the end'
于 2013-07-29T09:33:09.440 回答
2

使用获取起始索引string.find()和使用最后一个索引string.rfind(),然后使用以下方法删除内部部分:

lindex = string.find(long_string, removed_substr_start)
rindex = string.find(long_string, removed_substr_end, lindex)
result = long_string[0:lindex] + longstring[rindex:]

请参阅:http ://docs.python.org/2/library/string.html#string.find

于 2013-07-29T09:35:58.133 回答
1
import re
regexp = "white chevy.*?beer bott"
long_string = "lorem ipsum..white chevy..blah,blah...lot of text..beer bottle....and so to the end"
re.sub(regexp, "", long_string)

gives:

'lorem ipsum..le....and so to the end'
于 2013-07-29T09:33:24.253 回答
1

在使用了许多方法后,我发现这个解决方案是没有正则表达式的最佳解决方案:

def getString( str, _from, _to ):
    end_from = str.find( _from ) +len( _from)
    return str[ end_from : str.find( _to, end_from ) ]
于 2014-08-20T08:19:10.843 回答