我有一个字符串
"This is a small \t\t world"
假设字符串在单词“small”和“world”之间有 2 个制表符。如何修剪其中一个制表符空格,以便获得:
"This is a small \t world"
“小”和“世界”这两个词在句子中只能出现一次。基本上给定两个特定的词,我想修剪它们之间的额外标签
使用re
...
import re
s = b"This is a small world"
s = re.sub(r'(.*\bsmall *)\t+( *world\b.*)', r'\1\t\2', s)
print s
输出:
>>>
This is a small world
这将保留两者前后的所有空格tabs
。
def remove_tab(st, word1, word2):
index1 = st.find(word1)
index2 = st[index1:].find(word2)
replacement = st[index1:index2].replace('\t\t', '\t')
return st[:index1] + replacement + st[index2:]
使用regex
:
In [114]: def func(st,*words):
rep=" \t ".join(words)
reg="\b%s\s?\t{1,}\s?%s\b"%(words[0],words[1])
return re.sub(reg,rep,st)
.....:
In [118]: strs='This is \t\t\t a small\t\t\tworld, very small world?'
In [119]: func(strs,"small","world")
Out[119]: 'This is \t\t\t a small \t world, very small world?'
In [120]: func(strs,"is","a")
Out[120]: 'This is \t a small\t\t\tworld, very small world?'
您可以使用 Python re 模块来使用正则表达式:
import re
s = "This is \t\t a small \t\t world"
s1 = re.sub(r'(?<=small +)\t+(?= +world)', '\t', s)
这将在and\t
之间连续找到一个或多个 ,并将整个's 序列替换为单个."small "
" world"
\t
\t