python - 在字符串中的占位符之间提取和替换子字符串

Question

我有一个输入文本，

input = 'I like {sushi} and {tempura}.'

并希望从中获取列表和替换的 src。

lst = ['sushi', 'tempura']
src = 'I like * and *.'

我可以在输入/输出字符串中使用任何标记来代替{}and *，例如[]or 之类的。

score 8 · Accepted Answer

import re
input = 'I like {sushi} and {tempura}.'
regex = re.compile(r'\{([^\}]*)\}')
lst = regex.findall(input)            #['sushi','tempura']
mod_str = regex.sub('*',input)        #I like * and *.
print (lst)
print (mod_str)

您还可以使用字符串格式进行替换：

mod_str = input.format(**dict((x,'*') for x in lst))

正则表达式分解（注意我使用了原始字符串 [ r'...']）：

\{ -- 寻找文字'{'
[^\}]-- 匹配任何不是文字 '}' 的东西
*-- 尽可能多地匹配它。
\}-- 匹配文字 '}'

添加括号以在匹配中进行分组re.findall。

正如 DSM 所指出的，另一个在标记之间查找文本的常用习语是：

r"\{(.*?)\}"

意思是：

\{-- 匹配文字'{'
(.*?)-- 匹配任何东西，但不要贪心 -- （不要吃 re 将能够用于下一部分匹配的东西）
'\}'-- 匹配文字 '}'

score 4 · Accepted Answer

因为我无法阻止自己尝试寻找非正则表达式的方法来做事，所以这是一种使用标准字符串格式的方法：

>>> import string
>>> s = 'I like {sushi} and {tempura}.'
>>> parsed = string.Formatter().parse(s)
>>> fields = [p[1] for p in parsed if p[1]]
>>> src = s.format(**{f: '*' for f in fields})
>>> fields
['sushi', 'tempura']
>>> src
'I like * and *.'

score 0 · Accepted Answer

一种易于理解的方法，用于匹配文本之间的{}

import re

input = 'I like {sushi} and {tempura}'
lst = re.findall('{[(a-zA-Z)]*}',input)
src = re.sub('{[a-zA-Z]*}','*',input)

print lst
['sushi', 'tempura']

print src
I like * and *

如果您想匹配两者之间的任何内容，{}则需要使用'{[^}]*}'mgilsons 回答节目或{(.*?)}来自 DSM。

python - 在字符串中的占位符之间提取和替换子字符串

3 回答 3

Related

Reference