我只是在扩展 BrenBarn 接受的答案。我喜欢在午餐时解决一个好问题。以下是我对您问题的完整实施:
给定字符串2 cups [9 oz] [10 g] flour
import re
text = '2 cups [9 oz] [10 g] flour'
units = {'oz': 'uk imperical',
'cups': 'us',
'g': 'metric'}
# strip out brackets & trim white space
text = text.replace('[', '').replace(']', '').strip()
# replace numbers like 9 to "9
text = re.sub(r'(\d+)', r'"\1', text)
# expand units like `cups` to `cups" -> us`
for unit in units:
text = text.replace(unit, unit + '" -> ' + units[unit] + "~")
# matches the last word in the string
text = re.sub(r'(\w+$)', r'"\1" -> ingredient name', text)
print "raw text: \n" + text + "\n"
print "Array:"
print text.split('~ ')
将返回一个字符串数组:
raw text:
"2 cups" -> us~ "9 oz" -> uk imperical~ "10 g" -> metric~ "flour" -> ingredient name
Array: [
'"2 cups" -> us',
'"9 oz" -> uk imperical',
'"10 g" -> metric',
'"flour" -> ingredientname'
]