问问题
26401 次
3 回答
18
There are several issues with your code. First you should use re.compile(ur'<unicode string>')
. Also it is nice to add re.UNICODE flag (not sure if really needed here though). Next one is that still you will not receive a match since \d+
doesn't handle decimals just a series of numbers, you should use \d+\.?\d+
instead (you want number, probably a dot and a number). Example code:
#coding: utf-8
text = u"PROCESS:类型:关爱积分[NOTIFY] 交易号:2012022900000109 订单号:W12022910079166 交易金额:0.01元 交易状态:true 2012-2-29 10:13:08"
import re
pattern = re.compile(ur'交易金额:(\d+\.?\d+)元', re.UNICODE)
print pattern.search(text).group(1)
于 2012-05-11T06:45:59.493 回答
0
If you use utf-8, you can use flags=re.LOCALE
#coding: utf-8
import re
pattern = re.compile(r'交易金额:(\d+\.?\d+)元', flags=re.LOCALE)
for line in open('xx.txt'):
match = pattern.match(line)
More details, see re.LOCALE. There is no need to convert utf-8 to unicode.
于 2016-10-31T10:22:06.523 回答