如何从字符串中的源代码获取字符串常量?
例如,这是我要处理的源代码:
var v = "this is string constant + some numbers and \" is also included "
我无法将所有内容都放在引号内。通过使用这个正则表达式:"(.*?)"
.
除了字符串字符之外,我无法获取var
,或其他任何内容。v
=
如何从字符串中的源代码获取字符串常量?
例如,这是我要处理的源代码:
var v = "this is string constant + some numbers and \" is also included "
我无法将所有内容都放在引号内。通过使用这个正则表达式:"(.*?)"
.
除了字符串字符之外,我无法获取var
,或其他任何内容。v
=
您需要匹配一个开始引号,然后是任何转义字符或普通字符(引号和反斜杠除外),然后是结束引号:
"(?:\\.|[^"\\])*"
使用lookbehind,以确保 " 前面没有 \
import re
data = 'var v = "this is string constant + some numbers and \" is also included "\r\nvar v = "and another \"line\" "'
matches = re.findall( r'= "(.*(?<!\\))"', data, re.I | re.M)
print(matches)
输出:
['this is string constant + some numbers and " is also included ', 'and another "line" ']
为了得到引号内的所有内容,您可以尝试以下操作
"\".+?\""
:re.findall()