6

如何获得以下两个文本中引号之间的内容?

text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
     + "Another text on \"three\" lines" """

text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """

对我来说,问题是引号如果被转义应该被忽略,但是反斜杠有可能被转义。

我想获得以下组。

[
    r'Some text on \"two\" lines with a backslash escaped\\',
    r'Another text on \"three\" lines'
]
4

4 回答 4

23
"(?:\\.|[^"\\])*"

匹配带引号的字符串,包括其中出现的任何转义字符。

解释:

"       # Match a quote.
(?:     # Either match...
 \\.    # an escaped character
|       # or
 [^"\\] # any character except quote or backslash.
)*      # Repeat any number of times.
"       # Match another quote.
于 2013-04-21T11:43:24.143 回答
1

匹配除双引号外的所有内容:

import re
text = "Some text on \"two\" lines" + "Another text on \"three\" lines"
print re.findall(r'"([^"]*)"', text)

输出

['two', 'three']
于 2013-04-21T11:06:52.920 回答
1
>>> import re
>>> text = "Some text on\n\"two\"lines" + "Another texton\n\"three\"\nlines"
>>> re.findall(r'"(.*)"', text)
["two", "three"]
于 2013-04-21T11:03:16.697 回答
0
>>> import re
>>> text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
     + "Another text on \"three\" lines" """
>>> text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """
>>> re.findall(r'\\"([^"]+)\\"', text_2)
['two', 'three']
>>> re.findall(r'\\"([^"]+)\\"', text_1)
['two', 'three']

也许你想要这个:

re.findall(r'\\"((?:(?<!\\)[^"])+)\\"', text)
于 2013-04-21T11:01:05.580 回答