python - 正则表达式不返回任何内容。为什么？

Question

这什么都不返回？

# Enter your code for "Image Extractor" here.
import re
with open('site.html') as html:
    content = html.read()
    content = str(content)
    print(re.findall(r'<ima?ge?\s+[^>]*?src=["|\']([^["|\']]+)', content))

我认为这与我从表达式中逃避反斜杠有关......

score 2 · Accepted Answer

[^["|\']]

I'm not sure what you wanted this to do. You can't nest character classes or use | for alternation in a character class. The way you have it now, this section matches any character that isn't one of the following:

[
"
|
'

followed by a literal ]. If you wanted this to be a single character class that matches anything but a single or double quote, you wanted

[^"\']

python - 正则表达式不返回任何内容。为什么？

1 回答 1

Related

Reference