0

这什么都不返回?

# Enter your code for "Image Extractor" here.
import re
with open('site.html') as html:
    content = html.read()
    content = str(content)
    print(re.findall(r'<ima?ge?\s+[^>]*?src=["|\']([^["|\']]+)', content))

我认为这与我从表达式中逃避反斜杠有关......

4

1 回答 1

2
[^["|\']]

I'm not sure what you wanted this to do. You can't nest character classes or use | for alternation in a character class. The way you have it now, this section matches any character that isn't one of the following:

[
"
|
'

followed by a literal ]. If you wanted this to be a single character class that matches anything but a single or double quote, you wanted

[^"\']
于 2013-08-18T01:31:46.250 回答