描述
这个表达式将
- 找到具有属性的元标记
property="og:image"
- 避免一些非常困难的边缘情况
- 捕获内容属性的值
- 允许属性以任意顺序出现
<meta(?=\s|>)(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\sproperty=(?:'og:image|"og:image"|og:image))(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\scontent=('[^']*'|"[^"]*"|[^'"][^\s>]*))(?:[^'">=]*|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*>
例子
在这个实时示例中,请注意前两个元标记示例文本中的困难边缘情况: http ://www.rubular.com/r/YY70uaGPLE
示例文本
<meta info=' content="DontFindMe" ' content="http://domain.com/path/path/file1.jpg" random_attr="blah blah"
property="og:image"/>
<meta content="http://domain.com/path/path/file2.jpg" random_attr="blah blah"
property="og:image"/>
<meta random_attr="blah blah" property='og:image' content="foo'" />
火柴
[0][0] = <meta info=' content="DontFindMe" ' content="http://domain.com/path/path/file1.jpg" random_attr="blah blah"
property="og:image"/>
[0][1] = "http://domain.com/path/path/file1.jpg"
[1][0] = <meta content="http://domain.com/path/path/file2.jpg" random_attr="blah blah"
property="og:image"/>
[1][1] = "http://domain.com/path/path/file2.jpg"
[2][0] = <meta random_attr="blah blah" property='og:image' content="foo'" />
[2][1] = "foo'"