php - 通过 file_get_contents 和 preg_match 获取 og:image

Question

我正在使用 file_get_contents 从任何 url 获取 og:image。

$fooURL = file_get_contents($URLVF['url']);

然后我过滤 property=og:image 从页面中获取图像，下面的代码适用于大多数网站

preg_match("/content='(.*?)' property='og:image'/", $fooURL, $fooImage);

但是像 www.howcast.com 这样的网站有 og:image 的不同代码，如下所示

<meta content='http://attachments-mothership-production.s3.amazonaws.com/images/main-avatar.jpeg' property='og:image'>

因此，要获取上述代码的图像链接，我需要 preg_match 是这样的

preg_match('/property="og:image" content="(.*?)"/', $fooURL, $fooImage);

但是当然，如果我现在使用上面的代码，唯一可以工作的网站就是howcast，其他所有网站都不会返回任何内容

知道如何使代码使用编写元代码的任何方法或任何替代方法来顺利获取图像链接

score 2 · Accepted Answer

@str 建议的 DOMDocument 和 XPath 示例：

$html = <<<LOD
<html><head>
<meta content='http://attachments-mothership-production.s3.amazonaws.com/images/main-avatar.jpeg' property='og:image'>
</head><body></body></html>
LOD;

$doc = new DOMDocument();
@$doc->loadHTML($html);
// or @$doc->loadHTMLFile($URLVF['url']);
$xpath = new DOMXPath($doc);
$metaContentAttributeNodes = $xpath->query("/html/head/meta[@property='og:image']/@content");
foreach($metaContentAttributeNodes as $metaContentAttributeNode) {
    echo $metaContentAttributeNode->nodeValue . "<br/>";
}

php - 通过 file_get_contents 和 preg_match 获取 og:image

1 回答 1

Related

Reference