0

我想检查网站是否包含 schema.org 标记?我正在执行以下操作:

$domain = 'http://agents.allstate.com/william-leahy-mount-prospect-il.html';            
$client = new Zend_Http_Client();
            $client->setUri($domain);
            $response = $client->request();
            $html = $response->getBody();
            $dom = new Zend_Dom_Query($html);
            $resultSchema = $dom->query('body');

            foreach($resultSchema as $r){
                $data = $r->hasAttribute('itemprop');
                if($data)
                    echo 'YEs';
                else 
                    echo 'No';
            }

我不明白如何找到这个。这是正确的做法吗?网站上使用的 schema.org 标记可以使用任何 html 元素。如何查询所有元素并找到包含 schema.org 标记的元素?

4

1 回答 1

0

终于经过长时间的搜索和阅读能够得到答案!如果有人仍在寻找答案,这就是它的完成方式。

$seperator = '|'; $dbData = '';
$domain = 'http://agents.allstate.com/william-leahy-mount-prospect-il.html';            
$client = new Zend_Http_Client();
$client->setUri($domain);
$response = $client->request();
$html = $response->getBody();
$dom = new Zend_Dom_Query($html);
$result = $dom->queryXpath('//*[@itemtype="http://schema.org/LocalBusiness"]');
            if($result->count()){
                foreach ($result as $r) {
                    if($r->hasChildnodes()) {
                        $lbHtml = $r->C14N();

                        $dom2 = new Zend_Dom_Query($lbHtml);
                        $lbname = $dom2->queryXpath('//*[@itemprop="name"]');
                        if($lbname->count()){
                            foreach ($lbname as $name) {
                                $name = $name->nodeValue;
                            }
                        }
                    }
                }
            }

            if(isset($name))
                $dbData .= 'name:'.$name.$seperator;
            else 
                $dbData .= 'name:'.$seperator;

            $result = $dom->queryXpath('//*[@itemtype="http://schema.org/PostalAddress"]');
            if($result->count()){
                foreach ($result as $r) {
                    $address = $r->nodeValue;
                }
            }

            if(isset($address))
                $dbData .= 'address:'.$address.$seperator;
            else
                $dbData .= 'address:'.$seperator;

            $result = $dom->queryXpath('//*[@itemprop="telephone"]');
            if($result->count()){
                foreach ($result as $r) {
                    $telephone = $r->nodeValue;
                }
            }

            if(isset($telephone))
                $dbData .= 'telephone:'.$telephone.$seperator;
            else
                $dbData .= 'telephone:'.$seperator;

            $dbData = trim($dbData,'|');

$dbData 将包含包含 schema.org 数据所有属性的字符串。希望能帮助到你!

于 2013-07-31T05:56:17.147 回答