2

好的,可以说我有一个这样的 html 文件。. .

<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=1357900324528'">
<div class="vad buttonDiv" onclick="other('example')">
<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=7458758375733'">
<div class="vad buttonDiv" onclick="other('example1')">
<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=3474537737392'">
<div class="vad buttonDiv" onclick="other('example2')">

我想要做的是每个http://example.htm?some/link&id=**************我想从外部html页面只显示它们我尝试了下面的代码

$dom = new DOMDocument();
@$dom->loadHTML($html);

$xpath = new DOMXPath($dom);
$onclicks = $xpath->evaluate("/html/body//div");

for ($i = 0; $i < $onclicks->length; $i++) {
    $onclick = $onclicks->item($i);
    $display = $onclick->getAttribute("onclick");
    echo $display."<br>";
}

它得到了这个

location.href='http://example.htm?some/link&id=1357900324528'
other('example')

location.href='http://example.htm?some/link&id=7458758375733
other('example1')

location.href='http://example.htm?some/link&id=3474537737392
other('example2')

任何想法都是如何获得我所追求的,而不是点击内容,任何答案都将不胜感激。

4

4 回答 4

2

而不是复杂的 dom 解析最终会因解析网站的 HTML 错误而失败,我只使用preg_match_all

这很可能更快,更简单

if ( preg_match_all( '/onclick="(location\\.href=([^"]+))"/i', $html, $matches ) )
{
    print_r( $matches );
}

在此处输入图像描述

于 2013-01-11T11:45:15.693 回答
2
$url= "http://example.com";
$dom = new DOMDocument();
@$dom->loadHTML($url);
$xpath = new DOMXPath($dom);

$PATH = $xpath->evaluate('/html/body//div[@class="vad buttonDiv"]');
for ($i = 0; $i < $PATH->length; $i++) {
    $lmao = $PATH->item($i);

$answer = $lmao->getAttribute('onclick');
$searchArray = array( "location.href='", "'");
$replaceArray = array( "", "");
$link = str_replace($searchArray, $replaceArray, $answer);
echo $link."<br>"
}

显示只是链接的。

于 2013-02-16T00:30:55.863 回答
2

你离成功太近了...

在 Wikipedia 上学习 XPath 几分钟后,我想出了这个有效的 xpath:

$html=<<<TEXT
<html>
<body>
<div>
<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=1357900324528'"></div>
<div class="vad buttonDiv" onclick="other('example')"></div>
<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=7458758375733'"></div>
<div class="vad buttonDiv" onclick="other('example1')"></div>
<div class="vad buttonDiv" onclick="location.href='http://example.htm?some/link&id=3474537737392'"></div>
<div class="vad buttonDiv" onclick="other('example2')"></div>
</div>
</body>
</html>
TEXT;
$dom=new DOMDocument();
@$dom->loadHTML($html);
$xpath=new DOMXPath($dom);
$divs=$xpath->evaluate("/html/body//div[starts-with(@onclick,'location')]");
foreach(range(0,$divs->length-1) as $i)
{
    var_dump($divs->item($i)->getAttribute("onclick"));
}

上面的代码输出:

string(61) "location.href='http://example.htm?some/link&id=1357900324528'"
string(61) "location.href='http://example.htm?some/link&id=7458758375733'"
string(61) "location.href='http://example.htm?some/link&id=3474537737392'"
于 2013-01-11T12:06:09.537 回答
1

简单的解决方案:

for ($i = 0; $i < $onclicks->length; $i++) {
    $onclick = $onclicks->item($i);
    $display = $onclick->getAttribute("onclick");
    if(substr($display, 0, 8) == 'location'){
        $display = str_replace(array("location.href='", "'"), '', $display);
        echo $display."<br>";
    }

}
于 2013-01-11T11:44:28.280 回答