我将如何检索最后一次出现的 a.page_arrows
<div class="page-nav">
<a class="paginationNumberStyle page_arrows" data-url="/Building-Materials-Concrete-Cement-Masonry/h_d1/N-5yc1vZ25ecodZarlk/h_d2/Navigation?catalogId=10053&Nu=P_PARENT_ID&langId=-1&Nao=384&storeId=10051">
<img alt="" src="/static/images/layout/triangle-green-left.gif"></a>
<span>6</span>
<a class="paginationNumberStyle" data-url="/Building-Materials-Concrete-Cement-Masonry/h_d1/N-5yc1vZ25ecodZarlk/h_d2/Navigation?catalogId=10053&Nu=P_PARENT_ID&langId=-1&Nao=576&storeId=10051">7</a>
<a class="paginationNumberStyle" data-url="/Building-Materials-Concrete-Cement-Masonry/h_d1/N-5yc1vZ25ecodZarlk/h_d2/Navigation?catalogId=10053&Nu=P_PARENT_ID&langId=-1&Nao=672&storeId=10051">8</a>
<a class="paginationNumberStyle page_arrows" data-url="/Building-Materials-Concrete-Cement-Masonry/h_d1/N-5yc1vZ25ecodZarlk/h_d2/Navigation?catalogId=10053&Nu=P_PARENT_ID&langId=-1&Nao=576&storeId=10051">
<img alt="" src="/static/images/layout/triangle-green-right.gif"></a>
</div>
我正在尝试收集链接,然后转到下一页并收集其余链接,直到没有嵌套页面。这是我的代码:
getLinks('http://www.homedepot.com/Building-Materials-Concrete-Cement-Masonry/h_d1/N-5yc1vZ25ecodZarlk/h_d2/Navigation?catalogId=10053&Nu=P_PARENT_ID&langId=-1&storeId=10051¤tPLP=true&omni=c_Concrete,%20Cement%20&%20Masonry&searchNav=true');
function getLinks($URL) {
$html = file_get_contents($URL);
$dom = new simple_html_dom();
$dom -> load($html);
foreach ($dom->find('a[class=item_description]') as $href){
$url = $href->href;
echo $url.'<br>';
}
if ($nextPage = $dom->find("a[class=paginationNumberStyle]" ,0)){
$nextPageURL = 'http://www.homedepot.com'.$nextPage->getAttribute('data-url');
$dom -> clear();
unset($dom);
getLinks($nextPageURL);
} else {
echo "\nEND";
$dom -> clear();
unset($dom);
}
}