1

我不想刮谷歌。这只是一次性获得大约 300 个 URL 的方法,比手动操作要快一些。

我似乎无法创建一个 DOMDocument。它总是以空对象结束。

search_list.txt包含我的搜索词列表。现在我只用 1 个术语“乐高积木”来测试它。

该脚本正确下载了搜索结果页面。我在网络浏览器中查看了它,它看起来不错。

search_list.txt

legos

获取结果.php

<?php
$search_list = 'search_list.txt'; // file containing search terms
$results = 'results.txt';

$handle = fopen($vendor_list,'r');

while($line = fgets($handle)) {
        $fp = fopen($results,'w');
        $ch = curl_init('http://www.google.com/'
        . 'search?q=' . urlencode($line));
        curl_setopt($ch,CURLOPT_FILE,$fp);
        curl_setopt($ch,CURLOPT_HEADER,0);
        curl_exec($ch);
        curl_close($ch);
        fclose($fp);
        unset($ch,$fp);
}
fclose($handle);


$dom = DOMDocument::loadHTML(file_get_contents($results));
echo print_r($dom,true); // EMPTY
$search_div = $dom->getElementById('search');

if(is_null($search_div)) { // ALWAYS NULL
        echo 'Search_div is null';
} else {
        echo print_r($search_div,true);
}

?>
4

1 回答 1

0

我做了一些改变。

而不是fopen- fgets- * , file.

而不是curlsimple_html_dom::load_file

$search_list = 'search_list.txt'; // file containing search terms
$result_list = 'results.txt'; // file containing search terms

$searching_list = file($search_list);
foreach ($search_list as $key => $searching_word) {
    $html->load_file('http://www.google.com/'.'search?q='.urlencode($searching_word));
    $search_div = $html->find("div[id='search']");
    echo $search_div[0]; // See content of the search div.
    file_put_contents($result_list,$search_div[0]);
}

?>

您可以使用 来查看结果echo $search_div[0];

它向您显示搜索 div 的全部内容。

我搜索了'asd' =) ...

根据我的结果,它以 like 开头

<div id="search"><div id="ires"><ol><li class="g"><h3 class="r"><a href="/url?q=http://en.wikipedia.org/wiki/Atrial_septal_defect&amp;sa=U&amp;ei=5qhMUv36ILKX0AXxuYGYCQ&amp;ved=0CBgQFjAA&amp;usg=AFQjCNFo67q2pfiPWK5SDMKFTeu-QSfcxw"><b>Atrial septal defect</b> - Wikipedia, the free encyclopedia</a></h3><div class="s"><div class="kv" style="margin-bottom:2px"><cite>en.wikipedia.org/wiki/<b>Atrial_septal_defect</b></cite><span class="flc"> - <a href="/url?q=http://webcache.googleusercontent.com/search%3Fq%3Dcache:Ocu9slAHjr4J:http://en.wikipedia.org/wiki/Atrial_septal_defect%252Basd%26hl%3Den%26ct%3Dclnk&amp;sa=U&amp;ei=5qhMUv36ILKX0AXxuYGYCQ&amp;ved=0CBkQIDAA&amp;usg=AFQjCNEY245u_ERgmZd7-2vIk5RAIRbOeg">Cached</a> - <a href="/search?ie=UTF-8&amp;q=related:en.wikipedia.org/wiki/Atrial_septal_defect+asd&amp;tbo=1&amp;sa=X&amp;ei=5qhMUv36ILKX0AXxuYGYCQ&amp;ved=0CBoQHzAA">Similar</a></span></div><span class="st"><b>Atrial septal defect</b> (<b>ASD</b>)

并结束了

</span><br></div></li><li class="g"><h3 class="r"><a href="/url?q=http://achievementschooldistrict.org/&amp;sa=U&amp;ei=5qhMUv36ILKX0AXxuYGYCQ&amp;ved=0CEQQFjAJ&amp;usg=AFQjCNHqINq_rlt8mbk2WmlATfpx-fyP8w"><b>Achievement School District</b></a></h3><div class="s"><div class="kv" style="margin-bottom:2px"><cite>achievementschooldistrict.org/</cite><span class="flc"> - <a href="/url?q=http://webcache.googleusercontent.com/search%3Fq%3Dcache:s8DoGxDbr4oJ:http://achievementschooldistrict.org/%252Basd%26hl%3Den%26ct%3Dclnk&amp;sa=U&amp;ei=5qhMUv36ILKX0AXxuYGYCQ&amp;ved=0CEUQIDAJ&amp;usg=AFQjCNEPhVqK33c7ruuXT7cwVe3-8JdUVA">Cached</a></span></div><span class="st"><b>Achievement School District</b> &middot; The <b>ASD</b> &middot; Driving Results &middot; Campuses &middot; Join Our <br>  Team &middot; Enroll A Student &middot; <b>ASD</b> News &middot; Contact Us&nbsp;<b>...</b></span><br></div></li></ol></div></div>

更新

这部分基于Buttle Butk的评论。

如果谷歌搜索的第一个结果没有变化,您可以使用此代码获取搜索中的第一个结果。

<?php
$search_list = 'search_list.txt'; // file containing search terms
$result_list = 'results.txt'; // file containing search terms
$order_language = "en"
$searching_list = file($search_list);

foreach ($search_list as $key => $searching_word) {
    $link = 'https://www.google.com.tr/search?hl='.$order_language.'&q='.$searching_word.'&btnI=1';
    echo $link;
    file_put_contents($result_list,$link[0]);
}
?>

我再次搜索'asd' =) ...

结果

https://www.google.com.tr/search?hl=en&q=asd&btnI=1

当我复制并粘贴到 chrome 时,此链接将重定向到我的“asd 搜索”的第一个结果。

http://www.asd-europe.org/

如果我能帮助你,我会很高兴。祝你有美好的一天。

于 2013-10-02T23:24:57.347 回答