我终于让我的刮刀工作了(有点),但现在我想知道如何自动转到下一页并从那里刮取相同的信息。我正在使用 cURL 复制整个页面(否则我会收到 500 错误)。这是我的代码:
<?
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "http://example.com/results.asp?&j=t&page_no=1");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$html = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
// print $html . "\n";
require 'simple_html_dom.php';
$dom = new simple_html_dom();
$dom->load($html);
foreach($dom->find("div[@id='schoolsearch'] tr") as $data){
$tds = $data->find("td");
if(count($tds)==3){
$record = array(
'school' => $tds[1]->plaintext,
'city' => $tds[2]->plaintext
);
print json_encode($record) . "\n";
file_put_contents('schools.csv', json_encode($record) . "\n", FILE_APPEND);
}
}
?>
它并不完美,但它现在有效!有谁知道我怎样才能转到下一页?