0

我第一次在 PHP 中尝试 Curl,原因是我想从这个页面抓取结果:http ://www.lldj.com/pastresult.php 。该网站自 2002 年以来每周发布乐透结果,并有一个简单的提交表格(日期)。

提交按钮:Name = Button / value = Submit Select 下拉菜单:Name = Draw & Options #( 1 - 1097 ) // 表示抽奖编号

我可以手动检查它,但我想我为什么不使用简单的脚本并让它更容易,因为我也有兴趣测试如何使用 PHP/CURL 提交数据并检索结果。

我已经使用 DOM PHP 进行抓取,并且我对使用语法很满意。我想知道我是否应该同时使用 Curl 和 DOM,或者这可以通过 CURL 来实现。

到目前为止我所拥有的;

include'dom.php';
$post_data['draw'] = '1097';
$post_data['button'] = 'Submit';

//traverse array and prepare data for posting (key1=value1)
foreach ( $post_data as $key => $value) {
$post_items[] = $key . '=' . $value;
}

//create the final string to be posted using implode()
$post_string = implode ('&', $post_items);

//create cURL connection
$curl_connection = 
curl_init('http://www.lldj.com/pastresult.php');

//set options
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl_connection, CURLOPT_USERAGENT, 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 1);
//set data to be posted
curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string);

 //perform our request
$result = curl_exec($curl_connection);

 //show information regarding the request
 print_r(curl_getinfo($curl_connection));
echo curl_errno($curl_connection) . '-' . 
            curl_error($curl_connection);

提交数据后/刮

$t = $curl_connection->find('table',0); // ?? usualy referes to file_get_content Var
$data = $t->find('tr');

foreach($data as $n) {
$tds = $n->find('td');

$dataRows = array();

$dataRows['num'] =  $tds[0]->find('img',0)->href;

var_dump($dataRows);
}

有人可以指出这是否正确吗?您如何设置自动增加提交值然后重复该过程(例如,提交 darw = 1 然后绘制 =2 等。)谢谢

4

2 回答 2

1
<?php   
  while(true){

   for($i=1;$i<5000;$i++){

$post_data['draw'] = $i; // will change every time like 1,2,3,4
$post_data['button'] = 'Submit';

//traverse array and prepare data for posting (key1=value1)
foreach ( $post_data as $key => $value) {
$post_items[] = $key . '=' . $value;
}

//create the final string to be posted using implode()
$post_string = implode ('&', $post_items);

//create cURL connection
$curl_connection = 
curl_init('http://www.lldj.com/pastresult.php');

//set options
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl_connection, CURLOPT_USERAGENT, 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 1);
//set data to be posted
curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string);

 //perform our request
$result = curl_exec($curl_connection);

 //show information regarding the request
 print_r(curl_getinfo($curl_connection));
echo curl_errno($curl_connection) . '-' . 
            curl_error($curl_connection);

// 开始你的剪贴画

$t = $curl_connection->find('table',0); // ?? usualy referes to file_get_content Var
$data = $t->find('tr');

foreach($data as $n) {
$tds = $n->find('td');

$dataRows = array();

$dataRows['num'] =  $tds[0]->find('img',0)->href;

var_dump($dataRows);
}

} for 循环在这里结束

}?>

这里只是使用 curl 并更改 id 的骨架,您可以按照自己的方式进行设置。

还请确保在获取数据后清除变量。

使用喜欢

...
curl_close($ch);
unset($fields_string);
...
于 2013-06-04T09:25:42.560 回答
0

加载页面

获取远程内容的首选方式是file_get_contents(). 利用:

$html = file_get_contents('http://www.lldj.com/pastresult.php');

就是这样。


从页面获取内容

要从页面获取内容,您通常会使用DOMDocumentDOMXPath

$doc = new DOMDocument();
@$doc->loadHTML($html);
$selector = new DOMXpath($doc);

// xpath query
$result = $selector->query('YOUR QUERY');
于 2013-06-04T09:25:03.873 回答