-1

我有一个爬虫,它爬取以 www.bbc.co.uk/news 开头的网站。它抓取所有以http://www.bbc.co.uk/news开头的链接,找到它们的描述、链接和标题,并将它们插入数据库。

由于某种原因,它似乎没有插入。

有任何想法吗?

PS 完全没有输出,网页完全空白

   foreach ($links as $link) {
    $output = array(
"title"       => Titles($link), //dont know what Titles is, variable or string?
"description" => getMetas($link),
"keywords" => getKeywords($link), 
"link"        => $link                 
 );
if (empty($output["description"])) {
$output["description"] = getWord($link);
 }

 if (substr($ouput, 0, 26) == "http://www.bbc.co.uk/news/") {

 $data = '"' . implode('" , "', $output) . '"';
 $success = mysql_query( "INSERT INTO news_story (`title`, `description` , `keywords`, `link`)
 VALUES (" . $data . ")") or zerror_reporting();
 if ($sucess) {
echo "YEAH!";
   }

   if (!$sucess) {
echo "NO!!";
    }
    print_r($data);
     }}
4

4 回答 4

1

问题在这里:

 if (substr($ouput, 0, 26) == "http://www.bbc.co.uk/news/") {

   $data = '"' . implode('" , "', $output) . '"';
  $success = mysql_query( "INSERT INTO news_story (`title`, `description` , `keywords`, `link`)
  VALUES (" . $data . ")") or zerror_reporting();
 if ($sucess) {
echo "YEAH!";
  }

你的$ouput变量在哪里......我想你想写......$output但它也没有执行,因为$output变量是一个数组而不是一个string

于 2012-12-18T12:33:25.283 回答
0

Sanitization your value before inserting in to database

于 2012-12-18T12:31:10.110 回答
0

@Mrinmoy 的解决方案是正确的,但代码中似乎存在更多问题,因为您的代码从未进一步触及这一点。

首先设置显示错误:

ini_set('error_reporting',E_ALL);
ini_set('display_errors','on');
foreach ($links as $link) {

PHP 讲了很多,如果你能听的话。我个人使用 E_ALL|E_STRICT ,但对于今天来说这有点太多了。:) 然后清理您的数据,否则您很少会成功插入记录。你的数据会有很多句子:

 $output = array(
"title"       => mysql_real_escape_string(Titles($link)), //dont know what Titles is, variable or string?
"description" => mysql_real_escape_string(getMetas($link)),
"keywords" => mysql_real_escape_string(getKeywords($link)), 
"link"        => mysql_real_escape_string($link)                 
 );
if (empty($output["description"])) {
$output["description"] = mysql_real_escape_string(getWord($link));
 }

然后更正变量拼写错误并使用输出数组的链接索引:

if (substr($output['link'], 0, 26) == "http://www.bbc.co.uk/news/") {

最后,如果你仍然没有得到数据,你肯定会知道更多来自己修复它。并print_r($output); echo $data;在调用 mysql_query 之前使用。另一种跟踪进度的方法是通过填充代码echo __LINE__ . "\n";来查看它在哪里死亡。通过名称验证您的代码中有一个方法zerror_reporting或替换为die(mysql_error());

于 2012-12-19T12:15:46.987 回答
0

The blank white page is a PHP Fatal error that produces a 500 Internal Server Error response. That is caused by this undefined function zerror_reporting():

mysql_query(...) or zerror_reporting();

Change that to something like

mysql_query(...) or trigger_error(mysql_error());

The trigger_error() call will add the mysql error to your error log.

The second problem is you're trying to substr() on an array, you should be doing that on the link element:

 if (substr($output['link'], 0, 26) == "http://www.bbc.co.uk/news/") {
于 2012-12-18T12:26:11.997 回答