php - 如何从其他网页获取内容并存储在数据库中

Question

地狱之友

我需要从这里获取比赛结果

"http://www.drf.com/race-results/BHP/USA/2012-06-23/D"

并想存储在我的数据库中，我需要获取比赛 1、比赛 2、比赛 3 等的所有记录

请建议我使用此代码，但它向我显示整页我只需要特定信息

      <?php
       $ch = curl_init();   
      //Fetch the timeline
         curl_setopt($ch, CURLOPT_URL, 'http://www.drf.com/race-results/BHP/USA/2012-06-24/D');
 //send data via $_GET
 //curl_setopt($ch, CURLOPT_GET, 0);

//do not return the header information
      curl_setopt($ch, CURLOPT_HEADER, 0);
      curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE);

//If SSL verification is needed. Delete if not needed
      curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE);

    //Give me the data back as a string... Don't echo it.
      //curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

        //Warp 9, Engage!

       $content = curl_exec($ch);

       //Close CURL connection & free the used memory.

       curl_close($ch);
         ?>

score 2 · Accepted Answer

我建议使用Goutte库。它可以让您使用文档齐全的 API 抓取和解析远程站点。您甚至可以点击链接并提交表格。

文档中的示例用法：

use Goutte\Client;

$client = new Client();

使用 request() 方法发出请求：

$crawler = $client->request('GET', 'http://www.symfony-project.org/');

该方法返回一个 Crawler 对象 (Symfony\Component\DomCrawler\Crawler)。

点击链接：

$link = $crawler->selectLink('Plugins')->link();
$crawler = $client->click($link);

根据 CSS 类提取数据并输出文本：

$nodes = $crawler->filter('.error_list');
if ($nodes->count())
{
  die(sprintf("Authentification error: %s\n", $nodes->text()));
}

printf("Nb tasks: %d\n", $crawler->filter('#nb_tasks')->text());

score 0 · Accepted Answer

您应该查看 PHP dom 解析器。关联

解析 HTML 页面以获取所需的数据并将其保存到数据库中。

祝你好运。

score 0 · Accepted Answer

Curl 将返回该站点的页面 HTML 代码，这是预期的。

转到实际站点，确定显示结果的 div。然后使用 PHP dom 解析器提取特定部分的数据，甚至可以提取字符串（简单但效率低，不推荐）。

从部分中去除 HTML 标签并保存所需的数据，

score 0 · Accepted Answer

0

使用 PHP simplehtmldom 解析器从 HTML http://simplehtmldom.sourceforge.net/中提取内容

于 2012-06-29T12:49:32.770 回答

php - 如何从其他网页获取内容并存储在数据库中

4 回答 4

Related

Reference