2

全部

我正在使用 PHP Simple HTML DOM Parser 来获取标题和价格等产品详细信息。这是代码

<?php 

// Include the library
include('simple_html_dom.php');

// Retrieve the DOM from a given URL
$html = file_get_html('http://www.flipkart.com/mobiles/micromax');



// Find all SPAN tags that have a class of "myClass"
foreach($html->find('a.title') as $e){  
    echo 'Title: '.$e->outertext . '<br>';
    //$html = file_get_html('http://www.flipkart.com/mobiles/micromax/'.$e->outertext);

}

foreach($html->find('span.final-price') as $e)
    echo 'Price:'.$e->outertext . '<br>';


?>

结果

Title: Micromax X101 (White) 
Title: Micromax X291 (White) 
Title: Micromax X101 (Yellow) 
Title: Micromax X234+ (Wine Red) 
Title: Micromax Ninja 3 A57 (Black) 
Title: Micromax Ninja 4.0 A87 (Black) 
Title: Micromax Bling Q55 (Pearl White) 
Title: Micromax X222 (Cocoa Brown) 
Title: Micromax X263 (Champagne & Coffee) 
Title: Micromax X650 (Silver White) 
Title: Micromax A73 (Black) 
Title: Micromax X1i XTRA (Black) 
Title: Micromax Superfone Lite A75 (Charcoal Black) 
Title: Micromax X271 (Black & Blue) 
Title: Micromax X50 (Black) 
Title: Micromax Q56 (Baby Pink) 
Title: Micromax X104 (Black) 
Title: Micromax Q22 (Black Green) 
Title: Micromax Aisha A52 (Yellow) 
Title: Micromax A78 (Coffee) 
Price:Rs. 999
Price:Rs. 1910
Price:Rs. 999
Price:Rs. 1190
Price:Rs. 4999
Price:Rs. 6049
Price:Rs. 3130
Price:Rs. 2040
Price:Rs. 1735
Price:Rs. 3350
Price:Rs. 6199
Price:Rs. 1525
Price:Rs. 6299
Price:Rs. 1590
Price:Rs. 4850
Price:Rs. 3999
Price:Rs. 1099
Price:Rs. 1880
Price:Rs. 4699
Price:Rs. 6970

这工作正常,但在浏览器http://www.flipkart.com/mobiles/micromax中打开此页面。有 ajax 产品加载功能。

所以我的脚本只是获取初始加载的产品。我想得到所有的产品。您可以看到“显示 1-20 of 78”。如何获取全部 78 个产品的详细信息?

4

1 回答 1

2

您可以读取他们的产品数量并除以 20,因为他们显示并使用他们的 AJAX 脚本来获取值 20。这样您就可以避免simple_html_dom和解码 json 字符串:

http://www.flipkart.com/mobiles/micromax?response-type=json&inf-start=0
http://www.flipkart.com/mobiles/micromax?response-type=json&inf-start=20

等等。

您只需要检查在页面滚动期间调用的脚本。在谷歌浏览器中,您可以使用Developer Tools、打开F12和观看网络部分。

于 2012-10-01T07:39:06.077 回答