为什么不使用simplehtmldom解析器。
例子:
require_once 'simple_html_dom.php';
$url ="http://rads.stackoverflow.com/amzn/click/B009T9QCWI";
$html = file_get_html( $url );
// all results stored in this array
$result = array();
// page title
$result[ 'title' ] = $html->find( 'title', 0 )->plaintext;
// get all meta tags, which have an attribute "name"
foreach( $html->find( 'meta[name]' ) as $meta ) {
$result[ 'meta' ][] = array(
'name' => $meta->name,
'content' => $meta->content
);
}
// get all images
foreach( $html->find( 'img' ) as $image ) {
$result[ 'image' ][] = $image->src;
}
print_r( $result );
输出
Array
(
[title] => Amazon.com: Samsung Galaxy S III, Black 16GB (Verizon Wireless): Cell Phones & Accessories
[meta] => Array
(
[0] => Array
(
[name] => description
[content] => Shop cell phones and accessories at Amazon.com. You'll find great prices on cases, headsets, and the latest smartphones from carriers like Verizon, AT&T, and Sprint
)
[1] => Array
(
[name] => title
[content] => Amazon.com: Samsung Galaxy S III, Black 16GB (Verizon Wireless): Cell Phones & Accessories
)
[2] => Array
(
[name] => keywords
[content] => Samsung Galaxy S III, Black 16GB (Verizon Wireless),Samsung,Galaxy S III
)
)
[image] => Array
(
[0] => http://g-ecx.images-amazon.com/images/G/01/gno/beacon/BeaconSprite-US-01._V397411194_.png
[1] => http://g-ecx.images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V386942464_.gif
[2] => http://g-ecx.images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V386942464_.gif
[3] => http://ecx.images-amazon.com/images/I/41%2Bh%2BUmrcRL._SY300_.jpg
[4] => http://ecx.images-amazon.com/images/I/41%2Bh%2BUmrcRL._SL500_AA280_.jpg
[5] => http://g-ecx.images-amazon.com/images/G/01/icons/icon-offsite-sl-7069-t4._V171196157_.png
[6] => http://g-ecx.images-amazon.com/images/G/01/icons/icon-offsite-sl-7069-t4._V171196157_.png
[7] => http://ecx.images-amazon.com/images/I/41FBSaIC4AL._SL500_SS100_.jpg
[8] => http://ecx.images-amazon.com/images/I/41HGvd6-jwL._SL500_SS100_.jpg
[9] => http://ecx.images-amazon.com/images/I/51jiU%2BiYWUL._SL500_SS100_.jpg
[10] => http://ecx.images-amazon.com/images/I/317JogSYmkL._SL500_SS100_.jpg
[11] => http://ecx.images-amazon.com/images/I/41d6B11BDuL._SL500_SS100_.jpg
[12] => http://ecx.images-amazon.com/images/I/41a94BWHXbL._SL500_SS100_.jpg
[13] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/B009T9QCWI.main_SM.jpg
[14] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/B009T9QCWI.pt01_SM.jpg
[15] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/wireless-box-logo-verizon-box.jpg
[16] => http://g-ecx.images-amazon.com/images/G/01/th/aplus/a-plus_bottom-217._V180545591_.gif
[17] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/B009T9QCWI.pt02_SM.jpg
[18] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/amazon_app_suite_1_sma.jpg
[19] => http://g-ecx.images-amazon.com/images/G/01/wireless/detail-page/amazon_app_suite_5_sm.jpg
[20] => http://ecx.images-amazon.com/images/I/41HGvd6-jwL._SL75_SS50_.jpg
[21] => http://ecx.images-amazon.com/images/I/41FBSaIC4AL._SL75_SS50_.jpg
[22] => http://ecx.images-amazon.com/images/I/51jiU%2BiYWUL._SL75_SS50_.jpg
[23] => http://ecx.images-amazon.com/images/I/41a94BWHXbL._SL75_SS50_.jpg
[24] => http://g-ecx.images-amazon.com/images/G/01/x-locale/communities/reputation/suggestionbox._V192249929_.gif
[25] => http://g-ecx.images-amazon.com/images/G/01/icons/orange-arrow._V192570247_.gif
[26] => http://g-ecx.images-amazon.com/images/G/01/icons/orange-arrow._V192570247_.gif
[27] => http://g-ecx.images-amazon.com/images/G/01/icons/orange-arrow._V192570247_.gif
[28] => http://g-ecx.images-amazon.com/images/G/01/gno/images/general/navAmazonLogoFooter._V169459313_.gif
[29] => /gp/uedata/unsticky/182-7026578-6696341//ntpoffrw?noscript&id=158FKQCX6TYATFBQQW0V
)
)
您可以循环传递 url 并对所有人执行相同的操作。为简单起见,我保留了您对图像和元标记进行的检查。希望能帮助到你。