我已经花时间试图解决这个问题,这就是我所得到的。基本上我试图从 RSS 提要中提取图像。我使用 magpie 来处理提要,如下所示.. 这个片段在一个类中
function getImagesUrl($str) {
$a = array();
$pos = 0;
$topos;
$init = 1;
while($init) {
$pos = strpos($str, "img", $pos);
if($pos != FALSE) {
$topos = strpos($str, ">", $pos);
$imagetag = substr($str, $pos, ($topos - $pos));
$url = $this->getImageUrl($imagetag);
$pos = $topos;
array_push($a, $url);
}
else {
$init = 0;
}
}
return $a;
}
/*
* get the full url inside src atribute in <img>
*/
function getImageUrl($image) {
$p = strpos($image, "src=", 0);
$p+= 5; // remove o src="
$tp = strpos($image, '" ', $p);
$str = substr($image, $p, ($tp - $p));
return $str;
}
使用上述函数...我这样称呼它们...到目前为止,这将输出我稍后将粘贴的数据
@$rss = fetch_rss($rsso->url);
if (@$rss)
{
$items=$rss->items;
foreach ($items as $item )
{
if (isset($item['title'])&&isset($item['description']))
{
$hash=md5($this->es($item['title']).$this->es($item['description']));
$content = $item['content'];
foreach($content as $c) {
// get the images on content
$arr = $this->getImagesUrl($c);
print_r($arr);
}
这是一个输出示例
1. Array ( [0] =>
http://api.tweetmeme.com/imagebutton.gif?url=http://mashable.com/2010/09/25/trailmeme/
[1] =>
http://cdn.mashable.com/wp-content/plugins/wp-digg-this/i/gbuzz-feed.png
[2] =>
http://mashable.com/wp-content/plugins/wp-digg-this/i/fb.jpg
[3] =>
http://mashable.com/wp-content/plugins/wp-digg-this/i/diggme.png
[4] =>
http://ec.mashable.com/wp-content/uploads/2009/01/bizspark2.gif
[5] =>
http://cdn.mashable.com/wp-content/uploads/2010/09/web.png
[6] =>
http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png
[7] =>
http://cdn.mashable.com/wp-content/uploads/2009/02/bizspark.jpg
[8] =>
http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/0/di
[9] => [10] =>
http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/1/di
[11] => [12] =>
http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:D7DqB2pKExk [13] => [14] =>
http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:V_sGLiPBpWU [15] => [16] =>
http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:F7zBnMyn0Lo [17] => [18] =>
http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs
[19] => [20] =>
http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM
[21] => [22] =>
http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:gIN9vFwOqvQ [23] => [24] =>
http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA
[25] => [26] =>
http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok
[27] => [28] =>
http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI
[29] => [30] =>
http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A
[31] => [32] =>
http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:_cyp7NeR2Rw [33] => [34] =>
http://feeds.feedburner.com/~r/Mashable/~4/0N_mvMwPHYk
)
有没有办法可以过滤出正确的图片网址?例如....我想删除没有“jpg、png、gif”等扩展名的 url。其次,我想用 bizspark、digg、facebook、tweet、twitter 等废弃 url。任何人都找到了更简单的方法?请帮帮我