2

Tricky preg_replace_callback function here - I am admittedly not great at PRCE expressions.

I am trying to extract all img src values from a string of HTML, save the img src values to an array, and additionally replace the img src path to a local path (not a remote path). Ie I might have, surrounded by a lot of other HTML:

img src='http://www.mysite.com/folder/subfolder/images/myimage.png'

And I would want to extract myimage.png to an array, and additionally change the src to:

src='images/myimage.png'

Can that be done?

Thanks

4

2 回答 2

3

是否需要使用正则表达式?使用 DOM 函数处理 HTML 通常更容易:

<?php

$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents("http://stackoverflow.com"));
libxml_use_internal_errors(false);

$items = $domd->getElementsByTagName("img");
$data = array();

foreach($items as $item) {
  $data[] = array(
    "src" => $item->getAttribute("src"),
    "alt" => $item->getAttribute("alt"),
    "title" => $item->getAttribute("title"),
  );
}

print_r($data);
于 2011-03-29T14:24:49.003 回答
1

你需要正则表达式吗?不必要。正则表达式是最易读的解决方案吗?可能不会——至少除非你精通正则表达式。扫描大量数据时,正则表达式是否更有效?当然,正则表达式在第一次出现时就被编译和缓存。正则表达式是否赢得了“最少的代码行”奖杯?

$string = <<<EOS
<html>
<body>
blahblah<br>
<img src='http://www.mysite.com/folder/subfolder/images/myimage.png'>blah<br>
blah<img src='http://www.mysite.com/folder/subfolder/images/another.png' />blah<br>
</body>
</html>
EOS;

preg_match_all("%<img .*?src=['\"](.*?)['\"]%s", $string, $matches);
$images = array_map(function ($element) { return preg_replace("%^.*/(.*)$%", 'images/$1', $element); }, $matches[1]);

print_r($images);

两行代码,这在 PHP 中很难削弱。它产生以下$images数组:

Array
(
  [0] => images/myimage.png
  [1] => images/another.png
)

请注意,这不适用于 5.3 之前的 PHP 版本,除非您将匿名函数替换为适当的函数。

于 2011-03-29T14:40:41.573 回答