0

I am trying to strip out any html code of images from content if the image path is within a specific directory.

Take for example this block of content:

Donec iaculis <img src="http://www.domain.tld/smilies/butterfly.gif" alt="butterfly.gif" /> arcu pretium elementum et posuere felis. <img alt="mrgreen.gif" src="http://www.domain.tld/smilies/mrgreen.gif" /> Duis sit amet erat vitae tellus eleifend varius. <img src="http://www.domain.tld/avatars/somedude.jpg" /> Pellentesque ac ligula

What I am after is:

Donec iaculis arcu pretium elementum et posuere felis. Duis sit amet erat vitae tellus eleifend varius. <img src="http://www.domain.tld/avatars/somedude.jpg" /> Pellentesque ac ligula

In this example I would need to have the two images's removed that contain /smilies/ and leave the one img that exists within the /avatars/ path.

Note that the alt="" is in two different locations on the two images it matches.

4

2 回答 2

0

使用 DOM 的示例:

$doc = new DOMDocument();
@$doc->loadHTML($yourHTML);
foreach($doc->getElementsByTagName('img') as $imgNode) {
    if (strpos('/smilies/', $imgNode->getAttribute('src')))
        $imgNode->parentNode->removeChild($imgNode);
}
$yourHTML = $doc->saveHTML();
于 2013-08-10T01:57:45.853 回答
0
<?php
$html = 'Donec iaculis magna eget <img src="http://www.domain.tld/smilies/butterfly.gif" alt="butterfly.gif" /> arcu pretium elementum et posuere felis. Vivamus eget sodales lorem, id dictum lorem. Nunc vitae facilisis nibh. Integer dignissim, diam non molestie luctus, libero lacus auctor eros, vel hendrerit lorem risus vel elit. Pellentesque ac magna nec lectus tristique blandit. <img src="http://www.domain.tld/smilies/mrgreen.gif" /> Duis sit amet erat vitae tellus eleifend varius. <img src="http://www.domain.tld/avatars/somedude.jpg" /> Pellentesque ac ligula eget lacus dapibus fermentum. Interdum et malesuada fames ac ante ipsum primis in faucibus. Morbi gravida tempor leo eget lacinia. Curabitur interdum diam in congue consequat.';

$baseurl = 'http://www.domain.tld';
$folder = '/smilies/';

$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagname('img');
$removeList = array();
foreach ($images as $domElement) {
    $src = $domElement->getAttribute('src');
    if (strpos($src, $baseurl . $folder) !== false) {
        $removeList[] = $domElement;
    }
}

foreach ($removeList as $toRemove) {
    $toRemove->parentNode->removeChild($toRemove);
}

$html = $dom->saveHTML();

echo $html;

Care that you have to do two separate foreach loops, as you can't remove DOMNode from a DOMNodeList you are iterating. This is also the problem with Casimir's answer I think.

于 2013-08-10T02:00:16.407 回答