php - php中的XML解析

Question

我正在解析一个 xml，但有一个包含图像和文本的标签，我想在我的设计布局中的表格的不同列中分离图像和文本，但我不知道该怎么做。请帮我。我的 php 文件是：

<?php
$RSS_Content = array();

function RSS_Tags($item, $type)
{
    $y = array();
    $tnl = $item->getElementsByTagName("title");
    $tnl = $tnl->item(0);
    $title = $tnl->firstChild->textContent;

    $tnl = $item->getElementsByTagName("link");
    $tnl = $tnl->item(0);
    $link = $tnl->firstChild->textContent;
    $tnl = $item->getElementsByTagName("description");
    $tnl = $tnl->item(0);
    $img = $tnl->firstChild->textContent;

    $y["title"]  = $title;
    $y["link"] = $link;
    $y["description"] = $img;
    $y["type"] = $type;

    return $y;
}

function RSS_Channel($channel)
{
    global $RSS_Content;

    $items = $channel->getElementsByTagName("item");

    // Processing channel

    $y = RSS_Tags($channel, 0);     // get description of channel, type 0
    array_push($RSS_Content, $y);

    // Processing articles

    foreach($items as $item)
    {
        $y = RSS_Tags($item, 1);    // get description of article, type 1
        array_push($RSS_Content, $y);
    }
}

function RSS_Retrieve($url)
{
    global $RSS_Content;

    $doc  = new DOMDocument();
    $doc->load($url);

    $channels = $doc->getElementsByTagName("channel");

    $RSS_Content = array();

    foreach($channels as $channel)
    {
        RSS_Channel($channel);
    }

}

function RSS_RetrieveLinks($url)
{
    global $RSS_Content;

    $doc  = new DOMDocument();
    $doc->load($url);

    $channels = $doc->getElementsByTagName("channel");

    $RSS_Content = array();

    foreach($channels as $channel)
    {
        $items = $channel->getElementsByTagName("item");
        foreach($items as $item)
        {
            $y = RSS_Tags($item, 1);
            array_push($RSS_Content, $y);
        }
    }

}

function RSS_Links($url, $size = 15)
{
    global $RSS_Content;

    $page = "<ul>";

    RSS_RetrieveLinks($url);
    if($size > 0)
    $recents = array_slice($RSS_Content, 0, $size + 1);

    foreach($recents as $article)
    {
        $type = $article["type"];
        if($type == 0) continue;
        $title = $article["title"];
        $link = $article["link"];
        $img = $article["description"];
        $page .= "<a href=\"#\">$title</a>\n";
    }

    $page .="</ul>\n";

    return $page;

}

function RSS_Display($url, $click, $size = 8, $site = 0, $withdate = 0)
{
    global $RSS_Content;

    $opened = false;
    $page = "";
    $site = (intval($site) == 0) ? 1 : 0;

    RSS_Retrieve($url);
    if($size > 0)
    $recents = array_slice($RSS_Content, $site, $size + 1 - $site);

    foreach($recents as $article)
    {
        $type = $article["type"];
        if($type == 0)
        {
            if($opened == true)
            {
                $page .="</ul>\n";
                $opened = false;
            }
            $page .="<b>";
        }
        else
        {
            if($opened == false)
            {
                $page .= "<table width='369' border='0'>
            <tr>";
                $opened = true;
            }
        }
        $title = $article["title"];
        $link = $article["link"];
        $img = $article["description"];
        $page .= "<td width='125' align='center' valign='middle'>
              <div align='center'>$img</div></td>                    
        <td width='228' align='left' valign='middle'><div align='left'><a 
                  href=\"$click\" target='_top'>$title</a></div></td>";
        if($withdate)
        {
            $date = $article["date"];
            $page .=' <span class="rssdate">'.$date.'</span>';
        }
            if($type==0)
            {
                $page .="<br />";
            }
        }

        if($opened == true)
        {
            $page .="</tr>
                </table>";
        }
        return $page."\n";

    }
?>

score 0 · Accepted Answer

要分离图像和描述，您需要将存储在描述元素中的 HTML 再次解析为 XML。幸运的是，它是该元素内的有效 XML，因此您可以使用SimpleXML直接执行此操作，以下代码示例获取 URL 并将每个项目*description* 仅转换为文本并提取图像的src属性以将其存储为图像元素：

<item>
    <title>Fake encounter: BJP backs Kataria, says CBI targeting Modi</title>
    <link>http://ibnlive.in.com/news/fake-encounter-bjp-backs-kataria-says-cbi-targeting-modi/391802-37-64.html</link>
    <description>The BJP lashed out at the CBI and questioned its 'shoddy investigation' into the Sohrabuddin fake encounter case.</description>
    <pubDate>Wed, 15 May 2013 13:48:56 +0530</pubDate>
    <guid>http://ibnlive.in.com/news/fake-encounter-bjp-backs-kataria-says-cbi-targeting-modi/391802-37-64.html</guid>
    <image>http://static.ibnlive.in.com/ibnlive/pix/sitepix/05_2013/bjplive_kataria3.jpg</image>
</item>

代码示例是：

$url  = 'http://ibnlive.in.com/ibnrss/top.xml';
$feed = simplexml_load_file($url);

$items = $feed->xpath('(//channel/item)');

foreach ($items as $item) {
    list($description, $image) =
        simplexml_load_string("<r>$item->description</r>")
            ->xpath('(/r|/r//@src)');
    $item->description = (string)$description;
    $item->image       = (string)$image;
}

然后，您可以将SimpleXML导入到DOMElement中，dom_import_simplexml()但老实说，我只是将这个小小的 HTML 创建也包装到 SimpleXML 的 foreach 中，因为您可以像使用 DOMDocument 一样使用分页，并且LimitIterator您访问的数据是使用SimpleXML实际上很容易，将 XML 元素作为SimpleXMLElements传递很容易，而不是先解析成数组，然后再处理数组。那是没有意义的。

php - php中的XML解析

1 回答 1

Related

Reference