php - xml中具有相同名称的多个节点

Question

我在下面有这个 xml 文件：-

 <item> 
  <title>Troggs singer Reg Presley dies at 71</title>  
  <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
  <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
  <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
</item>  
<item> 
  <title>Horsemeat found at Newry cold store</title>  
  <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
  <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
  <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
</item>  
<item> 
  <title>US 'will sue' Standard &amp; Poor's</title>  
  <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
  <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
  <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
   </item>

现在，当我将输入节点作为“项目”来检索数据时，而不是显示所有项目节点，它只显示最后一个项目节点.....

我的代码是：-

    $dom->load($url);
    $link = $dom->getElementsByTagName($tag_name);
    $value = array();

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    }

这里，$url 是我的 xml 页面的 url $tag_name 是节点的名称，在这种情况下它是“item”

我得到的输出是： -

  US 'will sue' Standard &amp; Poor's.Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa.http://www.bbc.co.uk/news/world-europe-21331208.Mon, 04 Feb 2013 22:45:52 GMT

这是最后一个标签的数据。我想要所有项目标签的数据，并且我希望数据采用这种格式：-

title :-  US 'will sue' Standard &amp; Poor's
description :- Standard &amp; Poor's says it is to be sued by the US government over 
the credit ratings agency's assessment of mortgage bonds before the financial crisis

我什至想要我的输出中的子节点的名称（如果有的话）......请帮帮我......

score 2 · Accepted Answer

（不要忘记根节点。）看起来其中一种方法只是将该元素下的所有文本节点连接在一起（几乎相当于 xsl:value-of select=.）。我从来没有对 PHP 中的 DOMDocument 类和相关类做过很多事情。但是您可以做的是使用 C14N() 方法规范化 DOMNode，然后解析结果字符串。它并不漂亮，但它得到了你想要的结果并且很容易扩展：

    $tag_name = 'item';
    $link = $dom->getElementsByTagName($tag_name);
    for ($i = 0; $i < $link->length; $i++) {
        $treeAsString = $link->item($i)->C14N();
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        for ($j = 1; $j < ($curBranchPartsSize - 1); $j++) { 
            $curItem = $curBranchParts[$j];
            $curItemParts = explode('<', $curItem);
            $tagWithContent = $curItemParts[1];
            $tagWithContentParts = explode('>',$tagWithContent);
            $tag = $tagWithContentParts[0];
            $content = $tagWithContentParts[1];

            if (trim($content) != '') echo $tag . ' :- ' . $content . '<br />';
            else echo $tag . '<br />';   
        }
    }

score 2 · Accepted Answer

您似乎仅在“项目”节点上循环，并且正如其他人提到的那样，在每次迭代中都覆盖了先前的值。

如果您在循环内使用 print_r($value) 调试 $value 数组；

$dom->load($url);
$link = $dom->getElementsByTagName($tag_name);
$value = array();

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'] = $link->item($i)->nodeName;
    $childnode['value'] = $link->item($i)->nodeValue;
    $value[$childnode['name']] = $childnode['value'];

    echo 'iteration: ' . $i . '<br />';
    echo '<pre>'; print_r($value); echo '</pre>';
}

你可能会看到这样的东西

// iteration: 0
Array
(
    [item] => Troggs singer Reg Presley dies at 71 ......
)

// iteration: 1
Array
(
    [item] => Horsemeat found at Newry cold store .........
)

// iteration: 2
Array
(
    [item] => US 'will sue' Standard & Poor's .........
)

你应该做的是：

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->load($url);
$items = $dom->getElementsByTagName($tag_name);
$values = array();

foreach ($items as $item) {
    $itemProperties = array();

    // Loop through the 'sub' items 
    foreach ($item->childNodes as $child) {
        // Note: using 'localName' to remove the namespace
        if (isset($itemProperties[(string) $child->localName])) {
            // Quickfix to support multiple 'thumbnails' per item (although they have no content)
            $itemProperties[$child->localName] = (array) $itemProperties[$child->localName];
            $itemProperties[$child->localName][] = $child->nodeValue;
        } else {
            $itemProperties[$child->localName] = $child->nodeValue;
        }
    }

    // Append the item to the 'values' array
    $values[] = $itemProperties;

}


// Output the result
echo '<pre>'; print_r($values); echo '</pre>';

哪个输出：

Array
(
    [0] => Array
        (
            [title] => Troggs singer Reg Presley dies at 71
            [description] => Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.
            [link] => http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/uk-21332048
            [pubDate] => Tue, 05 Feb 2013 01:13:07 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [1] => Array
        (
            [title] => Horsemeat found at Newry cold store
            [description] => Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.
            [link] => http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/world-europe-21331208
            [pubDate] => Mon, 04 Feb 2013 23:47:38 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [2] => Array
        (
            [title] => US 'will sue' Standard & Poor's
            [description] => Standard & Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.
            [link] => http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/21331018
            [pubDate] => Mon, 04 Feb 2013 22:45:52 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

)

score 1 · Accepted Answer

你的问题是你的源 XML 需要有一个根节点（它可以被称为任何你想要的）。要成为有效的 XML，您总是需要一个根节点。也就是说，每个有效的 XML 文件都将只有一个元素没有父元素或兄弟元素。拥有根节点后，您的 XML 将加载到您的对象中。

例如：

<root>
    <item> 
      <title>Troggs singer Reg Presley dies at 71</title>  
      <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
      <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
      <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
    </item>  
    <item> 
      <title>Horsemeat found at Newry cold store</title>  
      <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
      <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
      <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
    </item>  
    <item> 
      <title>US 'will sue' Standard &amp; Poor's</title>  
      <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
      <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
      <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
    </item>
</root>

score 0 · Accepted Answer

我认为代码有问题：

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    }

每次$childnode['name']由新值分配，for loop并且在最后一次$i等于的长度时，$link.length该值将分配给$childnode array。所以为了减少问题，它应该是一个多维数组，比如

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'][$i] = $link->item($i)->nodeName;
    $childnode['value'][$i] = $link->item($i)->nodeValue;
    $value[$childnode['name'][$i]][$i] = $childnode['value'];
}

要测试它：print_r($childnode);

php - xml中具有相同名称的多个节点

4 回答 4

Related

Reference