我正在尝试从链接解析 RSS 提要。这是我的代码:

            $content = file_get_contents($this->feed);     
            $rss = new SimpleXmlElement($content);
            $rss_split = array();
           /* foreach ($rss->channel->item as $item) {
                $title = (string) $item->title; // Title
                $link = (string) $item->link; // Url Link
                $description = (string) $item->description; //Description               
                $rss_split[] = '<div><a href="' . $link . '" target="_blank" title="" >' . $title . ' </a><hr></div>';

完整的 XML 正在从这里下载: http: //devilsworkshop.org/feed/


    <title>Windows 8 Appstore resembles a ghost town</title>
    <pubDate>Tue, 18 Sep 2012 05:30:22 +0000</pubDate>
    <category><![CDATA[Windows 8]]></category>

    <guid isPermaLink="false">http://devilsworkshop.org/?p=62284</guid>
    <description><![CDATA[<p>Microsoft is all set to release Windows 8 for public in the coming weeks. Apparently, the biggest change in Windows 8 seems to be the Metro UI (I know it&#8217;s no more called Metro, but let&#8217;s keep it like that [...]</p><p>--
            This Post <a href="http://devilsworkshop.org/windows-appstore-resembles-ghost-town/">Windows 8 Appstore resembles a ghost town</a> is Published on <a href="http://devilsworkshop.org">Devils Workshop</a> .
    <content:encoded><![CDATA[<p>Microsoft is all set to release Windows 8 for public in the coming weeks. Apparently, the biggest change in Windows 8 seems to be the Metro UI (I know it&#8217;s no more called Metro, but let&#8217;s keep it like that for simplicity) and apps.</p>
        <h2>Apps are less advanced</h2>
        <p>Metro is great on tablets, but on desktop, it looks like an OS with dumbed down apps. Take Skitch for example, it is an app for taking and editing screenshots and was previously a Mac-only app but recently came to Windows 8. Just compare these two apps and you&#8217;ll know what I meant.</p>
        <p>Here&#8217;s how Skitch looks in Windows 8:</p>
        <p><a href="http://devilsworkshop.org/files/2012/09/SkitchinWindows8.png"><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-full wp-image-62302" title="SkitchinWindows8" src="http://devilsworkshop.org/files/2012/09/SkitchinWindows8.png" alt="" width="740" height="570" /></a></p>
        <p>And now, this is the Mac version of Skitch:</p>
        <p><a href="http://devilsworkshop.org/files/2012/09/SkitchinMac.png"><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-full wp-image-62301" title="SkitchinMac" src="http://devilsworkshop.org/files/2012/09/SkitchinMac.png" alt="" width="671" height="575" /></a></p>
        <p>Another example can be Newsmix, an app which will let you read stuff that matters to you &#8211; in a Magazine layout. Apparently, this app is a fail for someone like me who subscribe to 50+ blogs.</p>
        <p><a href="http://devilsworkshop.org/files/2012/09/NewsmixinWindows8.png"><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-large wp-image-62305" title="NewsMix in Windows 8" src="http://devilsworkshop.org/files/2012/09/NewsmixinWindows8-1024x640.png" alt="news-mix-windows-8" width="620" height="387" /></a><br />
            Sure, it will be great on a Windows slate, but not really on a PC/laptop.</p>
当我打印$content时,它会显示content:encoded标签中的图像。但是打印$rss根本没有显示该标签,并且描述标签也显示了SimpleXMLElement Object()



3 回答 3


首先,print_r()对于预测 SimpleXML 对象的行为方式不是一个好的选择,因为它们不是“正常”的 PHP 对象。你可以试试我的simplexml_dump()函数,它列出了特定节点或节点列表的内容、子节点和属性。

其次,该content:encoded元素位于命名空间content中,因此您需要告诉 SimpleXML 访问该命名空间中的节点,而不是使用默认的->children()方法。例如echo $item->children('content', true)->encoded;

于 2012-09-18T12:21:53.420 回答

当然打印$rss并没有显示数据..它显示了它的含义,因为它本身确实是一个SimpleXMLElement Object.

但是,据我所知,您的 xml 文档无法解析,因为它无效UTF-8。把它复制给我的客户,并梳理它,我发现了一堆xA0x92字符。




$char_arr = array('/\xa0/','/\x92/','/\x96/');
$rep_arr = array('&nbsp;','\'','-');
$content = preg_replace($char_arr, $rep_arr, $content);

确保将此代码放在声明您的 simpleXML 对象之前:

$content = file_get_contents($this->feed);     
$char_arr = array('/\xa0/','/\x92/','/\x96/');
$rep_arr = array('&nbsp;','\'','-');
$content = preg_replace($char_arr, $rep_arr, $content);
$rss = new SimpleXmlElement($content);


于 2012-09-18T07:28:11.987 回答

感谢 IMSoP 的回答,我直接访问了http://php.net/simplexml,在那里找到并使用了 xaviered_at gmail_dot_com 的 xmlObjToArr($obj) 函数来解决同样的问题。

对于那些仍在寻找一种在 content:encoded 之间标记内容的简单方法的人来说,这是一个简短而明显的脚本


echo "<pre>";

$url = "http://devilsworkshop.org/feed/";
$rss = simplexml_load_file($url);


    $items = $rss->channel->item;

    foreach($items as $item){

        $title = $item->title;
        $image = $item->image;
        $link = $item->link;
        $published_on = $item->pubDate;
        $description = $item->description;

        // bringing in to array <content:encoded> items from SimpleXMLElement Object()
        $content = xmlObjToArr($item->children('content', true)->encoded);

        echo "

        title: $title
        image: $image
        link: $link
        published on: $published_on
        description: $description



function xmlObjToArr($obj) {
        $namespace = $obj->getDocNamespaces(true);
        $namespace[NULL] = NULL;

        $children = array();
        $attributes = array();
        $name = strtolower((string)$obj->getName());

        $text = trim((string)$obj);
        if( strlen($text) <= 0 ) {
            $text = NULL;

        // get info for all namespaces
        if(is_object($obj)) {
            foreach( $namespace as $ns=>$nsUrl ) {
                // atributes
                $objAttributes = $obj->attributes($ns, true);
                foreach( $objAttributes as $attributeName => $attributeValue ) {
                    $attribName = strtolower(trim((string)$attributeName));
                    $attribVal = trim((string)$attributeValue);
                    if (!empty($ns)) {
                        $attribName = $ns . ':' . $attribName;
                    $attributes[$attribName] = $attribVal;

                // children
                $objChildren = $obj->children($ns, true);
                foreach( $objChildren as $childName=>$child ) {
                    $childName = strtolower((string)$childName);
                    if( !empty($ns) ) {
                        $childName = $ns.':'.$childName;
                    $children[$childName][] = xmlObjToArr($child);

        return array(

于 2013-04-23T14:16:41.257 回答