0
$url  = "http://example.com/get-xml.php"; // contains broken XML
$file = file_get_contents($url);
$xml  = simplexml_load_string($file);

调用 simplexml_load_string 时收到的消息:

“警告:simplexml_load_string() [function.simplexml-load-string]:实体:第 216 行:解析器错误:指定属性 mods 的强制值”

警告:simplexml_load_string() [function.simplexml-load-string]:

总之,有一个带有空格的 XML 标记,它破坏了一切。

因此,使用 PHP,我正在从第三方导入 XML,而错误的 XML 标记会破坏整个导入。有没有更好的方法通过查看每个特定的 XML 标记来读取非 XML?或者我至少可以忽略损坏的标签吗?

我想理想情况下我也想要一个file_get_contents显示 XML 标记的方法。菜鸟有什么建议吗?我无法更改第 3 方 XML,因为我是从我没有任何影响的远程服务获取的。

4

1 回答 1

0

PHP 5.1+ allows you to parse not well-formed XML documents and adds the missing elements, eg. missing closing tags.

This can be very useful, if you have to parse XML documents, on which you don't have any influence.

To use this feature, you just have to set the DomDocument property recover to true before loading the XML document and then loading the XML document will always return something more or less useful:

<?php
$xml = new DomDocument();
$xml->recover=true;
$xml->loadXML('<root><tag>hello world</root>');
print $xml->saveXML();
?>

which will return (besides a bunch of errors, the result will still show up).

code demo here: phpFiddle

Updated to bring the xml as it is:

if you can use curl this should achieve your goal.. try it an let me know

<?php
function curl_get_file_contents($URL)
    {
        $c = curl_init();
        curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($c, CURLOPT_URL, $URL);
        $contents = curl_exec($c);
        curl_close($c);

        if ($contents) return $contents;
            else return FALSE;
    }
?>
于 2013-05-01T05:19:33.177 回答