8

我正在尝试加载解析 Google Weather API 响应(中文响应)。

是 API 调用。

// This code fails with the following error
$xml = simplexml_load_file('http://www.google.com/ig/api?weather=11791&hl=zh-CN');

(!)警告:simplexml_load_string()[function.simplexml-load-string]:实体:第1行:解析器错误:输入不正确的UTF-8,表示编码!字节:第 11 行 C:\htdocs\weather.php 中的 0xB6 0xE0 0xD4 0xC6

为什么加载此响应失败?

如何对响应进行编码/解码以便simplexml正确加载它?

编辑:这是代码和输出。

<?php
$googleData = file_get_contents('http://www.google.com/ig/api?weather=11102&hl=zh-CN');
$xml = simplexml_load_string($googleData);

(!)警告:simplexml_load_string()[function.simplexml-load-string]:实体:第1行:解析器错误:输入不正确的UTF-8,表示编码!字节:第 3 行 C:\htdocs\test4.php 中的 0xB6 0xE0 0xD4 0xC6 调用堆栈时间内存函数位置 1 0.0020 314264 {main}( ) ..\test4.php:0 2 0.1535 317520 simplexml_load_string ( string(1364) ) ..\test4.php:3

( ! ) 警告:simplexml_load_string() [function.simplexml-load-string]: t_system data="SI"/>

( ! ) 警告:simplexml_load_string() [function.simplexml-load-string]: ^ in C:\htdocs\test4.php 第 3 行调用堆栈时间内存函数位置 1 0.0020 314264 {main}( ) ..\test4。 php:0 2 0.1535 317520 simplexml_load_string (string(1364)) ..\test4.php:3

4

5 回答 5

21

这里的问题是 SimpleXML 不会查看 HTTP 标头来确定文档中使用的字符编码,并且只是假设它是 UTF-8,即使 Google 的服务器确实将其宣传为

Content-Type: text/xml; charset=GB2312

您可以编写一个函数,使用超级秘密魔法变量查看该标头$http_response_header并相应地转换响应。像这样的东西:

function sxe($url)
{   
    $xml = file_get_contents($url);
    foreach ($http_response_header as $header)
    {   
        if (preg_match('#^Content-Type: text/xml; charset=(.*)#i', $header, $m))
        {   
            switch (strtolower($m[1]))
            {   
                case 'utf-8':
                    // do nothing
                    break;

                case 'iso-8859-1':
                    $xml = utf8_encode($xml);
                    break;

                default:
                    $xml = iconv($m[1], 'utf-8', $xml);
            }
            break;
        }
    }

    return simplexml_load_string($xml);
}
于 2010-05-25T03:28:17.647 回答
5

更新:我可以重现该问题。此外,当我输出原始 XML 提要时,Firefox 会自动将字符集嗅探为“简体中文”。要么 Google 提要提供不正确的数据(简体中文字符而不是 UTF-8 字符),要么在浏览器中未获取时提供不同的数据 - Firefox 中的内容类型标头清楚地说明了utf-8.

将来自简体中文(GB18030,这是 Firefox 给我的)的传入提要转换为 UTF-8 工作:

 $incoming = file_get_contents('http://www.google.com/ig/api?weather=11791&hl=zh-CN');
 $xml = iconv("GB18030", "utf-8", $incoming);
 $xml = simplexml_load_string($xml);

不过,它还没有解释也没有解决根本问题。我现在没有时间深入研究这个,也许其他人有。对我来说,看起来谷歌实际上提供的数据不正确(这让我感到惊讶。我不知道他们犯了像我们凡人一样的错误。:P)

于 2010-05-24T18:37:54.673 回答
2

刚好碰到这个。这似乎有效(我在网上找到的功能本身,只是更新了一点)。:

header('Content-Type: text/html; charset=utf-8'); 


function getWeather() {

$requestAddress = "http://www.google.com/ig/api?weather=11791&hl=zh-CN";
// Downloads weather data based on location.
$xml_str = file_get_contents($requestAddress,0);
$xml_str = preg_replace("/(<\/?)(\w+):([^>]*>)/", "$1$2$3", $xml_str); 

$xml_str = iconv("GB18030", "utf-8", $xml_str);


// Parses XML
$xml = new SimplexmlElement($xml_str, TRUE);
// Loops XML
$count = 0;
echo '<div id="weather">';

foreach($xml->weather as $item) {

    foreach($item->forecast_conditions as $new) {

        echo "<div class=\"weatherIcon\">\n";
         echo "<img src='http://www.google.com/" .$new->icon['data'] . "'   alt='".$new->condition['data']."'/><br>\n";
        echo "<b>".$new->day_of_week['data']."</b><br>";
        echo "Low: ".$new->low['data']." &nbsp;High: ".$new->high['data']."<br>";
        echo "\n</div>\n";
        }

}

echo '</div>';
}


getWeather();
于 2010-11-09T23:55:14.833 回答
2

这是我在 php 中制作的用于解析 Google Weather API 的脚本。

 <?php

function sxe($url)
{
$xml = file_get_contents($url);
foreach ($http_response_header as $header)
{
if (preg_match('#^Content-Type: text/xml; charset=(.*)#i', $header, $m))
{
switch (strtolower($m[1]))
{

case 'utf-8':
// do nothing
break;

case 'iso-8859-1':
$xml = utf8_encode($xml);
break;

default:
$xml = iconv($m[1], 'utf-8', $xml);
}
break;
}
}
return simplexml_load_string($xml);
}


$xml = simplexml_load_file('http://www.google.com/ig/api?weather=46360&h1=en-us');
$information = $xml->xpath("/xml_api_reply/weather/forecast_information");
$current = $xml->xpath("/xml_api_reply/weather/current_conditions");
$forecast = $xml->xpath("/xml_api_reply/weather/forecast_conditions");


print "<br><br><center><div style=\"border: 1px solid; background-color: #dddddd; background-image: url('http://mc-pdfd-live.dyndns.org/images/clouds.bmp'); width: 450\">";


print "<br><h3>";
print $information[0]->city['data'] . "&nbsp;" . $information[0]->unit_system['data'] . "&nbsp;" .     $information[0]->postal_code['data'];
print "</h3>";
print "<div style=\"border: 1px solid; width: 320px\">";
print "<table cellpadding=\"5px\"><tr><td><h4>";
print "Now";
print "<br><br>";
print "<img src=http://www.google.com" . $current[0]->icon['data'] . ">&nbsp;";
print "</h4></td><td><h4>";
print "<br><br>";
print "&nbsp;" . $current[0]->condition['data'] . "&nbsp;";
print "&nbsp;" . $current[0]->temp_f['data'] . "&nbsp;°F";
print "<br>";
print "&nbsp;" . $current[0]->wind_condition['data'];
print "<br>";
print "&nbsp;" . $current[0]->humidity['data'];
print "<h4></td></tr></table></div>";




print "<table cellpadding=\"5px\"><tr><td>";


print "<table cellpadding=\"5px\"><tr><td><h4>";
print "Today";
print "<br><br>";
print "<img src=http://www.google.com" . $forecast[0]->icon['data'] . ">&nbsp;";
print "</h4></td><td><h4>";
print "<br><br>";
print  $forecast[0]->condition['data'];
print "<br>";
print  "High&nbsp;" . $forecast[0]->high['data'] . "&nbsp;°F";
print "<br>";
print  "Low&nbsp;" . $forecast[0]->low['data'] . "&nbsp;°F";
print "</h4></td></tr></table>";

print "<table cellpadding=\"5px\"><tr><td><h4>";
print  $forecast[2]->day_of_week['data'];
print "<br><br>";
print "<img src=http://www.google.com" . $forecast[2]->icon['data'] . ">&nbsp;";
print "</h4></td><td><h4>";
print "<br><br>";
print  "&nbsp;" . $forecast[2]->condition['data'];
print "<br>";
print  "&nbsp;High&nbsp;" . $forecast[2]->high['data'] . "&nbsp;°F";
print "<br>";
print  "&nbsp;Low&nbsp;" . $forecast[2]->low['data'] . "&nbsp;°F";
print "</h4></td></tr></table>";


print "</td><td>";


print "<table cellpadding=\"5px\"><tr><td><h4>";
print  $forecast[1]->day_of_week['data'];
print "<br><br>";
print "<img src=http://www.google.com" . $forecast[1]->icon['data'] . ">&nbsp;";
print "</h4></td><td><h4>";
print "<br><br>";
print  "&nbsp;" . $forecast[1]->condition['data'];
print "<br>";
print  "&nbsp;High&nbsp;" . $forecast[1]->high['data'] . "&nbsp;°F";
print "<br>";
print  "&nbsp;Low&nbsp;" . $forecast[1]->low['data'] . "&nbsp;°F";
print "</h4></td></tr></table>";

print "<table cellpadding=\"5px\"><tr><td><h4>";
print  $forecast[3]->day_of_week['data'];
print "<br><br>";
print "<img src=http://www.google.com" . $forecast[3]->icon['data'] . ">&nbsp;";
print "</h4></td><td><h4>";
print "<br><br>";
print  "&nbsp;" . $forecast[3]->condition['data'];
print "<br>";
print  "&nbsp;High&nbsp;" . $forecast[3]->high['data'] . "&nbsp;°F";
print "<br>";
print  "&nbsp;Low&nbsp;" . $forecast[3]->low['data'] . "&nbsp;°F";
print "</h4></td></tr></table>";


print "</td></tr></table>";


print "</div></center>";


?>
于 2011-05-24T15:57:02.877 回答
1

尝试在url查询参数中添加eo = utf-8。在这种情况下,答案将完全是 UTF-8 编码。它帮助了我。

http://www.google.com/ig/api?weather=?????&degree=??????&oe=utf-8&hl=es
于 2012-02-10T14:24:35.473 回答