2

我知道这个问题的版本之前已经被问过,但我在这个版本中有一个特定的问题。

我正在尝试从嵌入在 CDATA 中但在 xml 标记之外的 RSS 提要中提取一些文本。这是RSS文件:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="/rss/ndbcrss.xsl"?>
<rss version="2.0" xmlns:georss="http://www.georss.org/georss" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>NDBC - Station 46042 - MONTEREY - 27NM WNW of Monterey, CA Observations</title>
    <description><![CDATA[This feed shows recent marine weather observations from Station 46042.]]></description>
    <link>http://www.ndbc.noaa.gov/</link>
    <pubDate>Wed, 07 Aug 2013 21:06:45 UT</pubDate>
    <lastBuildDate>Wed, 07 Aug 2013 21:06:45 UT</lastBuildDate>
    <ttl>30</ttl>
    <language>en-us</language>
    <managingEditor>webmaster.ndbc@noaa.gov</managingEditor>
    <webMaster>webmaster.ndbc@noaa.gov</webMaster>
    <image>
      <url>http://weather.gov/images/xml_logo.gif</url>
      <title>NOAA - National Weather Service</title>
      <link>http://www.ndbc.noaa.gov/</link>
    </image>
    <item>
      <pubDate>Wed, 07 Aug 2013 21:06:45 UT</pubDate>
      <title>Station 46042 - MONTEREY - 27NM WNW of Monterey, CA</title>
      <description><![CDATA[
        <strong>August 7, 2013 1:50 pm PDT</strong><br />
        <strong>Location:</strong> 36.785N 122.469W<br />
        <strong>Wind Direction:</strong> SW (220&#176;)<br />
        <strong>Wind Speed:</strong> 1.9 knots<br />
        <strong>Wind Gust:</strong> 1.9 knots<br />
        <strong>Significant Wave Height:</strong> 2.3 ft<br />
        <strong>Dominant Wave Period:</strong> 14 sec<br />
        <strong>Average Period:</strong> 6.9 sec<br />
        <strong>Mean Wave Direction:</strong> SSE (160&#176;) <br />
        <strong>Atmospheric Pressure:</strong> 30.11 in (1019.5 mb)<br />
        <strong>Pressure Tendency:</strong> -0.01 in (-0.3 mb)<br />
        <strong>Air Temperature:</strong> 60.8&#176;F (16.0&#176;C)<br />
        <strong>Water Temperature:</strong> 59.9&#176;F (15.5&#176;C)<br />
      ]]></description>
      <link>http://www.ndbc.noaa.gov/station_page.php?station=46042</link>
      <guid>http://www.ndbc.noaa.gov/station_page.php?station=46042&amp;ts=1375908600</guid>
      <georss:point>36.785 -122.469</georss:point>
    </item>
  </channel>
</rss>

我正在尝试从以下几行中获取“2.3 英尺”、“14 秒”和“SSE (160°)”:

<strong>Significant Wave Height:</strong> 2.3 ft<br />
<strong>Dominant Wave Period:</strong> 14 sec<br />
<strong>Mean Wave Direction:</strong> SSE (160&#176;) <br />

我可以从中剥离 CDATA,然后访问 strong[x] 元素,但我不知道如何获取标签之外的上述文本。

编辑

谢谢卡尔!使用explode/regex 效果很好。另一个工具添加到我的小(但不断增长的)包中。

这是我用来存储这三个项目的工作代码:

<?php
$url = "http://www.ndbc.noaa.gov/data/latest_obs/46042.rss";    
$xml = simplexml_load_file($url);

$data = $xml->channel->item->description;


foreach (explode("\n", $data) as $key=>$line) {
    preg_match('/(\<strong>.+?\<\/strong>)(.*)?<br/', $line, $matches);
    if ( ! empty($matches)) { 
        $dataDescr[$key] = $matches[1];
        $dataVal[$key] = $matches[2];
    }
}   
$sigWavHt = $dataVal[5];
$domWavPer = $dataVal[6];
$meanWavDir = $dataVal[8];

echo "$sigWavHt, $domWavPer, $meanWavDir"; //to test results
?>
4

1 回答 1

1

如果您确定数据与您的示例一致,则可以使用正则表达式来提取数据。

例如:

$data = "<strong>Significant Wave Height:</strong> 2.3 ft<br />
<strong>Dominant Wave Period:</strong> 14 sec<br />
<strong>Mean Wave Direction:</strong> SSE (160&#176;) <br />";

foreach (explode("\n", $data) as $line) {
    preg_match('/(\<strong>.+?\<\/strong>)(.*)?<br/', $line, $matches);
    if ( ! empty($matches)) {
        // The part with the <strong> tags is now in $matches[1], and
        // the part after is in $matches[2]
        echo "Key: {$matches[1]}\tValue: {$matches[2]}\n"; 
    }
}

在查看您在上面发布的完整提要时,您需要记住,第一个日期行在<strong>内容之后没有“数据”部分......

于 2013-08-07T22:54:44.173 回答