0

这个查询:

SELECT * 
FROM html 
WHERE url='http://wwww.example.com' 
AND xpath='//tr[@height="20"]'

返回 XML:

<results>
    <tr height="20">
        <td height="20" width="425">
            <p>Institution 0</p>
        </td>
        <td width="134">
            <p>Minneapolis</p>
        </td>
        <td width="64">
            <p>MN</p>
        </td>
    </tr>
    ...
</results>

问题:

  • 有没有办法使用 XPATH 创建单独的列?
  • 有没有办法创建列别名?

示例(无效语法):

SELECT td[position()=1]/p/. AS name, td[position()=2]/p/. AS city, td[position()=3]/p/. AS region
FROM   ...

目标:

<results>
    <tr height="20">
      <name>Institution 0</name>
      <city>Minneapolis</city>
      <region>MN</region>
    </tr>
    ...
</results>
4

1 回答 1

1

不像您正在尝试做的那样,使用 XPath。但是,可以使用 YQL 将XSL 转换应用于 XML/HTML 文档。这是一个例子:

XSLT

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="/">
        <rows>
          <xsl:apply-templates select="descendant::tr" />
        </rows>
    </xsl:template>
    <xsl:template match="//tr">
        <row>
            <name>
                <xsl:value-of select="td[1]/p" />
            </name>
            <city>
                <xsl:value-of select="td[2]/p" />
            </city>
            <region>
                <xsl:value-of select="td[3]/p" />
            </region>
        </row>
    </xsl:template>
</xsl:stylesheet>

HTML

<html>
    <body>
        <table>
            <tr height="20">
                <td height="20" width="425">
                    <p>Institution 0</p>
                </td>
                <td width="134">
                    <p>Minneapolis</p>
                </td>
                <td width="64">
                    <p>MN</p>
                </td>
            </tr>
            <tr height="20">
                <td height="20" width="425">
                    <p>Institution 1111</p>
                </td>
                <td width="134">
                    <p>Minneapolis 1111</p>
                </td>
                <td width="64">
                    <p>MN 11111</p>
                </td>
            </tr>
        </table>
    </body>
</html>

YQL查询

select * from xslt where stylesheet="url/to.xsl" and url="url/to.html"

YQL 结果

<results>
    <rows>
        <row>
            <name>Institution 0</name>
            <city>Minneapolis</city>
            <region>MN</region>
        </row>
        <row>
            <name>Institution 1111</name>
            <city>Minneapolis 1111</city>
            <region>MN 11111</region>
        </row>
    </rows>
</results>

»查看在 YQL 控制台中运行的示例。

于 2013-07-06T21:17:09.473 回答