0

简短的问题:如何处理 xml 输入文件中的原始 & 符号。

添加:我什至没有选择带有&符号的字段。解析器抱怨文件中存在&符号。

长解释:我正在处理通过 url 响应生成的 xml。

<NOTE>I%20hope%20this%20won%27t%20require%20a%20signature%3f%20%20
There%20should%20be%20painters%20%26%20stone%20guys%20at%20the
%20house%20on%20Wednesday%2c%20but%20depending%20on%20what%20time%20
it%20is%20delivered%2c%20I%20can%27t%20guarantee%21%20%20
Also%2c%20just%20want%20to%20make%20sure%20the%20billing%20address
%20is%20different%20from%20shipping%20address%3f
</NOTE>

这是 url 解码成这样:

<NOTE>I hope this won't require a signature?  
There should be painters & stone guys at the 
house on Wednesday, but depending on what time it is delivered, I can't guarantee!  
Also, just want to make sure the billing address is different from shipping address?  
</NOTE>

问题:由于“painters & stone guy”中的“&”,xslproc 在最后一个字符串上阻塞,并出现以下错误:

xmlParseEntityRef: no name
<NOTE>I hope this won't require a signature?  There should be painters &

看起来 xsltproc 需要关闭</NOTE>

我在不同的地方尝试了各种方式disable-output-escaping="yes"xsl:outputxsl:value-of

并且也尝试过xsltproc --decode-uri但无法弄清楚那个。没有文档。

注意:我想知道是否值得将输入保持为 urlencoded 格式。并使用 DOCTYPE.. 如下所示(不知道该怎么做)。输出最终是一个浏览器。

<!DOCTYPE xsl:stylesheet  [
    <!ENTITY nbsp   "&#160;">
    <!ENTITY copy   "&#169;">
    <!ENTITY reg    "&#174;">
]>
4

1 回答 1

0

如果存在未转义的 & 符号,则 XML 格式不正确。如果你把字符串放在里面<![CDATA[]]>,那么它应该可以工作。

<NOTE><![CDATA[I hope this won't require a signature?  
  There should be painters & stone guys at the 
  house on Wednesday, but depending on what time it is delivered, I can't guarantee!  
  Also, just want to make sure the billing address is different from shipping address?]]>  
</NOTE>

或者,当然,使用&amp;代替&.

编辑:如果 XSLT 处理器支持禁用输出转义(并且 xsltproc 支持),您还可以将 URL 转义转换为数字字符引用:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="NOTE">
    <xsl:copy>
      <xsl:call-template name="decodeURL"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template name="decodeURL">
    <xsl:param name="URL" select="string()"/>
    <xsl:choose>
      <xsl:when test="contains($URL,'%')">
        <xsl:variable name="remainingURL" select="substring-after($URL,'%')"/>
        <xsl:value-of disable-output-escaping="yes" select="concat(
          substring-before($URL,'%'),
          '&amp;#x',
          substring($remainingURL,1,2),
          ';')"/>
        <xsl:call-template name="decodeURL">
          <xsl:with-param name="URL" select="substring($remainingURL,3)"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$URL"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

当然,您不必将此转换用作预处理步骤,您可以decodeURL在将包含 URL 编码字符串的源 XML 转换为 HTML 或其他内容的样式表中重新使用。

于 2012-12-04T06:28:20.783 回答