所以我忙于创建一个 XSLT 文件来将各种 XML 文档处理成一个新的节点布局。

有一件事我想不通,这是我正在使用的 XML 示例:

   This is a paragraph on the page.
   This is another paragraph.
   Here is yet another paragraph on this page.

如您所见,段落使用空标签作为分隔符进行拆分。在结果 XML 我想要这个:

    This is a paragraph on the page.
    This is another paragraph.
   Here is yet another paragraph on this page.

如何使用 XSLT(仅限 1.0 版)实现这一点?


3 回答 3


以下答案不如@stwissel 优雅,但它会正确标记段落中的任何子树。它确实变得有点讨厌,确实。:-)

此任务的问题在于它需要对结束标记和随后匹配的开始标记(例如<tag></tag>)之间的内容进行特殊处理。然而,XSLT 已针对处理开始标记和匹配结束标记(例如</tag><tag>)之间的内容进行了优化。顺便说一句:有一种方法可以“欺骗”一点。请参阅我对这个问题的其他答案。

假设您有一个输入 XML,如下所示:

    This is a paragraph on the page.
    After Bold
    This is another paragraph.
    Here is yet another paragraph on this page.
        Bold and emphasized.
    After bold and emphasized.
    Another page.

可以使用此 XSLT 1.0 转换对其进行处理

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />

  <xsl:template match="page">
      <!-- handle the first paragraph up to the first newParagraph -->
        <xsl:apply-templates select="node()[not(preceding-sibling::newParagraph)]" />

      <!-- now handle all remaining paragraphs of the page -->
      <xsl:for-each select="newParagraph">
        <xsl:variable name="pCount" select="position()"/>
          <xsl:apply-templates select="following-sibling::node()[count(preceding-sibling::newParagraph) &lt;= $pCount]" />

  <!-- this default rule recursively copies all substructures within a paragraph at tag level -->  
  <xsl:template match="node()|@*">
      <xsl:apply-templates select="node()|@*"/>

  <!-- this default rule makes sure that texts between the tags are printed -->
  <xsl:template match="text()">
    <xsl:copy-of select="."/>

  <xsl:template match="newParagraph"/>



    This is a paragraph on the page.
    After Bold
    This is another paragraph.
    Here is yet another paragraph on this page.
        Bold and emphasized.
    After bold and emphasized.
    Another page.
于 2014-04-23T22:26:40.250 回答


<xsl:template match="pages">
    <xsl:apply-templates />

<xsl:template match="page/text()">
    <p><xsl:value-of select="."/></p>

<xsl:template match="NewParagraph" />


于 2014-04-23T07:16:33.347 回答

如果您愿意“作弊”一点,您可以手动将 XML 标记插入结果文档中,这些标记不是节点树的一部分,而是普通文本。但是,如果下游处理器重新解析输出,则不会注意到差异。

鉴于我的其他答案的输入,以下 XSLT 1.0 转换将起到作用(保留段落中的子树):

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />

  <xsl:template match="page">

  <!-- this default rule recursively copies all substructures within a paragraph at tag level -->  
  <xsl:template match="node()|@*">
      <xsl:apply-templates select="node()|@*"/>

  <!-- this default rule makes sure that texts between the tags are printed -->
  <xsl:template match="text()">
    <xsl:copy-of select="."/>

  <xsl:template match="newParagraph">
    <!-- This inserts a matching closing and opening tag -->
    <xsl:value-of select="'&lt;/P&gt;&lt;P&gt;'" disable-output-escaping="yes" />

于 2014-04-24T06:28:51.733 回答