4

更新:我想我现在已经回答了这个问题的大部分,除了<pgBreak>. XSLT您可以在本帖末尾的EDIT下查看我的更新和当前更新

我昨天问了一个类似的问题,得到了很好的答案。但是,我后来意识到这并没有涵盖我的所有基础,所以我今天要问一个更详细的问题。

XML 输入

<?xml version="1.0" encoding="UTF-8"?>    
<root>
<pgBreak pgId="i"/>
    <p xml:id="a-01">
        <highlight rend="italic">Bacon ipsum dolor sit amet</highlight> bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
        bacon filet mignon pork chop tail.
        <note.ref id="0001"><super>1</super></note.ref>
        <note id="0001">
            <p>
                You may need to consult a <highlight rend="italic">latin</highlight> butcher. Good Luck.
            </p>
        </note>   
        Pork loin <pgBreak pgId="01"/> ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
        hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
        beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
        hock pork hamburger fatback.
    </p>
    <p xml:id="a-02">
        Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
        bacon filet mignon pork chop tail.
    </p>
    <p xml:id="a-03">
        Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. 
            <quote>
                <p> 1.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
                <p> 2.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin <pgBreak pgId="02"/>turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
                <p> 3.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
            </quote>
    </p>
</root>

HTML 输出

  <!DOCTYPE HTML>
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
      <title>Test</title>
   </head>
   <body>
      <div id="pg-i">
        Page i
      </div>
      <p data-chunkid="a-01"> 
         <span class="highlight-italic">Bacon ipsum dolor sit amet</span>bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin
         pastrami t-
         bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef
         hamburger 
         bacon filet mignon pork chop tail.
         <span class="noteRef" id="0001"><sup>1</sup></span></p>
      <div id="note-0001" data-chunkid="a-01">
         <p>
            You may need to consult a <span class="highlight-italic">latin</span> butcher. Good Luck.

         </p>
      </div>
      <p data-chunkid="a-01">   
         Pork loin
      </p>
      <div id="pg-01">
          Page 01
       </div>
        <p data-chunkId="a-01">
         ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket
         rump ham, tail
         hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola
         short ribs swine   
         beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola
         bacon ham 
         hock pork hamburger fatback.
       </p>
      <p data-chunkid="a-02"><span class="highlight-italic">Bacon ipsum dolor sit</span> amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin
         pastrami t-
         bone. Sirloin turducken short ribs <span class="highlight-bold">t-bone</span> andouille strip steak pork loin corned beef hamburger 
         bacon filet mignon pork chop tail.

      </p>

      <p data-chunkid="a-03">
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs
         doner tri-tip 
         tongue. 

      </p>
      <blockquote data-chunkid="a-03">
        <p> 1.
            Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
            bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
            bacon filet mignon pork chop tail.
        </p>
         <p>2.
               Tri-tip ground round <span class="highlight-italic">short ribs</span> capicola meatloaf shank drumstick short loin pastrami t-
               bone. Sirloin 
          </p>
       </blockquote>
       <div id="pg-02">
         Page: 02
       </div>
       <blockquote data-chunkid="a-03"> 
         </p>
               turducken short ribs t-bone andouille strip steak pork loin corned beef
               hamburger bacon filet mignon pork chop tail.

         </p>
        <p> 3.
            Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
            bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
            bacon filet mignon pork chop tail.
        </p>

      </blockquote>
      <p data-chunkid="a-03">
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs
         doner tri-tip 
         tongue. 

      </p>
   </body>
</html>

我想将 xml 转换为 html5,但将每个块 (xml:id) 保持在一起。我想避免 divits(过度使用 div)所以将每个 p 包装在一个 div 中是不可行的,但我也试图避免无效的 HTML。例如,很容易获取父 p (xml:id=a-01) 并将其包裹在其后代周围,但是,一个块级别<div>和另一个<p>将是无效的,并且浏览器将在文本结束后解释所有内容作为孤立的文本。

我从昨天的问题中尝试了各种修改过XSLT的 s 。然而,我发现自己处于一个有点陌生的领域。我还将受益于对解决方案的简明解释,这样我就可以开始更好地理解 XSLT,因为看起来我将在接下来的几个月里花更多的时间来研究它。我可能应该拿起迈克尔凯的书或什么的。

编辑:我正在使用的 XSLT 的当前版本

注意:我还没有尝试分页符。此外,我无法<meta>关闭标签....oxygen 14 一直抱怨这一点。

<xsl:template match="/">
    <html>
        <body>
            <xsl:apply-templates/>
        </body>
    </html>
</xsl:template>

<xsl:template match="p[not((parent::note,.//p, .//div))]">
    <p data-chunkID="{@xml:id}">
        <xsl:apply-templates/>
    </p>
</xsl:template>

<xsl:template match="p[.//p, .//div]">
    <xsl:for-each-group select="node()" group-adjacent="boolean((self::text(), self::note.ref,self::highlight))">
        <xsl:choose>
            <xsl:when test="current-grouping-key()">
                <p data-chunkID="{../@xml:id}">
                    <xsl:apply-templates select="current-group()"/>
                </p>
            </xsl:when>
            <xsl:when test="self::p">
                <p>
                    <xsl:apply-templates/>
                </p>
            </xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates select="current-group()"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:for-each-group>
</xsl:template>

<xsl:template match="note.ref">
    <span class="noteRef" id="{@id}">
        <xsl:apply-templates/>
    </span>
</xsl:template>

<xsl:template match="super">
    <sup>
        <xsl:apply-templates/>
    </sup>
</xsl:template>

<xsl:template match="note">
    <div id="note-{@id}" data-chunkID="{../@xml:id}">
        <p>
        <xsl:apply-templates/>
        </p>
    </div>
</xsl:template>


<xsl:template match="quote">
    <blockquote data-chunkID="{../@xml:id}">
        <p>
        <xsl:apply-templates/>
        </p>
    </blockquote>
</xsl:template>



<xsl:template match="highlight">
    <xsl:variable name="class" select="concat(name(.),'-',string(@rend))"/>
    <xsl:choose>
        <xsl:when test="@rend[.= 'italic']">
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:when>
        <xsl:when test="@rend[.= 'bold']">
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:when>
        <xsl:otherwise>
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

4

1 回答 1

1

看起来您的输入与您的输出有点不一致。(这是预期的输出,还是您现在得到的输出)?块 a-02 和 a-03<highlight>在输入中没有元素,但输出中有<span class="highlight...">元素。此外,块 a-03 在块引用之后有重复的文本。

我相信我已经制定了一个可行的解决方案,可以完成您示例中的所有操作。你能试试这个吗?

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <html>
      <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <title>Test</title>
      </head>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="p | div">
    <xsl:variable name="breaks" select="note | pgBreak | quote" />
    <xsl:variable name="firstNonBreak" select="node()[count(. | $breaks) != count($breaks)][1]" />
    <xsl:variable name="nonBreaksAfterBreak"
                  select="$breaks/following-sibling::node()[1][count(. | $breaks) != count($breaks)]" />

    <xsl:apply-templates select="$breaks | $firstNonBreak | $nonBreaksAfterBreak" mode="sectChild" />
  </xsl:template>

  <!-- Template to output the chunk id attribute of a particular hierarchy -->
  <xsl:template name="ChunkId">
    <xsl:variable name="id" select="ancestor::*[../self::root]/@xml:id" />
    <xsl:if test="$id">
      <xsl:attribute name="data-chunkid">
        <xsl:value-of select="$id"/>
      </xsl:attribute>
    </xsl:if>
  </xsl:template>

  <!-- Splitting types - notes, page breaks, quotes -->
  <xsl:template match="pgBreak" mode="sectChild">
    <div id="pg-{@pgId}">
      <xsl:value-of select="concat('Page ', @pgId)"/>
    </div>
  </xsl:template>

  <xsl:template match="quote | note" mode="sectChild">
    <xsl:apply-templates />
  </xsl:template>

  <!-- Receives the first node of each block of content outside of the splitting types
       and passes processing onto itself and siblings within its block-->
  <xsl:template match="text() | highlight | note.ref | super" mode="sectChild">

    <xsl:variable name="content">
      <xsl:apply-templates select="." mode="buildContent" />
    </xsl:variable>

    <xsl:if test="normalize-space($content)">
      <xsl:call-template name="Nest">
        <xsl:with-param name="hierarchy" select="ancestor::*[not(self::root)]" />
        <xsl:with-param name="content" select="$content" />
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

  <!-- Recursive template to output nodes from the top level down to content -->
  <xsl:template name="Nest">
    <xsl:param name="topLevel" select="true()"/>
    <xsl:param name="hierarchy" />
    <xsl:param name="content" />

    <xsl:variable name="top" select="$hierarchy[1]" />
    <xsl:variable name="remainder" select="$hierarchy[position() > 1]" />

    <!-- If there's a quote or note yet to come, don't output tags until we get there -->
    <xsl:variable name="skipTags" select="boolean($remainder[self::quote or self::note])" />
    <!-- Recursive output is captured in a variable, to be output later in this template -->
    <xsl:variable name="inside">
      <xsl:if test="$hierarchy">
        <xsl:call-template name="Nest">
          <xsl:with-param name="topLevel" select="$topLevel and $skipTags" />
          <xsl:with-param name="hierarchy" select="$remainder" />
          <xsl:with-param name="content" select="$content" />
        </xsl:call-template>
      </xsl:if>
    </xsl:variable>

    <xsl:choose>
      <xsl:when test="not($hierarchy)">
        <xsl:copy-of select="$content" />
      </xsl:when>
      <xsl:when test="$top/self::quote">
        <blockquote>
          <xsl:call-template name="ChunkId" />
          <xsl:copy-of select="$inside"/>
        </blockquote>
      </xsl:when>
      <xsl:when test="$top/self::note">
        <div id="note-{$top/@id}">
          <xsl:call-template name="ChunkId" />
          <xsl:copy-of select="$inside"/>
        </div>
      </xsl:when>
      <xsl:when test="not($skipTags)">
        <xsl:element name="{name($top)}">
          <xsl:if test="$topLevel">
            <xsl:call-template name="ChunkId" />
          </xsl:if>
          <xsl:copy-of select="$inside"/>
        </xsl:element>
      </xsl:when>
      <xsl:otherwise>
        <xsl:copy-of select="$inside"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="node()" mode="buildContent">
    <xsl:if test="not(self::note or self::quote or self::pgBreak)">
      <!-- output this node -->
      <xsl:apply-templates select="self::node()[normalize-space(.)]" mode="contentOutput" />
      <!-- pass processing onto next sibling -->
      <xsl:apply-templates select="following-sibling::node()[1]" mode="buildContent" />
    </xsl:if>
  </xsl:template>

  <!-- Bottom level content - text, note refs, superscript, highlight-->
  <xsl:template match="text()" mode="contentOutput">
    <xsl:copy-of select="."/>
  </xsl:template>

  <xsl:template match="note.ref" mode="contentOutput">
    <span class="noteRef" id="{@id}">
      <xsl:apply-templates mode="contentOutput"/>
    </span>
  </xsl:template>

  <xsl:template match="super" mode="contentOutput">
    <sup>
      <xsl:apply-templates mode="contentOutput"/>
    </sup>
  </xsl:template>

  <xsl:template match="highlight" mode="contentOutput">
    <xsl:variable name="class" select="concat(name(.),'-',string(@rend))"/>
    <span class="{$class}">
      <xsl:apply-templates mode="contentOutput"/>
    </span>
  </xsl:template>
</xsl:stylesheet>

我相信未封闭的元标记是使用method="html". 您可能需要使用method="xml"来获取封闭的元标记。使用method="html",上述转换会从您的示例输入中生成以下输出:

<html>
  <head>
    <META http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Test</title>
  </head>
  <body>
  <p data-chunkid="a-01"><span class="highlight-italic">Bacon ipsum dolor sit amet</span> bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
    bacon filet mignon pork chop tail.
    <span class="noteRef" id="0001">
      <sup>1</sup>
    </span></p>
      <div id="note-0001" data-chunkid="a-01">
      <p>
        You may need to consult a <span class="highlight-italic">latin</span> butcher. Good Luck.
      </p>
    </div>
    <p data-chunkid="a-01">
    Pork loin </p>
    <div id="pg-01">Page 01</div>
    <p data-chunkid="a-01"> ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
    hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine
    beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham
    hock pork hamburger fatback.
  </p>
  <p data-chunkid="a-02">
    Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
    bacon filet mignon pork chop tail.
  </p>
  <p data-chunkid="a-03">
    Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue.
    </p>
      <blockquote data-chunkid="a-03">
      <p>
        Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin </p>
    </blockquote>
    <div id="pg-02">Page 02</div>
    <blockquote data-chunkid="a-03">
      <p>turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
        bacon filet mignon pork chop tail.
      </p>
    </blockquote>

</body>
</html>

通过将方法更改为“xml”并手动将meta元素添加到变换中,可以获得相同的结果,但具有以下内容<head>

  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Test</title>
  </head>
于 2013-01-18T11:06:39.300 回答