1

将大量文本分成任意数量的相等部分。

我可能已经厌倦了他们愚蠢的问题,但我还有一个问题。

我有一大段文字

<p>
    Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, 
    totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, 
    explicabo. Nemo enim ipsam voluptatem, quia voluptas sit,
</p>
<p> 
    aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi 
    nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit, amet, consectetur, adipisci velit, 
    sed quia non numquam eius modi tempora incidunt, ut labore et dolore magnam aliquam quaerat voluptatem. 
    Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid 
    ex ea commodi consequatur? 
</p>
<p>
    Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, 
    vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur? At vero eos et accusamus et iusto odio 
    dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas 
    molestias excepturi sint, obcaecati cupiditate non provident, similique sunt in culpa, qui officia deserunt 
    mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. 
    Nam libero tempore, cum 
</p>
<p>
    soluta nobis est eligendi optio, cumque nihil impedit, quo minus id, quod maxime placeat, facere possimus, 
    omnis voluptas assumenda est, omnis dolor repellendus. Temporibus autem quibusdam et aut officiis debitis 
    aut rerum necessitatibus saepe eveniet, ut et voluptates repudiandae sint et molestiae non recusandae. 
    Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur 
    aut perferendis doloribus asperiores repellat.
</p>
<p>
    Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem 
    aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. 
    Nemo enim ipsam voluptatem, quia voluptas sit,
</p>
<p>
    Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, 
    totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, 
    explicabo. Nemo enim ipsam voluptatem, quia voluptas sit,
</p>
<p> 
    aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos, qui ratione voluptatem sequi 
    nesciunt, neque porro quisquam est, qui dolorem ipsum, quia dolor sit, amet, consectetur, adipisci velit, 
    sed quia non numquam eius modi tempora incidunt, ut labore et dolore magnam aliquam quaerat voluptatem. 
    Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid 
    ex ea commodi consequatur?
</p>
<p>
    Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, 
    vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur? At vero eos et accusamus et iusto odio 
    dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et 
    quas molestias excepturi sint, obcaecati cupiditate non provident, similique sunt in culpa, qui officia 
    deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita 
    distinctio. Nam libero tempore, cum 
</p>
<p>
    soluta nobis est eligendi optio, cumque nihil impedit, quo minus id, quod maxime placeat, facere possimus, 
    omnis voluptas assumenda est, omnis dolor repellendus. Temporibus autem quibusdam et aut officiis debitis aut 
    rerum necessitatibus saepe eveniet, ut et voluptates repudiandae sint et molestiae non recusandae. 
    Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur 
    aut perferendis doloribus asperiores repellat.
</p>
<p>
    Quis autem vel eum iure reprehenderit, qui in ea voluptate velit esse, quam nihil molestiae consequatur, 
    vel illum, qui dolorem eum fugiat, quo voluptas nulla pariatur? At vero eos et accusamus et iusto odio 
    dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas 
    molestias excepturi sint, obcaecati cupiditate non provident, similique sunt in culpa, qui officia deserunt 
    mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. 
    Nam libero tempore, cum 
</p>
<p>
    Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, 
    totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, 
    explicabo. Nemo enim ipsam voluptatem, quia voluptas sit,
</p>
<p>
    soluta nobis est eligendi optio, cumque nihil impedit, quo minus id, quod maxime placeat, facere 
    possimus, omnis voluptas assumenda est, omnis dolor repellendus. Temporibus autem quibusdam et aut 
    officiis debitis aut rerum necessitatibus saepe eveniet, ut et voluptates repudiandae sint et molestiae 
    non recusandae. Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores 
    alias consequatur aut perferendis doloribus asperiores repellat.
</p>
<p>
    Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam 
    rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, 
    explicabo. Nemo enim ipsam voluptatem, quia voluptas sit,
</p>

在出口处,我需要将文本分成“n”个相等的部分,以便在这些部分中的文本数量大致相同。然后我将这些部分排列成列并且需要这些列看起来大约相同的高度。另一个条件:你可以打破的标签(我的意思是如果标签“p”包含很多文本,它可以分为两部分,引入另一列)。我认为这是一项艰巨的任务,我将不胜感激。

我知道 XSLT 不是排版工具。但是是否可以将文本分成多个部分,每个部分的字符数相同?

4

1 回答 1

2

使用 XSLT 进行文本对齐是可能的(我多年前就做过)。这是一个小示例,您可以将其用作执行完整任务的基础:

这种转变

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:f="http://fxsl.sf.net/"
xmlns:ext="http://exslt.org/common"
xmlns:str-split2lines-func="f:str-split2lines-func"
exclude-result-prefixes="xsl ext str-split2lines-func"
>


   <xsl:import href="dvc-str-foldl.xsl"/>

   <!-- to be applied on text.xml -->

   <str-split2lines-func:str-split2lines-func/>

   <xsl:output indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
      <xsl:call-template name="str-split-to-lines">
        <xsl:with-param name="pStr" select="/*"/>
        <xsl:with-param name="pLineLength" select="64"/>
        <xsl:with-param name="pDelimiters" select="' &#9;&#10;&#13;'"/>
      </xsl:call-template>
    </xsl:template>

    <xsl:template name="str-split-to-lines">
      <xsl:param name="pStr"/>
      <xsl:param name="pLineLength" select="60"/>
      <xsl:param name="pDelimiters" select="' &#9;&#10;&#13;'"/>

      <xsl:variable name="vsplit2linesFun"
                    select="document('')/*/str-split2lines-func:*[1]"/>

      <xsl:variable name="vrtfParams">
       <delimiters><xsl:value-of select="$pDelimiters"/></delimiters>
       <lineLength><xsl:copy-of select="$pLineLength"/></lineLength>
      </xsl:variable>

      <xsl:variable name="vResult">
          <xsl:call-template name="dvc-str-foldl">
            <xsl:with-param name="pFunc" select="$vsplit2linesFun"/>
            <xsl:with-param name="pStr" select="$pStr"/>
            <xsl:with-param name="pA0" select="ext:node-set($vrtfParams)"/>
          </xsl:call-template>
      </xsl:variable>

      <xsl:for-each select="ext:node-set($vResult)/line">
        <xsl:for-each select="word">
          <xsl:value-of select="concat(., ' ')"/>
        </xsl:for-each>
        <xsl:value-of select="'&#xA;'"/>
      </xsl:for-each>
    </xsl:template>

    <xsl:template match="str-split2lines-func:*" mode="f:FXSL">
      <xsl:param name="arg1" select="/.."/>
      <xsl:param name="arg2"/>

      <xsl:copy-of select="$arg1/*[position() &lt; 3]"/>
      <xsl:copy-of select="$arg1/line[position() != last()]"/>

      <xsl:choose>
        <xsl:when test="contains($arg1/*[1], $arg2)">
          <xsl:if test="string($arg1/word)">
             <xsl:call-template name="fillLine">
               <xsl:with-param name="pLine" select="$arg1/line[last()]"/>
               <xsl:with-param name="pWord" select="$arg1/word"/>
               <xsl:with-param name="pLineLength" select="$arg1/*[2]"/>
             </xsl:call-template>
          </xsl:if>
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy-of select="$arg1/line[last()]"/>
          <word><xsl:value-of select="concat($arg1/word, $arg2)"/></word>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>

      <!-- Test if the new word fits into the last line -->
    <xsl:template name="fillLine">
      <xsl:param name="pLine" select="/.."/>
      <xsl:param name="pWord" select="/.."/>
      <xsl:param name="pLineLength" />

      <xsl:variable name="vnWordsInLine" select="count($pLine/word)"/>
      <xsl:variable name="vLineLength" 
       select="string-length($pLine) + $vnWordsInLine"/>
      <xsl:choose>
        <xsl:when test="not($vLineLength + string-length($pWord) 
                           > 
                            $pLineLength)">
          <line>
            <xsl:copy-of select="$pLine/*"/>
            <xsl:copy-of select="$pWord"/>
          </line>
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy-of select="$pLine"/>
          <line>
            <xsl:copy-of select="$pWord"/>
          </line>
          <word/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>

</xsl:stylesheet>

在此 XML 文档上执行时(这是当时使用的原始文档——此 cnn 文本可以追溯到 2003 年或 2002 年):

<text>
Dec. 13 — As always for a presidential inaugural, security and surveillance were
extremely tight in Washington, DC, last January. But as George W. Bush prepared to
take the oath of office, security planners installed an extra layer of protection: a
prototype software system to detect a biological attack. The U.S. Department of
Defense, together with regional health and emergency-planning agencies, distributed
a special patient-query sheet to military clinics, civilian hospitals and even aid
stations along the parade route and at the inaugural balls. Software quickly
analyzed complaints of seven key symptoms — from rashes to sore throats — for
patterns that might indicate the early stages of a bio-attack. There was a brief
scare: the system noticed a surge in flulike symptoms at military clinics.
Thankfully, tests confirmed it was just that — the flu.
</text>

正确地将每一行对齐到不超过 64 的所需宽度结果是:

Dec. 13 — As always for a presidential inaugural, security and 
surveillance were extremely tight in Washington, DC, last 
January. But as George W. Bush prepared to take the oath of 
office, security planners installed an extra layer of 
protection: a prototype software system to detect a biological 
attack. The U.S. Department of Defense, together with regional 
health and emergency-planning agencies, distributed a special 
patient-query sheet to military clinics, civilian hospitals and 
even aid stations along the parade route and at the inaugural 
balls. Software quickly analyzed complaints of seven key 
symptoms — from rashes to sore throats — for patterns that might 
indicate the early stages of a bio-attack. There was a brief 
scare: the system noticed a surge in flulike symptoms at 
military clinics. Thankfully, tests confirmed it was just that — 
the flu. 

剩下的任务留给读者作为练习:)

于 2010-03-19T13:06:28.533 回答