xslt - XSLT - get sorted information extract from cdata

Question

I’m new in the xslt topic and have a problem that can't solve on my own. Here e excample of my xml file:

<node>
  <failure><![CDATA[
    some useless information.
    CRS urn:ogc:def:crs:EPSG::25830 not defined.
    CRS urn:ogc:def:crs:EPSG::25833 not possible.
    CRS urn:ogc:def:crs:EPSG::25830 not defined. 
    some useless information.]]>
  </failure>
</node>

The main problem is that the information stand in a CDATA block and many different informations are mixed up. I have found a way to get them out, but only as a string value not able to differentiate between the sort.

I need a way to extract elements that fit the pattern: "CRS [-unknown-] [id] not [result]"

What i want is something like this:

<failure>
    <CRS>
      <id> urn:ogc:def:crs:EPSG::25830 </id>
      <result> not defined </result>
    </CRS>
    <CRS>
      <id> urn:ogc:def:crs:EPSG::25833 </id>
      <result> not posible </result>
    </CRS>
    <CRS>
      <id> urn:ogc:def:crs:EPSG::25830 </id>
      <result> not defined </result>
    </CRS>
</failure>

Can somebody help me or made experience with simular problems?

score 0 · Accepted Answer

XSLT 2.0xsl:analyze-string正是为这项任务而设计的，所以如果可能的话，我建议您升级到Saxon等 2.0 处理器：

<xsl:template match="node">
  <failure>
    <xsl:analyze-string select="failure"
         regex="^\s*CRS\s*(\S+)\s*(not\s*.*)$" flags="m">
      <xsl:matching-substring>
        <CRS>
          <id><xsl:value-of select="regex-group(1)" /></id>
          <result><xsl:value-of select="normalize-space(regex-group(2))" /></result>
        </CRS>
      </xsl:matching-substring>
    </xsl:analyze-string>
  </failure>
</xsl:template>

相比之下，XSLT 1.0 中的字符串操作工具极为有限，并且由于 XSLT 是一种没有可更新变量的函数式语言，您必须编写某种极其复杂的递归call-template逻辑集来将文本分成单独的行，然后提取相关的依次使用substring-before和从每行中取出位substring-after。

<xsl:template name="each-line">
  <xsl:param name="val" />
  <!-- pull out everything before the first newline and normalize (trim leading
       and trailing whitespace and squash internal whitespace to a single space
       character -->
  <xsl:variable name="firstLine"
    select="normalize-space(substring-before($val, '&#10;'))" />
  <!-- pull out everything after the first newline -->
  <xsl:variable name="rest" select="substring-after($val, '&#10;')" />
  <xsl:if test="$firstLine">
    <xsl:call-template name="process-line">
      <xsl:with-param name="line" select="$firstLine" />
    </xsl:call-template>
  </xsl:if>
  <!-- if there are still some non-empty lines left then process them recursively -->
  <xsl:if test="normalize-space($rest)">
    <xsl:call-template name="each-line">
      <xsl:with-param name="val" select="$rest" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>

<xsl:template name="process-line">
  <xsl:param name="line" />
  <xsl:if test="starts-with($line, 'CRS ') and contains($line, ' not ')">
    <!-- here $line will be something like
         "CRS urn:ogc:def:crs:EPSG::25830 not defined." -->
    <CRS>
      <!-- ID is everything between the first and second spaces -->
      <id><xsl:value-of select="substring-before(substring-after($line, ' '), ' ')" /></id>
      <!-- result is everything after the second space -->
      <result><xsl:value-of select="substring-after(substring-after($line, ' '), ' ')" /></result>
    </CRS>
  </xsl:if>
</xsl:template>

您可以使用类似的构造调用此逻辑

<xsl:template match="node">
  <failure>
    <xsl:call-template name="each-line">
      <xsl:with-param name="val" select="failure" />
    </xsl:call-template>
  </failure>
</xsl:template>

xslt - XSLT - get sorted information extract from cdata

1 回答 1

Related

Reference