xml - XSLT 如何使用逗号分隔的字符串查找值？

Question

我想出了如何进行字符串查找以及如何分别解析逗号分隔的字符串。但是我想知道是否有一种有效的方法可以同时满足这两个要求。这是我的源 XML：

<?xml version="1.0" ?>
     <MATRIX>
     <DATA_RECORD>
      <COMPONENT1>1, 2</COMPONENT1> 
      <COMPONENT2>6, 7, 8, 9</COMPONENT2>  
     </DATA_RECORD>
    </MATRIX>

我希望通过解析逗号分隔的字符串并使用每个令牌进行查找来生成以下 XML：

<?xml version="1.0" encoding="UTF-8"?>
     <MATRIX>
     <DATA_RECORD>
      <COMPONENT1>A, B</COMPONENT1> 
      <COMPONENT2>F, G, H, I</COMPONENT2>  
     </DATA_RECORD>
    </MATRIX>

这是我的查找 XML（COMPONENT_LOOKUPLIST.xml）：

<?xml version="1.0" ?>
    <MAIN>
      <DATA_RECORD>
        <COMPONENT_ID>1</COMPONENT_ID>
        <COMPONENT_NAME>A</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>2</COMPONENT_ID>
        <COMPONENT_NAME>B</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>3</COMPONENT_ID>
        <COMPONENT_NAME>C</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>4</COMPONENT_ID>
        <COMPONENT_NAME>D</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>5</COMPONENT_ID>
        <COMPONENT_NAME>E</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>6</COMPONENT_ID>
        <COMPONENT_NAME>F</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>7</COMPONENT_ID>
        <COMPONENT_NAME>G</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>8</COMPONENT_ID>
        <COMPONENT_NAME>H</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>9</COMPONENT_ID>
        <COMPONENT_NAME>I</COMPONENT_NAME>    
      </DATA_RECORD>
    </MAIN>

我是 XSLT 的初学者。一些 XSLT 专家可以分享一些想法或提供示例代码吗？我从 Jeni 的网站获得了令牌代码：

 <xsl:template name="tokenize">
          <xsl:param name="string" />
          <xsl:param name="delimiter" select="','" />
          <xsl:choose>
            <xsl:when test="$delimiter and contains($string, $delimiter)">
              <token>
                <xsl:value-of select="substring-before($string, $delimiter)" />
              </token>

              <xsl:call-template name="tokenize">
                <xsl:with-param name="string" 
                                select="substring-after($string, $delimiter)" />
                <xsl:with-param name="delimiter" select="$delimiter" />
              </xsl:call-template>
            </xsl:when>

            <xsl:otherwise>
              <token><xsl:value-of select="$string" /></token>

            </xsl:otherwise>
          </xsl:choose>
        </xsl:template>

    <xsl:call-template name="tokenize">    
            <xsl:with-param name="string" select="/MATRIX/DATA_RECORD/COMPONENT1"></xsl:with-param>
          </xsl:call-template>

            <xsl:call-template name="tokenize">    
            <xsl:with-param name="string" select="/MATRIX/DATA_RECORD/COMPONENT2"></xsl:with-param>
          </xsl:call-template>

并写了一个查找：

<xsl:variable name="lookup" select="document('COMPONENT_LOOKUPLIST.xml')/MAIN/DATA_RECORD"/>
                <xsl:for-each select="//DATA_RECORD">

 <token>
  <xsl:for-each select="*">      
   <xsl:value-of select="$lookup[COMPONENT_ID = current()]/COMPONENT_NAME"/>        
          </xsl:for-each>   
          </token>
        </xsl:for-each>

但是将这两者结合在一起似乎具有挑战性。

谢谢你。

score 0 · Accepted Answer

If you are stuck with XSLT1.0, one solution could be to amend the tokenize template to do the look-up, rather than spitting up token elements (If you kept with the token elements, you would have to do a two-pass transform to transform these back to strings based on your look-up).

So, instead of doing this in the tokenize template

<token>
   <xsl:value-of select="substring-before($string, $delimiter)" />
</token>

Do, this instead

<xsl:value-of select="$lookup[COMPONENT_ID=normalize-space(substring-before($string, $delimiter))]/COMPONENT_NAME"/>
 <xsl:value-of select="$delimiter"/>

Here is the full XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output omit-xml-declaration="yes" method="html"/>

   <xsl:variable name="lookup" select="document('C:\COMPONENT_LOOKUPLIST.xml')/MAIN/DATA_RECORD"/>

   <xsl:template match="DATA_RECORD/*">
      <xsl:copy>
         <xsl:call-template name="tokenize">
            <xsl:with-param name="string" select="."/>
         </xsl:call-template>
      </xsl:copy>
   </xsl:template>

   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>

   <xsl:template name="tokenize">
      <xsl:param name="string"/>
      <xsl:param name="delimiter" select="','"/>
      <xsl:choose>
         <xsl:when test="$delimiter and contains($string, $delimiter)">
            <xsl:value-of select="$lookup[COMPONENT_ID=normalize-space(substring-before($string, $delimiter))]/COMPONENT_NAME"/>
            <xsl:value-of select="$delimiter"/>
            <xsl:call-template name="tokenize">
               <xsl:with-param name="string" select="substring-after($string, $delimiter)"/>
               <xsl:with-param name="delimiter" select="$delimiter"/>
            </xsl:call-template>
         </xsl:when>
         <xsl:otherwise>
            <xsl:value-of select="$lookup[COMPONENT_ID=normalize-space($string)]/COMPONENT_NAME"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:template>
</xsl:stylesheet>

When applied to your XML, the following is output

<MATRIX>
   <DATA_RECORD>
      <COMPONENT1>A,B</COMPONENT1>
      <COMPONENT2>F,G,H,I</COMPONENT2>
   </DATA_RECORD>
</MATRIX>

Note, this does remove the spaces too, but hopefully that isn't an issue...

EDIT: If you do want to keep the spaces, you do something like this:

<xsl:variable name="current" select="substring-before($string, $delimiter)" />
<xsl:value-of select="substring-before($current, normalize-space($current))" /> 
<xsl:value-of select="$lookup[COMPONENT_ID=normalize-space($)]/COMPONENT_NAME"/>
<xsl:value-of select="substring-after($current, normalize-space($current))" />

score 0 · Accepted Answer

您是否考虑过使用像 Saxon 9 或 AltovaXML 或 XmlPrime 这样的 XSLT 2.0 处理器？在这种情况下，您可以轻松地执行例如

<xsl:key name="by-id" match="DATA_RECORD" use="COMPONENT_ID"/>

<xsl:param name="lk-doc-url" select="'COMPONENT_LOOKUPLIST.xml'"/>
<xsl:variable name="lk-doc" select="document($lk-doc-url)"/>

<xsl:template match="*[starts-with(local-name(), 'COMPONENT')]">
  <xsl:copy>
    <xsl:value-of select="for $id in tokenize(., ', ') return key('by-id', $id, $lk-doc)/COMPONENT_NAME"
       separator=", "/>
  </xsl:copy>
</xsl:template>

[编辑] 这是一个完整且经过测试的示例：

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">



<xsl:key name="by-id" match="DATA_RECORD" use="COMPONENT_ID"/>

<xsl:param name="lk-doc-url" select="'test2013013103.xml'"/>
<xsl:variable name="lk-doc" select="document($lk-doc-url)"/>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="*[starts-with(local-name(), 'COMPONENT')]">
  <xsl:copy>
    <xsl:value-of select="for $id in tokenize(., ', ') return key('by-id', $id, $lk-doc)/COMPONENT_NAME"
       separator=", "/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

当我将 Saxon 9.4 应用于输入时

<?xml version="1.0" ?>
     <MATRIX>
     <DATA_RECORD>
      <COMPONENT1>1, 2</COMPONENT1> 
      <COMPONENT2>6, 7, 8, 9</COMPONENT2>  
     </DATA_RECORD>
    </MATRIX>

查找文件test2013013103.xml是

<?xml version="1.0" ?>
    <MAIN>
      <DATA_RECORD>
        <COMPONENT_ID>1</COMPONENT_ID>
        <COMPONENT_NAME>A</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>2</COMPONENT_ID>
        <COMPONENT_NAME>B</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>3</COMPONENT_ID>
        <COMPONENT_NAME>C</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>4</COMPONENT_ID>
        <COMPONENT_NAME>D</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>5</COMPONENT_ID>
        <COMPONENT_NAME>E</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>6</COMPONENT_ID>
        <COMPONENT_NAME>F</COMPONENT_NAME>    
      </DATA_RECORD>
        <DATA_RECORD>
        <COMPONENT_ID>7</COMPONENT_ID>
        <COMPONENT_NAME>G</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>8</COMPONENT_ID>
        <COMPONENT_NAME>H</COMPONENT_NAME>    
      </DATA_RECORD>
      <DATA_RECORD>
        <COMPONENT_ID>9</COMPONENT_ID>
        <COMPONENT_NAME>I</COMPONENT_NAME>    
      </DATA_RECORD>
    </MAIN>

输出是

<?xml version="1.0" encoding="UTF-8"?><MATRIX>
     <DATA_RECORD>
      <COMPONENT1>A, B</COMPONENT1>
      <COMPONENT2>F, G, H, I</COMPONENT2>
     </DATA_RECORD>
    </MATRIX>

所以我的建议有效，我不确定你没有得到任何内容的情况有什么不同。

xml - XSLT 如何使用逗号分隔的字符串查找值？

2 回答 2

Related

Reference