1

努力使以下工作:我正在尝试合并翻译的节点,但由于有时节点集之间存在细微差异,我不能这样做,只是蒙上眼睛,需要手动审查。然而,与此同时,我喜欢让我的生活保持简单,所以我想尽可能多地实现自动化。举个例子:

<root>
<chapter>
<string class="l1"><local xml:lang="en">Some English here</local></string>
<string class="p"><local xml:lang="en">Some other English here</local></string>
<string class="p"><local xml:lang="en">and some English here</local></string>
<string class="p"><local xml:lang="en">Some English here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="fr">Some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some other English translated to French here</local></string>
<string class="p"><local xml:lang="fr">and some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some English translated to French here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>

实际文件可以包含 30 种语言和数百个节点,因此上面的示例非常简化。

我想用这个例子实现的是合并英语和法语,因为它们都有相同数量的元素,并且所有属性也相同。法语应该保持原样,因为并非所有属性都匹配,荷兰语应该保持原样,因为元素的数量不匹配。

所以输出应该是这样的:

<root>
<!-- French has the same amount of elements, and a full sequential match of attributes, so we can merge -->
<chapter>
<string class="l1">
    <local xml:lang="en">Some English here</local>
    <local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">Some other English here</local>
    <local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">and some English here</local>
    <local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">Some English here</local>
    <local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<!-- German has same amount of elements, but different tag sequence, so we leave it for manual review -->
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<!-- Dutch has same same tag sequence but less elements, so we leave it for manual review-->
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>

英语始终是主要参考,因此我已经可以通过使用英语节点计数作为比较来排除大小不同的节点集,只是不知道如何检查所有属性值是否也相等。

有什么建议吗?(使用 xslt2)

谢谢 !

4

2 回答 2

1

这是一个示例 XSLT 2.0 样式表:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:variable 
  name="master" 
  select="root/chapter[string/local/@xml:lang = 'en']"/>


<xsl:variable 
  name="matches" 
  select="root/chapter[not(string/local/@xml:lang = 'en')]
    [count(string) eq count($master/string)
     and 
      (every $i in (1 to count($master/string))
       satisfies $master/string[$i]/@class eq string[$i]/@class)]"/>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="chapter[. intersect $master]">
  <xsl:copy>
    <xsl:apply-templates select="string"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="string[local/@xml:lang = 'en']">
  <xsl:variable name="pos" select="position()"/>
  <xsl:copy>
    <xsl:apply-templates select="@* | local | $matches/string[$pos]/local"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="chapter[. intersect $matches]"/>

</xsl:stylesheet>

当我将 Saxon 9.4 应用于您发布的输入时,我得到了结果

<root>
   <chapter>
      <string class="l1">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some other English here</local>
         <local xml:lang="fr">Some other English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">and some English here</local>
         <local xml:lang="fr">and some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
   </chapter>
   <chapter>
      <string class="l1">
         <local xml:lang="de">Some English translated to German here</local>
      </string>
      <string class="p">
         <local xml:lang="de">Some other English translated to German here</local>
      </string>
      <string class="another_class">
         <local xml:lang="de">and some English translated to German here</local>
      </string>
      <string class="p">
         <local xml:lang="de">Some English translated to German here</local>
      </string>
   </chapter>
   <chapter>
      <string class="l1">
         <local xml:lang="nl">Some English translated to Dutch here</local>
      </string>
      <string class="p">
         <local xml:lang="nl">Some other English translated to Dutch here</local>
      </string>
      <string class="p">
         <local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local>
      </string>
   </chapter>
</root>
于 2012-06-21T10:12:07.620 回答
0

这种转变

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="vENSignature" select="string-join(/*/*[1]/*/@class, '+')"/>
 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="/*">
  <root>
   <xsl:for-each-group select="chapter"
    group-adjacent="string-join(*/@class, '+') eq $vENSignature">
     <xsl:choose>
       <xsl:when test="current-grouping-key() eq true()">
             <chapter>
              <xsl:apply-templates select="*"/>
            </chapter>
        </xsl:when>
        <xsl:otherwise>
          <xsl:sequence select="current-group()"/>
        </xsl:otherwise>
    </xsl:choose>
   </xsl:for-each-group>
  </root>
 </xsl:template>

 <xsl:template match="chapter/*" >
  <xsl:variable name="vPos" select="position()"/>
  <xsl:copy>
    <xsl:sequence select="@*, current-group()/*[position() = $vPos]/*"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

应用于提供的 XML 文档时:

<root>
    <chapter>
        <string class="l1">
            <local xml:lang="en">Some English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">Some other English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">and some English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">Some English here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="fr">Some English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">Some other English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">and some English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">Some English translated to French here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="de">Some English translated to German here</local>
        </string>
        <string class="p">
            <local xml:lang="de">Some other English translated to German here</local>
        </string>
        <string class="another_class">
            <local xml:lang="de">and some English translated to German here</local>
        </string>
        <string class="p">
            <local xml:lang="de">Some English translated to German here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="nl">Some English translated to Dutch here</local>
        </string>
        <string class="p">
            <local xml:lang="nl">Some other English translated to Dutch here</local>
        </string>
        <string class="p">
            <local xml:lang="nl">and some English translated to Dutch here
                <br/>Some English translated to Dutch here
            </local>
        </string>
    </chapter>
</root>

产生想要的正确结果:

<root>
   <chapter>
      <string class="l1">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some other English here</local>
         <local xml:lang="fr">Some other English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">and some English here</local>
         <local xml:lang="fr">and some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
   </chapter>
   <chapter>
            <string class="l1">
                  <local xml:lang="de">Some English translated to German here</local>
            </string>
            <string class="p">
                  <local xml:lang="de">Some other English translated to German here</local>
            </string>
            <string class="another_class">
                  <local xml:lang="de">and some English translated to German here</local>
            </string>
            <string class="p">
                  <local xml:lang="de">Some English translated to German here</local>
            </string>
      </chapter>
   <chapter>
            <string class="l1">
                  <local xml:lang="nl">Some English translated to Dutch here</local>
            </string>
            <string class="p">
                  <local xml:lang="nl">Some other English translated to Dutch here</local>
            </string>
            <string class="p">
                  <local xml:lang="nl">and some English translated to Dutch here
                <br/>Some English translated to Dutch here
            </local>
            </string>
      </chapter>
</root>

说明

  1. 我们定义并使用 a 的“签名”属性chapter——即class其子项的属性值的序列。

  2. 我们chapter根据其签名是否等于“英文签名”这一事实对所有元素进行分组。

  3. 我们合并chapter组中签名等于“英文签名”的元素。

  4. chapter我们复制其他组中的元素不变。

于 2012-06-21T12:12:19.657 回答