我有一个包含 92 个制表符分隔的文本文件列表的 XML 文件:
<?xml version="1.0" encoding="UTF-8"?>
<dumpSet>
<dump filename="file_one.txt"/>
<dump filename="file_two.txt"/>
<dump filename="file_three.txt"/>
...
</dumpSet>
每个文件的第一行包含后续行的字段名称。这只是一个例子。元素的名称和数量将因记录而异。大多数将有大约 50 个字段名称。
Title Translated Title Watch Video Interviewee Interviewer
Interview with Barack Obama Obama, Barack Walters, Barbara
Interview with Sarah Palin Palin, Sarah Couric, Katie Smith, John
...
Oxygen XML Editor 有一个 Import 功能,可以将文本文件转换为 XML,但是 - 据我所知 - 这不能在具有多个文件的批处理过程中完成。到目前为止,批处理部分还没有出现问题。我正在使用 XSLT 2.0 的unparsed-text()函数从列表中的文件中提取内容。但是,我正在努力正确地对 XML 输出进行分组。所需输出的示例:
<collection>
<record>
<title>Interview with Barack Obama</title>
<translatedtitle></translatedtitle>
<watchvideo></watchvideo>
<interviewee>Obama, Barack</interviewee>
<interviewer>Walters, Barbara</interviewer>
<videographer>Smith, John</videographer>
</record>
<record>
<title>Interview with Sarah Palin</title>
<translatedtitle></translatedtitle>
<watchvideo></watchvideo>
<interviewee>Palin, Sarah</interviewee>
<interviewer>Couric, Katie</interviewer>
<videographer>Smith, John</videographer>
</record>
...
</collection>
现在,这是我得到的输出:
<collection>
<record>
<title>title</title>
<value>Interview with Barack Obama</value>
<value>Interview with Sarah Palin</value>
<translatedtitle>translatedtitle</translatedtitle>
<value/>
<value/>
<watchvideo>watchvideo</watchvideo>
<value/>
<value/>
<interviewee>interviewee</interviewee>
<value>Obama, Barack</value>
<value>Palin, Sarah</value>
<interviewer>interviewer</interviewer>
<value>Walters, Barbara</value>
<value>Couric, Katie</value>
<videographer>videographer</videographer>
<value>Smith, John</value>
<value>Smith, John </value>
<value/>
<value/>
</record>
</collection>
也就是说,我无法按记录对输出进行分组。这是我正在使用的当前代码,基于 Doug Tidwell 的 XSLT 书中的一个示例:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="2.0">
<xsl:param name="i" select="1"/>
<xsl:param name="increment" select="1"/>
<xsl:param name="operator" select="'<='"/>
<xsl:param name="testVal" select="100"/>
<xsl:template match="/">
<collections>
<collection>
<xsl:for-each select="dumpSet/dump">
<!-- Pull in external tab-delimited files -->
<xsl:for-each select="unparsed-text(concat('../2013-04-26/',@filename),'UTF-8')">
<record>
<!-- Call recursive template to loop through elements. -->
<xsl:call-template name="for-loop">
<xsl:with-param name="i" select="$i"/>
<xsl:with-param name="increment" select="$increment"/>
<xsl:with-param name="operator" select="$operator"/>
<xsl:with-param name="testVal" select="$testVal"/>
</xsl:call-template>
</record>
</xsl:for-each>
</xsl:for-each>
</collection>
</collections>
</xsl:template>
<xsl:template name="for-loop">
<xsl:param name="i"/>
<xsl:param name="increment"/>
<xsl:param name="operator"/>
<xsl:param name="testVal"/>
<xsl:variable name="testPassed">
<xsl:choose>
<xsl:when test="$operator = '<='">
<xsl:if test="$i <= $testVal">
<xsl:text>true</xsl:text>
</xsl:if>
</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:if test="$testPassed = 'true'">
<!-- Separate the header from the tab-delimited file. -->
<xsl:for-each select="tokenize(.,'\r|\n')[1]">
<!-- Spit out the field names. -->
<xsl:for-each select="tokenize(.,'\t')[$i]">
<xsl:element name="{replace(lower-case(translate(.,'-.','')),' ','')}">
<xsl:value-of select="replace(lower-case(translate(.,'-.','')),' ','')"/>
</xsl:element>
</xsl:for-each>
</xsl:for-each>
<!-- For the following rows, loop through the field values. -->
<xsl:for-each select="tokenize(.,'\r|\n')[position()>1]">
<xsl:for-each select="tokenize(.,'\t')[$i]">
<value>
<xsl:value-of select="."/>
</value>
</xsl:for-each>
</xsl:for-each>
<!-- Call the template to increment. -->
<xsl:call-template name="for-loop">
<xsl:with-param name="i" select="$i + $increment"/>
<xsl:with-param name="increment" select="$increment"/>
<xsl:with-param name="operator" select="$operator"/>
<xsl:with-param name="testVal" select="$testVal"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
我应该如何将其更改为按记录对输出进行分组?