我需要转置一个 XML 文件,例如:
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="11" name="Widget 2" price="21.99" category_code="V" category="Video Games" manufacturer_code="SC4" manufacturer="Some Company 4" />
<product id="12" name="Widget 3" price="10.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 2" />
</products>
到一个逗号分隔的文本文件,或者一个格式正确的 HTML 表格,其中每个产品只包含一行,如下所示:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3
10, Widget 1, 15.99, T, Toys, E, Electronics, SC1, Some Company 1, SC2, Some Company 2, SC3, Some Company 3
11, Widget 2, 21.99, V, Video Games,,, SC4, Some Company 4,,,,
12, Widget 3, 10.99, T, Toys,,, SC1, Some Company 2,,,,
正如您所注意到的,XML 数据可以被认为是三个表连接在一起的结果:product、product_category 和 product_manufacturer。每个产品可以属于多个类别并有多个制造商。当然,我正在处理的真实数据更复杂,而且在一个完全不同的领域,但是这个示例正确地描述了这个问题。
我对 XSLT 的了解非常有限,在 SO 和 Internet 上的其他资源的帮助下,我整理了一个样式表,部分提供了我需要的东西:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="key_product_group" match="product" use="@id"/>
<xsl:key name="key_category_group" match="product" use="concat(
@id,
@category_code,
@category)"/>
<xsl:key name="key_manufacturer_group" match="product" use="concat(
@id,
@manufacturer_code,
@manufacturer)"/>
<xsl:variable name="var_max_category_group" >
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="var_max_manufacturer_group">
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/">
<xsl:text>id,</xsl:text>
<xsl:text>name,</xsl:text>
<xsl:text>price,</xsl:text>
<xsl:call-template name="loop_pcat">
<xsl:with-param name="count" select="$var_max_category_group"/>
</xsl:call-template>
<xsl:call-template name="loop_pmf">
<xsl:with-param name="count" select="$var_max_manufacturer_group"/>
</xsl:call-template>
<br></br>
<xsl:variable name="var_group"
select="//product[generate-id(.) = generate-id(key('key_product_group',@id)[1])]"/>
<xsl:for-each select="$var_group">
<xsl:sort order="ascending" select="@id"/>
<xsl:value-of select="@id"/>,
<xsl:value-of select="@name"/>,
<xsl:value-of select="@price"/>,
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))]">
<xsl:value-of select="@category_code"/>,
<xsl:value-of select="@category"/>,
</xsl:for-each>
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))]">
<xsl:value-of select="@manufacturer_code"/>,
<xsl:value-of select="@manufacturer"/>,
</xsl:for-each>
<br></br>
</xsl:for-each>
</xsl:template>
<xsl:template name="loop_pcat">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('category_code_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="concat('category_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="loop_pcat">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('manufacturer_code_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="concat('manufacturer_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="loop_pmf">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
上面的样式表产生以下结果:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3,
10, Widget 1, 15.99, T, Toys, E, Electronics, SC1, Some Company 1, SC2, Some Company 2, SC3, Some Company 3,
11, Widget 2, 21.99, V, Video Games, SC4, Some Company 4,
12, Widget 3, 10.99, T, Toys, SC1, Some Company 2,
输出至少有一个主要问题:每一行中都不存在所有列,例如第 2 行和第 3 行缺少 category_code_2、category_2、manufacturer_code 和manufacturer 2 和 3。我确定样式表还有其他问题同样,我不知道它将如何在相对较大的 xml 文件上执行,但现在我非常感谢您在使样式表生成所需的输出格式方面的帮助。
谢谢
耐甲氧西林金黄色葡萄球菌