1

我得到一个包含电视广播列表的巨大 XML 文件。而且我必须将其拆分为仅包含一天所有广播的小文件。我设法做到了,但是xml标头和一个节点多次出现有两个问题。

XML的结构如下:

<?xml version="1.0" encoding="UTF-8"?>
<broadcasts>
    <broadcast>
    <id>4637445812</id>
    <week>39</week>
    <date>2009-09-22</date>
    <time>21:45:00:00</time>
        ... (some more)
    </broadcast>
    ... (long list of broadcast nodes)
</broadcasts>

我的 XSL 看起来像这样:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:redirect="http://xml.apache.org/xalan/redirect"
        extension-element-prefixes="redirect"
        version="1.0">
    <!-- mark the CDATA escaped tags -->
    <xsl:output method="xml" cdata-section-elements="title text"
        indent="yes" omit-xml-declaration="no" />

    <xsl:template match="broadcasts">
        <xsl:apply-templates />
    </xsl:template>

    <xsl:template match="broadcast">
    <!-- Build filename PRG_YYYYMMDD.xml -->
    <xsl:variable name="filename" select="concat(substring(date,1,4),substring(date,6,2))"/>
    <xsl:variable name="filename" select="concat($filename,substring(date,9,2))" />
    <xsl:variable name="filename" select="concat($filename,'.xml')" />
    <redirect:write select="concat('PRG_',$filename)" append="true">    

        <schedule>  
        <broadcast program="TEST">
            <!-- format timestamp in specific way -->
            <xsl:variable name="tmstmp" select="concat(substring(date,9,2),'/')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,6,2))"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,'/')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,1,4))"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,' ')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(time,1,5))"/>

            <timestamp><xsl:value-of select="$tmstmp"/></timestamp>
            <xsl:copy-of select="title"/>
            <text><xsl:value-of select="subtitle"/></text>

            <xsl:variable name="newVps" select="concat(substring(vps,1,2),substring(vps,4,2))"/>
            <xsl:variable name="newVps" select="concat($newVps,substring(vps,7,2))"/>
            <xsl:variable name="newVps" select="concat($newVps,substring(vps,10,2))"/>
            <vps><xsl:value-of select="$newVps"/></vps>                    
            <nextday>false</nextday>               
        </broadcast>      
        </schedule>
    </redirect:write>
    </xsl:template> 
</xsl:stylesheet>

我的输出 XML 是这样的:

PRG_20090512.xml:

<?xml version="1.0" encoding="UTF-8"?>
  <schedule>
    <broadcast program="TEST">
      <timestamp>01/03/2010 06:00</timestamp>
      <title><![CDATA[TELEKOLLEG  Geschichte ]]></title>
      <text><![CDATA[Giganten in Fernost]]></text>
      <vps>06000000</vps>
      <nextday>false</nextday>
    </broadcast>
  </schedule>
<?xml version="1.0" encoding="UTF-8"?> <!-- don't want this -->
  <schedule>  <!-- don't want this -->
    <broadcast program="TEST">
      <timestamp>01/03/2010 06:30</timestamp>
      <title><![CDATA[Die chemische Bindung]]></title>
      <text/>
      <vps>06300000</vps>
      <nextday>false</nextday>
    </broadcast>
  </schedule>
<?xml version="1.0" encoding="UTF-8"?>
...and so on

我可以在输出声明中输入 omit-xml-declaration="yes" ,但我没有任何 xml 标头。我试图检查标签是否已经在输出中,但未能在输出中选择节点......

这是我尝试过的:

<xsl:choose>
  <xsl:when test="count(schedule) = 0"> <!-- schedule needed -->   
    <schedule>
      <broadcast>
    ...
  <xsl:otherwise> <!-- no schedule needed -->
    <broadcast>
    ...

感谢您的帮助,因为我不知道如何处理。;( 雪人

4

2 回答 2

1

一次写入一个文件,包含该日期的所有广播。

这变成了按日期对输入元素进行分组的问题。由于 Xalan 是 XSLT 1.0,因此您可以使用键来执行此操作。

我们定义了一个按日期对广播进行分组的键。我们选择每个广播是其日期的第一个广播。然后使用键功能选择同一日期的所有广播。

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:redirect="http://xml.apache.org/xalan/redirect"
                extension-element-prefixes="redirect"
                version="1.0">

    <!-- mark the CDATA escaped tags --> 
    <xsl:output method="xml" cdata-section-elements="title text" indent="yes" omit-xml-declaration="no" />

    <xsl:key name="date" match="broadcast" use="date" />

    <xsl:template match="broadcasts">
        <xsl:apply-templates select="broadcast[generate-id(.)=generate-id(key('date',date)[1])]"/>
    </xsl:template>

    <xsl:template match="broadcast">
        <!-- Build filename PRG_YYYYMMDD.xml -->
        <xsl:variable name="filename" select="concat(substring(date,1,4),substring(date,6,2))"/>
        <xsl:variable name="filename" select="concat($filename,substring(date,9,2))" />
        <xsl:variable name="filename" select="concat($filename,'.xml')" />

        <redirect:write select="concat('PRG_',$filename)" append="true">        

            <schedule>
                <xsl:apply-templates select="key('date',date)" mode="broadcast" />
            </schedule>

        </redirect:write>

    </xsl:template>

    <xsl:template match="broadcast" mode="broadcast">
        <broadcast program="TEST">
            <!-- format timestamp in specific way -->
            <xsl:variable name="tmstmp" select="concat(substring(date,9,2),'/')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,6,2))"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,'/')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,1,4))"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,' ')"/>
            <xsl:variable name="tmstmp" select="concat($tmstmp,substring(time,1,5))"/>

            <timestamp><xsl:value-of select="$tmstmp"/></timestamp>
            <xsl:copy-of select="title"/>
            <text><xsl:value-of select="subtitle"/></text>

            <xsl:variable name="newVps" select="concat(substring(vps,1,2),substring(vps,4,2))"/>
            <xsl:variable name="newVps" select="concat($newVps,substring(vps,7,2))"/>
            <xsl:variable name="newVps" select="concat($newVps,substring(vps,10,2))"/>
            <vps><xsl:value-of select="$newVps"/></vps>                                     
            <nextday>false</nextday>                             
        </broadcast>
    </xsl:template>

</xsl:stylesheet>
于 2010-03-11T11:14:41.237 回答
0

用唯一的父级包装您的日程安排元素,看看这是否会使问题消失。

我不熟悉这个特殊问题,但我猜它是由于您尝试生成具有多个顶级元素的 XML 文档造成的。每个 XML 文档都必须只有一个顶级元素(如果您问我,这是一个愚蠢的要求,例如它使 XML 不适合日志文件,但事实就是如此)。

于 2010-03-11T10:46:34.083 回答