0

我正在使用都柏林核心编码的元数据文件,我想将其转换为 CSV。我试图得到下面的输出

identifier1|||identifier2|||identifier3|||identifier4,datestamp1|||datestamp2|||2010-04-27T01:10:31Z,setspec1,title1|||title2,subject1|||subject2,baseURL|||xxxxx|||xxxxx

请注意,可重复元素由三个竖线符号 (|||) 分隔,而元素由逗号 (,) 分隔。

我已经设法获得了下面的样式表,但是,我正在努力解决以下问题

(1) 如何定义一个通用模板以使我能够用逗号分隔节点?

<xsl:template match="GENERIC MATCH">
  <xsl:apply-templates select="current()" />
  <xsl:if test="position() = last()">,</xsl:if>
</xsl:template>

Input File下面为例,我基本上希望 GENERIC MATCH 能够让我动态处理level 2节点(标题、元数据和 about)并用逗号分隔结果。

(2) 如何确定元素是否是最后一个子节点,以便有条件地在后面包含逗号?

<xsl:output method="text" omit-xml-declaration="yes"/>

<xsl:template match="/">
  <xsl:apply-templates select="record" />
</xsl:template>
<xsl:template match="record">
  <xsl:apply-templates select="//metadata/oai_dc:dc/dc:title|//metadata/oai_dc:dc/dc:subject" />
  <xsl:if test="not(metadata/oai_dc:dc/node()/position()=last())">#####</xsl:if>
</xsl:template>

<xsl:template match="dc:title">
  <xsl:value-of select="." />
  <xsl:if test="not(position() = last())">||</xsl:if>
</xsl:template>

<xsl:template match="dc:subject">
  <xsl:value-of select="." />
  <xsl:if test="not(position() = last())">||</xsl:if>
</xsl:template>

输入文件

<?xml version="1.0"?>
<record>
  <header>
    <identifier>identifier1</identifier>
    <datestamp>datastamp1</datestamp>
    <setSpec>setspec1</setSpec>
  </header>
  <metadata>
    <oai_dc:dc>
      <dc:title>title1</dc:title>
      <dc:title>title2</dc:title>
      <dc:creator>creator1</dc:creator>
      <dc:subject>subject1</dc:subject>
      <dc:subject>subject2</dc:subject>
    </oai_dc:dc>
  </metadata>
  <about>
    <provenance>
      <originDescription altered="false" harvestDate="2011-08-11T03:47:51Z">
        <baseURL>baseURL1</baseURL>
        <identifier>identifier3</identifier>
        <datestamp>datestamp2</datestamp>
        <metadataNamespace>xxxxx</metadataNamespace>
        <originDescription altered="false" harvestDate="2010-10-10T06:15:53Z">
          <baseURL>xxxxx</baseURL>
          <identifier>identifier4</identifier>
          <datestamp>2010-04-27T01:10:31Z</datestamp>
          <metadataNamespace>xxxxx</metadataNamespace>
        </originDescription>
      </originDescription>
    </provenance>
  </about>
</record>

我正在xslt 1.0使用xsltproc.

4

1 回答 1

1

这个怎么样:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text" indent="yes" omit-xml-declaration="yes"/>
  <!-- A key on all leaf nodes -->
  <!-- *[not(*)] matches any element that is a leaf node
       i.e. it has no child elements. Here, the elements' names are being
       used as the key value. -->
  <xsl:key name="kNodeType" match="*[not(*)]" use="local-name()"/>

  <xsl:template match="/">
    <!-- Use Muenchian grouping to apply the "group" template to the first of
         each leaf node with a distinct name. -->
    <xsl:apply-templates
      select="//*[not(*)][generate-id() = 
                          generate-id(key('kNodeType', local-name())[1])]"
      mode="group" />
  </xsl:template>

  <!-- This template will be used to match only the first item in each group,
       due to the grouping expression used in the previous template. -->
  <xsl:template match="*" mode="group">
    <!-- Skip the comma for the first group, output it for all others -->
    <xsl:if test="position() > 1">,</xsl:if>
    <!-- Apply the "item" template to all items in the same group as this element
         (i.e. those with the same name) -->
    <xsl:apply-templates select="key('kNodeType', local-name())" mode="item" />
  </xsl:template>

  <xsl:template match="*" mode="item">
    <!-- Skip the delimiter for the first item in each group;
         output it for all others -->
    <xsl:if test="position() > 1">|||</xsl:if>
    <xsl:value-of select="."/>
  </xsl:template>
</xsl:stylesheet>

在您的示例输入上运行时,这会产生:

identifier1|||identifier3|||identifier4,datastamp1|||datastamp2|||2010-04-27T01:10:31Z,setspec1,title1|||title2,creator1,subject1|||subject2,baseURL1|||xxxxx, xxxxx|||xxxxx

于 2013-01-30T14:55:50.107 回答