The Input I Have:
I'm working with a Sharepoint list that produces RSS feeds in the following form:
<?xml version="1.0"?>
<rss>
<channel>
<!-- Irrelevant Fields -->
<item>
<title type="text">Title</title>
<description type="html">
<div><b>Field1:</b> Value 1</div>
<div><b>Field2:</b> Value 2</div>
<div><b>Field3:</b> Value 3</div>
<div><b>Field4:</b> Value 4</div>
<div><b>Field5:</b> Value 5</div>
</description>
</item>
<item>
<title type="text">Title</title>
<description type="html">
<div><b>Field1:</b> Value 1</div>
<div><b>Field3:</b> Value 3</div>
<div><b>Field4:</b> Value 4</div>
<div><b>Field5:</b> Value 5</div>
</description>
</item>
<item>
<title type="text">Title</title>
<description type="html">
<div><b>Field1:</b> Value 1</div>
<div><b>Field2:</b> Value 2</div>
<div><b>Field3:</b> Value 3</div>
<div><b>Field4:</b> Value 4</div>
<div><b>Field5:</b> Value 5</div>
</description>
</item>
<!-- More <item> elements -->
</channel>
</rss>
Note that the <description>
element seems to define a set of elements. Furthermore, note that not all <description>
elements contain markup for "Field2".
What I Need:
I need XML of the following form:
<?xml version="1.0"?>
<Events>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field2>Value 2</Field2>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
</Event>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field2/>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
</Event>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field2>Value 2</Field2>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
</Event>
</Events>
The Rules (updated):
- This needs to be an XSLT 1.0 solution.
xxx:node-set
is the only valid extension function available to me; this includes extension functions written in other languages, such as C# or Javascript.- If any field's information is missing, a blank element should be output. Note in my desired output the empty
<Field2>
child within the second<Event>
element. - We cannot assume that the field names themselves will follow any particular pattern; they may as well be
<PeanutButter>
,<Jelly>
, etc.
What I Have So Far:
<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl"
version="1.0">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<Events>
<xsl:apply-templates select="*/item"/>
</Events>
</xsl:template>
<xsl:template match="item[contains(description, 'Field2')]">
<Event>
<xsl:variable name="vElements">
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="description"/>
<xsl:with-param name="delimiter" select="' '"/>
</xsl:call-template>
</xsl:variable>
<Category>
<xsl:value-of select="title"/>
</Category>
<xsl:apply-templates
select="exsl:node-set($vElements)/*[normalize-space()]" mode="token"/>
</Event>
</xsl:template>
<!-- NOTE HOW THIS TEMPLATE IS NEARLY IDENTICAL TO THE LAST ONE,
MINUS THE BLANK <Field2>; THAT'S NOT VERY ELEGANT. -->
<xsl:template match="item[not(contains(description, 'Field2'))]">
<Event>
<xsl:variable name="vElements">
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="description"/>
<xsl:with-param name="delimiter" select="' '"/>
</xsl:call-template>
</xsl:variable>
<Category>
<xsl:value-of select="title"/>
</Category>
<xsl:apply-templates
select="exsl:node-set($vElements)/*[normalize-space()]" mode="token"/>
<Field2/>
</Event>
</xsl:template>
<xsl:template match="*" mode="token">
<xsl:element
name="{substring-after(
substring-before(normalize-space(), ':'),
'<div><b>')}">
<xsl:value-of
select="substring-before(
substring-after(., ':</b> '),
'</div>')"/>
</xsl:element>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="text"/>
<xsl:param name="delimiter" select="' '"/>
<xsl:choose>
<xsl:when test="contains($text,$delimiter)">
<xsl:element name="token">
<xsl:value-of select="substring-before($text,$delimiter)"/>
</xsl:element>
<xsl:call-template name="tokenize">
<xsl:with-param
name="text"
select="substring-after($text,$delimiter)"/>
<xsl:with-param
name="delimiter"
select="$delimiter"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="$text">
<xsl:element name="token">
<xsl:value-of select="$text"/>
</xsl:element>
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
...which produces:
<?xml version="1.0"?>
<Events>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field2>Value 2</Field2>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
</Event>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
<Field2/>
</Event>
<Event>
<Category>Title</Category>
<Field1>Value 1</Field1>
<Field2>Value 2</Field2>
<Field3>Value 3</Field3>
<Field4>Value 4</Field4>
<Field5>Value 5</Field5>
</Event>
</Events>
There are two primary issues with my solution:
- It feels clunky; there's repetitive code and it seems a tad unwieldy. I'm thinking that some optimization could occur?
- Notice that it outputs empty
<Field2>
elements in the incorrect order and places them at the bottom. This is somewhat easily remedied, I suppose, but all of my solutions seem silly and are therefore not included. :)
Ready, Set, Go!
I would appreciate your help with a more elegant solution (or, at the least, a solution that fixes issue #2 above). Thanks!
Conclusion
Based on observations made by @Borodin in his own solution, I decided to go with the following:
<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl"
version="1.0">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vFieldNames">
<name oldName="Field1" newName="fieldA" />
<name oldName="Field2" newName="fieldB" />
<name oldName="Field3" newName="fieldC" />
<name oldName="Field4" newName="fieldD" />
<name oldName="Field5" newName="fieldE" />
</xsl:variable>
<xsl:template match="/">
<events>
<xsl:apply-templates select="*/*/item" />
</events>
</xsl:template>
<xsl:template match="item">
<event>
<category>
<xsl:value-of select="title" />
</category>
<xsl:apply-templates select="exsl:node-set($vFieldNames)/*">
<xsl:with-param
name="pDescriptionText"
select="current()/description" />
</xsl:apply-templates>
</event>
</xsl:template>
<xsl:template match="name">
<xsl:param name="pDescriptionText" />
<xsl:variable
name="vRough"
select="substring-before(
substring-after($pDescriptionText, @oldName),
'div')"/>
<xsl:variable
name="vValue"
select="substring-before(
substring-after($vRough, '>'),
'<')"/>
<xsl:element name="{@newName}">
<xsl:value-of select="normalize-space($vValue)" />
</xsl:element>
</xsl:template>
</xsl:stylesheet>
This solution adds one extra layer: it allows me to change the field names nicely (via the oldName
and newName
attributes on each <name>
element).
Thanks to all who answered!