我尝试使用从奇迹数据库 wikia 中提取的数据填充本体(您可以提取包含 wiki 的所有信息的 xml)。我的问题是这个 xml 太重了,不能用它做任何事情(超过 500Mo)。我尝试使用 xslt 将其转换为非常简单的 rdf 文件,但由于 xml 文件的大小,这几乎是不可能的。
xml 文档由以下页面组成:
<page>
<title>Aeroika (Earth-616)</title>
<ns>0</ns>
<id>1035</id>
<sha1>11t0be5viqp0vsj8zwglfu3wea8fou4</sha1>
<revision>
<id>1786343</id>
<timestamp>2011-10-04T17:49:37Z</timestamp>
<contributor>
<username>HamsterMan</username>
<id>2082346</id>
</contributor>
<minor/>
<text xml:space="preserve" bytes="1652">{{Marvel Database:Character Template
| Image = Aeroika (Earth-616).jpg
| RealName = Aeroika
| CurrentAlias = Aeroika
| Aliases =
| Identity =
| Affiliation = [[Defenders (Earth-616)|Defenders]]
| Relatives =
| Universe = Earth-616
| BaseOfOperations = [[Tunnelworld]]
| Gender = Male
| Height =
| Weight =
| Eyes =
| Hair = Gold
| UnusualSkinColour = Gold
| UnusualFeatures = Wings growing out of his head.
}}
[[Category:Flight]]</text>
</revision>
</page>
例如,在这种情况下,我做了一个 xslt,它在 rdf 中提取重要数据。
<xsl:template match="/">
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:si="http://www.w3schools.com/rdf/">
<xsl:for-each select="page">
<xsl:choose>
<xsl:when test="contains(revision/text, 'Character Template')">
<rdf:Description rdf:about="{title}">
<Image><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Image'),'|'),'=')" /></Image>
<RealName><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'RealName'),'|'),'=')" /></RealName>
<CurrentAlias><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'CurrentAlias'),'|'),'=')" /></CurrentAlias>
<Aliases><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Aliases'),'|'),'=')" /></Aliases>
<Identity><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Identity'),'|'),'=')" /></Identity>
<Affiliation><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Affiliation'),'|'),'=')" /></Affiliation>
<Relatives><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Relatives'),'|'),'=')" /></Relatives>
<Universe><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Universe'),'|'),'=')" /></Universe>
<BaseOfOperations><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'BaseOfOperations'),'|'),'=')" /></BaseOfOperations>
<Gender><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Gender'),'|'),'=')" /></Gender>
<Height><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Height'),'|'),'=')" /></Height>
<Weight><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Weight'),'|'),'=')" /></Weight>
<Eyes><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Eyes'),'|'),'=')" /></Eyes>
<Hair><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Hair'),'|'),'=')" /></Hair>
<UnusualSkinColour><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'UnusualSkinColour'),'|'),'=')" /></UnusualSkinColour>
<UnusualFeatures><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'UnusualFeatures'),'|'),'=')" /></UnusualFeatures>
<Citizenship><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Citizenship'),'|'),'=')" /></Citizenship>
<MaritalStatus><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'MaritalStatus'),'|'),'=')" /></MaritalStatus>
<Occupation><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Occupation'),'|'),'=')" /></Occupation>
<Education><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Education'),'|'),'=')" /></Education>
<Origin><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Origin'),'|'),'=')" /></Origin>
<PlaceOfBirth><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'PlaceOfBirth'),'|'),'=')" /></PlaceOfBirth>
<Creators><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Creators'),'|'),'=')" /></Creators>
<First><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'First'),'|'),'=')" /></First>
<HistoryText><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'HistoryText'),'|'),'=')" /></HistoryText>
<Powers><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Powers'),'|'),'=')" /></Powers>
<Abilities><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Abilities'),'|'),'=')" /></Abilities>
<Strength><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Strength'),'|'),'=')" /></Strength>
<Weaknesses><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Weaknesses'),'|'),'=')" /></Weaknesses>
<Equipement><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Equipement'),'|'),'=')" /></Equipement>
<Transportation><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Transportation'),'|'),'=')" /></Transportation>
<Weapons><xsl:value-of select="substring-after(substring-before(substring-after(revision/text, 'Weapons'),'|'),'=')" /></Weapons>
</rdf:Description>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</rdf:RDF>
</xsl:template>
</xsl:stylesheet>
你知道我该怎么做吗?谢谢