2

有没有办法将对齐文件加载到python。如果我有这样的文件:

<?xml version='1.0' encoding='utf-8' standalone='no'?>
<rdf:RDF xmlns='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:xsd='http://www.w3.org/2001/XMLSchema#'
xmlns:align='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'>
<Alignment>
<map>
      <Cell>
          <entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Catie+Curtis"></entity1>
          <entity2 rdf:resource="http://discogs.dataincubator.org/artist/catie-curtis"></entity2>
        <relation>=</relation>
        <measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">1.0</measure>
      </Cell>
    </map>
<map>
      <Cell>
          <entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Bigelf"></entity1>
          <entity2 rdf:resource="http://discogs.dataincubator.org/artist/bigelf"></entity2>
        <relation>=</relation>
        <measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.8</measure>
      </Cell>
    </map>
<map>
      <Cell>
          <entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/%C3%81kos"></entity1>
          <entity2 rdf:resource="http://discogs.dataincubator.org/artist/%C3%81kos"></entity2>
        <relation>=</relation>
        <measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.9</measure>
      </Cell>
    </map>
</Alignment>
</rdf:RDF>

我想保持信心值以及三倍:主题:http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Catie+Curtis Predicate:owl:SameAs Object:http: //discogs.dataincubator.org/artist/catie-curtis 信心:1.0

我试图用 RDFlib 来做,但没有成功。任何建议都会有所帮助,谢谢!

4

1 回答 1

3

尝试使用 Redland 库: http: //librdf.org/docs/python.html

import RDF
parser = RDF.Parser(name="rdfxml")
model = RDF.Model()
parser.parse_into_model(model, "file:./align.rdf", None)

然后查询模型变量。例如,为了检索所有对齐并返回它们的度量,查询如下:

for statement in RDF.Query("SELECT ?a ?m WHERE {?a a <http://knowledgeweb.semanticweb.org/heterogeneity/alignment#Cell> ; <http://knowledgeweb.semanticweb.org/heterogeneity/alignment#measure> ?m. }",query_language="sparql").execute(model):
print "cell: %s measure:%s"%(statement['a'],statement['m'])

结果将包含字典对象(变量名、结果)的迭代器,并将按如下方式打印出来:

cell: (r1301329275r1126r2) measure:1.0^^<http://www.w3.org/2001/XMLSchema#float>
cell: (r1301329275r1126r3) measure:0.8^^<http://www.w3.org/2001/XMLSchema#float>
cell: (r1301329275r1126r4) measure:0.9^^<http://www.w3.org/2001/XMLSchema#float>

可以在此处检索用于检索节点内容的 python 中的 API:http: //librdf.org/docs/python.html 有关 SPARQL 查询语言的概述,您可以阅读以下内容:http ://www.w3.org/TR/ rdf-sparql-查询/

于 2011-03-25T18:24:34.583 回答