0

我正在使用SPARQL从RDF文件中提取节点,rdf文件中的节点如下:

 <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a covering of feathers, and their front limbs are modified into wings. Some birds, such as penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or they will perish.^M
    <br />
    <br />
    <a href="/nature/19700707">All you need to know about British birds.</a>
</dc:description>

我正在使用 python RDFLib 来获取这个节点。它返回为

rdflib.term.BNode('Nfc3f01b2567a4b3ea23dbd01394929df')

如何从 dc:description 中提取文本rdflib.term.BNode('Nfc3f01b2567a4b3ea23dbd01394929df')

我根据答案尝试过的东西:

from rdflib import *
import rdfextras
import json

#load the ontology
rdfextras.registerplugins()
g=Graph()

g.parse("http://www.bbc.co.uk/nature/life/Bird.rdf")


#define the predixes
PREFIX = ''' PREFIX dc:<http://purl.org/dc/terms/>
             .......
             PREFIX po:<http://purl.org/ontology/po/>
             PREFIX owl:<http://www.w3.org/2002/07/owl#>
         '''

def exe(query):
        query = PREFIX + query
        return g.query(query)

def getEntity(entity_type,entity):
        #getting the description
        entity_url = "<http://www.bbc.co.uk/nature/life/" + entity.capitalize() + "#" + entity_type.lower() +">"
    query = ''' SELECT ?description
                    WHERE { ''' + entity_url + ''' dc:description ?description . }'''
    result_set = exe(query)
    dc = Namespace("http://purl.org/dc/terms/")
        for row in result_set:
                description = row[0]
            print description.value(dc.description)

getEntity("class","bird")

我收到以下错误:

Traceback (most recent call last):
  File "test_bird1.py", line 40, in <module>
    getEntity("class","bird")
  File "test_bird1.py", line 38, in getEntity
    print description.value(dc.description)
AttributeError: 'BNode' object has no attribute 'value'
4

2 回答 2

2

BNodes(和 URIrefs 也是)是资源,因此资源模块文档可能是对您最有用的文档。根据该文档,看起来像这样的东西应该为您处理好事情。x空白节点在哪里,g是图形,它看起来像这样:

>>> from rdflib import *
>>> DC = Namespace("http://purl.org/dc/terms/")
>>> r = Resource( g, x )
>>> r.value(DC.description)

正如您在另一个问题的答案中所指出的, SPARQL 没有返回正确的结果,在它们出现的地方实际上是不合法的<br />(也许您需要使用另一个序列化,例如 NTriples、N3、Turtle) ,因此很难预测不同的库会如何处理格式错误的输入。您可以让内容制作者知道他们正在发布格式错误的数据。

于 2014-01-03T17:50:16.930 回答
0
from rdflib import Graph, BNode
g = Graph()
g.parse("http://www.bbc.co.uk/nature/life/Bird.rdf")

for objects in g.objects(subject=BNode(add the BNode code here)):
   print (objects)
于 2017-03-05T13:18:03.113 回答