0

这个想法是我正在接受一个 Wordnet 文本行,将行的所有不同部分分配给不同的变量,并将这些变量作为三元组输入到 RDFlib 图中。

这是文本文件中的示例行:

13797906 23 n 04 flood 0 inundation 0 deluge 0 torrent 0 005 @ 13796604 n 0000 + 00603894 a 0401 + 00753137 v 0302 + 01527311 v 0203 + 02361703 v 0101 | an overwhelming number or amount; "a flood of requests"; "a torrent of abuse"

这是我的代码。

from rdflib import URIRef, Graph
from StringIO import StringIO

G = Graph()
F = open("new_2.txt", "r")
for line in F:

    L = line.split()
    L2 = line.strip().split('|')
    synset_offset = L[0]
    lex_filenum = L[1]
    ss_type = L[2]
    gloss = L2[1]                                   
    before_at, after_at = line.split('@', 1)
    N = int(L[3])
    K = int(before_at.split()[-1])                                     
    word = L[4:4 + 2 * N:2]                         
    iw = iter(word)
    S = after_at.split()[0:0 +4 * K:4]              
    ip = iter(S)
    SS = after_at.split()[1:1 + 4 * K:4]            
    iss = iter(SS)
    ST = after_at.split()[2:2 + 4 * K:4]          
    ist = iter(ST)

    line1 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.w3.org/1999/02/22-rdf-syntax-ns#lex_filenum '''+lex_filenum+''''''
    line2 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#ss_type '''+ss_type+''''''             
    line3 = ''''''
    #line4 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#gloss '''gloss'''                         
    for item in word: 
        line3 += '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#lexical_entry '''+iw.next()+'''\n'''   
    line5 = ''''''
    for item in S:
        line5 += '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#has_ptr '''+ip.next()+'''\n'''            
    line6 = ''''''
    for item in SS:
        line6 += '''http://www.example.org/lexicon#'''+ip.next()+''' http://www.monnetproject.eu/lemon#pos '''+iss.next()+'''\n'''                      
    line7 = ''''''
    for item in ST:
        line7 += '''http://www.example.org/lexicon#'''+ip.next()+''' http://www.monnetproject.eu/lemon#source_target '''+ist.next()+'''\n'''     

    contents = '''\
    '''+line1+'''
    '''+line2+''' 
    '''+line3+'''
    '''+line5+'''
    '''+line6+'''  
    '''+line7+''''''#'''+line4+'''

    tabfile = StringIO(contents)
    for line in tabfile:
        triple = line.split()
        triple = (URIRef(t) for t in triple)
        G.add(triple)

print G.serialize(format='nt')

这一切都完美无缺,直到line5. (line4 因不同的原因被注释掉,我还不需要)

这是我在包含第 5 行、第 6 行和第 7 行时得到的错误:

 G.add(triple)
  File "/usr/lib/python2.7/site-packages/rdflib-4.1_dev-py2.7.egg/rdflib/graph.py", line 352, in add
    def add(self, (s, p, o)):
ValueError: need more than 0 values to unpack

我不明白 line3 和 line5 之间的区别是什么会导致错误,line3 完美!

4

1 回答 1

0

似乎S = after_at.split()[0:0 +4 * K:4]导致一个空值,这意味着 S 是一个空列表。此外,尽管您正在遍历所有项目,但 in for item in S, item 尚未在该循环中使用!

于 2013-06-18T11:53:23.537 回答