0

通过 Jena 查询 DBpedia 时遇到了一些问题。在 nextSolution 方法中迭代 resultSet 时会引发异常。这是代码:

ResultSet results = throwQuery(query);
ArrayList<Movies> movs = new ArrayList<Movies>();
//try {
    while (results.hasNext()) {
    try{
    QuerySolution q = results.nextSolution();


    Movies m = new Movies();
    m.setUrl(q.get("film_url").toString());
    RDFNode node = q.get("film_label");
    // Set a default title
    String title = "";
    if (node != null) {
        // We delete the "@en" part that indicates that the label is in
        // english
        title = node.toString();
        int ind = title.indexOf("@en");
        title = title.substring(0, ind);
    }
    m.setTitle(title);

    node = q.get("image_url");
    // Set a default image
    String image = "http://4.bp.blogspot.com/_rY0CJheAaRM/SuYJcVOqKbI/AAAAAAAAA2Y/abClDm72TuY/s320/NoCoverAvailable.png";
    if (node != null) {
        // For some reason the image link retrieved from dbpedia is
        // broken. Here we fix it 
        image = node.toString();
        int ind = image.indexOf("common");
        image = image.substring(0, ind) + "en" + image.substring(ind + 7);
    }
    m.setImageurl(image);

    movs.add(m);
    }
    catch(Exception e){
        System.err.println("Error catched: " + e.getMessage());
    }
    }

return movs;

哪里 throwQuery

private final static String SERVICE = "http://dbpedia.org/sparql";

private static ResultSet throwQuery(String q) {
Query qFactory = QueryFactory.create(q);
QueryExecution qe = QueryExecutionFactory.sparqlService(SERVICE, qFactory);
ResultSet results = null;
try {
    results = qe.execSelect();
} catch (QueryExceptionHTTP e) {
    System.out.println(e.getMessage());
    System.out.println(SERVICE + " is DOWN");
} finally {
    qe.close();
    return results;
}
}

和测试查询

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?film_label ?image_url ?film_url
WHERE {
?film_url rdf:type <http://dbpedia.org/ontology/Film> .
OPTIONAL{ 
    ?film_url rdfs:label ?film_label 
    FILTER (LANG(?film_label) = 'en')
}
OPTIONAL{   
    ?film_url foaf:depiction ?image_url 
}
FILTER regex(str(?film_url), "hola","i") 
}
ORDER BY ?film_url

当程序开始迭代时,一切顺利,直到达到该值Nicholas Nickleby (2002 film)然后我得到这个异常:

com.hp.hpl.jena.sparql.resultset.ResultSetException: XMLStreamException: Unexpected EOF in start tag
at [row,col {unknown-source}]: [67,116]
at     com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.staxError(XMLInputStAX.java:539    )
at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.hasNext(XMLInputStAX.java:236)
at client.DBPediaConnector.getMovie(DBPediaConnector.java:67)
at customServices.MoviesService.searchInsertMovie(MoviesService.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1052)
at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1124)
at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:5388)
at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:619)
at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:800)
at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:571)
at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doAround(SystemInterceptorProxy.java:162)
at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:144)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:861)
at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:800)
at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:370)
at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:5360)
at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:5348)
at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:214)
... 47 more
Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in start tag
at [row,col {unknown-source}]: [67,116]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:677)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1034)
at com.ctc.wstx.sr.StreamScanner.getNextChar(StreamScanner.java:785)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2790)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1065)
at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.getOneSolution(XMLInputStAX.java:435)
at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.hasNext(XMLInputStAX.java:232)
... 71 more

对我来说似乎是耶拿的内部错误,但我不知道。难道我做错了什么?我该如何解决这个问题?

4

1 回答 1

1

请给出一个完整的,最小的例子。这是相当长的。

DBpedia 正在为结果返回损坏的 XML,可能是因为查询需要很长时间才能执行并且触发了超时。这似乎是一个中等速度的查询。

如果您的 Jena 版本足够新,请尝试将 &timeout=60000 添加到“ http://dbpedia.org/sparql&timeout=60000 ”的查询 URL。这可能还不够长。dbpedia 有一个无法覆盖的硬性内部限制。

在一天中的不同时间执行也可能有所帮助。

也可能是因为返回了损坏的 XML。在 DBpedia UI 上执行查询并获取 XML 结果以进行检查。

于 2013-04-01T09:52:28.640 回答