JenaARQ 执行的优化之一是:“将过滤器放置在定义其依赖变量的位置附近”。
这会导致以下查询计划:
(filter (exprlist (|| (|| (isIRI ?Y) (isBlank ?Y)) (!= (datatype ?Y) <http://example.com/onto/rdf#structure>)) (|| (|| (isIRI ?Z) (isBlank ?Z)) (!= (datatype ?Z) <http://example.com/onto/rdf#structure>)))
(bgp
(triple ?X <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Student>)
(triple ?Y <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Faculty>)
(triple ?Z <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Course>)
(triple ?X <http://swat.cse.lehigh.edu/onto/univ-bench.owl#advisor> ?Y)
(triple ?Y <http://swat.cse.lehigh.edu/onto/univ-bench.owl#teacherOf> ?Z)
(triple ?X <http://swat.cse.lehigh.edu/onto/univ-bench.owl#takesCourse> ?Z)
)))
转化为以下内容:
(sequence
(filter (|| (|| (isIRI ?Z) (isBlank ?Z)) (!= (datatype ?Z) <http://example.com/onto/rdf#structure>))
(sequence
(filter (|| (|| (isIRI ?Y) (isBlank ?Y)) (!= (datatype ?Y) <http://example.com/onto/rdf#structure>))
(bgp
(triple ?X <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Student>)
(triple ?Y <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Faculty>)
))
(bgp (triple ?Z <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Course>))))
(bgp
(triple ?X <http://swat.cse.lehigh.edu/onto/univ-bench.owl#advisor> ?Y)
(triple ?Y <http://swat.cse.lehigh.edu/onto/univ-bench.owl#teacherOf> ?Z)
(triple ?X <http://swat.cse.lehigh.edu/onto/univ-bench.owl#takesCourse> ?Z)
)))
事实证明,虽然原始查询计划以毫秒为单位运行,但“优化”查询计划需要大约 7 个小时才能完成。
JenaARQ 是否考虑任何用于优化查询计划中的过滤器位置的统计信息?
我正在使用耶拿 3.12.0。