我的一位同事需要获得两组涉及术语关联的 RDF 三元组。这些术语来自一个列表,而关联来自使用这些术语的一组三元组。
第一组是所有三元组,术语列表中的任何项目都是三元组的主语或宾语。
第二组是所有三元组,其中任何两个术语是一个或两个彼此相距较远的谓词,其中谓词不一定是双向的。因此,对于术语列表中的 s1 和 s2,两个三元组 s1 → s3 和 s2 → s3 将是有效的。
我想我已经有了答案,但我想请求为 SPARQL 基础做出贡献,并检查自己。
给定这样的数据:
@prefix : <urn:ex:> .
:a :p :b .
:a :p :e .
:b :p :c .
:b :p :d .
:c :p :a .
:c :p :f .
:d :p :a .
:d :p :d .
如果我们将(:b :c)
感兴趣的术语作为一组,下面的查询将找到您感兴趣的所有三元组。注意第一组的条件,即。来自?s ?p ?o
其中一个?s
或?o
在术语列表中的,也得到第二组的一些,即两个术语连接的部分,即两者 ?s
都?o
在术语列表中。
prefix : <urn:ex:>
select distinct ?s ?p ?between ?q ?o where {
# term list appearing twice in order to
# get all pairs of items
values ?x { :b :c }
values ?y { :b :c }
# This handles the first set (where either the subject or
# object is from the term list). Note that the first set
# includes part of the second set; when two terms from
# the list are separated by just one predicate, then it's
# a case where either the subject or object are from the
# term list (since *both* are).
{ ?s ?p ?x bind(?x as ?o)} UNION { ?x ?p ?o bind(?x as ?s)}
UNION
# The rest of the second set is when two terms from the
# list are connected by a path of length two. This is
# a staightforward pattern to write.
{ ?x ?p ?between . ?between ?q ?y .
bind(?x as ?s)
bind(?y as ?o) }
}
在结果中,单个三元组是仅s
、p
和o
被绑定的行。这些涵盖了您的第一组以及第二组的“距离 = 1”部分。第二组的其余部分也绑定between
和q
。就您问题中的示例而言,between
是s3
.
$ arq --data data.n3 --query query.sparql
-------------------------------
| s | p | between | q | o |
===============================
| :a | :p | | | :b |
| :b | :p | | | :d |
| :b | :p | | | :c |
| :c | :p | | | :f |
| :c | :p | | | :a |
| :c | :p | :a | :p | :b |
-------------------------------
鉴于评论中的示例,我认为这个查询可以大大缩短为以下内容:
prefix : <urn:ex:>
select distinct ?x ?p ?between ?q ?y where {
values ?x { :b :c }
values ?y { :b :c }
{ ?x ?p ?between } UNION { ?between ?p ?x }
{ ?between ?q ?y } UNION { ?y ?q ?between }
}
一旦我们绑定?x ?p ?between
or ?between ?p ?x
,我们只是说在?x
and之间有一条边(在任一方向上) ?between
。?y
并?q
扩展这条路径,所以我们有:
?x --?p-- ?between --?q-- ?y
--?p--
其中和的实际方向--?q--
可能是左或右。这涵盖了我们需要的所有情况。可能不难看出为什么长度为 2 的路径会匹配这种模式,但是只有主语或宾语是一个特殊术语的三元组的情况值得详细说明。给定一个三元组
<term> <prop> <non-term>
我们可以得到路径
<term> --<prop>-- <non-term> --<prop>-- <term>
这适用于<term>
既是客体又<non-term>
是主体的情况。它还涵盖了主语和宾语都是术语的情况。根据上面的数据,结果是:
$ arq --data data.n3 --query paths.sparql
-------------------------------
| x | p | between | q | y |
===============================
| :b | :p | :d | :p | :b |
| :b | :p | :c | :p | :b |
| :b | :p | :a | :p | :b |
| :c | :p | :a | :p | :b |
| :b | :p | :a | :p | :c |
| :c | :p | :f | :p | :c |
| :c | :p | :a | :p | :c |
| :c | :p | :b | :p | :c |
-------------------------------
如果我们添加一些关于哪个方向?p
和?q
指向的信息,我们可以重建路径:
prefix : <urn:ex:>
select distinct ?x ?p ?pdir ?between ?q ?qdir ?y where {
values ?x { :b :c }
values ?y { :b :c }
{ ?x ?p ?between bind("right" as ?pdir)} UNION { ?between ?p ?x bind("left" as ?pdir)}
{ ?between ?q ?y bind("right" as ?qdir)} UNION { ?y ?q ?between bind("left" as ?qdir)}
}
这给出了输出:
$ arq --data data.n3 --query paths.sparql
---------------------------------------------------
| x | p | pdir | between | q | qdir | y |
===================================================
| :b | :p | "right" | :d | :p | "left" | :b | # :b -> :d
| :b | :p | "right" | :c | :p | "left" | :b | # :b -> :c
| :b | :p | "left" | :a | :p | "right" | :b | # :a -> :b
| :c | :p | "right" | :a | :p | "right" | :b | # :c -> :a -> :b
| :b | :p | "left" | :a | :p | "left" | :c | # :c -> :a -> :b
| :c | :p | "right" | :f | :p | "left" | :c | # :c -> :f
| :c | :p | "right" | :a | :p | "left" | :c | # :c -> :a
| :c | :p | "left" | :b | :p | "right" | :c | # :b -> :c
---------------------------------------------------
路径有重复c -> a -> b
,但可能会被过滤掉。
如果您实际上在这里寻找三元组的集合,而不是特定的路径,则可以使用构造查询,它会返回一个图形(因为一组三元组是一个图形):
prefix : <urn:ex:>
construct {
?s1 ?p ?o1 .
?s2 ?q ?o2 .
}
where {
values ?x { :b :c }
values ?y { :b :c }
{ ?x ?p ?between .
bind(?x as ?s1)
bind(?between as ?o1) }
UNION
{ ?between ?p ?x .
bind(?between as ?s1)
bind(?x as ?o1)}
{ ?between ?q ?y .
bind(?between as ?s2)
bind(?y as ?o2) }
UNION
{ ?y ?q ?between .
bind(?y as ?s2)
bind(?between as ?o2)}
}
$ arq --data data.n3 --query paths-construct.sparql
@prefix : <urn:ex:> .
<urn:ex:b>
<urn:ex:p> <urn:ex:c> ;
<urn:ex:p> <urn:ex:d> .
<urn:ex:c>
<urn:ex:p> <urn:ex:f> ;
<urn:ex:p> <urn:ex:a> .
<urn:ex:a>
<urn:ex:p> <urn:ex:b> .
您可以在查询中使用 UNION。无论哪种情况,您都有一组正在寻找的模式,并且您希望从这些模式的 UNION 中收集信息。
对于第一组,获取包含主题或对象中的列表项的所有三元组,
SELECT ?s ?p ?o # result triples
WHERE
{
# get a term bound to ?term
GRAPH <urn:termsList/>
{ ?term a <urn:types/word> } # or however the terms are stored
# match ?term against the basic patterns
GRAPH <urn:associations/>
{
{
?term ?p ?o . # basic pattern #1
BIND(?term AS ?s) # so that ?term shows up in the results
}
UNION # take ?term as either subject or object
{
?s ?p ?term . # basic pattern #2
BIND(?term AS ?o)
}
}
}
首先获得所有条款的绑定(?term a ...)。
然后将其与基本模式匹配:
?term ?p ?o
和
?s ?p ?term.
在每个模式匹配之后,使用绑定将 ?term 放置在结果中的适当位置。比如第一个模式刚刚绑定了?p和?o,那么接下来需要绑定它们对应的?s,否则只会显示空白。
对于第二组,首先我们从列表中获取两个单词。我们想要一个多对多的匹配:
?term1 a … .
?term2 a … .
基本模式:
?term1 ?p1 ?term2
?term1 ?p1 ?term .
?term2 ?p2 ?term .
?term1 ?p1 ?term .
?term ?p2 ?term2 .
?term ?p1 ?term1 .
?term ?p2 ?term2 .
在最后三个中的每一个上添加一个过滤器,以确保 ?term1 和 ?term2 不相同:
FILTER(!SAMETERM(?term1, ?term2))
(我们可以将这些过滤器放在所有联合之外,但在进一步使用它们之前在本地过滤变量更有效。)
最后将结果 UNION 在一起:
SELECT ?s1 ?p1 ?o1 ?s2 ?p2 ?o2
WHERE
{
GRAPH <urn:termsList/>
{
?term1 a <urn:types/word> . # outer loop variable
?term2 a <urn:types/word> . # inner loop variable
}
GRAPH <urn:associations/>
{
{
# Only need to check one direction; either end gets
# matched into ?term1 at some point
?term1 ?p1 ?term2 .
BIND (?term1 AS ?s1) .
BIND (?term2 AS ?o1) . # Note we leave ?s2, ?p2, ?o2 unbound here
}
UNION
{
?term1 ?p1 ?term .
?term2 ?p2 ?term .
FILTER(!SAMETERM(?term1, ?term2))
BIND(?term1 AS ?s1) .
BIND(?term AS ?o1) .
BIND(?term2 AS ?s2) .
BIND(?term AS ?o2)
}
UNION
{
?term1 ?p1 ?term .
?term ?p2 ?term2 .
FILTER(!SAMETERM(?term1, ?term2))
BIND(?term1 AS ?s1) .
BIND(?term AS ?o1) .
BIND(?term AS ?s2) .
BIND(?term2 AS ?o2)
}
UNION
{
?term ?p1 ?term1 .
?term ?p2 ?term2 .
FILTER(!SAMETERM(?term1, ?term2))
BIND(?term AS ?s1) .
BIND(?term1 AS ?o1) .
BIND(?term AS ?s2) .
BIND(?term2 AS ?o2)
}
}
}
我们将测试以下文本的查询: 对于单词列表——
# For God so loved the world, that he gave his only begotten Son, that
# whosoever believeth in him should not perish, but have everlasting life.
# John 3:16
@prefix : <urn:terms/> .
@prefix t: <urn:types/> .
:For a t:word .
:God a t:word .
:so a t:word .
:loved a t:word .
:the a t:word .
:world a t:word .
:that a t:word .
:he a t:word .
:gave a t:word .
:his a t:word .
:only a t:word .
:begotten a t:word .
:Son a t:word .
:that a t:word .
:whosoever a t:word .
:believeth a t:word .
:in a t:word .
:him a t:word .
:should a t:word .
:not a t:word .
:perish a t:word .
:but a t:word .
:have a t:word .
:everlasting a t:word .
:life a t:word .
和一个关联列表:
# For the wages of sin is death; but the gift of God is eternal life through
# Jesus Christ our Lord.
# Romans 6:23
@prefix : <urn:terms/> .
@prefix g: <urn:grammar/> .
:For g:clauseAt :wages ;
g:nextClauseHeadAt :but .
:the g:describes :wages .
:wages g:predicate :is .
:of g:describes :wages ;
g:nominative :sin .
:is g:object :death .
:but g:clauseAt :gift .
:the g:describes :gift .
:gift g:predicate :is .
:of g:describes :gift ;
g:nominative :God .
:is g:object :life .
:eternal g:describes :life .
:through g:describes :is ;
g:nominative :Jesus .
:Christ g:describes :Jesus .
:our g:describes :Lord .
:Lord g:describes :Jesus .
查询一:
----------------------------------------------------------------------------
| s | p | o |
============================================================================
| <urn:terms/For> | <urn:grammar/nextClauseHeadAt> | <urn:terms/but> |
| <urn:terms/For> | <urn:grammar/clauseAt> | <urn:terms/wages> |
| <urn:terms/of> | <urn:grammar/nominative> | <urn:terms/God> |
| <urn:terms/is> | <urn:grammar/object> | <urn:terms/life> |
| <urn:terms/eternal> | <urn:grammar/describes> | <urn:terms/life> |
| <urn:terms/but> | <urn:grammar/clauseAt> | <urn:terms/gift> |
| <urn:terms/For> | <urn:grammar/nextClauseHeadAt> | <urn:terms/but> |
| <urn:terms/the> | <urn:grammar/describes> | <urn:terms/gift> |
| <urn:terms/the> | <urn:grammar/describes> | <urn:terms/wages> |
----------------------------------------------------------------------------
查询 2:
----------------------------------------------------------------------------------------------------------------------------------------
| s1 | p1 | o1 | s2 | p2 | o2 |
========================================================================================================================================
| <urn:terms/For> | <urn:grammar/nextClauseHeadAt> | <urn:terms/but> | | | |
| <urn:terms/For> | <urn:grammar/clauseAt> | <urn:terms/wages> | <urn:terms/the> | <urn:grammar/describes> | <urn:terms/wages> |
| <urn:terms/but> | <urn:grammar/clauseAt> | <urn:terms/gift> | <urn:terms/the> | <urn:grammar/describes> | <urn:terms/gift> |
| <urn:terms/the> | <urn:grammar/describes> | <urn:terms/wages> | <urn:terms/For> | <urn:grammar/clauseAt> | <urn:terms/wages> |
| <urn:terms/the> | <urn:grammar/describes> | <urn:terms/gift> | <urn:terms/but> | <urn:grammar/clauseAt> | <urn:terms/gift> |
----------------------------------------------------------------------------------------------------------------------------------------
请注意,这里有一些冗余。这是由于我们如何将值绑定到 ?term1 和 ?term2 的双循环性质,因此 ?term1 变为 ?term2 ,反之亦然。如果这是不可接受的,您可以简单地将第 1 行更改为 only
SELECT DISTINCT ?s1 ?p1 ?o1
当然,这使得 ?s2 和 ?o2 的 BINDings 变得不必要,因为它们只为 SELECT 绑定。
“因为如果我们在像他一样的死亡中与[基督]联合,我们一定会在像他一样的复活中与他联合”(罗马书 6:5 ESV)。