rdf - 为什么有些 rdf 文件不包含?

Question

我正在使用 Jena 编写一个描述在线帖子的 rdf 文件。根据我使用的 sioc 本体/命名空间，例如，以下内容：

类别: sioc:Post
属性：sioc:has_creator

在耶拿，我如何在文件中包含 sioc:Post 作为

<sioc:Post rdf:about="http://example.com/vb/1035092">

代替

<rdf:Description rdf:about="http://example.com/vb/1035092">

最佳做法是什么？

score 2 · Accepted Answer

Both of the answers so far make good points:

You should not pay much attention to the particular serialization of your RDF graph, because there are lots of different serializations, and you should be accessing them using an API that exposes the graph, not the serialization. (See, for instance, Don't query RDF (or OWL) with XPath in one of my previous answers, for some comments about depending on a particular XML serialization.)
The difference that you're seeing is that the most simple RDF/XML serialization will use lots of rdf:Description elements, and these will contain rdf:type elements to indicate the types of the described element. However, the RDF/XML serialization format defines many abbreviations that can be used to make the serialization of a graph much shorter, more readable, and, in some cases, more like a traditional XML document. Others have mentioned that using the type as the element name is just one such abbreviation, but I think it's worth examining the spec on this point. This particular abbreviation is defined in 2.13 Typed Nodes:

It is common for RDF graphs to have rdf:type predicates from subject nodes. These are conventionally called typed nodes in the graph, or typed node elements in the RDF/XML. RDF/XML allows this triple to be expressed more concisely. by replacing the rdf:Description node element name with the namespaced-element corresponding to the RDF URI reference of the value of the type relationship. There may, of course, be multiple rdf:type predicates but only one can be used in this way, the others must remain as property elements or property attributes.

The typed node elements are commonly used in RDF/XML with the built-in classes in the RDF vocabulary: rdf:Seq, rdf:Bag, rdf:Alt, rdf:Statement, rdf:Property and rdf:List.

For example, the RDF/XML in Example 14 could be written as shown in Example 15.

Example 14: Complete example with rdf:type (example14.rdf output example14.nt)
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:ex="http://example.org/stuff/1.0/">
  <rdf:Description rdf:about="http://example.org/thing">
    <rdf:type rdf:resource="http://example.org/stuff/1.0/Document"/>
    <dc:title>A marvelous thing</dc:title>
  </rdf:Description>
</rdf:RDF>
Example 15: Complete example using a typed node element to replace an rdf:type (example15.rdf output example15.nt)
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:ex="http://example.org/stuff/1.0/">
  <ex:Document rdf:about="http://example.org/thing">
    <dc:title>A marvelous thing</dc:title>
  </ex:Document>
</rdf:RDF>

If you're using Jena, you can get extensive control over the way that your RDF/XML output is formatted. These options are documented in the Advanced RDF/XML Output section of the documentation. However, for the case that you want, simply serializing in RDF/XML versus RDF/XML-ABBREV will take care of what you want to do. For instance, look at the results using the Jena command line rdfcat tool. Here's our data (in Turtle):

# The actual namespace doesn't matter for this example.
@prefix sioc: <http://example.org/> . 

<http://example.com/vb/1035092>
  a sioc:Post ;
  sioc:has_creator "someone" .

Let's convert this to simple RDF/XML:

$ rdfcat -out RDF/XML data.n3
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:sioc="http://example.org/" > 
  <rdf:Description rdf:about="http://example.com/vb/1035092">
    <rdf:type rdf:resource="http://example.org/Post"/>
    <sioc:has_creator>someone</sioc:has_creator>
  </rdf:Description>
</rdf:RDF>

Now let's convert it to RDF/XML-ABBREV:

$ rdfcat -out RDF/XML-ABBREV data.n3
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:sioc="http://example.org/">
  <sioc:Post rdf:about="http://example.com/vb/1035092">
    <sioc:has_creator>someone</sioc:has_creator>
  </sioc:Post>
</rdf:RDF>

In the first case you see an rdf:Description element with rdf:type and sioc:has_creator subelements, but in the second case you see a sioc:Post element with only a sioc:has_creator subelement.

As to best practice, I don't know that it really matters. The RDF/XML-ABBREV will typically be a bit shorter, so would incur less network overhead on transmission, storage on disk, and would be easier to read. The simpler RDF/XML will be a faster to write, though. On most graphs this won't make a big a difference, but generating RDF/XML-ABBREV can be pretty expensive, as a recent thread on the Jena mailing list discusses.

score 1 · Accepted Answer

你真的不应该纠结于你的数据的计算机可读输出是什么样子的。Jena 生成有效的 RDF，任何其他 RDF 解析器/框架都将能够读取它并让您使用它来做事。

您想要的样式格式无效，在您的示例中它需要是 rdf:ID，这意味着由 URI 标识的事物是 sioc:Post。在后一种情况下，这基本上只是关于该 URI 的内容的容器；你会看到一个单独的 rdf:type 三元组来断言个人是一个 sioc:Post。

但说真的，重申一下，RDF 输出的样子并不重要。如果您希望它看起来像某种方式，因为您要手动编辑它，请不要。去获取像 Protege 或 TopBraid 这样的工具并使用它。

score 1 · Accepted Answer

Jena 有两个 RDF/XML 编写器。用于RDF/XML-ABBREV获得更易读的格式。

不过，正如迈克尔正确地说的那样，不要沉迷于此。解析器不在乎。

rdf - 为什么有些 rdf 文件不包含?

3 回答 3

Related

Reference