I have a RDF/XML document with this format:
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:ags="http://purl.org/agmes/1.1/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dct="http://purl.org/dc/terms/">
<bibo:Article rdf:about="http://xxxxx/NO8500391">
<dct:identifier>NO8500391</dct:identifier>
...
</bibo:Article>
<bibo:Article rdf:about="http://xxxxx/NO8500523">
...
</bibo:Article>
<bibo:Article rdf:about="http://xxxxx/NO8500496">
...
</bibo:Article>
</rdf:RDF>
As you can see, in a single RDF/XML file, there are many bibo:Article
s, could be thousands. What I want is to extract each article and convert it to RDF/JSON (I know how to write a model) using Apache Jena, so I can have a separate document for each article, and later import them all to a index like CouchDB or Elasticsearch to perform searches.
How can I extract each article in the model (Jena)?
The dirty way that I was thinking is to process the file as XML and extract each bibo:Article
element.