0

I need some sort of dataset which has related items in it. For example, a flower has related subtypes: roses, violets, etc. Each of those subtypes has their own subtype. This could be a graph of related items which is used in semantic search engines, etc.

Is there anywhere that has such dataset (preferably with images)?

4

2 回答 2

2

Wordnet would be a good start. You can get if from here for free.

Conceptnet is another great ontology. It has a lower quality, but a much larger number of concepts. Here's the Conceptnet page for flower

The third source I'd recommend checking out is wikipedia cross-article links.

于 2012-08-06T20:32:08.170 回答
1

Expanding on Wikipedia mentioned above by Sagie, DBPedia is a project that has extracted the structured data from Wikipedia into data sets. They mentioned that their datasets have 3.77 million 'things' and 400 million facts. There's also localised information in different languages:

The full DBpedia data set features labels and abstracts for 10.3 million unique things in up to 111 different languages; 8.0 million links to images and 24.4 million HTML links to external web pages; 27.2 million data links into external RDF data sets, 55.8 million links to Wikipedia categories, and 8.2 million YAGO categories. The dataset consists of 1.89 billion pieces of information (RDF triples) out of which 400 million were extracted from the English edition of Wikipedia, 1.46 billion were extracted from other language editions, and about 27 million are data links to external RDF data sets.

Their dataset is queriable via SPARQL. An example they give is for the top 20 cities with over 2 million population:

SELECT ?subject ?population WHERE {
?subject rdf:type <http://dbpedia.org/ontology/City>.
?subject <http://dbpedia.org/ontology/populationUrban> ?population.
FILTER (xsd:integer(?population) > 2000000)
}
ORDER BY DESC(xsd:integer(?population))
LIMIT 20
于 2012-08-12T12:08:34.587 回答