Linked Hypernyms Dataset

v2016-04

This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are hypernyms mined from articles' free text using hand-crafted lexicosyntactic patterns.

Datasets were generated within DBpedia 2016 and Wikipedia snapshots in March 2016.


New dockerized version of LHD framework available at Github

All partitions of the dataset, as described in the dataset description section, can be download from here.


The downloads are provided as N-Triples (gzipped). The numbers correspond to instances count (in thousands).
Download highlights
Dataset Dutch English German
Core Dataset
Most accurate - result of pattern matching
nt
nt
nt
Inference Dataset
Types are in the DBpedia ontology namespace - merge of Core, STI
nt
nt
nt
Extension Dataset
Types are in the DBpedia resource namespace - highest type specificity
nt
nt
nt
Raw "Plain Text" Dataset
All hypernyms are string literals (the original extracted word).
nt
nt
nt

Publications