Linked Hypernyms Dataset

This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are hypernyms mined from articles' free text using hand-crafted lexicosyntactic patterns.

Datasets were generated within DBpedia 2016 and Wikipedia snapshots in March 2016.

20/6/2017. Dockerized version of LHD framework WON the DBpedia TextExt Challenge. Check out the presentation we gave at LDK 2017.

All partitions of the dataset, as described in the dataset description section, can be download from here.

The downloads are provided as N-Triples (gzipped). The numbers correspond to instances count (in thousands).
Download highlights
Dataset	Dutch	English	German
Core Dataset Most accurate - result of pattern matching	nt	nt	nt
Inference Dataset Types are in the DBpedia ontology namespace - merge of Core, STI	nt	nt	nt
Extension Dataset Types are in the DBpedia resource namespace - highest type specificity	nt	nt	nt
Raw "Plain Text" Dataset All hypernyms are string literals (the original extracted word).	nt	nt	nt

Publications

T. Kliegr,O. Zamazal. LHD 2.0: A Text Mining Approach to Typing Entities In Knowledge Graphs. Web Semantics, 2016 preprint
T. Kliegr. Linked Hypernyms: Enriching DBpedia with Targeted Hypernym Discovery. Web Semantics, 2015