This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are hypernyms mined from articles' free text using hand-crafted lexicosyntactic patterns.
Datasets were generated within DBpedia 2016 and Wikipedia snapshots in March 2016.
20/6/2017. Dockerized version of LHD framework WON the DBpedia TextExt Challenge. Check out the presentation we gave at LDK 2017.
All partitions of the dataset, as described in the dataset description section, can be download from here.
Download highlights | |||||
---|---|---|---|---|---|
Dataset | Dutch | English | German | ||
Core Dataset Most accurate - result of pattern matching |
nt |
nt |
nt |
||
Inference Dataset Types are in the DBpedia ontology namespace - merge of Core, STI |
nt |
nt |
nt |
||
Extension Dataset Types are in the DBpedia resource namespace - highest type specificity |
nt |
nt |
nt |
||
Raw "Plain Text" Dataset All hypernyms are string literals (the original extracted word). |
nt |
nt |
nt |
Publications
- T. Kliegr,O. Zamazal. LHD 2.0: A Text Mining Approach to Typing Entities In Knowledge Graphs. Web Semantics, 2016 preprint
- T. Kliegr. Linked Hypernyms: Enriching DBpedia with Targeted Hypernym Discovery. Web Semantics, 2015