This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are hypernyms mined from articles' free text using hand-crafted lexicosyntactic patterns.
The dataset contains 4.8 million entity-type assignments.
The dataset was generated with DBpedia 3.9 and Wikipedia snapshots in May 2014.
The latest version of the Linked Hypernyms Dataset - late May 2014!
All partitions of the dataset, as described in the dataset description section, can be download from here.
Download highlights | |||
---|---|---|---|
Dataset | Dutch | English | German |
LHD 1.0 Hypernyms are either DBpedia resources or DBpedia ontology types. |
nt 834k |
nt 3,013k |
nt 893k |
LHD 2.0 All hypernyms are only DBpedia ontology types. |
nt 785k |
nt 2,960k |
nt 795k |
Complete "Plain Text" Hypernym Dataset All hypernyms are string literals (the original extracted word). |
nt 1,304k |
nt 3,385k |
nt 1,033k |
Other datasets All auxiliary files created during the extraction process. |
zip | zip | zip |
Processed resources (with a short abstract) | 1,404k | 4,004k | 1,367k |
Publications
- T. Kliegr, V. Zeman, M. Dojchinovski. Linked Hypernyms Dataset - Generation Framework and Use Cases. In Linguistic Linked Data (LDL'14) Challenge collocated with LREC 2014, Reykjavik, Iceland, May, 2014.
- T. Kliegr., O. Zamazal Towards Linked Hypernyms Dataset 2.0: complementing DBpedia with hypernym discovery. In 9th International Language Resources and Evaluation Conference (LREC'14), Reykjavik, Iceland, May, 2014.