Linked Hypernyms Dataset description

This page presents experimental results that demonstrate the coverage as well as the quality of the datasets. The creation of the Linked Hypernym Dataset comprises of two steps: hypernym discovery, which returns a plain text hypernym from a Wikipedia article, and hypernym disambiguation, which resolves the plain text hypernym to a DBpedia resource or DBpedia Ontology class. Below, we present results for the individual phases of this process, as well as the evaluation of the resulting Linked Hypernym Dataset

Overall dataset accuracy
Accuracy of hypernym discovery
Evaluation of hypernym disambiguation

Overall accuracy (available only for selected partitions)

Accuracy estimates are provided for selected paritions (subdatasets) of the Linked Hypernyms Dataset. From each subdataset, on 1000 articles randomly selected a human annotator evaluated whether the type assigned to the entity is correct. Wilson confidence intervals are also reported.

Each evaluation was performed on a randomly drawn sample of 1000 documents. The accuracy values are supplemented by Wilson confidence interval.
Datasets with accuracy estimate
Dataset	English	Accuracy
Type DBpedia ontology class, not redundant w.r.t. DBpedia	nt stat	0.82 +- 2%
- subset with YAGO Exact Match	nt stat	0.994 [0.99;1]
- subset with YAGO No Type	nt stat	0.91 +- 2%
- subset with no match with YAGO	nt stat	0.90 +- 2%
Type is DBpedia resource, not redundant w.r.t. DBpedia	nt stat	0.83 +- 2%
- subset with YAGO Exact Match	nt stat	0.86 +- 2%
- subset with YAGO No Type	nt stat	0.82 +- 2%
- subset with no match with YAGO	nt stat	0.77 +- 2.5%
Type is DBpedia ontology class, DBpedia assigns no type	nt stat	0.94 +- 2%
- subset with YAGO Exact Match	nt stat	0.95 +- 2%
- subset with YAGO No Type	nt stat	0.88 +- 2%
- subset with YAGO No Type	nt stat	0.93 +- 2%

The accuracy readings in the table above include a combined error of hypernym discovery, and disambiguation of the hypernym to a DBpedia URI. A separate evaluation of the contribution of these two error sources is covered below.

Resources

Evaluation guidelines
Raw evaluation results (JWS article)
Raw evaluation results for type inference (LREC article)

refer to the evaluation guidelines for explanation of individual fields
Office Open XML format
sheet names correspond to the names of the files in which the corresponding Linked Hypernym subsets are distributed
the first sheet contains aggregate statistics

Accuracy of hypernym discovery

For the English corpus, the first appearance of a hypernym in each of the documents was independently annotated by three annotators with the help of Google Translate.
For German and Dutch, all documents were annotated by two annotators, when there was no consensus, an annotation by the third annotator was provided.

For the meaning of individual columns please see GATE Corpus Quality Assurance tool docs
Evaluation results for hypernym discovery
language	docs	docs with groundtruth	match	only A	only B	partially correct	precision	recall	F1.0
English	500	500	411	55	24	0	0.94	0.88	0.91
German	500	492	410	45	26	2	0.94	0.90	0.92
German-person	224	222	205	10	6	1	0.98	0.95	0.97
Dutch	500	495	428	45	37	3	0.91	0.90	0.91

Resources

Hypernym Disambiguation

precise - the linked hypernym exclusively covers the entity represented by the hypernym
imprecise - the linked hypernym is closely topically related
disambiguation page - the linked hypernym is a disambiguation page
incorrect
article does not exist - no linked hypernym found

This evaluation was performed only for English.

Evaluation results for the hypernym linking task (English)
precise	imprecise	disambiguation page	incorrect	article does not exist
0.6709585122	0.0765379113	0.19527897	0.0557939914	0.0014306152

Annotation guidelines for disambiguation.

The raw data for all evaluators are available in sheet "en.disambiguation.eval" of the jws spreadsheet. Please use Open Office or Libre office to open this file.

Scripts

To automate the experiments several scripts for the GATE Groovy PR are provided:

Experiment 1: listPersonDocs.grv: list documents describing persons
Experiment 2: input-setAggreement.grv: list documents with interannotator agreement