This page presents experimental results that demonstrate the coverage as well as the quality of the datasets. The creation of the Linked Hypernym Dataset comprises of two steps: hypernym discovery, which returns a plain text hypernym from a Wikipedia article, and hypernym disambiguation, which resolves the plain text hypernym to a DBpedia resource or DBpedia Ontology class. Below, we present results for the individual phases of this process, as well as the evaluation of the resulting Linked Hypernym Dataset
Overall accuracy (available only for selected partitions)
Accuracy estimates are provided for selected paritions (subdatasets) of the Linked Hypernyms Dataset. From each subdataset, on 1000 articles randomly selected a human annotator evaluated whether the type assigned to the entity is correct. Wilson confidence intervals are also reported.
Datasets with accuracy estimate | ||
---|---|---|
Dataset | English | Accuracy |
Type DBpedia ontology class, not redundant w.r.t. DBpedia | nt stat | 0.82 +- 2% |
- subset with YAGO Exact Match | nt stat | 0.994 [0.99;1] |
- subset with YAGO No Type | nt stat | 0.91 +- 2% |
- subset with no match with YAGO | nt stat | 0.90 +- 2% |
Type is DBpedia resource, not redundant w.r.t. DBpedia | nt stat | 0.83 +- 2% |
- subset with YAGO Exact Match | nt stat | 0.86 +- 2% |
- subset with YAGO No Type | nt stat | 0.82 +- 2% |
- subset with no match with YAGO | nt stat | 0.77 +- 2.5% |
Type is DBpedia ontology class, DBpedia assigns no type | nt stat | 0.94 +- 2% |
- subset with YAGO Exact Match | nt stat | 0.95 +- 2% |
- subset with YAGO No Type | nt stat | 0.88 +- 2% |
- subset with YAGO No Type | nt stat | 0.93 +- 2% |
The accuracy readings in the table above include a combined error of hypernym discovery, and disambiguation of the hypernym to a DBpedia URI. A separate evaluation of the contribution of these two error sources is covered below.
Resources
- Evaluation guidelines
- Raw evaluation results (JWS article)
- Raw evaluation results for type inference (LREC article)
- refer to the evaluation guidelines for explanation of individual fields
- Office Open XML format
- sheet names correspond to the names of the files in which the corresponding Linked Hypernym subsets are distributed
- the first sheet contains aggregate statistics
- For the English corpus, the first appearance of a hypernym in each of the documents was independently annotated by three annotators with the help of Google Translate.
- For German and Dutch, all documents were annotated by two annotators, when there was no consensus, an annotation by the third annotator was provided.
- precise - the linked hypernym exclusively covers the entity represented by the hypernym
- imprecise - the linked hypernym is closely topically related
- disambiguation page - the linked hypernym is a disambiguation page
- incorrect
- article does not exist - no linked hypernym found
- Experiment 1: listPersonDocs.grv: list documents describing persons
- Experiment 2: input-setAggreement.grv: list documents with interannotator agreement
Accuracy of hypernym discovery
The quality of the hypernym discovery was evaluated on three manually tagged corpora (English, German, Dutch), each containing randomly selected 500 articles.Evaluation results for hypernym discovery | |||||||||
---|---|---|---|---|---|---|---|---|---|
language | docs | docs with groundtruth | match | only A | only B | partially correct | precision | recall | F1.0 |
English | 500 | 500 | 411 | 55 | 24 | 0 | 0.94 | 0.88 | 0.91 |
German | 500 | 492 | 410 | 45 | 26 | 2 | 0.94 | 0.90 | 0.92 |
German-person | 224 | 222 | 205 | 10 | 6 | 1 | 0.98 | 0.95 | 0.97 |
Dutch | 500 | 495 | 428 | 45 | 37 | 3 | 0.91 | 0.90 | 0.91 |
Resources
Hypernym Disambiguation
A hypernym discovered from an article is in plain text. This evaluation assesses the accuracy of matching a hypernym with a DBpedia resource. Three annotators have taken part in this experiment marking each entity-hypernym article pair as one of the following:This evaluation was performed only for English.
precise | imprecise | disambiguation page | incorrect | article does not exist |
---|---|---|---|---|
0.6709585122 | 0.0765379113 | 0.19527897 | 0.0557939914 | 0.0014306152 |
Annotation guidelines for disambiguation.
The raw data for all evaluators are available in sheet "en.disambiguation.eval" of the jws spreadsheet. Please use Open Office or Libre office to open this file.
Scripts
To automate the experiments several scripts for the GATE Groovy PR are provided: