This is the old version 3.8!
Go to the NEW VERSION!

Linked Hypernyms Dataset

v3.8

This pages provides downloads of the Linked Hypernym datasets. The downloads are provided as N-Triples (gzipped). For most files, a "stat" file is provided with frequencies of types. For selected partitions of the dataset, we also provide accuracy estimates.

Complete Linked Hypernyms dataset is downloadable as "Ontology Hypernym Types", the type is provided as DBpedia Ontology class (preferred) or a DBpedia resource (note that some of these resources are resolvable to the DBpedia Ontology.
Main dataset partitions
Dataset Dutch English German
Ontology Hypernym Typesnt stat nt statnt stat
Ontology Hypernym Subclasses (superclass DBpedia resource)ntntnt
Ontology Hypernym Subclasses (superclass DBpedia ontology concept)nt

Resolving linked hypernym to DBpedia Ontology classes

The types in the complete Linked Hypernyms Dataset is a mix of types in the DBpedia resource and DBpedia Ontology namespaces. Some of the types in the DBpedia resource namespace are resolvable to the DBpedia Ontology class using the mappings published below. In these mappings files, the DBpedia resource is modelled as a subclass of DBpedia Ontology class.

Some partitions of the dataset contain exclusively resolvable types. In these partitions, most types are directly DBpedia Ontology classes and about 10% are DBpedia resources, which can be mapped to a DBpedia Ontology class via the subclass relation using the above listed files.

Additions in version LHD 2.0 (draft)

The version 2.0 of the Linked Hypernyms Dataset increases the number of entities with a type mappable to DBpedia Ontology to nearly 100%. This is accomplished by mapping the original types (DBpedia resources), to DBpedia Ontology concepts.

Example

The LHD 1.0 dataset contains the following statement: The object of this statement is a DBpedia resource (Pusblisher is not in the DBpedia Ontology). This statement is thus not well connected to the Linked Data Cloud. The LHD 2.0 adds the following statement: Using this statement, it is trivial to infer that

The mapping was performed by a statistical, cooccurrence-based algorithm. Out of multiple mapping candidate, the one with highest support is selected. The mappings algorithm used a heuristically determined constant 0.2 as the tradeoff between the specificity of the type and its support. Details TBA.

This table provides links to download for mappings between DBpedia resources and DBpedia classes (LHD 2.0 draft)
Raw data and intermediary results
Dataset Dutch English German
Mapping from DBpedia resources to DBpedia Ontology classes
via the subclass relation, each record is preceded by its support in the comment
nt ntnt
Entities mapped via LHD2.0 mappings to DBpedia Ontology
Entity type assignment is confirmed in DBpedia
nt stat nt statnt stat
Entities mapped via LHD2.0 mappings to DBpedia Ontology
Entity is in DBpedia, but the type is not confirmed
nt stat nt statnt stat
Entities mapped via LHD2.0 mappings to DBpedia Ontology
Entity is not in DBpedia
nt stat nt statnt stat
All entities with DBpedia Ontology classes replacing DBpedia resources
The complete LHD dataset with some type assignments overriden by LHD 2.0 mappings
nt stat nt statnt stat

Supplementary files

This table provides a list of datasets in earlier processing stages
Raw data and intermediary results
Dataset Dutch English German
Ontology Hypernym Types without DBpedia Ontology mappings
Complete Linked Hypernyms dataset, all hypernyms are DBpedia resources
nt stat nt statnt stat
Ontology Hypernym Types before the deletion rules were applied
The mapping rules were already applied.
nt stat nt statnt stat
Ontology Hypernym Types - raw disambiguation result
Mapping rules neither deletion rules were applied.
nt stat nt statnt stat
Plain Text Hypernym Types
Complete Hypernyms dataset, all hypernyms are textual strings
nt nt nt

The Ontology Hypernym Types file is partitioned into several subsets according to whether

The partitions are enlisted below.

Partitions of the Ontology Hypernym Types dataset, the accuracy estimates are reported with Wilson confidence intervals
Dataset Dutch English German
Type is DBpedia resource, lower confidence (type is a hypernym) nt statnt statnt stat
Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to DBpediant stat nt statnt stat
Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to English DBpedia (for localized datasets only)nt statnt stat
Type is DBpedia resource, types redundant w.r.t. to DBpediant statnt statnt stat
Type is DBpedia resource, types probably redundant w.r.t. to English DBpedia (for localized datasets only)nt statnt stat
Type is resolvable to a DBpedia ontology class, not redundant w.r.t. DBpedia
This dataset is further partitioned according to novelty w.r.t YAGO 2s
nt statnt stat
0.82 +- 2%
nt stat
Type is DBpedia resource, not redundant w.r.t. DBpedia>
This dataset is further partitioned according to novelty w.r.t YAGO 2s
nt statnt stat
0.83 +- 2%
nt stat
Type is resolvable to a DBpedia ontology class, DBpedia assigns no type
This dataset is further partitioned according to novelty w.r.t YAGO 2s
nt statnt stat
0.94 +- 2%
nt stat
Type is DBpedia resource, lower confidence (type is a DBpedia instance)nt statnt statnt stat

Description of the individual subsets

Below, we provide details on individual partitions of the Ontology Hypernym Types file. If not stated explicitly stated otherwise, the same language version of DBpedia was used to perform the redundancy check as is the dataset language (i.e. English dataset - English DBpedia). The partitions that contain types non-redundant w.r.t. DBpedia are further partitioned according to the redundancy w.r.t. to the YAGO ontology.

Redundant of low confidence subsets

Type is DBpedia resource, lower confidence (type is a hypernym)

Statements with the subject used as object (i.e. hypernym) in another extracted statement and at the same time the object is considered as instance in DBpedia.

Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to DBpedia

Statements with object mapped to the DBpedia Ontology, the statements are redundant w.r.t. existing statement in DBpedia.

Type is DBpedia ontology class, types redundant w.r.t. to DBpedia (for localized datasets only)

For entity-linked hypernyms extracted from non-English DBpedia, the redundancy check is performed also against English DBpedia. This dataset contains statements, which were not found as redundant with respect to the localized DBpedia, but were found redundant w.r.t. English DBpedia. The object of the statements (type) is mapped to the DBpedia Ontology.

Type is DBpedia resource, types redundant w.r.t. to DBpedia

Statements where the name of the object does not match any of the DBpedia-owl types, but it exactly matches a name of another assigned type in DBpedia (such as from the schema.org) ontology. These statements are thus (probably) redundant.

Type is DBpedia resource, types probably redundant w.r.t. to English DBpedia (for localized datasets only)

Statements where the name of the object does not match any of the DBpedia-owl types, but it exactly matches a name of another assigned type in DBpedia (such as from the schema.org) ontology. These statements are thus (probably) redundant.

DBpedia enrichment subsets

The three subsets listed below contain statements novel w.r.t. DBpedia (instance file, v 3.8).

Type is resolvable to a DBpedia ontology class, not redundant w.r.t. DBpedia

Statements with the type resolvable to the DBpedia Ontology, the statements are not redundant w.r.t. existing statements in DBpedia

Dataset partitions with statements resolvable to DBpedia ontology classses
YAGO overlap
Dataset Dutch English German
YAGO Exact Matchnt statnt stat
0.994 [0.99;1]
nt stat
YAGO Approx Matchnt statnt statnt stat
YAGO No Matchnt statnt stat
0.9 +- 2%
nt stat
YAGO No Typent statnt stat
0.91 +- 2%
nt stat

Type is DBpedia resource, not redundant w.r.t. DBpedia

Statements with the type not resolvable (within the Linked Hypernyms dataset) to the DBpedia Ontology. No further check of the statement uniqueness w.r.t. existing statements in DBpedia was done.

Dataset partitions with statements not resolvable to DBpedia ontology classes
YAGO overlap
Dataset Dutch English German
YAGO Exact Matchnt statnt stat
0.86 +- 2%
nt stat
YAGO Approx Matchnt statnt statnt stat
YAGO No Matchnt statnt stat
0.77 +- 2%
nt stat
YAGO No Typent statnt stat
0.82 +- 2%
nt stat

Type is resolvable to a DBpedia ontology class, DBpedia assigns no type

Statements with the type resolvable to the DBpedia Ontology, the statements are not redundant, since the subjects do not have any type in DBpedia.

Dataset partitions with new statements resolvable to DBpedia ontology
YAGO overlap
Dataset Dutch English German
YAGO Exact Matchnt statnt stat
0.95 +- 2%
nt stat
YAGO Approx Matchnt statnt statnt stat
YAGO No Matchnt statnt stat
0.93 +- 2%
nt stat
YAGO No Typent statnt stat
0.88 +- 2%
nt stat

Type is DBpedia resource, lower confidence (type is a DBpedia instance)

Lists instances which have a DBpedia instance as a type.

YAGO Overlap

Subsets of the Linked Hypernym Dataset that contain entity-type which are novel w.r.t. DBpedia are futher partitioned into four subsets according to overlap with Yago 2s.

Redundant subsets

YAGO Exact Match

A perfect match between the linked hypernym and YAGO2s type name found.

YAGO Approx Match

A YAGO2s type containing the linked hypernym as a substring was found.

YAGO Enrichment subsets

YAGO No Type

Entity has no type assigned in YAGO2s.t

YAGO No Match

None of the above applies: entity has at least one type in YAGO, but none of the types matches the Linked Hypernym.The name of the overlapping class in YAGO is listed on the preceding line in a comment.