This pages provides downloads of the Linked Hypernym datasets. The downloads are provided as N-Triples (gzipped). For most files, a "stat" file is provided with frequencies of types. For selected partitions of the dataset, we also provide accuracy estimates.
Main dataset partitions | |||
---|---|---|---|
Dataset | Dutch | English | German |
Ontology Hypernym Types | nt stat | nt stat | nt stat |
Ontology Hypernym Subclasses (superclass DBpedia resource) | nt | nt | nt |
Ontology Hypernym Subclasses (superclass DBpedia ontology concept) | nt |
Resolving linked hypernym to DBpedia Ontology classes
The types in the complete Linked Hypernyms Dataset is a mix of types in the DBpedia resource and DBpedia Ontology namespaces. Some of the types in the DBpedia resource namespace are resolvable to the DBpedia Ontology class using the mappings published below. In these mappings files, the DBpedia resource is modelled as a subclass of DBpedia Ontology class.
Some partitions of the dataset contain exclusively resolvable types. In these partitions, most types are directly DBpedia Ontology classes and about 10% are DBpedia resources, which can be mapped to a DBpedia Ontology class via the subclass relation using the above listed files.Additions in version LHD 2.0 (draft)
The version 2.0 of the Linked Hypernyms Dataset increases the number of entities with a type mappable to DBpedia Ontology to nearly 100%. This is accomplished by mapping the original types (DBpedia resources), to DBpedia Ontology concepts.
Example
The LHD 1.0 dataset contains the following statement:- <http://dbpedia.org/resource/O'Brien_Press> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/resource/Publisher> .
- <http://dbpedia.org/resource/Publisher> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://dbpedia.org/ontology/Company> .
- <http://dbpedia.org/resource/O'Brien_Press> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Company> .
The mapping was performed by a statistical, cooccurrence-based algorithm. Out of multiple mapping candidate, the one with highest support is selected. The mappings algorithm used a heuristically determined constant 0.2 as the tradeoff between the specificity of the type and its support. Details TBA.
Raw data and intermediary results | |||
---|---|---|---|
Dataset | Dutch | English | German |
Mapping from DBpedia resources to DBpedia Ontology classes via the subclass relation, each record is preceded by its support in the comment | nt | nt | nt |
Entities mapped via LHD2.0 mappings to DBpedia Ontology Entity type assignment is confirmed in DBpedia |
nt stat | nt stat | nt stat |
Entities mapped via LHD2.0 mappings to DBpedia Ontology Entity is in DBpedia, but the type is not confirmed |
nt stat | nt stat | nt stat |
Entities mapped via LHD2.0 mappings to DBpedia Ontology Entity is not in DBpedia |
nt stat | nt stat | nt stat |
All entities with DBpedia Ontology classes replacing DBpedia resources The complete LHD dataset with some type assignments overriden by LHD 2.0 mappings |
nt stat | nt stat | nt stat |
Supplementary files
- Ontology Hypernym Types - Dutch, English and German in one file: nt, stat
- The list of DBpedia resources used as types (with number of instances): stat
- The list of DBpedia Ontology concepts used as types (with number of instances): stat
Raw data and intermediary results | |||
---|---|---|---|
Dataset | Dutch | English | German |
Ontology Hypernym Types without DBpedia Ontology mappings Complete Linked Hypernyms dataset, all hypernyms are DBpedia resources | nt stat | nt stat | nt stat |
Ontology Hypernym Types before the deletion rules were applied The mapping rules were already applied. | nt stat | nt stat | nt stat |
Ontology Hypernym Types - raw disambiguation result Mapping rules neither deletion rules were applied. | nt stat | nt stat | nt stat |
Plain Text Hypernym Types Complete Hypernyms dataset, all hypernyms are textual strings | nt | nt | nt |
The Ontology Hypernym Types file is partitioned into several subsets according to whether
- entity-type statement is novel w.r.t. DBpedia
- type is resolvable to a DBpedia ontology class (preferred) or DBpedia resource (suitable class not found)
- lower confidence entity-type statements are listed separately
Dataset | Dutch | English | German |
---|---|---|---|
Type is DBpedia resource, lower confidence (type is a hypernym) | nt stat | nt stat | nt stat |
Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to DBpedia | nt stat | nt stat | nt stat |
Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to English DBpedia (for localized datasets only) | nt stat | nt stat | |
Type is DBpedia resource, types redundant w.r.t. to DBpedia | nt stat | nt stat | nt stat |
Type is DBpedia resource, types probably redundant w.r.t. to English DBpedia (for localized datasets only) | nt stat | nt stat | |
Type is resolvable to a DBpedia ontology class, not redundant w.r.t. DBpedia This dataset is further partitioned according to novelty w.r.t YAGO 2s | nt stat | nt stat 0.82 +- 2% | nt stat |
Type is DBpedia resource, not redundant w.r.t. DBpedia> This dataset is further partitioned according to novelty w.r.t YAGO 2s | nt stat | nt stat 0.83 +- 2% | nt stat |
Type is resolvable to a DBpedia ontology class, DBpedia assigns no type This dataset is further partitioned according to novelty w.r.t YAGO 2s | nt stat | nt stat 0.94 +- 2% | nt stat |
Type is DBpedia resource, lower confidence (type is a DBpedia instance) | nt stat | nt stat | nt stat |
Description of the individual subsets
Below, we provide details on individual partitions of the Ontology Hypernym Types file. If not stated explicitly stated otherwise, the same language version of DBpedia was used to perform the redundancy check as is the dataset language (i.e. English dataset - English DBpedia). The partitions that contain types non-redundant w.r.t. DBpedia are further partitioned according to the redundancy w.r.t. to the YAGO ontology.
Redundant of low confidence subsets
Type is DBpedia resource, lower confidence (type is a hypernym)
Statements with the subject used as object (i.e. hypernym) in another extracted statement and at the same time the object is considered as instance in DBpedia.
Type is resolvable to a DBpedia ontology class, types redundant w.r.t. to DBpedia
Statements with object mapped to the DBpedia Ontology, the statements are redundant w.r.t. existing statement in DBpedia.
Type is DBpedia ontology class, types redundant w.r.t. to DBpedia (for localized datasets only)
For entity-linked hypernyms extracted from non-English DBpedia, the redundancy check is performed also against English DBpedia. This dataset contains statements, which were not found as redundant with respect to the localized DBpedia, but were found redundant w.r.t. English DBpedia. The object of the statements (type) is mapped to the DBpedia Ontology.
Type is DBpedia resource, types redundant w.r.t. to DBpedia
Statements where the name of the object does not match any of the DBpedia-owl types, but it exactly matches a name of another assigned type in DBpedia (such as from the schema.org) ontology. These statements are thus (probably) redundant.
Type is DBpedia resource, types probably redundant w.r.t. to English DBpedia (for localized datasets only)
Statements where the name of the object does not match any of the DBpedia-owl types, but it exactly matches a name of another assigned type in DBpedia (such as from the schema.org) ontology. These statements are thus (probably) redundant.
DBpedia enrichment subsets
The three subsets listed below contain statements novel w.r.t. DBpedia (instance file, v 3.8).
Type is resolvable to a DBpedia ontology class, not redundant w.r.t. DBpedia
Statements with the type resolvable to the DBpedia Ontology, the statements are not redundant w.r.t. existing statements in DBpedia
YAGO overlap | |||
---|---|---|---|
Dataset | Dutch | English | German |
YAGO Exact Match | nt stat | nt stat 0.994 [0.99;1] | nt stat |
YAGO Approx Match | nt stat | nt stat | nt stat |
YAGO No Match | nt stat | nt stat 0.9 +- 2% | nt stat |
YAGO No Type | nt stat | nt stat 0.91 +- 2% | nt stat |
Type is DBpedia resource, not redundant w.r.t. DBpedia
Statements with the type not resolvable (within the Linked Hypernyms dataset) to the DBpedia Ontology. No further check of the statement uniqueness w.r.t. existing statements in DBpedia was done.
YAGO overlap | |||
---|---|---|---|
Dataset | Dutch | English | German |
YAGO Exact Match | nt stat | nt stat 0.86 +- 2% | nt stat |
YAGO Approx Match | nt stat | nt stat | nt stat |
YAGO No Match | nt stat | nt stat 0.77 +- 2% | nt stat |
YAGO No Type | nt stat | nt stat 0.82 +- 2% | nt stat |
Type is resolvable to a DBpedia ontology class, DBpedia assigns no type
Statements with the type resolvable to the DBpedia Ontology, the statements are not redundant, since the subjects do not have any type in DBpedia.
YAGO overlap | |||
---|---|---|---|
Dataset | Dutch | English | German |
YAGO Exact Match | nt stat | nt stat 0.95 +- 2% | nt stat |
YAGO Approx Match | nt stat | nt stat | nt stat |
YAGO No Match | nt stat | nt stat 0.93 +- 2% | nt stat |
YAGO No Type | nt stat | nt stat 0.88 +- 2% | nt stat |
Type is DBpedia resource, lower confidence (type is a DBpedia instance)
Lists instances which have a DBpedia instance as a type.
YAGO Overlap
Subsets of the Linked Hypernym Dataset that contain entity-type which are novel w.r.t. DBpedia are futher partitioned into four subsets according to overlap with Yago 2s.
Redundant subsets
YAGO Exact Match
A perfect match between the linked hypernym and YAGO2s type name found.
YAGO Approx Match
A YAGO2s type containing the linked hypernym as a substring was found.
YAGO Enrichment subsets
YAGO No Type
Entity has no type assigned in YAGO2s.t
YAGO No Match
None of the above applies: entity has at least one type in YAGO, but none of the types matches the Linked Hypernym.The name of the overlapping class in YAGO is listed on the preceding line in a comment.