Linked Hypernyms Dataset

The Czech Traveler dataset is based on a collection of 1,276 images taken by a professional photographer during trips to Albania, Corsica, Romania, Slovenia and Ukraine. These images have short textual annotations consisting of 1 to 10 words saved in images' EXIF data. Out of the annotations we extracted 103 unique annotations. The image annotations were broken into entities and these entities were assigned a label (class).

The Czech Traveler Dataset in numbers
Key metrics		download
all images	1,276	zip (51.1 MB)
unique annotations	103
entities	186
labeled entities	184
labeled entities (to 9 classes) with inter-annotator agreement	143	ods (25 KB)
unique entities	151
named entities	101
unique named entities	76
unique entities with inter-annotator agreement	113
entities for which not even the head was mapped to WordNet	47
unique entities for which not even the head was mapped to WordNet	41
entities for which not even the head was mapped to WordNet among the 143 entities with inter-annotator agreemen	30

Czech Traveler Dataset