Targeted Hypernym Discovery

v0.3.9.2Beta

REST API v1 Resources

API v1 is available but we strongly recommend migration to API v2.

Resource:

POST api/v1/extraction

Description:

Performs entity extraction and classification of entities for a given text. For each extracted entity its types are discovered. For both, the entities and the types, appropriate links to DBpedia or YAGO are provided (if available). Note that the length of the input text has influence on the processing time.


Parameters Description
lang optional Language of the input text. Values: en/de/nl.
Default value: en.
Example: lang=en
format optional Requested response format. Values: xml/json.
Default value: xml
Example: format=xml
Note: The format can be also specified using the Accept request header. If format parameter is specified, then the format parameter will have higher priority.
Example: Accept: application/xml.

NIF 1.0 support has been suspended, NIF 2.0 will be released once the specification is completed.

provenance optional Provenance of the results. Values: thd/dbpedia/yago. The client can choose one or more.
Default value: thd,dbpedia,yago.
Example: provenance=thd,dbpedia
knowledge_base optional Defines, which knowledge base is used to retrieve types for thd. Currently applicable only, when provenance is set to thd or all. Values: linkedHypernymsDataset/local/live
linkedHypernymsDataset - use the linked hypernyms dataset (recommended),
local - local Wikipedia mirror (slight latency), for the date of the Wikipedia snapshot please refer to the technical information about application version in the page footer,
live - live Wikipedia (highest latency), please be considerate and do not submit large amount of text or high number of requests. Recommended use: issue query containing single candidate entity, for which the other options failed to provide a type.
Note: Only one option can be chosen.
Default value: linkedHypernymsDataset
Example: knowledge_base=linkedHypernymsDataset
entity_type optional The types of entities to be processed from the text. Values: ne/ce/all
ne - extract only "named entities",
ce - only common entities will be extracted,
all - both, the named entities and the common entities will be extracted.
Default value: all
Example: entity_type=ne
priority_entity_linking optional If set to true, the system will prefer linking more precise DBpedia disambiguation (longer entity name). This option may result to less entities being assigned types. Values: true/false
Default value: false
Example: priority_entity_linking=true
apikey required Used for identification of a third-party application utilizing the service. Write us an email to get an api key.
Example: apikey=123456789

Response Object

Field Type Description
startOffset Integer Start offset index of the found entity in the input text. The offset is counter from 0 from the beginning of the input text.
"startOffset": 4
endOffset Integer End offset of the found entity in the input text. The offset is counter from 0 from the beginning of the input text.
"endOffset": 18
underlyingString String The string considered as an entity.
"underlyingString": "Charles Bridge"
entityType String The type of the extracted entity. Possible values: "named entity" or "common entity".
"entityType": "named entity"
types Array of types List of types for the found entity.
"types":  [
       {
        "typeLabel": "Country",
        "typeURI": "http://dbpedia.org/ontology/Country",
        "entityLabel": "Czech Republic",
        "entityURI": "http://dbpedia.org/resource/Czech_Republic",
        "confidence":  {
          "value": "0.85",
          "bounds": "+- 2.5%",
          "type": "extraction"
        },
        "provenance": "thd"
      } ]
The corresponding XML is:
<types>
    <type>
        <typeLabel>Country</typeLabel>
        <typeURI>http://dbpedia.org/ontology/Country</typeURI>
        <entityLabel>Czech Republic</entityLabel>
        <entityURI>http://dbpedia.org/resource/Czech_Republic</entityURI>
        <confidence type="extraction" bounds="+- 2.5%">0.85</confidence>
        <provenance>thd</provenance>
    </type>
<types>

typeLabel - name by which the type is formally known.

typeURI - DBpedia/YAGO URI describing the entity type.

entityLabel - name by which the disambiguated entity is formally known.

entityURI - DBpedia/YAGO URI describing the disambiguated entity.

confidence - estimated classification or disambiguation confidence.
Possible values for the type attribute are classification and disambiguation.
Classification confidence is the estimated probability that the typeLabel is correct for given typeURI.
Disambiguation confidence is the estimated probability of the entityURI being correct given the surface form of the entity (not currently supported).

provenance - Provenance of the results. Possible values are: thd - produced by THD, thd-derived - also produced by THD through searching for superclasses in the Dbpedia ontology, dbpedia - produced by DBpedia, and yago - produced by YAGO2s ontology.

HTTP Status Codes

The THD API attempts to return appropriate HTTP status codes for every request.

Code Text Description
200 OK Success!
400 Bad Request The request was invalid. An accompanying error message will explain why.
401 Unauthorized Authentication credentials were missing or incorrect.
406 Not Acceptable Returned by the API when an invalid format is specified in the request.
500 Internal Server Error Something is broken. Please write us an email so the THD team can investigate.

Error Messages

When the THD API returns error messages, it does so in your requested format. For example, returned error in JSON might look like this:

{ "code": 45, "value": "Empty body request" }

The corresponding XML response would be:

<?xml version="1.0" encoding="UTF-8"?>
<error code="45">Empty body request</error>

Error Codes

In addition to descriptive error text, error messages contain machine-parseable codes. The following table describes the codes which may appear when working with the API:

Code Text Description
41 Not supported format The format specified in the format parameter is not supported.
42 Not supported format The format specified in the Accept header is not supported.
43 Could not authenticate you Authentication credentials were missing. Needs security credentials specified by the apikey parameter.
44 Could not authenticate you Specified api key is not valid. The API could not authenticate you.
45 Empty body request The body of the request is empty.
46 Not valid knowledge base parameter Chosen knowledge base is not supported.
47 Not valid provenance parameter The value of the provenance parameter is not valid. You can choose between thd, dbpedia and yago.
48 Not correctly set entity_type parameter The value of the provenance parameter is not valid. You can choose between ne, ce and all.
49 Not supported language Specified language in the lang parameter is not valid. You can choose between en, de and nl.
51 Internal error Something went wrong on the server side. Please write us an email so the THD team can investigate.

Request example

POST
http://ner.vse.cz/thd/api/v1/extraction?apikey=123456789&format=xml&provenance=thd&priority_entity_linking=true&entity_type=all

POST Data
The Charles Bridge is a famous historic bridge that crosses the Vltava river in Prague, Czech Republic.

curl -v "http://ner.vse.cz/thd/api/v1/extraction?apikey=123456789&format=xml&provenance=thd&priority_entity_linking=true&entity_type=all" -d "The Charles Bridge is a famous historic bridge that crosses the Vltava river in Prague, Czech Republic."

Response example

<entities>
    <entity>
        <startOffset>4</startOffset>
        <endOffset>18</endOffset>
        <underlyingString>Charles Bridge</underlyingString>
        <entityType>named entity</entityType>
        <types>
          <type>
            <typeLabel>Bridge</typeLabel>
            <typeURI>http://dbpedia.org/ontology/Bridge</typeURI>
            <entityLabel>Charles Bridge</entityURI>
            <entityURI>http://dbpedia.org/resource/Charles_Bridge</entityURI>
            <confidence type="extraction" bounds="+- 2.5%">0.85</confidence>
            <provenance>thd</provenance>
          </type>
          <type>
            <typeLabel>route of transportation</typeLabel>
            <typeURI>http://dbpedia.org/ontology/RouteOfTransportation</typeURI>
            <entityLabel>Charles Bridge</entityURI>
            <entityURI>http://dbpedia.org/resource/Charles_Bridge</entityURI>
            <confidence type="extraction" bounds="+- 2.5%">0.85</confidence>
            <provenance>thd</provenance>
          </type>
          ...
    </entity>
    ...
</entities>