Unlock text
Unlock the location data implicit in your text documents.
A simple RESTful web service. POST your text content - either plain text, or HTML pages or XML containing metadata - and get back a feed of the named places found in the text, with best guesses as to their locations.
First, we extract likely placenames from a piece of text. Next, we look up the placenames in the gazetteer, and match the placenames to locations using the context provided by the text. For example, if "Leith" and "Portobello" are mentioned together, we're more likely to be talking about "Leith, Edinburgh" than "Leith, Ontario".
The text/places service can be used either with the open data worldwide gazetteer GeoNames, or with the Ordnance Survey derived UK gazetteer, Unlock.
Geo-Parser
Note: The Geo-Parser can take a while to process large documents. We recommend geo-parsing one page of text at a time.
| Parameter | Description |
|---|---|
| Type | The contents of the your file to be geo-parsed. Plain Text, HTML or XML. |
| Gazetteer | Which gazetteer will be used to look-up placesnames? GeoNames (free access) or Unlock (Digimap key required). |
| Output Format | What format do you want to receive results? XML, JSON or KML document. |
| Upload File | Browse for your local file to upload and Geo-parse! |
Text/Places API
An API for Geo-parsing is available. Make a POST request to this URL with the parameters shown below: http://unlock.edina.ac.uk/text/places
You can use a command-line client, such as curl, to make the POST request like so:
curl -d type=plain -d "document=Carnock is a small town in Fife near Dunfermline" / -d gazetteer=geonames http://unlock.edina.ac.uk/text/places
Alternatively, in form-emulation mode:
curl -F "document=@filename.txt" -F "type=plain" -F "gazetteer=geonames" -F "format=json"
| Parameter | Value | Type | Description |
|---|---|---|---|
| document | Text contents to be geoparsed. | String | This may be plain text, HTML or XML - specified by the "type" parameter |
| type | One of ['plain','html','xml'] | String | |
| format | One of ['json','kml','basic'] | String | The output format for the set of results |
| gazetteer | Either 'geonames' or 'os' | String | The gazetteer to be used to resolve the placename locations. If requesting OS data, you must also specify an API key. |
| key | A registered API key. The presented key is mapped to the IP address of the request. | String | Unlock Web Services authentication. Only needed for OS data. |
The core geoparser software is a collaboration between EDINA and the Language Technology Group at the School of Informatics, University of Edinburgh. LTG have worked with us on enhancing the geoparser for very large bodies of similar text, including the BOPCRIS archive of 19th century parliamentary proceedings, and the HISTPOP archive of historic census and population data.


