Recent News

Event: CICLing 2015 Conference

Date: April 14 to April 20, 2015
Location: Cairo, Egypt

NLP Technologies is excited to present the findings of its latest research and development project, in collaboration with the University of Ottawa, on NLP-based methods for detecting and disambiguating geo-localization from the content of Twitter Messages. This article will be presented at 16th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2015.

The 2015 CICLing conference will be held in Cairo, Egypt, from April 14 to April 20, 2015.

Detecting and Disambiguating Locations Mentioned in Twitter Messages

Detecting the location entities mentioned in Tweets is useful in text mining for business, marketing or defence applications. Therefore, techniques for extracting the location entities from the Twitter textual content are needed. In this work, we approach this task in a similar manner to the Named Entity Recognition (NER) task focused only on locations, but we address a deeper task: classifying the detected locations into names of cities, provinces/states, and countries. We approach the task in a novel way, consisting in two stages. In the first stage, we train Conditional Random Fields (CRF) models with various sets of features; we collected and annotated our own dataset or training and testing. In the second stage, we resolve cases when there exist more than one place with the same name. We propose a set of heuristics for choosing the correct physical location in these cases. The findings were positive for both tasks.