Exploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in Tweets - scientific work related to Wikipedia quality published in 2014, written by Muhammad Atif Qureshi, Muhammad Atif Qureshi, Colm O’Riordan and Gabriella Pasi.
Overview
Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper authors use “context phrases” in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, authors also utilize the concept of “relatedness” between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that authors restrict ourselves to features extracted from Wikipedia.