Difference between revisions of "Georeferencing Wikipedia Documents Using Data from Social Media Sources"

From Wikipedia Quality
Jump to: navigation, search
(Basic information on Georeferencing Wikipedia Documents Using Data from Social Media Sources)
 
(Adding wikilinks)
Line 1: Line 1:
'''Georeferencing Wikipedia Documents Using Data from Social Media Sources''' - scientific work related to Wikipedia quality published in 2014, written by Olivier Van Laere, Steven Schockaert, Vlad Tanasescu, Bart Dhoedt and Christopher B. Jones.
+
'''Georeferencing Wikipedia Documents Using Data from Social Media Sources''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Olivier Van Laere]], [[Steven Schockaert]], [[Vlad Tanasescu]], [[Bart Dhoedt]] and [[Christopher B. Jones]].
  
 
== Overview ==
 
== Overview ==
Social media sources such as Flickr and Twitter continuously generate large amounts of textual information (tags on Flickr and short messages on Twitter). This textual information is increasingly linked to geographical coordinates, which makes it possible to learn how people refer to places by identifying correlations between the occurrence of terms and the locations of the corresponding social media objects. Recent work has focused on how this potentially rich source of geographic information can be used to estimate geographic coordinates for previously unseen Flickr photos or Twitter messages. In this article, authors extend this work by analysing to what extent probabilistic language models trained on Flickr and Twitter can be used to assign coordinates to Wikipedia articles. Authors results show that exploiting these language models substantially outperforms both (i) classical gazetteer-based methods (in particular, using Yahooe Placemaker and Geonames) and (ii) language modelling approaches trained on Wikipedia alone. This supports the hypothesis that social media are important sources of geographic information, which are valuable beyond the scope of individual applications.
+
Social media sources such as Flickr and [[Twitter]] continuously generate large amounts of textual information (tags on Flickr and short messages on Twitter). This textual information is increasingly linked to geographical coordinates, which makes it possible to learn how people refer to places by identifying correlations between the occurrence of terms and the locations of the corresponding social media objects. Recent work has focused on how this potentially rich source of geographic information can be used to estimate geographic coordinates for previously unseen Flickr photos or Twitter messages. In this article, authors extend this work by analysing to what extent probabilistic language models trained on Flickr and Twitter can be used to assign coordinates to [[Wikipedia]] articles. Authors results show that exploiting these language models substantially outperforms both (i) classical gazetteer-based methods (in particular, using [[Yahoo]]e Placemaker and Geonames) and (ii) language modelling approaches trained on Wikipedia alone. This supports the hypothesis that social media are important sources of geographic information, which are valuable beyond the scope of individual applications.

Revision as of 10:10, 15 January 2020

Georeferencing Wikipedia Documents Using Data from Social Media Sources - scientific work related to Wikipedia quality published in 2014, written by Olivier Van Laere, Steven Schockaert, Vlad Tanasescu, Bart Dhoedt and Christopher B. Jones.

Overview

Social media sources such as Flickr and Twitter continuously generate large amounts of textual information (tags on Flickr and short messages on Twitter). This textual information is increasingly linked to geographical coordinates, which makes it possible to learn how people refer to places by identifying correlations between the occurrence of terms and the locations of the corresponding social media objects. Recent work has focused on how this potentially rich source of geographic information can be used to estimate geographic coordinates for previously unseen Flickr photos or Twitter messages. In this article, authors extend this work by analysing to what extent probabilistic language models trained on Flickr and Twitter can be used to assign coordinates to Wikipedia articles. Authors results show that exploiting these language models substantially outperforms both (i) classical gazetteer-based methods (in particular, using Yahooe Placemaker and Geonames) and (ii) language modelling approaches trained on Wikipedia alone. This supports the hypothesis that social media are important sources of geographic information, which are valuable beyond the scope of individual applications.