Mining Wikipedia Article Clusters for Geospatial Entities and Relationships
Mining Wikipedia Article Clusters for Geospatial Entities and Relationships - scientific work related to Wikipedia quality published in 2009, written by Jeremy Witmer and Jugal K. Kalita.
Overview
Authors present in this paper a method to extract geospatial entities and relationships from the unstructured text of the English language Wikipedia. Using a novel approach that applies SVMs trained from purely structural features of text strings, authors extract candidate geospatial entities and relationships. Using a combination of further techniques, along with an external gazetteer, the candidate entities and relationships are disambiguated and the Wikipedia article pages are modified to include the semantic information provided by the extraction process. Authors successfully extracted location entities with an F-measure of 81%, and location relations with an F-