Extending the Coverage of Dbpedia Properties Using Distant Supervision over Wikipedia
Extending the Coverage of Dbpedia Properties Using Distant Supervision over Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Alessio Palmero Aprosio, Claudio Giuliano and Alberto Lavelli.
Overview
DBpedia is a Semantic Web project aiming to extract structured data from Wikipedia articles. Due to the increasing number of resources linked to it, DBpedia plays a central role in the Linked Open Data community. Currently, the information contained in DBpedia is mainly collected from Wikipedia infoboxes, a set of subject-attribute-value triples that represents a summary of the Wikipedia page. These infoboxes are manually compiled by the Wikipedia contributors, and in more than 50% of the Wikipedia articles the infobox is missing. In this article, authors use the distant supervision paradigm to extract the missing information directly from the Wikipedia article, using a Relation Extraction tool trained on the information already present in DBpedia. Authors evaluate system on a data set consisting of seven DBpedia properties, demonstrating the suitability of the approach in extending the DBpedia coverage.