Automatically Generating Wikipedia Info-Boxes from Wikidata

From Wikipedia Quality
Revision as of 09:47, 4 June 2019 by Aurora (talk | contribs) (Creating a new page - Automatically Generating Wikipedia Info-Boxes from Wikidata)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Automatically Generating Wikipedia Info-Boxes from Wikidata - scientific work related to Wikipedia quality published in 2018, written by Tomás Sáez and Aidan Hogan.

Overview

Info-boxes provide a summary of the most important meta-data relating to a particular entity described by a Wikipedia article. However, many articles have no info-box or have info-boxes with only minimal information; furthermore, there is a huge disparity between the level of detail available for info-boxes in English articles and those for other languages. Wikidata has been proposed as a central repository of facts to try to address such disparities, and has been used as a source of information to generate info-boxes. However, current processes still rely on human intervention either to create generic templates for entities of a given type or to create a specific info-box for a specific article in a specific language. As such, there are still many articles of Wikipedia without info-boxes but where relevant data are provided by Wikidata. In this paper, authors investigate fully automatic methods to generate info-boxes for Wikipedia from the Wikidata knowledge graph. The primary challenge is to create ranking mechanisms that provide an intuitive prioritisation of the facts associated with an entity. Authors discuss this challenge, propose several straightforward metrics to prioritise information in info-boxes, and present an initial user evaluation to compare the quality of info-boxes generated by various metrics.