Miracle at the Spanish Wiqa Pilot: Using Named Entities and Cosine Similarity to Extend Wikipedia Articles

From Wikipedia Quality
Revision as of 07:11, 21 October 2020 by Hanna (talk | contribs) (Links)
Jump to: navigation, search

Miracle at the Spanish Wiqa Pilot: Using Named Entities and Cosine Similarity to Extend Wikipedia Articles - scientific work related to Wikipedia quality published in 2006, written by César de Pablo-Sánchez, José Luis Martínez-Fernández and Paloma Martínez.

Overview

The WiQA pilot task explores how to select new and useful information that could be included in Wikipedia articles. Authors system explores how the combination of NE and cosine similarity allow to detect new and repeated information. Authors have submitted two runs for the Spanish subtask wich differ in the way they select candidate sentences using the link structure in the WikipediaXML corpus. Authors approach obtains results that provide at least a new snippet per topic in average. The main limitation was found in the candidate selection strategy that results in some topics being not answered or in other cases providing too much noisy candidates.