Wikipedia and Machine Translation: Killing Two Birds with One Stone

From Wikipedia Quality
Revision as of 09:54, 7 November 2019 by Serenity (talk | contribs) (New study: Wikipedia and Machine Translation: Killing Two Birds with One Stone)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Wikipedia and Machine Translation: Killing Two Birds with One Stone - scientific work related to Wikipedia quality published in 2014, written by Iaki Alegria, Unai Cabezon, Unai Fernandez de Betoo, Gorka Labaka, Aingeru Mayor, Kepa Sarasola and Arkaitz Zubiaga.

Overview

In this paper authors present the free/open-source language resources for machine translation created in OpenMT-2 wikiproject, a collaboration framework that was tested with editors of Basque Wikipedia. Post-editing of Computer Science articles has been used to improve the output of a Spanish to Basque MT system called Matxin. For the collaboration between editors and researchers, authors selected a set of 100 articles from the Spanish Wikipedia. These articles would then be used as the source texts to be translated into Basque using the MT engine. A group of volunteers from Basque Wikipedia reviewed and corrected the raw MT translations. This collaboration ultimately produced two main benefits: (i) the change logs that would potentially help improve the MT engine by using an automated statistical post-editing system, and (ii) the growth of Basque Wikipedia. The results show that this process can improve the accuracy of a Rule Based Machine Translation system in nearly 10% benefiting from the post-edition of 50,000 words in the Computer Science domain. Authors believe that conclusions can be extended to MT engines involving other less-resourced languages lacking large parallel corpora or frequently updated lexical knowledge, as well as to other domains.