Extraction of Bilingual Cognates from Wikipedia

From Wikipedia Quality
Revision as of 19:58, 22 June 2019 by Alice (talk | contribs) (Overview - Extraction of Bilingual Cognates from Wikipedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Extraction of Bilingual Cognates from Wikipedia - scientific work related to Wikipedia quality published in 2012, written by Pablo Gamallo and Marcos Garcia.

Overview

In this article, authors propose a method to extract translation equivalents with similar spelling from comparable corpora. The method was applied on Wikipedia to extract a large amount of Portuguese-Spanish bilingual terminological pairs that were not found in existing dictionaries. The resulting bilingual lexicons consists of more than 27,000 new pairs of lemmas and multiwords, with about 92% accuracy.