Difference between revisions of "Building Bilingual Parallel Corpora based on Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Adding wikilinks)
(Infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Building Bilingual Parallel Corpora based on Wikipedia
 +
| date = 2010
 +
| authors = [[Mehdi Mohammadi]]<br />[[Nasser GhasemAghaee]]
 +
| doi = 10.1109/ICCEA.2010.203
 +
| link = http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&amp;arnumber=5445653
 +
}}
 
'''Building Bilingual Parallel Corpora based on Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Mehdi Mohammadi]] and [[Nasser GhasemAghaee]].
 
'''Building Bilingual Parallel Corpora based on Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Mehdi Mohammadi]] and [[Nasser GhasemAghaee]].
  
 
== Overview ==
 
== Overview ==
 
Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian-­ English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs.
 
Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian-­ English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs.

Revision as of 11:29, 11 November 2020


Building Bilingual Parallel Corpora based on Wikipedia
Authors
Mehdi Mohammadi
Nasser GhasemAghaee
Publication date
2010
DOI
10.1109/ICCEA.2010.203
Links
Original

Building Bilingual Parallel Corpora based on Wikipedia - scientific work related to Wikipedia quality published in 2010, written by Mehdi Mohammadi and Nasser GhasemAghaee.

Overview

Aligned parallel corpora are an important resource for a wide range of multilingual researches, specifically, corpus-based machine translation. In this paper authors present a Persian-­ English sentence-aligned parallel corpus by mining Wikipedia. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs.