Difference between revisions of "Building Bilingual Parallel Corpora based on Wikipedia"
(Adding wikilinks) |
(Infobox) |
||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = Building Bilingual Parallel Corpora based on Wikipedia | ||
+ | | date = 2010 | ||
+ | | authors = [[Mehdi Mohammadi]]<br />[[Nasser GhasemAghaee]] | ||
+ | | doi = 10.1109/ICCEA.2010.203 | ||
+ | | link = http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5445653 | ||
+ | }} | ||
'''Building Bilingual Parallel Corpora based on Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Mehdi Mohammadi]] and [[Nasser GhasemAghaee]]. | '''Building Bilingual Parallel Corpora based on Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Mehdi Mohammadi]] and [[Nasser GhasemAghaee]]. | ||
== Overview == | == Overview == | ||
Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs. | Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs. |
Revision as of 11:29, 11 November 2020
Authors | Mehdi Mohammadi Nasser GhasemAghaee |
---|---|
Publication date | 2010 |
DOI | 10.1109/ICCEA.2010.203 |
Links | Original |
Building Bilingual Parallel Corpora based on Wikipedia - scientific work related to Wikipedia quality published in 2010, written by Mehdi Mohammadi and Nasser GhasemAghaee.
Overview
Aligned parallel corpora are an important resource for a wide range of multilingual researches, specifically, corpus-based machine translation. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining Wikipedia. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs.