Difference between revisions of "Building Bilingual Parallel Corpora based on Wikipedia"
(Infobox) |
(+ Embed) |
||
Line 10: | Line 10: | ||
== Overview == | == Overview == | ||
Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs. | Aligned parallel corpora are an important resource for a wide range of [[multilingual]] researches, specifically, corpus-based [[machine translation]]. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining [[Wikipedia]]. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs. | ||
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Mohammadi, Mehdi; GhasemAghaee, Nasser. (2010). "[[Building Bilingual Parallel Corpora based on Wikipedia]]".DOI: 10.1109/ICCEA.2010.203. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Mohammadi |first1=Mehdi |last2=GhasemAghaee |first2=Nasser |title=Building Bilingual Parallel Corpora based on Wikipedia |date=2010 |doi=10.1109/ICCEA.2010.203 |url=https://wikipediaquality.com/wiki/Building_Bilingual_Parallel_Corpora_based_on_Wikipedia}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Mohammadi, Mehdi; GhasemAghaee, Nasser. (2010). &quot;<a href="https://wikipediaquality.com/wiki/Building_Bilingual_Parallel_Corpora_based_on_Wikipedia">Building Bilingual Parallel Corpora based on Wikipedia</a>&quot;.DOI: 10.1109/ICCEA.2010.203. | ||
+ | </nowiki> | ||
+ | </code> |
Revision as of 10:25, 9 March 2021
Authors | Mehdi Mohammadi Nasser GhasemAghaee |
---|---|
Publication date | 2010 |
DOI | 10.1109/ICCEA.2010.203 |
Links | Original |
Building Bilingual Parallel Corpora based on Wikipedia - scientific work related to Wikipedia quality published in 2010, written by Mehdi Mohammadi and Nasser GhasemAghaee.
Overview
Aligned parallel corpora are an important resource for a wide range of multilingual researches, specifically, corpus-based machine translation. In this paper authors present a Persian- English sentence-aligned parallel corpus by mining Wikipedia. Authors propose a method of extracting sentence-level alignment by using an extended link-based bilingual lexicon method. Experimental results show that method increase precision, while it reduce the total number of generated candidate pairs.
Embed
Wikipedia Quality
Mohammadi, Mehdi; GhasemAghaee, Nasser. (2010). "[[Building Bilingual Parallel Corpora based on Wikipedia]]".DOI: 10.1109/ICCEA.2010.203.
English Wikipedia
{{cite journal |last1=Mohammadi |first1=Mehdi |last2=GhasemAghaee |first2=Nasser |title=Building Bilingual Parallel Corpora based on Wikipedia |date=2010 |doi=10.1109/ICCEA.2010.203 |url=https://wikipediaquality.com/wiki/Building_Bilingual_Parallel_Corpora_based_on_Wikipedia}}
HTML
Mohammadi, Mehdi; GhasemAghaee, Nasser. (2010). "<a href="https://wikipediaquality.com/wiki/Building_Bilingual_Parallel_Corpora_based_on_Wikipedia">Building Bilingual Parallel Corpora based on Wikipedia</a>".DOI: 10.1109/ICCEA.2010.203.