Difference between revisions of "Study on Wikipedia for Translation Mining for Clir"

From Wikipedia Quality
Jump to: navigation, search
(wikilinks)
(Adding infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Study on Wikipedia for Translation Mining for Clir
 +
| date = 2010
 +
| authors = [[Jianmin Yao]]<br />[[Chang-Long Sun]]<br />[[Yu Hong]]<br />[[Yun-Dong Ge]]<br />[[Qiaoming Zhu]]
 +
| doi = 10.1109/ICMLC.2010.5580683
 +
| link = http://ieeexplore.ieee.org/document/5580683/
 +
}}
 
'''Study on Wikipedia for Translation Mining for Clir''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Jianmin Yao]], [[Chang-Long Sun]], [[Yu Hong]], [[Yun-Dong Ge]] and [[Qiaoming Zhu]].
 
'''Study on Wikipedia for Translation Mining for Clir''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Jianmin Yao]], [[Chang-Long Sun]], [[Yu Hong]], [[Yun-Dong Ge]] and [[Qiaoming Zhu]].
  
 
== Overview ==
 
== Overview ==
 
The query translation of Out of Vocabulary (OOV) is one of the key factors that affect the performance of Cross-Language Information Retrieval (CLIR). Based on [[Wikipedia]] data structure and language [[features]], the paper divides translation environment into target-existence and target-deficit environment. To overcome the difficulty of translation mining in the target-deficit environment, the frequency change information and adjacency information is used to realize the extraction of candidate units, and establish the strategy of mixed translation mining based on the frequency-distance model, surface pattern matching model and summary-score model. Search engine based OOV translation mining is taken as baseline to test the performance on TOP1 results. It is verified that the mixed translation mining method based on Wikipedia can achieve the precision rate of 0.6279, and the improvement is 6.98% better than the baseline.
 
The query translation of Out of Vocabulary (OOV) is one of the key factors that affect the performance of Cross-Language Information Retrieval (CLIR). Based on [[Wikipedia]] data structure and language [[features]], the paper divides translation environment into target-existence and target-deficit environment. To overcome the difficulty of translation mining in the target-deficit environment, the frequency change information and adjacency information is used to realize the extraction of candidate units, and establish the strategy of mixed translation mining based on the frequency-distance model, surface pattern matching model and summary-score model. Search engine based OOV translation mining is taken as baseline to test the performance on TOP1 results. It is verified that the mixed translation mining method based on Wikipedia can achieve the precision rate of 0.6279, and the improvement is 6.98% better than the baseline.

Revision as of 10:37, 2 July 2020


Study on Wikipedia for Translation Mining for Clir
Authors
Jianmin Yao
Chang-Long Sun
Yu Hong
Yun-Dong Ge
Qiaoming Zhu
Publication date
2010
DOI
10.1109/ICMLC.2010.5580683
Links
Original

Study on Wikipedia for Translation Mining for Clir - scientific work related to Wikipedia quality published in 2010, written by Jianmin Yao, Chang-Long Sun, Yu Hong, Yun-Dong Ge and Qiaoming Zhu.

Overview

The query translation of Out of Vocabulary (OOV) is one of the key factors that affect the performance of Cross-Language Information Retrieval (CLIR). Based on Wikipedia data structure and language features, the paper divides translation environment into target-existence and target-deficit environment. To overcome the difficulty of translation mining in the target-deficit environment, the frequency change information and adjacency information is used to realize the extraction of candidate units, and establish the strategy of mixed translation mining based on the frequency-distance model, surface pattern matching model and summary-score model. Search engine based OOV translation mining is taken as baseline to test the performance on TOP1 results. It is verified that the mixed translation mining method based on Wikipedia can achieve the precision rate of 0.6279, and the improvement is 6.98% better than the baseline.