An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages

From Wikipedia Quality
Jump to: navigation, search


An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages
Authors
Rohit Bharadwaj G
Niket Tandon
Vasudeva Varma
Publication date
2010
Links
Original

An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages - scientific work related to Wikipedia quality published in 2010, written by Rohit Bharadwaj G, Niket Tandon and Vasudeva Varma.

Overview

The problem of extracting bilingual dictionaries from Wikipedia is well known and well researched. Given the structural and rich multilingual content of Wikipedia, a language independent approach is necessary for extracting dictionaries for various languages more so for under-resourced languages. In attempt to mine dictionaries for under-resourced languages, authors developed an iterative approach to construct parallel corpus for building a dictionary, for which authors consider several kinds of Wikipedia article information like title, infobox information, category, article text and dictionaries already built at each phase. The average precision over various datasets is encouraging with maximum precision of 76.7%, performing better than existing systems. As no language-specific resources are used, method is applicable to any pair of language with special focus on under-resourced languages and hence breaking the language barrier.

Embed

Wikipedia Quality

G, Rohit Bharadwaj; Tandon, Niket; Varma, Vasudeva. (2010). "[[An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages]]".

English Wikipedia

{{cite journal |last1=G |first1=Rohit Bharadwaj |last2=Tandon |first2=Niket |last3=Varma |first3=Vasudeva |title=An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages |date=2010 |url=https://wikipediaquality.com/wiki/An_Iterative_Approach_to_Extract_Dictionaries_from_Wikipedia_for_Under-Resourced_Languages}}

HTML

G, Rohit Bharadwaj; Tandon, Niket; Varma, Vasudeva. (2010). &quot;<a href="https://wikipediaquality.com/wiki/An_Iterative_Approach_to_Extract_Dictionaries_from_Wikipedia_for_Under-Resourced_Languages">An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages</a>&quot;.