An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages
Authors | Rohit Bharadwaj G Niket Tandon Vasudeva Varma |
---|---|
Publication date | 2010 |
Links | Original |
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages - scientific work related to Wikipedia quality published in 2010, written by Rohit Bharadwaj G, Niket Tandon and Vasudeva Varma.
Overview
The problem of extracting bilingual dictionaries from Wikipedia is well known and well researched. Given the structural and rich multilingual content of Wikipedia, a language independent approach is necessary for extracting dictionaries for various languages more so for under-resourced languages. In attempt to mine dictionaries for under-resourced languages, authors developed an iterative approach to construct parallel corpus for building a dictionary, for which authors consider several kinds of Wikipedia article information like title, infobox information, category, article text and dictionaries already built at each phase. The average precision over various datasets is encouraging with maximum precision of 76.7%, performing better than existing systems. As no language-specific resources are used, method is applicable to any pair of language with special focus on under-resourced languages and hence breaking the language barrier.
Embed
Wikipedia Quality
G, Rohit Bharadwaj; Tandon, Niket; Varma, Vasudeva. (2010). "[[An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages]]".
English Wikipedia
{{cite journal |last1=G |first1=Rohit Bharadwaj |last2=Tandon |first2=Niket |last3=Varma |first3=Vasudeva |title=An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages |date=2010 |url=https://wikipediaquality.com/wiki/An_Iterative_Approach_to_Extract_Dictionaries_from_Wikipedia_for_Under-Resourced_Languages}}
HTML
G, Rohit Bharadwaj; Tandon, Niket; Varma, Vasudeva. (2010). "<a href="https://wikipediaquality.com/wiki/An_Iterative_Approach_to_Extract_Dictionaries_from_Wikipedia_for_Under-Resourced_Languages">An Iterative Approach to Extract Dictionaries from Wikipedia for Under-Resourced Languages</a>".