Language Independent Named Entity Identification Using Wikipedia
Authors | Mahathi Bhagavatula Santosh Gsk Vasudeva Varma |
---|---|
Publication date | 2012 |
Links | Original |
Language Independent Named Entity Identification Using Wikipedia - scientific work related to Wikipedia quality published in 2012, written by Mahathi Bhagavatula, Santosh Gsk and Vasudeva Varma.
Overview
Recognition of Named Entities (NEs) is a difficult process in Indian languages like Hindi, Telugu, etc., where sufficient gazetteers and annotated corpora are not available compared to English language. This paper details a novel clustering and co-occurrence based approach to map English NEs with their equivalent representations from different languages recognized in a language-independent way. Authors have substituted the required language specific resources by the richly structured multilingual content of Wikipedia. The approach includes clustering of highly similar Wikipedia articles. Then the NEs in an English article are mapped with other language terms in interlinked articles based on co-occurrence frequencies. The cluster information and the term co-occurrences are considered in extracting the NEs from non-English languages. Hence, the English Wikipedia is used to bootstrap the NEs for other languages. Through this approach, authors have availed the structured, semi-structured and multilingual content of the Wikipedia to a massive extent. Experimental results suggest that the proposed approach yields promising results in rates of precision and recall.
Embed
Wikipedia Quality
Bhagavatula, Mahathi; Gsk, Santosh; Varma, Vasudeva. (2012). "[[Language Independent Named Entity Identification Using Wikipedia]]". Association for Computational Linguistics.
English Wikipedia
{{cite journal |last1=Bhagavatula |first1=Mahathi |last2=Gsk |first2=Santosh |last3=Varma |first3=Vasudeva |title=Language Independent Named Entity Identification Using Wikipedia |date=2012 |url=https://wikipediaquality.com/wiki/Language_Independent_Named_Entity_Identification_Using_Wikipedia |journal=Association for Computational Linguistics}}
HTML
Bhagavatula, Mahathi; Gsk, Santosh; Varma, Vasudeva. (2012). "<a href="https://wikipediaquality.com/wiki/Language_Independent_Named_Entity_Identification_Using_Wikipedia">Language Independent Named Entity Identification Using Wikipedia</a>". Association for Computational Linguistics.