Language Independent Named Entity Identification Using Wikipedia

From Wikipedia Quality
Jump to: navigation, search


Language Independent Named Entity Identification Using Wikipedia
Authors
Mahathi Bhagavatula
Santosh Gsk
Vasudeva Varma
Publication date
2012
Links
Original

Language Independent Named Entity Identification Using Wikipedia - scientific work related to Wikipedia quality published in 2012, written by Mahathi Bhagavatula, Santosh Gsk and Vasudeva Varma.

Overview

Recognition of Named Entities (NEs) is a difficult process in Indian languages like Hindi, Telugu, etc., where sufficient gazetteers and annotated corpora are not available compared to English language. This paper details a novel clustering and co-occurrence based approach to map English NEs with their equivalent representations from different languages recognized in a language-independent way. Authors have substituted the required language specific resources by the richly structured multilingual content of Wikipedia. The approach includes clustering of highly similar Wikipedia articles. Then the NEs in an English article are mapped with other language terms in interlinked articles based on co-occurrence frequencies. The cluster information and the term co-occurrences are considered in extracting the NEs from non-English languages. Hence, the English Wikipedia is used to bootstrap the NEs for other languages. Through this approach, authors have availed the structured, semi-structured and multilingual content of the Wikipedia to a massive extent. Experimental results suggest that the proposed approach yields promising results in rates of precision and recall.

Embed

Wikipedia Quality

Bhagavatula, Mahathi; Gsk, Santosh; Varma, Vasudeva. (2012). "[[Language Independent Named Entity Identification Using Wikipedia]]". Association for Computational Linguistics.

English Wikipedia

{{cite journal |last1=Bhagavatula |first1=Mahathi |last2=Gsk |first2=Santosh |last3=Varma |first3=Vasudeva |title=Language Independent Named Entity Identification Using Wikipedia |date=2012 |url=https://wikipediaquality.com/wiki/Language_Independent_Named_Entity_Identification_Using_Wikipedia |journal=Association for Computational Linguistics}}

HTML

Bhagavatula, Mahathi; Gsk, Santosh; Varma, Vasudeva. (2012). &quot;<a href="https://wikipediaquality.com/wiki/Language_Independent_Named_Entity_Identification_Using_Wikipedia">Language Independent Named Entity Identification Using Wikipedia</a>&quot;. Association for Computational Linguistics.