Multilingual Document Clustering Using Wikipedia as External Knowledge

From Wikipedia Quality
Revision as of 08:22, 21 May 2019 by Agnieszka (talk | contribs) (Information about: Multilingual Document Clustering Using Wikipedia as External Knowledge)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Multilingual Document Clustering Using Wikipedia as External Knowledge
Authors
N. Kiran Kumar
K. Santosh
Vasudeva Varma
Publication date
2011
DOI
10.1007/978-3-642-21353-3_9
Links
Original

Multilingual Document Clustering Using Wikipedia as External Knowledge - scientific work related to Wikipedia quality published in 2011, written by N. Kiran Kumar, K. Santosh and Vasudeva Varma.

Overview

This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia has evolved to be a major structured multilingual knowledge base. It has been highly exploited in many monolingual clustering approaches and also in comparing multilingual corpora. But there is no prior work which studied the impact of Wikipedia on MDC. Here, authors have studied availing Wikipedia in enhancing MDC performance. Authors have leveraged Wikipedia knowledge structure (such as cross-lingual links, category, outlinks, Infobox information, etc.) to enrich the document representation for clustering multilingual documents. Authors have implemented Bisecting k-means clustering algorithm and experiments are conducted on a standard dataset provided by FIRE for their 2010 Ad-hoc Cross-Lingual document retrieval task on Indian languages. Authors have considered English and Hindi datasets for experiments. By avoiding language-specific tools, approach provides a general framework which can be easily extendable to other languages. The system was evaluated using F-score and Purity measures and the results obtained were encouraging.

Embed

Wikipedia Quality

Kumar, N. Kiran; Santosh, K.; Varma, Vasudeva. (2011). "[[Multilingual Document Clustering Using Wikipedia as External Knowledge]]". Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-21353-3_9.

English Wikipedia

{{cite journal |last1=Kumar |first1=N. Kiran |last2=Santosh |first2=K. |last3=Varma |first3=Vasudeva |title=Multilingual Document Clustering Using Wikipedia as External Knowledge |date=2011 |doi=10.1007/978-3-642-21353-3_9 |url=https://wikipediaquality.com/wiki/Multilingual_Document_Clustering_Using_Wikipedia_as_External_Knowledge |journal=Springer, Berlin, Heidelberg}}

HTML

Kumar, N. Kiran; Santosh, K.; Varma, Vasudeva. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Multilingual_Document_Clustering_Using_Wikipedia_as_External_Knowledge">Multilingual Document Clustering Using Wikipedia as External Knowledge</a>&quot;. Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-21353-3_9.