Conceptual Hierarchical Clustering of Documents Using Wikipedia Knowledge

From Wikipedia Quality
Revision as of 08:35, 18 October 2019 by Madison (talk | contribs) (+ wikilinks)
Jump to: navigation, search

Conceptual Hierarchical Clustering of Documents Using Wikipedia Knowledge - scientific work related to Wikipedia quality published in 2011, written by Gerasimos Spanakis, Georgios Siolas and Andreas Stafylopatis.

Overview

In this paper, authors propose a novel method for conceptual hierarchical clustering of documents using knowledge extracted from Wikipedia. A robust and compact document representation is built in real-time using the Wikipedia API. The clustering process is hierarchi- cal and creates cluster labels which are descriptive and important for the examined corpus. Experiments show that the proposed technique greatly improves over the baseline approach.