Wikipedia-Based Smoothing for Enhancing Text Clustering

From Wikipedia Quality
Jump to: navigation, search


Wikipedia-Based Smoothing for Enhancing Text Clustering
Authors
Elahe Rahimtoroghi
Azadeh Shakery
Publication date
2011
DOI
10.1007/978-3-642-25631-8_30
Links
Original

Wikipedia-Based Smoothing for Enhancing Text Clustering - scientific work related to Wikipedia quality published in 2011, written by Elahe Rahimtoroghi and Azadeh Shakery.

Overview

The conventional algorithms for text clustering that are based on the bag of words model, fail to fully capture the semantic relations between the words. As a result, documents describing an identical topic may not be categorized into same clusters if they use different sets of words. A generic solution for this issue is to utilize background knowledge to enrich the document contents. In this research, authors adopt a language modeling approach for text clustering and propose to smooth the document language models using Wikipedia articles in order to enhance text clustering performance. The contents of Wikipedia articles as well as their assigned categories are used in three different ways to smooth the document language models with the goal of enriching the document contents. Clustering is then performed on a document similarity graph constructed on the enhanced document collection. Experiment results confirm the effectiveness of the proposed methods.

Embed

Wikipedia Quality

Rahimtoroghi, Elahe; Shakery, Azadeh. (2011). "[[Wikipedia-Based Smoothing for Enhancing Text Clustering]]". Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-25631-8_30.

English Wikipedia

{{cite journal |last1=Rahimtoroghi |first1=Elahe |last2=Shakery |first2=Azadeh |title=Wikipedia-Based Smoothing for Enhancing Text Clustering |date=2011 |doi=10.1007/978-3-642-25631-8_30 |url=https://wikipediaquality.com/wiki/Wikipedia-based_smoothing_for_enhancing_text_clustering |journal=Springer, Berlin, Heidelberg}}

HTML

Rahimtoroghi, Elahe; Shakery, Azadeh. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Wikipedia-Based_Smoothing_for_Enhancing_Text_Clustering">Wikipedia-Based Smoothing for Enhancing Text Clustering</a>&quot;. Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-25631-8_30.