Sense Clustering Using Wikipedia
Sense Clustering Using Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Bharath Dandala, Chris Hokamp, Rada Mihalcea and Razvan C. Bunescu.
Overview
In this paper, authors propose a novel method for generating a coarse-grained sense inventory from Wikipedia using a machine learning framework. Structural and content-based features are employed to induce clusters of articles representative of a word sense. Additionally, multilingual features are shown to improve the clustering accuracy, especially for languages that are less comprehensive than English. Authors show the effectiveness of clustering methodology by testing it against both manually and automatically annotated datasets.