Exploiting Negative Categories and Wikipedia Structures for Document Classification

From Wikipedia Quality
Revision as of 08:32, 6 June 2019 by Jasmine (talk | contribs) (wikilinks)
Jump to: navigation, search

Exploiting Negative Categories and Wikipedia Structures for Document Classification - scientific work related to Wikipedia quality published in 2009, written by Meenakshi Sundaram Murugeshan, K. Lakshmi and Saswati Mukherjee.

Overview

This paper explores the effect of profile based method for classification of Wikipedia XML documents. Authors approach builds two profiles, exploiting the whole content, Initial Descriptions and links in the Wikipedia documents. For building profiles authors use the negative category information which has shown to perform well for classifying unstructured texts. The performance of Cosine and Fractional Similarity metrics is also compared. The use of two classifiers and their weighted average improves the classification performance.