A Novel Weighting Scheme for Efficient Document Indexing and Classification
Authors | Bashar Tahayna Ramesh Kumar Ayyasamy Saadat M. Alhashmi Siew Eu-Gene |
---|---|
Publication date | 2010 |
ISBN | 978-142446718-1 |
DOI | 10.1109/ITSIM.2010.5561553 |
Links |
A Novel Weighting Scheme for Efficient Document Indexing and Classification - scientific work about Wikipedia quality published in 2010, written by Bashar Tahayna, Ramesh Kumar Ayyasamy, Saadat M. Alhashmi and Siew Eu-Gene.
Overview
In this paper authors propose and illustrate the effectiveness of a new topic-based document classification method. The proposed method utilizes the Wikipedia, a large scale Web encyclopaedia that has high-quality and huge-scale articles and a category system. Wikipedia is used using an Ngram technique to transform the document from being a "bag of words" to become a "bag of concepts". Based on this transformation, a novel concept-based weighting scheme (denoted as Conf.idf) is proposed to index the text with the flavor of the traditional tf.idf indexing scheme. Moreover, a genetic algorithm-based support vector machine optimization method is used for the purpose of feature subset and instance selection. Experimental results showed that proposed weighting scheme outperform the traditional indexing and weighting scheme.
Embed
Wikipedia Quality
Tahayna, Bashar; Ayyasamy, Ramesh Kumar; Alhashmi, Saadat M.; Eu-Gene, Siew. (2010). "[[A Novel Weighting Scheme for Efficient Document Indexing and Classification]]". IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282. ISBN: 978-142446718-1. DOI: 10.1109/ITSIM.2010.5561553.
English Wikipedia
{{cite journal |last1=Tahayna |first1=Bashar |last2=Ayyasamy |first2=Ramesh Kumar |last3=Alhashmi |first3=Saadat M. |last4=Eu-Gene |first4=Siew |title=A Novel Weighting Scheme for Efficient Document Indexing and Classification |date=2010 |isbn=978-142446718-1 |doi=10.1109/ITSIM.2010.5561553 |url=https://wikipediaquality.com/wiki/A_Novel_Weighting_Scheme_for_Efficient_Document_Indexing_and_Classification |journal=IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282}}
HTML
Tahayna, Bashar; Ayyasamy, Ramesh Kumar; Alhashmi, Saadat M.; Eu-Gene, Siew. (2010). "<a href="https://wikipediaquality.com/wiki/A_Novel_Weighting_Scheme_for_Efficient_Document_Indexing_and_Classification">A Novel Weighting Scheme for Efficient Document Indexing and Classification</a>". IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282. ISBN: 978-142446718-1. DOI: 10.1109/ITSIM.2010.5561553.