A Novel Weighting Scheme for Efficient Document Indexing and Classification

From Wikipedia Quality
Revision as of 00:43, 4 July 2018 by Librarian (talk | contribs) (New scientific work)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
A Novel Weighting Scheme for Efficient Document Indexing and Classification
Authors
Bashar Tahayna
Ramesh Kumar Ayyasamy
Saadat M. Alhashmi
Siew Eu-Gene
Publication date
2010
ISBN
978-142446718-1
DOI
10.1109/ITSIM.2010.5561553
Links

A Novel Weighting Scheme for Efficient Document Indexing and Classification - scientific work about Wikipedia quality published in 2010, written by Bashar Tahayna, Ramesh Kumar Ayyasamy, Saadat M. Alhashmi and Siew Eu-Gene.

Overview

In this paper authors propose and illustrate the effectiveness of a new topic-based document classification method. The proposed method utilizes the Wikipedia, a large scale Web encyclopaedia that has high-quality and huge-scale articles and a category system. Wikipedia is used using an Ngram technique to transform the document from being a "bag of words" to become a "bag of concepts". Based on this transformation, a novel concept-based weighting scheme (denoted as Conf.idf) is proposed to index the text with the flavor of the traditional tf.idf indexing scheme. Moreover, a genetic algorithm-based support vector machine optimization method is used for the purpose of feature subset and instance selection. Experimental results showed that proposed weighting scheme outperform the traditional indexing and weighting scheme.

Embed

Wikipedia Quality

Tahayna, Bashar; Ayyasamy, Ramesh Kumar; Alhashmi, Saadat M.; Eu-Gene, Siew. (2010). "[[A Novel Weighting Scheme for Efficient Document Indexing and Classification]]". IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282. ISBN: 978-142446718-1. DOI: 10.1109/ITSIM.2010.5561553.

English Wikipedia

{{cite journal |last1=Tahayna |first1=Bashar |last2=Ayyasamy |first2=Ramesh Kumar |last3=Alhashmi |first3=Saadat M. |last4=Eu-Gene |first4=Siew |title=A Novel Weighting Scheme for Efficient Document Indexing and Classification |date=2010 |isbn=978-142446718-1 |doi=10.1109/ITSIM.2010.5561553 |url=https://wikipediaquality.com/wiki/A_Novel_Weighting_Scheme_for_Efficient_Document_Indexing_and_Classification |journal=IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282}}

HTML

Tahayna, Bashar; Ayyasamy, Ramesh Kumar; Alhashmi, Saadat M.; Eu-Gene, Siew. (2010). &quot;<a href="https://wikipediaquality.com/wiki/A_Novel_Weighting_Scheme_for_Efficient_Document_Indexing_and_Classification">A Novel Weighting Scheme for Efficient Document Indexing and Classification</a>&quot;. IET Information Security Volume 4, Issue 4, December 2010, pp. 273-282. ISBN: 978-142446718-1. DOI: 10.1109/ITSIM.2010.5561553.