Mining Wikipedia Knowledge to Improve Document Indexing and Classification

From Wikipedia Quality
Revision as of 00:42, 4 July 2018 by Librarian (talk | contribs) (New scientific work)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Mining Wikipedia Knowledge to Improve Document Indexing and Classification
Authors
Ramesh Kumar Ayyasamy
Bashar Tahayna
Saadat M. Alhashmi
Siew Eu-Gene
Simon J. Egerton
Publication date
2010
ISBN
978-142447167-6
DOI
10.1109/ISSPA.2010.5605508
Links

Mining Wikipedia Knowledge to Improve Document Indexing and Classification - scientific work about Wikipedia quality published in 2010, written by Ramesh Kumar Ayyasamy, Bashar Tahayna, Saadat M. Alhashmi, Siew Eu-Gene and Simon J. Egerton.

Overview

Autorsb logs are an important source of information that requires automatic techniques to categorize them into "topic-based" content, to facilitate their future browsing and retrieval. In this paper authors propose and illustrate the effectiveness of a new tf.idf measure. The proposed Conf.idf, Catf.idf measures are solely based on the mapping of terms-to-concepts-to- categories (TCONCAT) method that utilizes Wikipedia. The Knowledge base-Wikipedia is considered as a large scale Web encyclopaedia, that has high-quality and huge number of articles and categorical indexes. Using this system, their proposed framework consists of two stages to solve weblog classification problem. The first stage is to find out the terms belonging to a unique concept (article), as well as to disambiguate the terms belonging to more than one concept. The second stage is the determination of the categories to which these found concepts belong to. Experimental result confirms that, proposed system can distinguish the web logs that belongs to more than one category efficiently and has a better performance and success than the traditional statistical Natural Language Processing-NLP approaches.

Embed

Wikipedia Quality

Ayyasamy, Ramesh Kumar; Tahayna, Bashar; Alhashmi, Saadat M.; Eu-Gene, Siew; Egerton, Simon J.. (2010). "[[Mining Wikipedia Knowledge to Improve Document Indexing and Classification]]". Proceedings of the ACM SIGMOD International Conference on Management of Data 2010, Article number 4. ISBN: 978-142447167-6. DOI: 10.1109/ISSPA.2010.5605508.

English Wikipedia

{{cite journal |last1=Ayyasamy |first1=Ramesh Kumar |last2=Tahayna |first2=Bashar |last3=Alhashmi |first3=Saadat M. |last4=Eu-Gene |first4=Siew |last5=Egerton |first5=Simon J. |title=Mining Wikipedia Knowledge to Improve Document Indexing and Classification |date=2010 |isbn=978-142447167-6 |doi=10.1109/ISSPA.2010.5605508 |url=https://wikipediaquality.com/wiki/Mining_Wikipedia_Knowledge_to_Improve_Document_Indexing_and_Classification |journal=Proceedings of the ACM SIGMOD International Conference on Management of Data 2010, Article number 4}}

HTML

Ayyasamy, Ramesh Kumar; Tahayna, Bashar; Alhashmi, Saadat M.; Eu-Gene, Siew; Egerton, Simon J.. (2010). &quot;<a href="https://wikipediaquality.com/wiki/Mining_Wikipedia_Knowledge_to_Improve_Document_Indexing_and_Classification">Mining Wikipedia Knowledge to Improve Document Indexing and Classification</a>&quot;. Proceedings of the ACM SIGMOD International Conference on Management of Data 2010, Article number 4. ISBN: 978-142447167-6. DOI: 10.1109/ISSPA.2010.5605508.