Using Wikipedia Categories for Discovering the Themes of Text Documents

From Wikipedia Quality
Jump to: navigation, search


Using Wikipedia Categories for Discovering the Themes of Text Documents
Authors
Abdullah Bawakid
Publication date
2015
DOI
10.1109/IHMSC.2015.68
Links
Original

Using Wikipedia Categories for Discovering the Themes of Text Documents - scientific work related to Wikipedia quality published in 2015, written by Abdullah Bawakid.

Overview

This paper describes a new unsupervised approach for identifying the main themes of any text document with the aid of Wikipedia. In contrast to others, the proposed algorithm relies on merely two main aspects of Wikipedia, namely its articles titles and categories structure. The inner content of the articles of Wikipedia are not employed in algorithm. Authors describe in this paper how to build a Term-Categories vector that defines how strong a term is associated to a Wikipedia concept. Authors also explain how this vector is employed when processing a text document to discover its main themes. Authors report the performance of method by attempting to predict the most representative categories for a subset of Wikipedia articles.

Embed

Wikipedia Quality

Bawakid, Abdullah. (2015). "[[Using Wikipedia Categories for Discovering the Themes of Text Documents]]". IEEE Computer Society. DOI: 10.1109/IHMSC.2015.68.

English Wikipedia

{{cite journal |last1=Bawakid |first1=Abdullah |title=Using Wikipedia Categories for Discovering the Themes of Text Documents |date=2015 |doi=10.1109/IHMSC.2015.68 |url=https://wikipediaquality.com/wiki/Using_Wikipedia_Categories_for_Discovering_the_Themes_of_Text_Documents |journal=IEEE Computer Society}}

HTML

Bawakid, Abdullah. (2015). &quot;<a href="https://wikipediaquality.com/wiki/Using_Wikipedia_Categories_for_Discovering_the_Themes_of_Text_Documents">Using Wikipedia Categories for Discovering the Themes of Text Documents</a>&quot;. IEEE Computer Society. DOI: 10.1109/IHMSC.2015.68.