Multi-Level Topical Text Categorization with Wikipedia

From Wikipedia Quality
Revision as of 07:20, 22 June 2020 by Maria (talk | contribs) (Adding embed)
Jump to: navigation, search


Multi-Level Topical Text Categorization with Wikipedia
Authors
Nan Guo
Yuan He
ChunGang Yan
Lu Liu
Cheng Wang
Publication date
2016
DOI
10.1145/2996890.3007856
Links
Original

Multi-Level Topical Text Categorization with Wikipedia - scientific work related to Wikipedia quality published in 2016, written by Nan Guo, Yuan He, ChunGang Yan, Lu Liu and Cheng Wang.

Overview

This paper introduces an automatic categorical-marking model for text categorization. Traditional classification algorithms are generally applying labeled training set and call for a lot of manual work to tag classifications beforehand. Also due to the ambiguity and fuzziness of texts, the results of traditional text categorization algorithms may not be clear enough and abundant in content. This paper presents an unsupervised, training-set-free and hierarchical categorization model called Folk-Topical Text Categorization (FTTC). FTTC applies topic model to abstract documents to topical words and make use of Wikipedia's crowd-sourcing and collective control to extend hierarchical classifications. The results are not restricted to predefined categories but contain categories abstracted to deeper semantic levels and greatly facilitate traditional text categorization applications. For a document, its topical words are obtained using a popular topic model called Latent Dirichlet Allocation (LDA). Afterwards, the topical words are used to build and trace through the category-trees of Wikipedia. Based on the filtered results, the final classifications comprehensively reflect the diversified and content-rich information of the text, and fully cover different aspects of the text. Experimental results on different kinds of datasets show that model advances in classification accuracy, flexibility and intelligibility, as compared with traditional models.

Embed

Wikipedia Quality

Guo, Nan; He, Yuan; Yan, ChunGang; Liu, Lu; Wang, Cheng. (2016). "[[Multi-Level Topical Text Categorization with Wikipedia]]".DOI: 10.1145/2996890.3007856.

English Wikipedia

{{cite journal |last1=Guo |first1=Nan |last2=He |first2=Yuan |last3=Yan |first3=ChunGang |last4=Liu |first4=Lu |last5=Wang |first5=Cheng |title=Multi-Level Topical Text Categorization with Wikipedia |date=2016 |doi=10.1145/2996890.3007856 |url=https://wikipediaquality.com/wiki/Multi-Level_Topical_Text_Categorization_with_Wikipedia}}

HTML

Guo, Nan; He, Yuan; Yan, ChunGang; Liu, Lu; Wang, Cheng. (2016). &quot;<a href="https://wikipediaquality.com/wiki/Multi-Level_Topical_Text_Categorization_with_Wikipedia">Multi-Level Topical Text Categorization with Wikipedia</a>&quot;.DOI: 10.1145/2996890.3007856.