Difference between revisions of "Wikilda: Towards More Effective Knowledge Acquisition in Topic Models Using Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Creating a new page - Wikilda: Towards More Effective Knowledge Acquisition in Topic Models Using Wikipedia)
 
(+ links)
Line 1: Line 1:
'''Wikilda: Towards More Effective Knowledge Acquisition in Topic Models Using Wikipedia''' - scientific work related to Wikipedia quality published in 2017, written by Swapnil Hingmire, Sutanu Chakraborti, Girish Keshav Palshikar and Abhay Sodani.
+
'''Wikilda: Towards More Effective Knowledge Acquisition in Topic Models Using Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2017, written by [[Swapnil Hingmire]], [[Sutanu Chakraborti]], [[Girish Keshav Palshikar]] and [[Abhay Sodani]].
  
 
== Overview ==
 
== Overview ==
Towards the goal of enhancing interpretability of Latent Dirichlet Allocation (LDA) topics, authors propose WikiLDA, an enhancement to LDA using Wikipedia concepts. In WikiLDA, initially, for each document in a corpus authors "sprinkle" (append) its most relevant Wikipedia concepts. Authors then use Generalized Polya Urn (GPU) to incorporate word-word, word-concept, and concept-concept semantic relatedness into the generative process of LDA. As the most probable concepts from inferred topics can be referred on Wikipedia, the topics are likely to become more interpretable and hence more usable in acquiring domain knowledge from humans for various text mining tasks (e.g. eliciting topic labels for text classification). Empirical results show that a projection of documents by WikiLDA in a semantically enriched and coherent topic space leads to improved performance in text classification like tasks, especially in domains where the classes are hard to separate.
+
Towards the goal of enhancing interpretability of Latent Dirichlet Allocation (LDA) topics, authors propose WikiLDA, an enhancement to LDA using [[Wikipedia]] concepts. In WikiLDA, initially, for each document in a corpus authors "sprinkle" (append) its most relevant Wikipedia concepts. Authors then use Generalized Polya Urn (GPU) to incorporate word-word, word-concept, and concept-concept semantic [[relatedness]] into the generative process of LDA. As the most probable concepts from inferred topics can be referred on Wikipedia, the topics are likely to become more interpretable and hence more usable in acquiring domain knowledge from humans for various text mining tasks (e.g. eliciting topic labels for text classification). Empirical results show that a projection of documents by WikiLDA in a semantically enriched and coherent topic space leads to improved performance in text classification like tasks, especially in domains where the classes are hard to separate.

Revision as of 12:49, 8 May 2020

Wikilda: Towards More Effective Knowledge Acquisition in Topic Models Using Wikipedia - scientific work related to Wikipedia quality published in 2017, written by Swapnil Hingmire, Sutanu Chakraborti, Girish Keshav Palshikar and Abhay Sodani.

Overview

Towards the goal of enhancing interpretability of Latent Dirichlet Allocation (LDA) topics, authors propose WikiLDA, an enhancement to LDA using Wikipedia concepts. In WikiLDA, initially, for each document in a corpus authors "sprinkle" (append) its most relevant Wikipedia concepts. Authors then use Generalized Polya Urn (GPU) to incorporate word-word, word-concept, and concept-concept semantic relatedness into the generative process of LDA. As the most probable concepts from inferred topics can be referred on Wikipedia, the topics are likely to become more interpretable and hence more usable in acquiring domain knowledge from humans for various text mining tasks (e.g. eliciting topic labels for text classification). Empirical results show that a projection of documents by WikiLDA in a semantically enriched and coherent topic space leads to improved performance in text classification like tasks, especially in domains where the classes are hard to separate.