Probabilistic Explicit Topic Modeling Using Wikipedia

From Wikipedia Quality
Jump to: navigation, search


Probabilistic Explicit Topic Modeling Using Wikipedia
Authors
Joshua Hansen
Eric K. Ringger
Kevin D. Seppi
Publication date
2013
DOI
10.1007/978-3-642-40722-2_7
Links
Original

Probabilistic Explicit Topic Modeling Using Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Joshua Hansen, Eric K. Ringger and Kevin D. Seppi.

Overview

Despite popular use of Latent Dirichlet Allocation (LDA) for automatic discovery of latent topics in document corpora, such topics lack connections with relevant knowledge sources such as Wikipedia, and they can be difficult to interpret due to the lack of meaningful topic labels. Furthermore, the topic analysis suffers from a lack of identifiability between topics across independently analyzed corpora but also across distinct runs of the algorithm on the same corpus. This paper introduces two methods for probabilistic explicit topic modeling that address these issues: Latent Dirichlet Allocation with Static Topic-Word Distributions (LDA-STWD), and Explicit Dirichlet Allocation (EDA). Both of these methods estimate topic-word distributions a priori from Wikipedia articles, with each article corresponding to one topic and the article title serving as a topic label. LDA-STWD and EDA overcome the nonidentifiability, isolation, and unintepretability of LDA output. Authors assess their effectiveness by means of crowd-sourced user studies on two tasks: topic label generation and document label generation. Authors find that LDA-STWD improves substantially upon the performance of the state-of-the-art on the document labeling task, and that both methods otherwise perform on par with a state-of-the-art post hoc method.

Embed

Wikipedia Quality

Hansen, Joshua; Ringger, Eric K.; Seppi, Kevin D.. (2013). "[[Probabilistic Explicit Topic Modeling Using Wikipedia]]". Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-40722-2_7.

English Wikipedia

{{cite journal |last1=Hansen |first1=Joshua |last2=Ringger |first2=Eric K. |last3=Seppi |first3=Kevin D. |title=Probabilistic Explicit Topic Modeling Using Wikipedia |date=2013 |doi=10.1007/978-3-642-40722-2_7 |url=https://wikipediaquality.com/wiki/Probabilistic_Explicit_Topic_Modeling_Using_Wikipedia |journal=Springer, Berlin, Heidelberg}}

HTML

Hansen, Joshua; Ringger, Eric K.; Seppi, Kevin D.. (2013). &quot;<a href="https://wikipediaquality.com/wiki/Probabilistic_Explicit_Topic_Modeling_Using_Wikipedia">Probabilistic Explicit Topic Modeling Using Wikipedia</a>&quot;. Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-40722-2_7.