Collective Context-Aware Topic Models for Entity Disambiguation

From Wikipedia Quality
Jump to: navigation, search
Collective Context-Aware Topic Models for Entity Disambiguation
Authors
Prithviraj Sen
Publication date
2012
ISBN
978-145031229-5
DOI
10.1145/2187836.2187935
Links

Collective Context-Aware Topic Models for Entity Disambiguation - scientific work about Wikipedia quality published in 2012, written by Prithviraj Sen.

Overview

A crucial step in adding structure to unstructured data is to identify references to entities and disambiguate them. Such disambiguated references can help enhance readability and draw similarities across different pieces of running text in an automated fashion. Previous research has tackled this problem by first forming a catalog of entities from a knowledge base, such as Wikipedia, and then using this catalog to disambiguate references in unseen text. However, most of the previously proposed models either do not use all text in the knowledge base, potentially missing out on discriminative features, or do not exploit word-entity proximity to learn high-quality catalogs. In this work, authors propose topic models that keep track of the context of every word in the knowledge base; so that words appearing within the same context as an entity are more likely to be associated with that entity. Thus, their topic models utilize all text present in the knowledge base and help learn high-quality catalogs. Their models also learn groups of co-occurring entities thus enabling collective disambiguation. Unlike most previous topic models, their models are non-parametric and do not require the user to specify the exact number of groups present in the knowledge base. In experiments performed on an extract of Wikipedia containing almost 60,000 references, their models outperform SVM-based baselines by as much as 18% in terms of disambiguation accuracy translating to an increment of almost 11,000 correctly disambiguated references.

Embed

Wikipedia Quality

Sen, Prithviraj. (2012). "[[Collective Context-Aware Topic Models for Entity Disambiguation]]". Advances in Information Sciences and Service Sciences Volume 4, Issue 12, June 2012, pp. 140-151. ISBN: 978-145031229-5. DOI: 10.1145/2187836.2187935.

English Wikipedia

{{cite journal |last1=Sen |first1=Prithviraj |title=Collective Context-Aware Topic Models for Entity Disambiguation |date=2012 |isbn=978-145031229-5 |doi=10.1145/2187836.2187935 |url=https://wikipediaquality.com/wiki/Collective_Context-Aware_Topic_Models_for_Entity_Disambiguation |journal=Advances in Information Sciences and Service Sciences Volume 4, Issue 12, June 2012, pp. 140-151}}

HTML

Sen, Prithviraj. (2012). &quot;<a href="https://wikipediaquality.com/wiki/Collective_Context-Aware_Topic_Models_for_Entity_Disambiguation">Collective Context-Aware Topic Models for Entity Disambiguation</a>&quot;. Advances in Information Sciences and Service Sciences Volume 4, Issue 12, June 2012, pp. 140-151. ISBN: 978-145031229-5. DOI: 10.1145/2187836.2187935.