A Wikipedia-Lda Model for Entity Linking with Batch Size Changing Instance Selection

From Wikipedia Quality
Revision as of 12:14, 17 June 2020 by Sofia (talk | contribs) (+ infobox)
Jump to: navigation, search


A Wikipedia-Lda Model for Entity Linking with Batch Size Changing Instance Selection
Authors
Wei Zhang
Jian Su
Chew Lim Tan
Publication date
2011
Links
Original

A Wikipedia-Lda Model for Entity Linking with Batch Size Changing Instance Selection - scientific work related to Wikipedia quality published in 2011, written by Wei Zhang, Jian Su and Chew Lim Tan.

Overview

Entity linking maps name mentions in context to entries in a knowledge base through resolving the name variations and ambiguities. In this paper, authors propose two advancements for entity linking. First, a Wikipedia-LDA method is proposed to model the contexts as the probability distributions over Wikipedia categories, which allows the context similarity being measured in a semantic space instead of literal term space used by other studies for the disambiguation. Furthermore, to automate the training instance annotation without compromising the accuracy, an instance selection strategy is proposed to select an informative, representative and diverse subset from an auto-generated dataset. During the iterative selection process, the batch sizes at each iteration change according to the variance of classifier’s confidence or accuracy between batches in sequence, which not only makes the selection insensitive to the initial batch size, but also leads to a better performance. The above two advancements give significant improvements to entity linking individually. Collectively they lead the highest performance on KBP-10 task. Being a generic approach, the batch size changing method can also benefit active learning for other tasks.