Leveraging Fine-Grained Wikipedia Categories for Entity Search

From Wikipedia Quality
Revision as of 06:23, 7 February 2021 by Serafina (talk | contribs) (wikilinks)
Jump to: navigation, search

Leveraging Fine-Grained Wikipedia Categories for Entity Search - scientific work related to Wikipedia quality published in 2018, written by Denghao Ma, Yueguo Chen, Kevin Chen Chuan Chang and Xiaoyong Du.

Overview

Ad-hoc entity search, which is to retrieve a ranked list of relevant entities in response to a query of natural language question, has been widely studied. It has been shown that category matching of entities, especially when matching to fine-grained entity categories, is critical to the performance of entity search. However, the potentials of fine-grained Wikipedia categories, has not been well exploited by existing studies. Based on the observation of how people describe entities of a specific type, authors propose a headword-and-modifier model to deeply interpret both queries and fine-grained entity categories. Probabilistic generative models are designed to effectively estimate the relevance of headwords and modifiers as a pattern-based matching problem, taking the Wikipedia type taxonomy as an important input to address the ad-hoc representations of concepts/entities in queries. Extensive experimental results on three widely-used test sets: INEX-XER 2009, SemSearch-LS and TREC-Entity, show that method achieves a significant improvement of the entity search performance over the state-of-the-art methods.