Unsupervised Query Segmentation Using Generative Language Models and Wikipedia
Authors | Bin Tan Fuchun Peng |
---|---|
Publication date | 2008 |
DOI | 10.1145/1367497.1367545 |
Links | Original Preprint |
Unsupervised Query Segmentation Using Generative Language Models and Wikipedia - scientific work related to Wikipedia quality published in 2008, written by Bin Tan and Fuchun Peng.
Overview
In this paper, authors propose a novel unsupervised approach to query segmentation, an important task in Web search. Authors use a generative query model to recover a query's underlying concepts that compose its original segmented form. The model's parameters are estimated using an expectation-maximization (EM) algorithm, optimizing the minimum description length objective function on a partial corpus that is specific to the query. To augment this unsupervised learning, authors incorporate evidence from Wikipedia. Experiments show that approach dramatically improves performance over the traditional approach that is based on mutual information, and produces comparable results with a supervised method. In particular, the basic generative language model contributes a 7.4% improvement over the mutual information based method (measured by segment F1 on the Intersection test set). EM optimization further improves the performance by 14.3%. Additional knowledge from Wikipedia provides another improvement of 24.3%, adding up to a total of 46% improvement (from 0.530 to 0.774).
Embed
Wikipedia Quality
Tan, Bin; Peng, Fuchun. (2008). "[[Unsupervised Query Segmentation Using Generative Language Models and Wikipedia]]".DOI: 10.1145/1367497.1367545.
English Wikipedia
{{cite journal |last1=Tan |first1=Bin |last2=Peng |first2=Fuchun |title=Unsupervised Query Segmentation Using Generative Language Models and Wikipedia |date=2008 |doi=10.1145/1367497.1367545 |url=https://wikipediaquality.com/wiki/Unsupervised_Query_Segmentation_Using_Generative_Language_Models_and_Wikipedia}}
HTML
Tan, Bin; Peng, Fuchun. (2008). "<a href="https://wikipediaquality.com/wiki/Unsupervised_Query_Segmentation_Using_Generative_Language_Models_and_Wikipedia">Unsupervised Query Segmentation Using Generative Language Models and Wikipedia</a>".DOI: 10.1145/1367497.1367545.