An Exploration of Learning to Link with Wikipedia: Features, Methods and Training Collection

From Wikipedia Quality
Revision as of 09:38, 23 June 2019 by Alice (talk | contribs) (Starting a page: An Exploration of Learning to Link with Wikipedia: Features, Methods and Training Collection)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

An Exploration of Learning to Link with Wikipedia: Features, Methods and Training Collection - scientific work related to Wikipedia quality published in 2009, written by Jiyin He and Maarten De Rijk.

Overview

Authors describe participation in the Link-the-Wiki track at INEX 2009. Authors apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of approaches: features, learning methods as well as the collection used for training the models. Authors find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.