Wikiwalk: Random Walks on Wikipedia for Semantic Relatedness

From Wikipedia Quality
Revision as of 23:21, 30 May 2019 by Sofia (talk | contribs) (Overview - Wikiwalk: Random Walks on Wikipedia for Semantic Relatedness)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Wikiwalk: Random Walks on Wikipedia for Semantic Relatedness - scientific work related to Wikipedia quality published in 2009, written by Eric Yeh, Daniel Ramage, Christopher D. Manning, Eneko Agirre and Aitor Soroa.

Overview

Computing semantic relatedness of natural language texts is a key component of tasks such as information retrieval and summarization, and often depends on knowledge of a broad range of real-world concepts and relationships. Authors address this knowledge integration issue by computing semantic relatedness using personalized PageRank (random walks) on a graph derived from Wikipedia. This paper evaluates methods for building the graph, including link selection strategies, and two methods for representing input texts as distributions over the graph nodes: one based on a dictionary lookup, the other based on Explicit Semantic Analysis. Authors evaluate techniques on standard word relatedness and text similarity datasets, finding that they capture similarity information complementary to existing Wikipedia-based relatedness measures, resulting in small improvements on a state-of-the-art measure.