Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

From Wikipedia Quality
Revision as of 15:38, 7 December 2019 by Maria (talk | contribs) (+ infobox)
Jump to: navigation, search


Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation
Authors
Eneko Agirre
Ander Barrena
Aitor Soroa
Publication date
2015
Links
Original Preprint

Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation - scientific work related to Wikipedia quality published in 2015, written by Eneko Agirre, Ander Barrena and Aitor Soroa.

Overview

Hyperlinks and other relations in Wikipedia are a extraordinary resource which is still not fully understood. In this paper authors study the different types of links in Wikipedia, and contrast the use of the full graph with respect to just direct links. Authors apply a well-known random walk algorithm on two tasks, word relatedness and named-entity disambiguation. Authors show that using the full graph is more effective than just direct links by a large margin, that non-reciprocal links harm performance, and that there is no benefit from categories and infoboxes, with coherent results on both tasks. Authors set new state-of-the-art figures for systems based on Wikipedia links, comparable to systems exploiting several information sources and/or supervised machine learning. Authors approach is open source, with instruction to reproduce results, and amenable to be integrated with complementary text-based methods.