Exploring the Use of Word Embeddings and Random Walks on Wikipedia for the Cogalex Shared Task

Exploring the Use of Word Embeddings and Random Walks on Wikipedia for the Cogalex Shared Task - scientific work related to Wikipedia quality published in 2014, written by Josu Goikoetxea, Eneko Agirre and Aitor Soroa.

Overview

In participation on the task authors wanted to test three different kinds of relatedness algorithms: one based on embeddings induced from corpora, another based on random walks on WordNet and a last one based on random walks based on Wikipedia. All three of them perform similarly in noun relatedness datasets like WordSim353, close to the highest reported values. Although the task definition gave examples of nouns, the train and test data were based on the Edinburgh Association Thesaurus, and around 50% of the target words were not nouns. The corpus-based algorithm performed much better than the other methods in the training dataset, and was thus submitted for the test.

Exploring the Use of Word Embeddings and Random Walks on Wikipedia for the Cogalex Shared Task

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools