Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency Parser

Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency Parser
Authors	Ruben Dorado Audrey Bramy Camilo Mejía-Moncayo Alix E. Rojas
Publication date	2017
DOI	10.1007/978-3-319-66562-7_3
Links	Original

Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency Parser - scientific work related to Wikipedia quality published in 2017, written by Ruben Dorado, Audrey Bramy, Camilo Mejía-Moncayo and Alix E. Rojas.

Overview

Controlled vocabularies are important resources used in several tasks such as machine translation, text summarization, and text analysis. However, the development of such resources is expensive and time-consuming. On the other hand, the Wikipedia, a free collaborative encyclopedia, contains plenty of semi-structured information that can be used by an automatic process to create new resources. This paper proposes a method to extract semantic information from the Wikipedia in the form of a controlled vocabulary. The method combines keywords obtained for a specific Wikipedia article with three different strategies: using Wikipedia annotations called wikilinks, a ranking measure to obtain keywords from text, and a dependency parser. To evaluate the model, authors performed an analysis in terms of coverage and performance of the acquired vocabulary using WordNet as a gold standard.

Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency Parser

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools