Language-Agnostic Relation Extraction from Wikipedia Abstracts

From Wikipedia Quality
Jump to: navigation, search


Language-Agnostic Relation Extraction from Wikipedia Abstracts
Authors
Nicolas Heist
Heiko Paulheim
Publication date
2017
DOI
10.1007/978-3-319-68288-4_23
Links
Original

Language-Agnostic Relation Extraction from Wikipedia Abstracts - scientific work related to Wikipedia quality published in 2017, written by Nicolas Heist and Heiko Paulheim.

Overview

Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), authors present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. Authors demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, authors can extract 1.6M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. Furthermore, authors show an exemplary geographical breakdown of the information extracted.

Embed

Wikipedia Quality

Heist, Nicolas; Paulheim, Heiko. (2017). "[[Language-Agnostic Relation Extraction from Wikipedia Abstracts]]". Springer, Cham. DOI: 10.1007/978-3-319-68288-4_23.

English Wikipedia

{{cite journal |last1=Heist |first1=Nicolas |last2=Paulheim |first2=Heiko |title=Language-Agnostic Relation Extraction from Wikipedia Abstracts |date=2017 |doi=10.1007/978-3-319-68288-4_23 |url=https://wikipediaquality.com/wiki/Language-Agnostic_Relation_Extraction_from_Wikipedia_Abstracts |journal=Springer, Cham}}

HTML

Heist, Nicolas; Paulheim, Heiko. (2017). &quot;<a href="https://wikipediaquality.com/wiki/Language-Agnostic_Relation_Extraction_from_Wikipedia_Abstracts">Language-Agnostic Relation Extraction from Wikipedia Abstracts</a>&quot;. Springer, Cham. DOI: 10.1007/978-3-319-68288-4_23.