Difference between revisions of "Pairing Wikipedia Articles Across Languages"

From Wikipedia Quality
Jump to: navigation, search
(+ wikilinks)
(Adding infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Pairing Wikipedia Articles Across Languages
 +
| date = 2016
 +
| authors = [[Marcus Klang]]<br />[[Pierre Nugues]]
 +
| link = https://lup.lub.lu.se/search/publication/10b176f2-f95c-492c-9cf7-f29c82acee70
 +
}}
 
'''Pairing Wikipedia Articles Across Languages''' - scientific work related to [[Wikipedia quality]] published in 2016, written by [[Marcus Klang]] and [[Pierre Nugues]].
 
'''Pairing Wikipedia Articles Across Languages''' - scientific work related to [[Wikipedia quality]] published in 2016, written by [[Marcus Klang]] and [[Pierre Nugues]].
  
 
== Overview ==
 
== Overview ==
 
Wikipedia has become a reference knowledge source for scores of NLP applications. One of its invaluable [[features]] lies in its [[multilingual]] nature, where articles on a same entity or concept can have from one to more than 200 different versions. The interlinking of [[language versions]] in [[Wikipedia]] has undergone a major renewal with the advent of [[Wikidata]], a unified scheme to identify entities and their properties using unique numbers. However, as the interlinking is still manuallycarriedoutbythousandsofeditorsacrosstheglobe,errorsmaycreepintheassignment ofentities. Inthispaper,wedescribeanoptimizationtechniquetomatchautomaticallylanguage versions of articles, and hence entities, that is only based on bags of words and anchors. Authors created a dataset of all the articles on persons authors extracted from Wikipedia in six languages: English, French, German, Russian, Spanish, and Swedish. Authors report a correct match of at least 94.3% on each pair. (Less)
 
Wikipedia has become a reference knowledge source for scores of NLP applications. One of its invaluable [[features]] lies in its [[multilingual]] nature, where articles on a same entity or concept can have from one to more than 200 different versions. The interlinking of [[language versions]] in [[Wikipedia]] has undergone a major renewal with the advent of [[Wikidata]], a unified scheme to identify entities and their properties using unique numbers. However, as the interlinking is still manuallycarriedoutbythousandsofeditorsacrosstheglobe,errorsmaycreepintheassignment ofentities. Inthispaper,wedescribeanoptimizationtechniquetomatchautomaticallylanguage versions of articles, and hence entities, that is only based on bags of words and anchors. Authors created a dataset of all the articles on persons authors extracted from Wikipedia in six languages: English, French, German, Russian, Spanish, and Swedish. Authors report a correct match of at least 94.3% on each pair. (Less)

Revision as of 07:39, 13 February 2021


Pairing Wikipedia Articles Across Languages
Authors
Marcus Klang
Pierre Nugues
Publication date
2016
Links
Original

Pairing Wikipedia Articles Across Languages - scientific work related to Wikipedia quality published in 2016, written by Marcus Klang and Pierre Nugues.

Overview

Wikipedia has become a reference knowledge source for scores of NLP applications. One of its invaluable features lies in its multilingual nature, where articles on a same entity or concept can have from one to more than 200 different versions. The interlinking of language versions in Wikipedia has undergone a major renewal with the advent of Wikidata, a unified scheme to identify entities and their properties using unique numbers. However, as the interlinking is still manuallycarriedoutbythousandsofeditorsacrosstheglobe,errorsmaycreepintheassignment ofentities. Inthispaper,wedescribeanoptimizationtechniquetomatchautomaticallylanguage versions of articles, and hence entities, that is only based on bags of words and anchors. Authors created a dataset of all the articles on persons authors extracted from Wikipedia in six languages: English, French, German, Russian, Spanish, and Swedish. Authors report a correct match of at least 94.3% on each pair. (Less)