Difference between revisions of "Directions for Exploiting Asymmetries in Multilingual Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Creating a page: Directions for Exploiting Asymmetries in Multilingual Wikipedia)
 
(Adding wikilinks)
Line 1: Line 1:
'''Directions for Exploiting Asymmetries in Multilingual Wikipedia''' - scientific work related to Wikipedia quality published in 2009, written by Elena Filatova.
+
'''Directions for Exploiting Asymmetries in Multilingual Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2009, written by [[Elena Filatova]].
  
 
== Overview ==
 
== Overview ==
Multilingual Wikipedia has been used extensively for a variety Natural Language Processing (NLP) tasks. Many Wikipedia entries (people, locations, events, etc.) have descriptions in several languages. These descriptions, however, are not identical. On the contrary, descriptions in different languages created for the same Wikipedia entry can vary greatly in terms of description length and information choice. Keeping these peculiarities in mind is necessary while using multilingual Wikipedia as a corpus for training and testing NLP applications. In this paper authors present preliminary results on quantifying Wikipedia multilinguality. Authors results support the observation about the substantial variation in descriptions of Wikipedia entries created in different languages. However, authors believe that asymmetries in multilingual Wikipedia do not make Wikipedia an undesirable corpus for NLP applications training. On the contrary, authors outline research directions that can utilize multilingual Wikipedia asymmetries to bridge the communication gaps in multilingual societies.
+
Multilingual [[Wikipedia]] has been used extensively for a variety [[Natural Language Processing]] (NLP) tasks. Many Wikipedia entries (people, locations, events, etc.) have descriptions in several languages. These descriptions, however, are not identical. On the contrary, descriptions in [[different language]]s created for the same Wikipedia entry can vary greatly in terms of description length and information choice. Keeping these peculiarities in mind is necessary while using [[multilingual]] Wikipedia as a corpus for training and testing NLP applications. In this paper authors present preliminary results on quantifying Wikipedia multilinguality. Authors results support the observation about the substantial variation in descriptions of Wikipedia entries created in different languages. However, authors believe that asymmetries in multilingual Wikipedia do not make Wikipedia an undesirable corpus for NLP applications training. On the contrary, authors outline research directions that can utilize multilingual Wikipedia asymmetries to bridge the communication gaps in multilingual societies.

Revision as of 09:04, 18 October 2019

Directions for Exploiting Asymmetries in Multilingual Wikipedia - scientific work related to Wikipedia quality published in 2009, written by Elena Filatova.

Overview

Multilingual Wikipedia has been used extensively for a variety Natural Language Processing (NLP) tasks. Many Wikipedia entries (people, locations, events, etc.) have descriptions in several languages. These descriptions, however, are not identical. On the contrary, descriptions in different languages created for the same Wikipedia entry can vary greatly in terms of description length and information choice. Keeping these peculiarities in mind is necessary while using multilingual Wikipedia as a corpus for training and testing NLP applications. In this paper authors present preliminary results on quantifying Wikipedia multilinguality. Authors results support the observation about the substantial variation in descriptions of Wikipedia entries created in different languages. However, authors believe that asymmetries in multilingual Wikipedia do not make Wikipedia an undesirable corpus for NLP applications training. On the contrary, authors outline research directions that can utilize multilingual Wikipedia asymmetries to bridge the communication gaps in multilingual societies.