Difference between revisions of "Language Resources Extracted from Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(+ embed code)
(Adding categories)
 
Line 32: Line 32:
 
</nowiki>
 
</nowiki>
 
</code>
 
</code>
 +
 +
 +
 +
[[Category:Scientific works]]
 +
[[Category:English Wikipedia]]

Latest revision as of 10:27, 4 March 2021


Language Resources Extracted from Wikipedia
Authors
Denny Vrandecic
Philipp Sorg
Rudi Studer
Publication date
2011
DOI
10.1145/1999676.1999703
Links
Original

Language Resources Extracted from Wikipedia - scientific work related to Wikipedia quality published in 2011, written by Denny Vrandecic, Philipp Sorg and Rudi Studer.

Overview

Wikipedia provides an interesting amount of text for more than hundred languages. This also includes languages where no reference corpora or other linguistic resources are easily available. Authors have extracted background language models built from the content of Wikipedia in various languages. The models generated from Simple and English Wikipedia are compared to language models derived from other established corpora. The differences between the models in regard to term coverage, term distribution and correlation are described and discussed. Authors provide access to the full dataset and create visualizations of the language models that can be used exploratory. The paper describes the newly released dataset for 33 languages, and the services that authors provide on top of them.

Embed

Wikipedia Quality

Vrandecic, Denny; Sorg, Philipp; Studer, Rudi. (2011). "[[Language Resources Extracted from Wikipedia]]".DOI: 10.1145/1999676.1999703.

English Wikipedia

{{cite journal |last1=Vrandecic |first1=Denny |last2=Sorg |first2=Philipp |last3=Studer |first3=Rudi |title=Language Resources Extracted from Wikipedia |date=2011 |doi=10.1145/1999676.1999703 |url=https://wikipediaquality.com/wiki/Language_Resources_Extracted_from_Wikipedia}}

HTML

Vrandecic, Denny; Sorg, Philipp; Studer, Rudi. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Language_Resources_Extracted_from_Wikipedia">Language Resources Extracted from Wikipedia</a>&quot;.DOI: 10.1145/1999676.1999703.