The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction

From Wikipedia Quality
Jump to: navigation, search


The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction
Authors
Roman Grundkiewicz
Marcin Junczys-Dowmunt
Publication date
2014
DOI
10.1007/978-3-319-10888-9_47
Links
Original Preprint

The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction - scientific work related to Wikipedia quality published in 2014, written by Roman Grundkiewicz and Marcin Junczys-Dowmunt.

Overview

This paper introduces the freely available WikEd Error Corpus. Authors describe the data mining process from Wikipedia revision histories, corpus content and format. The corpus consists of more than 12 million sentences with a total of 14 million edits of various types. As one possible application, authors show that WikEd can be successfully adapted to improve a strong baseline in a task of grammatical error correction for English-as-a-Second-Language (ESL) learners’ writings by 2.63%. Used together with an ESL error corpus, a composed system gains 1.64% when compared to the ESL-trained system.

Embed

Wikipedia Quality

Grundkiewicz, Roman; Junczys-Dowmunt, Marcin. (2014). "[[The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction]]". Springer, Cham. DOI: 10.1007/978-3-319-10888-9_47.

English Wikipedia

{{cite journal |last1=Grundkiewicz |first1=Roman |last2=Junczys-Dowmunt |first2=Marcin |title=The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction |date=2014 |doi=10.1007/978-3-319-10888-9_47 |url=https://wikipediaquality.com/wiki/The_WikEd_Error_Corpus:_A_Corpus_of_Corrective_Wikipedia_Edits_and_Its_Application_to_Grammatical_Error_Correction |journal=Springer, Cham}}

HTML

Grundkiewicz, Roman; Junczys-Dowmunt, Marcin. (2014). &quot;<a href="https://wikipediaquality.com/wiki/The_WikEd_Error_Corpus:_A_Corpus_of_Corrective_Wikipedia_Edits_and_Its_Application_to_Grammatical_Error_Correction">The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction</a>&quot;. Springer, Cham. DOI: 10.1007/978-3-319-10888-9_47.