Difference between revisions of "Learning Simple Wikipedia: a Cogitation in Ascertaining Abecedarian Language"

From Wikipedia Quality
Jump to: navigation, search
(Overview: Learning Simple Wikipedia: a Cogitation in Ascertaining Abecedarian Language)
 
(wikilinks)
Line 1: Line 1:
'''Learning Simple Wikipedia: a Cogitation in Ascertaining Abecedarian Language''' - scientific work related to Wikipedia quality published in 2010, written by Courtney Napoles and Mark Dredze.
+
'''Learning Simple Wikipedia: a Cogitation in Ascertaining Abecedarian Language''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Courtney Napoles]] and [[Mark Dredze]].
  
 
== Overview ==
 
== Overview ==
Text simplification is the process of changing vocabulary and grammatical structure to create a more accessible version of the text while maintaining the underlying information and content. Automated tools for text simplification are a practical way to make large corpora of text accessible to a wider audience lacking high levels of fluency in the corpus language. In this work, authors investigate the potential of Simple Wikipedia to assist automatic text simplification by building a statistical classification system that discriminates simple English from ordinary English. Most text simplification systems are based on hand-written rules (e.g., PEST (Carroll et al., 1999) and its module SYSTAR (Canning et al., 2000)), and therefore face limitations scaling and transferring across domains. The potential for using Simple Wikipedia for text simplification is significant; it contains nearly 60,000 articles with revision histories and aligned articles to ordinary English Wikipedia. Using articles from Simple Wikipedia and ordinary Wikipedia, authors evaluated different classifiers and feature sets to identify the most discriminative features of simple English for use across domains. These findings help further understanding of what makes text simple and can be applied as a tool to help writers craft simple text.
+
Text simplification is the process of changing vocabulary and grammatical structure to create a more accessible version of the text while maintaining the underlying information and content. Automated tools for text simplification are a practical way to make large corpora of text accessible to a wider audience lacking high levels of fluency in the corpus language. In this work, authors investigate the potential of Simple [[Wikipedia]] to assist automatic text simplification by building a statistical classification system that discriminates simple English from ordinary English. Most text simplification systems are based on hand-written rules (e.g., PEST (Carroll et al., 1999) and its module SYSTAR (Canning et al., 2000)), and therefore face limitations scaling and transferring across domains. The potential for using Simple Wikipedia for text simplification is significant; it contains nearly 60,000 articles with revision histories and aligned articles to ordinary [[English Wikipedia]]. Using articles from Simple Wikipedia and ordinary Wikipedia, authors evaluated different classifiers and feature sets to identify the most discriminative [[features]] of simple English for use across domains. These findings help further understanding of what makes text simple and can be applied as a tool to help writers craft simple text.

Revision as of 11:04, 11 June 2019

Learning Simple Wikipedia: a Cogitation in Ascertaining Abecedarian Language - scientific work related to Wikipedia quality published in 2010, written by Courtney Napoles and Mark Dredze.

Overview

Text simplification is the process of changing vocabulary and grammatical structure to create a more accessible version of the text while maintaining the underlying information and content. Automated tools for text simplification are a practical way to make large corpora of text accessible to a wider audience lacking high levels of fluency in the corpus language. In this work, authors investigate the potential of Simple Wikipedia to assist automatic text simplification by building a statistical classification system that discriminates simple English from ordinary English. Most text simplification systems are based on hand-written rules (e.g., PEST (Carroll et al., 1999) and its module SYSTAR (Canning et al., 2000)), and therefore face limitations scaling and transferring across domains. The potential for using Simple Wikipedia for text simplification is significant; it contains nearly 60,000 articles with revision histories and aligned articles to ordinary English Wikipedia. Using articles from Simple Wikipedia and ordinary Wikipedia, authors evaluated different classifiers and feature sets to identify the most discriminative features of simple English for use across domains. These findings help further understanding of what makes text simple and can be applied as a tool to help writers craft simple text.