Learning to Simplify Sentences Using Wikipedia

From Wikipedia Quality
Revision as of 10:13, 9 December 2020 by Leslie (talk | contribs) (Adding new article - Learning to Simplify Sentences Using Wikipedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Learning to Simplify Sentences Using Wikipedia - scientific work related to Wikipedia quality published in 2011, written by William Coster and David Kauchak.

Overview

In this paper authors examine the sentence simplification problem as an English-to-English translation problem, utilizing a corpus of 137K aligned sentence pairs extracted by aligning English Wikipedia and Simple English Wikipedia. This data set contains the full range of transformation operations including rewording, reordering, insertion and deletion. Authors introduce a new translation model for text simplification that extends a phrase-based machine translation approach to include phrasal deletion. Evaluated based on three metrics that compare against a human reference (BLEU, word-F1 and SSA) new approach performs significantly better than two text compression techniques (including T3) and the phrase-based translation system without deletion.