Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia

From Wikipedia Quality
Revision as of 23:00, 16 June 2019 by Autumn (talk | contribs) (Starting a page: Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia - scientific work related to Wikipedia quality published in 2018, written by Xinya Du and Claire Cardie.

Overview

Authors study the task of generating from Wikipedia articles question-answer pairs that cover content beyond a single sentence. Authors propose a neural network approach that incorporates coreference knowledge via a novel gating mechanism. Compared to models that only take into account sentence-level information (Heilman and Smith, 2010; Du et al., 2017; Zhou et al., 2017), authors find that the linguistic knowledge introduced by the coreference representation aids question generation significantly, producing models that outperform the current state-of-the-art. Authors apply system (composed of an answer span extraction system and the passage-level QG system) to the 10,000 top-ranking Wikipedia articles and create a corpus of over one million question-answer pairs. Authors also provide a qualitative analysis for this large-scale generated corpus from Wikipedia.