Augmenting Wikipedia-Extraction with Results from the Web

From Wikipedia Quality
Jump to: navigation, search
Augmenting Wikipedia-Extraction with Results from the Web
Authors
Feifei Wu
Raphael Hoffmann
Daniel S. Weld
Publication date
2008
ISBN
978-157735383-6
Links

Augmenting Wikipedia-Extraction with Results from the Web - scientific work about Wikipedia quality published in 2008, written by Feifei Wu, Raphael Hoffmann and Daniel S. Weld.

Overview

Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-supervised information extraction. While previous efforts at extraction from Wikipedia achieve high precision and recall on well-populated classes of articles, they fail in a larger number of cases, largely because incomplete articles and infrequent use of infoboxes lead to insufficient training data. This paper explains and evaluates a method for improving recall by extracting from the broader Web. There are two key advances necessary to make Web supplementation effective: 1) a method to filter promising sentences from Web pages, and 2) a novel retraining technique to broaden extractor recall. Experiments show that, used in concert with shrinkage, their techniques increase recall by a factor of up to 8 while maintaining or increasing precision.

Embed

Wikipedia Quality

Wu, Feifei; Hoffmann, Raphael; Weld, Daniel S.. (2008). "[[Augmenting Wikipedia-Extraction with Results from the Web]]". CEUR Workshop Proceedings Volume 355, 2008, 6p. ISBN: 978-157735383-6.

English Wikipedia

{{cite journal |last1=Wu |first1=Feifei |last2=Hoffmann |first2=Raphael |last3=Weld |first3=Daniel S. |title=Augmenting Wikipedia-Extraction with Results from the Web |date=2008 |isbn=978-157735383-6 |url=https://wikipediaquality.com/wiki/Augmenting_Wikipedia-Extraction_with_Results_from_the_Web |journal=CEUR Workshop Proceedings Volume 355, 2008, 6p}}

HTML

Wu, Feifei; Hoffmann, Raphael; Weld, Daniel S.. (2008). &quot;<a href="https://wikipediaquality.com/wiki/Augmenting_Wikipedia-Extraction_with_Results_from_the_Web">Augmenting Wikipedia-Extraction with Results from the Web</a>&quot;. CEUR Workshop Proceedings Volume 355, 2008, 6p. ISBN: 978-157735383-6.