Augmenting Wikipedia-Extraction with Results from the Web
Authors | Feifei Wu Raphael Hoffmann Daniel S. Weld |
---|---|
Publication date | 2008 |
ISBN | 978-157735383-6 |
Links |
Augmenting Wikipedia-Extraction with Results from the Web - scientific work about Wikipedia quality published in 2008, written by Feifei Wu, Raphael Hoffmann and Daniel S. Weld.
Overview
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-supervised information extraction. While previous efforts at extraction from Wikipedia achieve high precision and recall on well-populated classes of articles, they fail in a larger number of cases, largely because incomplete articles and infrequent use of infoboxes lead to insufficient training data. This paper explains and evaluates a method for improving recall by extracting from the broader Web. There are two key advances necessary to make Web supplementation effective: 1) a method to filter promising sentences from Web pages, and 2) a novel retraining technique to broaden extractor recall. Experiments show that, used in concert with shrinkage, their techniques increase recall by a factor of up to 8 while maintaining or increasing precision.
Embed
Wikipedia Quality
Wu, Feifei; Hoffmann, Raphael; Weld, Daniel S.. (2008). "[[Augmenting Wikipedia-Extraction with Results from the Web]]". CEUR Workshop Proceedings Volume 355, 2008, 6p. ISBN: 978-157735383-6.
English Wikipedia
{{cite journal |last1=Wu |first1=Feifei |last2=Hoffmann |first2=Raphael |last3=Weld |first3=Daniel S. |title=Augmenting Wikipedia-Extraction with Results from the Web |date=2008 |isbn=978-157735383-6 |url=https://wikipediaquality.com/wiki/Augmenting_Wikipedia-Extraction_with_Results_from_the_Web |journal=CEUR Workshop Proceedings Volume 355, 2008, 6p}}
HTML
Wu, Feifei; Hoffmann, Raphael; Weld, Daniel S.. (2008). "<a href="https://wikipediaquality.com/wiki/Augmenting_Wikipedia-Extraction_with_Results_from_the_Web">Augmenting Wikipedia-Extraction with Results from the Web</a>". CEUR Workshop Proceedings Volume 355, 2008, 6p. ISBN: 978-157735383-6.