Dft-Extractor: a System to Extract Domain-Specific Faceted Taxonomies from Wikipedia

From Wikipedia Quality
Revision as of 08:32, 16 January 2020 by Violet (talk | contribs) (wikilinks)
Jump to: navigation, search

Dft-Extractor: a System to Extract Domain-Specific Faceted Taxonomies from Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Bifan Wei, Jun Liu, Jian Ma, Qinghua Zheng, Wei Zhang and Boqin Feng.

Overview

Extracting faceted taxonomies from the Web has received increasing attention in recent years from the web mining community. Authors demonstrate in this study a novel system called DFT-Extractor, which automatically constructs domain-specific faceted taxonomies from Wikipedia in three steps: 1) It crawls domain terms from Wikipedia by using a modified topical crawler. 2) Then it exploits a classification model to extract hyponym relations with the use of motif-based features. 3) Finally, it constructs a faceted taxonomy by applying a community detection algorithm and a group of heuristic rules. DFT-Extractor also provides a graphical user interface to visualize the learned hyponym relations and the tree structure of taxonomies.