Controversy Detection in Wikipedia Using Collective Classification

From Wikipedia Quality
Revision as of 08:44, 30 April 2020 by Brooklyn (talk | contribs) (Wikilinks)
Jump to: navigation, search

Controversy Detection in Wikipedia Using Collective Classification - scientific work related to Wikipedia quality published in 2016, written by Shiri Dori-Hacohen, David D. Jensen and James Allan.

Overview

Concerns over personalization in IR have sparked an interest in detection and analysis of controversial topics. Accurate detection would enable many beneficial applications, such as alerting search users to controversy. Wikipedia's broad coverage and rich metadata offer a valuable resource for this problem. Authors hypothesize that intensities of controversy among related pages are not independent; thus, authors propose a stacked model which exploits the dependencies among related pages. Authors approach improves classification of controversial web pages when compared to a model that examines each page in isolation, demonstrating that controversial topics exhibit homophily. Using notions of similarity to construct a subnetwork for collective classification, rather than using the default network present in the relational data, leads to improved classification with wider applications for semi-structured datasets, with the effects most pronounced when a small set of neighbors is used.