Characterizing Discussions in the Spanish Wikipedia

From Wikipedia Quality
Revision as of 13:02, 5 January 2020 by Mila (talk | contribs) (Links)
Jump to: navigation, search

Characterizing Discussions in the Spanish Wikipedia - scientific work related to Wikipedia quality published in 2017, written by Johnny Torres, Alfonsina Ochoa, Alberto Jimenez, Sixto Garcia, Enrique Pelaez and Xavier Ochoa.

Overview

Wikipedia, as the largest online encyclopedia, is edited collaboratively by hundreds of users. The content in some articles can have dispute, giving rise to discussions which are registered in the related talk pages. In this paper, authors propose an annotation schema for Spanish Wikipedia talk pages in order to determine the type of opinions expressed in them. Authors apply the annotation schema to a corpus that includes a collection of discussions about 148 topics drawn from 25 Spanish Wikipedia talk pages. Authors make the resulting dataset publicly available for download on github 1 . Furthermore, authors train and evaluate supervised machine learning models to automatically identify the annotation labels. Linear Support Vector classifier (LinearSVC) performs better compared to other baseline models, and achieves an accuracy F 1 = 0.71 in experiments.