Automatic Content-Based Categorization of Wikipedia Articles

From Wikipedia Quality
Jump to: navigation, search


Automatic Content-Based Categorization of Wikipedia Articles
Authors
Zeno Gantner
Lars Schmidt-Thieme
Publication date
2009
DOI
10.3115/1699765.1699770
Links
Original Preprint

Automatic Content-Based Categorization of Wikipedia Articles - scientific work related to Wikipedia quality published in 2009, written by Zeno Gantner and Lars Schmidt-Thieme.

Overview

Wikipedia's article contents and its category hierarchy are widely used to produce semantic resources which improve performance on tasks like text classification and keyword extraction. The reverse -- using text classification methods for predicting the categories of Wikipedia articles -- has attracted less attention so far. Authors propose to "return the favor" and use text classifiers to improve Wikipedia. This could support the emergence of a virtuous circle between the wisdom of the crowds and machine learning/NLP methods.

Embed

Wikipedia Quality

Gantner, Zeno; Schmidt-Thieme, Lars. (2009). "[[Automatic Content-Based Categorization of Wikipedia Articles]]". Association for Computational Linguistics. DOI: 10.3115/1699765.1699770.

English Wikipedia

{{cite journal |last1=Gantner |first1=Zeno |last2=Schmidt-Thieme |first2=Lars |title=Automatic Content-Based Categorization of Wikipedia Articles |date=2009 |doi=10.3115/1699765.1699770 |url=https://wikipediaquality.com/wiki/Automatic_Content-Based_Categorization_of_Wikipedia_Articles |journal=Association for Computational Linguistics}}

HTML

Gantner, Zeno; Schmidt-Thieme, Lars. (2009). &quot;<a href="https://wikipediaquality.com/wiki/Automatic_Content-Based_Categorization_of_Wikipedia_Articles">Automatic Content-Based Categorization of Wikipedia Articles</a>&quot;. Association for Computational Linguistics. DOI: 10.3115/1699765.1699770.