Detection of Text Quality Flaws as a One-Class Classification Problem

From Wikipedia Quality
Jump to: navigation, search
Detection of Text Quality Flaws as a One-Class Classification Problem
Authors
Maik Anderka
Benno Maria Stein
Nedim Lipka
Publication date
2011
ISBN
978-145030717-8
DOI
10.1145/2063576.2063954
Links

Detection of Text Quality Flaws as a One-Class Classification Problem - scientific work about Wikipedia quality published in 2011, written by Maik Anderka, Benno Maria Stein and Nedim Lipka.

Overview

For Web applications that are based on user generated content the detection of text quality flaws is a key concern. Their research contributes to automatic quality flaw detection. In particular, authors propose to cast the detection of text quality flaws as a one-class classification problem: authors are given only positive examples (= texts containing a particular quality flaw) and decide whether or not an unseen text suffers from this flaw. Authors argue that common binary or multiclass classification approaches are ineffective in here, and authors underpin their approach by a real-world application: authors employ a dedicated one-class learning approach to determine whether a given Wikipedia article suffers from certain quality flaws. Since in the Wikipedia setting the acquisition of sensible test data is quite intricate, authors analyze the effects of a biased sample selection. In addition, authors illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. Altogether, provided test data with little noise, four from ten important quality flaws in Wikipedia can be detected with a precision close to 1.

Embed

Wikipedia Quality

Anderka, Maik; Stein, Benno Maria; Lipka, Nedim. (2011). "[[Detection of Text Quality Flaws as a One-Class Classification Problem]]". International Conference on Information and Knowledge Management, Proceedings 2011, pp. 2313-2316. ISBN: 978-145030717-8. DOI: 10.1145/2063576.2063954.

English Wikipedia

{{cite journal |last1=Anderka |first1=Maik |last2=Stein |first2=Benno Maria |last3=Lipka |first3=Nedim |title=Detection of Text Quality Flaws as a One-Class Classification Problem |date=2011 |isbn=978-145030717-8 |doi=10.1145/2063576.2063954 |url=https://wikipediaquality.com/wiki/Detection_of_Text_Quality_Flaws_as_a_One-Class_Classification_Problem |journal=International Conference on Information and Knowledge Management, Proceedings 2011, pp. 2313-2316}}

HTML

Anderka, Maik; Stein, Benno Maria; Lipka, Nedim. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Detection_of_Text_Quality_Flaws_as_a_One-Class_Classification_Problem">Detection of Text Quality Flaws as a One-Class Classification Problem</a>&quot;. International Conference on Information and Knowledge Management, Proceedings 2011, pp. 2313-2316. ISBN: 978-145030717-8. DOI: 10.1145/2063576.2063954.