Towards Effective Processing of Large Text Collections

From Wikipedia Quality
Jump to: navigation, search
Towards Effective Processing of Large Text Collections
Authors
Julian Szymański
Henryk Krawczyk
Publication date
2012
ISBN
978-146732678-0
DOI
10.1109/INTECH.2012.6457784
Links

Towards Effective Processing of Large Text Collections - scientific work about Wikipedia quality published in 2012, written by Julian Szymański and Henryk Krawczyk.

Overview

In the article authors describe the approach to parallel implementation of elementary operations for textual data categorization. In the experiments authors evaluate parallel computations of similarity matrices and k-means algorithm. The test datasets have been prepared as graphs created from Wikipedia articles related with links. When authors create the clustering data packages, authors compute pairs of eigenvectors and eigenvalues for visualizations of the datasets. Authors describe the method used for evaluation of the clustering quality. Finally authors discuss achieved results, point some improvements and perspectives for future development.

Embed

Wikipedia Quality

Szymański, Julian; Krawczyk, Henryk. (2012). "[[Towards Effective Processing of Large Text Collections]]". ACM International Conference Proceeding Series 2012, pp. 764-772. ISBN: 978-146732678-0. DOI: 10.1109/INTECH.2012.6457784.

English Wikipedia

{{cite journal |last1=Szymański |first1=Julian |last2=Krawczyk |first2=Henryk |title=Towards Effective Processing of Large Text Collections |date=2012 |isbn=978-146732678-0 |doi=10.1109/INTECH.2012.6457784 |url=https://wikipediaquality.com/wiki/Towards_Effective_Processing_of_Large_Text_Collections |journal=ACM International Conference Proceeding Series 2012, pp. 764-772}}

HTML

Szymański, Julian; Krawczyk, Henryk. (2012). &quot;<a href="https://wikipediaquality.com/wiki/Towards_Effective_Processing_of_Large_Text_Collections">Towards Effective Processing of Large Text Collections</a>&quot;. ACM International Conference Proceeding Series 2012, pp. 764-772. ISBN: 978-146732678-0. DOI: 10.1109/INTECH.2012.6457784.