Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News

From Wikipedia Quality
Revision as of 05:48, 13 June 2020 by Sophie (talk | contribs) (+ Embed)
Jump to: navigation, search


Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News
Authors
Marcos Mouriño-García
Roberto Pérez-Rodríguez
Luis E. Anido-Rifón
Publication date
2017
DOI
10.1109/iscmi.2017.8279619
Links
Original

Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News - scientific work related to Wikipedia quality published in 2017, written by Marcos Mouriño-García, Roberto Pérez-Rodríguez and Luis E. Anido-Rifón.

Overview

This paper presents a first attempt of leveraging Wikipedia knowledge to represent textual news stories as vectors of Wikipedia concepts, and analysis its suitability for creating a cross-language classifier of textual news stories written in Spanish when it is trained only with English ones. Authors describe two approaches. The first one is based only on Wikipedia concepts to represent the news stories (WikiBoC-CLCM). The second approach (Hybrid-WikiBoC) combines the WikiBoC-CLCM classifier with the state-of-the-art approach based on the bag of words model along with machine translation techniques (BoW-MT). To evaluate the approaches proposed authors present a dataset composed of news written in English and Spanish, extracted from several online newspapers and news agencies such as Reuters and Europa Press. The results obtained show that the purely based on concepts WikiBoC-CLCM approach offers the highest classification performance, achieving increases up to 55.07% over the state-of-the-art BoW-MT approach. The Hybrid-WikiBoC approach also outperforms the BoW-MT model, achieving performance increases up to 2.34% Authors conclude that leveraging Wikipedia knowledge is of great advantage in tasks of cross-language classification of textual news stories.

Embed

Wikipedia Quality

Mouriño-García, Marcos; Pérez-Rodríguez, Roberto; Anido-Rifón, Luis E.. (2017). "[[Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News]]".DOI: 10.1109/iscmi.2017.8279619.

English Wikipedia

{{cite journal |last1=Mouriño-García |first1=Marcos |last2=Pérez-Rodríguez |first2=Roberto |last3=Anido-Rifón |first3=Luis E. |title=Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News |date=2017 |doi=10.1109/iscmi.2017.8279619 |url=https://wikipediaquality.com/wiki/Leveraging_Wikipedia_Knowledge_to_Cross-Language_Classify_Textual_News}}

HTML

Mouriño-García, Marcos; Pérez-Rodríguez, Roberto; Anido-Rifón, Luis E.. (2017). &quot;<a href="https://wikipediaquality.com/wiki/Leveraging_Wikipedia_Knowledge_to_Cross-Language_Classify_Textual_News">Leveraging Wikipedia Knowledge to Cross-Language Classify Textual News</a>&quot;.DOI: 10.1109/iscmi.2017.8279619.