Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia

From Wikipedia Quality
Revision as of 10:36, 2 July 2020 by Zoey (talk | contribs) (Embed for English Wikipedia, HTML)
Jump to: navigation, search


Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia
Authors
Tatsuya Nakamura
Masumi Shirakawa
Takahiro Hara
Shojiro Nishio
Publication date
2014
DOI
10.1109/WI-IAT.2014.76
Links
Original

Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia - scientific work related to Wikipedia quality published in 2014, written by Tatsuya Nakamura, Masumi Shirakawa, Takahiro Hara and Shojiro Nishio.

Overview

In this paper, authors propose two methods to measure the semantic similarity for multi-lingual and short texts by using Wikipedia. In recent years, people around the world have been continuously generating information about their local area in their own languages on social networking services. Measuring the similarity between the texts is challenging because they are often short and written in various languages. Authors methods solve this problem by incorporating inter-language links of Wikipedia into extended naive Bayes (ENB), a probabilistic method of semantic similarity measurements for short texts. The proposed methods represent a multi-lingual short text as a vector of the English version of Wikipedia articles (entities). Authors conducted an experiment on clustering of tweets written in four languages (English, Spanish, Japanese and Arabic). From the experimental results, authors confirmed that methods outperformed cross-lingual explicit semantic analysis (CL-ESA), which is a method to measure the similarity between texts written in two different languages. Moreover, methods were competitive with ENB applied to texts that have been translated into English using Google Translate. Authors methods enabled similarity measurements for multi-lingual short texts without the cost of machine translations.

Embed

Wikipedia Quality

Nakamura, Tatsuya; Shirakawa, Masumi; Hara, Takahiro; Nishio, Shojiro. (2014). "[[Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia]]". IEEE Computer Society. DOI: 10.1109/WI-IAT.2014.76.

English Wikipedia

{{cite journal |last1=Nakamura |first1=Tatsuya |last2=Shirakawa |first2=Masumi |last3=Hara |first3=Takahiro |last4=Nishio |first4=Shojiro |title=Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia |date=2014 |doi=10.1109/WI-IAT.2014.76 |url=https://wikipediaquality.com/wiki/Semantic_Similarity_Measurements_for_Multi-Lingual_Short_Texts_Using_Wikipedia |journal=IEEE Computer Society}}

HTML

Nakamura, Tatsuya; Shirakawa, Masumi; Hara, Takahiro; Nishio, Shojiro. (2014). &quot;<a href="https://wikipediaquality.com/wiki/Semantic_Similarity_Measurements_for_Multi-Lingual_Short_Texts_Using_Wikipedia">Semantic Similarity Measurements for Multi-Lingual Short Texts Using Wikipedia</a>&quot;. IEEE Computer Society. DOI: 10.1109/WI-IAT.2014.76.