Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links

From Wikipedia Quality
Jump to: navigation, search


Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links
Authors
Dante Degl’Innocenti
Dario De Nart
Muhammad Helmy
Carlo Tasso
Publication date
2018
DOI
10.1007/978-3-319-67056-0_27
Links
Original

Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links - scientific work related to Wikipedia quality published in 2018, written by Dante Degl’Innocenti, Dario De Nart, Muhammad Helmy and Carlo Tasso.

Overview

In this chapter authors present a fast, accurate, and elegant metric to assess semantic relatedness among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the English Wikipedia and the Italian Wikipedia, which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users.

Embed

Wikipedia Quality

Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "[[Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links]]". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.

English Wikipedia

{{cite journal |last1=Degl’Innocenti |first1=Dante |last2=Nart |first2=Dario De |last3=Helmy |first3=Muhammad |last4=Tasso |first4=Carlo |title=Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links |date=2018 |doi=10.1007/978-3-319-67056-0_27 |url=https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links |journal=Springer, Cham}}

HTML

Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). &quot;<a href="https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links">Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links</a>&quot;. Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.