Difference between revisions of "Computing Semantic Relatedness Using Wikipedia Features"

From Wikipedia Quality
Jump to: navigation, search
(Computing Semantic Relatedness Using Wikipedia Features - new page)
 
(+ wikilinks)
Line 1: Line 1:
'''Computing Semantic Relatedness Using Wikipedia Features''' - scientific work related to Wikipedia quality published in 2013, written by Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha and Abdelmajid Ben Hamadou.
+
'''Computing Semantic Relatedness Using Wikipedia Features''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Mohamed Ali Hadj Taieb]], [[Mohamed Ben Aouicha]] and [[Abdelmajid Ben Hamadou]].
  
 
== Overview ==
 
== Overview ==
Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguistics, cognitive science and artificial intelligence. In this paper, authors propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances. Therefore, authors utilized the Wikipedia features (articles, categories, Wikipedia category graph and redirection) in a system combining this Wikipedia semantic information in its different components. The approach is preceded by a pre-processing step to provide for each category pertaining to the Wikipedia category graph a semantic description vector including the weights of stems extracted from articles assigned to the target category. Next, for each candidate word, authors collect its categories set using an algorithm for categories extraction from the Wikipedia category graph. Then, authors compute the semantic relatedness degree using existing vector similarity metrics (Dice, Overlap and Cosine) and a new proposed metric that performed well as cosine formula. The basic system is followed by a set of modules in order to exploit Wikipedia features to quantify better as possible the semantic relatedness between words. Authors evaluate measure based on two tasks: comparison with human judgments using five datasets and a specific application ''solving choice problem''. Authors result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches.
+
Measuring semantic [[relatedness]] is a critical task in many domains such as psychology, biology, linguistics, cognitive science and [[artificial intelligence]]. In this paper, authors propose a novel system for computing semantic relatedness between words. Recent approaches have exploited [[Wikipedia]] as a huge semantic resource that showed good performances. Therefore, authors utilized the Wikipedia [[features]] (articles, [[categories]], Wikipedia category graph and redirection) in a system combining this Wikipedia [[semantic information]] in its different components. The approach is preceded by a pre-processing step to provide for each category pertaining to the Wikipedia category graph a semantic description vector including the weights of stems extracted from articles assigned to the target category. Next, for each candidate word, authors collect its categories set using an algorithm for categories extraction from the Wikipedia category graph. Then, authors compute the semantic relatedness degree using existing vector similarity metrics (Dice, Overlap and Cosine) and a new proposed metric that performed well as cosine formula. The basic system is followed by a set of modules in order to exploit Wikipedia features to quantify better as possible the semantic relatedness between words. Authors evaluate measure based on two tasks: comparison with human judgments using five datasets and a specific application ''solving choice problem''. Authors result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches.

Revision as of 09:07, 14 January 2021

Computing Semantic Relatedness Using Wikipedia Features - scientific work related to Wikipedia quality published in 2013, written by Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha and Abdelmajid Ben Hamadou.

Overview

Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguistics, cognitive science and artificial intelligence. In this paper, authors propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances. Therefore, authors utilized the Wikipedia features (articles, categories, Wikipedia category graph and redirection) in a system combining this Wikipedia semantic information in its different components. The approach is preceded by a pre-processing step to provide for each category pertaining to the Wikipedia category graph a semantic description vector including the weights of stems extracted from articles assigned to the target category. Next, for each candidate word, authors collect its categories set using an algorithm for categories extraction from the Wikipedia category graph. Then, authors compute the semantic relatedness degree using existing vector similarity metrics (Dice, Overlap and Cosine) and a new proposed metric that performed well as cosine formula. The basic system is followed by a set of modules in order to exploit Wikipedia features to quantify better as possible the semantic relatedness between words. Authors evaluate measure based on two tasks: comparison with human judgments using five datasets and a specific application solving choice problem. Authors result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches.