Difference between revisions of "An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs"

From Wikipedia Quality
Jump to: navigation, search
(Adding wikilinks)
(Infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs
 +
| date = 2013
 +
| authors = [[Khaled A. Hejazy]]<br />[[Samhaa R. El-Beltagy]]
 +
| doi = 10.1007/978-3-642-36981-0_8
 +
| link = https://link.springer.com/chapter/10.1007/978-3-642-36981-0_8
 +
}}
 
'''An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Khaled A. Hejazy]] and [[Samhaa R. El-Beltagy]].
 
'''An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Khaled A. Hejazy]] and [[Samhaa R. El-Beltagy]].
  
 
== Overview ==
 
== Overview ==
 
Wikipedia is the largest online encyclopedia known to date. Its rich content and semi-structured nature has made it into a very valuable research tool used for classification, [[information extraction]], and semantic annotation, among others. Many applications can benefit from the presence of a topic hierarchy in [[Wikipedia]]. However, what Wikipedia currently offers is a category graph built through hierarchical category links the semantics of which are un-defined. Because of this lack of semantics, a sub-category in Wikipedia does not necessarily comply with the concept of a sub-category in a hierarchy. Instead, all it signifies is that there is some sort of relationship between the parent category and its sub-category. As a result, traversing the category links of any given category can often result in surprising results. For example, following the category of “Computing” down its sub-category links, the totally unrelated category of “Theology” appears. In this paper, authors introduce a novel algorithm that through measuring the semantic [[relatedness]] between any given Wikipedia category and nodes in its sub-graph is capable of extracting a category hierarchy containing only nodes that are relevant to the parent category. The algorithm has been evaluated by comparing its output with a gold standard data set. The experimental setup and results are presented.
 
Wikipedia is the largest online encyclopedia known to date. Its rich content and semi-structured nature has made it into a very valuable research tool used for classification, [[information extraction]], and semantic annotation, among others. Many applications can benefit from the presence of a topic hierarchy in [[Wikipedia]]. However, what Wikipedia currently offers is a category graph built through hierarchical category links the semantics of which are un-defined. Because of this lack of semantics, a sub-category in Wikipedia does not necessarily comply with the concept of a sub-category in a hierarchy. Instead, all it signifies is that there is some sort of relationship between the parent category and its sub-category. As a result, traversing the category links of any given category can often result in surprising results. For example, following the category of “Computing” down its sub-category links, the totally unrelated category of “Theology” appears. In this paper, authors introduce a novel algorithm that through measuring the semantic [[relatedness]] between any given Wikipedia category and nodes in its sub-graph is capable of extracting a category hierarchy containing only nodes that are relevant to the parent category. The algorithm has been evaluated by comparing its output with a gold standard data set. The experimental setup and results are presented.

Revision as of 08:42, 14 August 2020


An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs
Authors
Khaled A. Hejazy
Samhaa R. El-Beltagy
Publication date
2013
DOI
10.1007/978-3-642-36981-0_8
Links
Original

An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs - scientific work related to Wikipedia quality published in 2013, written by Khaled A. Hejazy and Samhaa R. El-Beltagy.

Overview

Wikipedia is the largest online encyclopedia known to date. Its rich content and semi-structured nature has made it into a very valuable research tool used for classification, information extraction, and semantic annotation, among others. Many applications can benefit from the presence of a topic hierarchy in Wikipedia. However, what Wikipedia currently offers is a category graph built through hierarchical category links the semantics of which are un-defined. Because of this lack of semantics, a sub-category in Wikipedia does not necessarily comply with the concept of a sub-category in a hierarchy. Instead, all it signifies is that there is some sort of relationship between the parent category and its sub-category. As a result, traversing the category links of any given category can often result in surprising results. For example, following the category of “Computing” down its sub-category links, the totally unrelated category of “Theology” appears. In this paper, authors introduce a novel algorithm that through measuring the semantic relatedness between any given Wikipedia category and nodes in its sub-graph is capable of extracting a category hierarchy containing only nodes that are relevant to the parent category. The algorithm has been evaluated by comparing its output with a gold standard data set. The experimental setup and results are presented.