Extracting Ontologies from Arabic Wikipedia: a Linguistic Approach

Extracting Ontologies from Arabic Wikipedia: a Linguistic Approach - scientific work related to Wikipedia quality published in 2014, written by Nora I. Al-Rajebah and Hend S. Al-Khalifa.

Overview

As one of the important aspects of semantic web, building ontological models became a driving demand for developing a variety of semantic web applications. Through the years, much research was conducted to investigate the process of generating ontologies automatically from semi-structured knowledge sources such as Wikipedia. Different ontology building techniques were investigated, e.g., NLP tools and pattern matching, infoboxes and structured knowledge sources (Cyc and WordNet). Looking at the results of previous approaches authors can see that the vast majority of employed techniques did not consider the linguistic aspect of Wikipedia. In this article, authors present solution to extract ontologies from Wikipedia using a linguistic approach based on the semantic field theory introduced by Jost Trier. Linguistic ontologies are significant in many applications for both linguists and Web researchers. Authors applied the proposed approach on the Arabic version of Wikipedia. The semantic relations were extracted from infoboxes, hyperlinks within infoboxes and list of categories that articles belong to. Authors system successfully extracted approximately (760,000) triples from the Arabic Wikipedia. Authors conducted three experiments to evaluate the system output, namely: Validation Test, Crowd Evaluation and Domain Experts’ evaluation. The system output achieved an average precision of 65 %.

Extracting Ontologies from Arabic Wikipedia: a Linguistic Approach

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools