Difference between revisions of "Wikipedia Link Structure and Text Mining for Semantic Relation Extraction Towards a Huge Scale Global Web Ontology"
(Overview: Wikipedia Link Structure and Text Mining for Semantic Relation Extraction Towards a Huge Scale Global Web Ontology) |
(Links) |
||
Line 1: | Line 1: | ||
− | '''Wikipedia Link Structure and Text Mining for Semantic Relation Extraction Towards a Huge Scale Global Web Ontology''' - scientific work related to Wikipedia quality published in 2008, written by Kotaro Nakayama, Takahiro Hara and Shojiro Nishio. | + | '''Wikipedia Link Structure and Text Mining for Semantic Relation Extraction Towards a Huge Scale Global Web Ontology''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Kotaro Nakayama]], [[Takahiro Hara]] and [[Shojiro Nishio]]. |
== Overview == | == Overview == | ||
− | Wikipedia, a collaborative Wiki-based encyclopedia, has be- come a huge phenomenon among Internet users. It covers huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. Since it is becoming a database storing all human knowledge, Wikipedia mining is a promising approach that bridges the Semantic Web and the Social Web (a. k. a. Web 2.0). In fact, in the previ- ous researches on Wikipedia mining, it is strongly proved that Wikipedia has a remarkable capability as a corpus for knowledge extraction, espe- cially for relatedness measurement among concepts. However, semantic relatedness is just a numerical strength of a relation but does not have an explicit relation type. To extract inferable semantic relations with ex- plicit relation types, authors need to analyze not only the link structure but also texts in Wikipedia. In this paper, authors propose a consistent approach of semantic relation extraction from Wikipedia. The method consists of three sub-processes highly optimized for Wikipedia mining; 1) fast pre- processing, 2) POS (Part Of Speech) tag tree analysis, and 3) mainstay extraction. Furthermore, detailed evaluation proved that link struc- ture mining improves both the accuracy and the scalability of semantic relations extraction. | + | Wikipedia, a collaborative Wiki-based encyclopedia, has be- come a huge phenomenon among Internet users. It covers huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. Since it is becoming a database storing all human knowledge, [[Wikipedia]] mining is a promising approach that bridges the Semantic Web and the Social Web (a. k. a. Web 2.0). In fact, in the previ- ous researches on Wikipedia mining, it is strongly proved that Wikipedia has a remarkable capability as a corpus for knowledge extraction, espe- cially for [[relatedness]] measurement among concepts. However, semantic relatedness is just a numerical strength of a relation but does not have an explicit relation type. To extract inferable semantic relations with ex- plicit relation types, authors need to analyze not only the link structure but also texts in Wikipedia. In this paper, authors propose a consistent approach of semantic relation extraction from Wikipedia. The method consists of three sub-processes highly optimized for Wikipedia mining; 1) fast pre- processing, 2) POS (Part Of Speech) tag tree analysis, and 3) mainstay extraction. Furthermore, detailed evaluation proved that link struc- ture mining improves both the accuracy and the scalability of semantic relations extraction. |
Revision as of 13:22, 11 January 2021
Wikipedia Link Structure and Text Mining for Semantic Relation Extraction Towards a Huge Scale Global Web Ontology - scientific work related to Wikipedia quality published in 2008, written by Kotaro Nakayama, Takahiro Hara and Shojiro Nishio.
Overview
Wikipedia, a collaborative Wiki-based encyclopedia, has be- come a huge phenomenon among Internet users. It covers huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. Since it is becoming a database storing all human knowledge, Wikipedia mining is a promising approach that bridges the Semantic Web and the Social Web (a. k. a. Web 2.0). In fact, in the previ- ous researches on Wikipedia mining, it is strongly proved that Wikipedia has a remarkable capability as a corpus for knowledge extraction, espe- cially for relatedness measurement among concepts. However, semantic relatedness is just a numerical strength of a relation but does not have an explicit relation type. To extract inferable semantic relations with ex- plicit relation types, authors need to analyze not only the link structure but also texts in Wikipedia. In this paper, authors propose a consistent approach of semantic relation extraction from Wikipedia. The method consists of three sub-processes highly optimized for Wikipedia mining; 1) fast pre- processing, 2) POS (Part Of Speech) tag tree analysis, and 3) mainstay extraction. Furthermore, detailed evaluation proved that link struc- ture mining improves both the accuracy and the scalability of semantic relations extraction.