Difference between revisions of "Yago: a Core of Semantic Knowledgeunifying Wordnet and Wikipedia"
(Overview - Yago: a Core of Semantic Knowledgeunifying Wordnet and Wikipedia) |
(wikilinks) |
||
Line 1: | Line 1: | ||
− | '''Yago: a Core of Semantic Knowledgeunifying Wordnet and Wikipedia''' - scientific work related to Wikipedia quality published in 2007, written by Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum. | + | '''Yago: a Core of Semantic Knowledgeunifying Wordnet and Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2007, written by [[Fabian M. Suchanek]], [[Gjergji Kasneci]] and [[Gerhard Weikum]]. |
== Overview == | == Overview == | ||
− | Authors present YAGO, a lightweight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as hasWonPrize). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuris-tic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations , products, etc. with their semantic relationships – and in quantity by increasing the number of facts by more than an order of magnitude. Authors empirical evaluation of fact cor-rectness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, authors show how YAGO can be further extended by state-of-the-art information extraction techniques. | + | Authors present YAGO, a lightweight and extensible [[ontology]] with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as hasWonPrize). The facts have been automatically extracted from [[Wikipedia]] and unified with [[WordNet]], using a carefully designed combination of rule-based and heuris-tic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations , products, etc. with their semantic relationships – and in quantity by increasing the number of facts by more than an order of magnitude. Authors empirical evaluation of fact cor-rectness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, authors show how YAGO can be further extended by state-of-the-art [[information extraction]] techniques. |
Revision as of 08:03, 10 January 2020
Yago: a Core of Semantic Knowledgeunifying Wordnet and Wikipedia - scientific work related to Wikipedia quality published in 2007, written by Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum.
Overview
Authors present YAGO, a lightweight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as hasWonPrize). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuris-tic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations , products, etc. with their semantic relationships – and in quantity by increasing the number of facts by more than an order of magnitude. Authors empirical evaluation of fact cor-rectness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, authors show how YAGO can be further extended by state-of-the-art information extraction techniques.