Difference between revisions of "Identifying Document Topics Using the Wikipedia Category Network"
(+ cat.) |
(+ embed code) |
||
Line 2: | Line 2: | ||
| title = Identifying Document Topics Using the Wikipedia Category Network | | title = Identifying Document Topics Using the Wikipedia Category Network | ||
| date = 2006 | | date = 2006 | ||
− | | authors = [[ | + | | authors = [[Peter Sch]] |
− | + | | link = http://doi.ieeecomputersociety.org/10.1109/WI.2006.92 | |
− | | link = http:// | ||
− | |||
}} | }} | ||
− | '''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to [[Wikipedia quality]] published in 2006, written by [[ | + | '''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to [[Wikipedia quality]] published in 2006, written by [[Peter Sch]]. |
== Overview == | == Overview == | ||
− | In the last few years the size and coverage of | + | In the last few years the size and coverage of [[Wikipedia]], a freely available on-line encyclopedia has reached the point where it can be utilized similar to an [[ontology]] or taxonomy to identify the topics discussed in a document. In this paper authors will show that even a simple algorithm that exploits only the titles and [[categories]] of Wikipedia articles can characterize documents by [[Wikipedia categories]] surprisingly well. Authors test the [[reliability]] of method by predicting categories of Wikipedia articles themselves based on their bodies, and by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts. |
== Embed == | == Embed == | ||
Line 16: | Line 14: | ||
<code> | <code> | ||
<nowiki> | <nowiki> | ||
− | + | Sch, Peter. (2006). "[[Identifying Document Topics Using the Wikipedia Category Network]]". | |
</nowiki> | </nowiki> | ||
</code> | </code> | ||
Line 23: | Line 21: | ||
<code> | <code> | ||
<nowiki> | <nowiki> | ||
− | {{cite journal |last1= | + | {{cite journal |last1=Sch |first1=Peter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2006 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network}} |
</nowiki> | </nowiki> | ||
</code> | </code> | ||
Line 30: | Line 28: | ||
<code> | <code> | ||
<nowiki> | <nowiki> | ||
− | + | Sch, Peter. (2006). &quot;<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>&quot;. | |
</nowiki> | </nowiki> | ||
</code> | </code> | ||
− | |||
− | |||
− | |||
− |
Revision as of 08:14, 19 November 2019
Authors | Peter Sch |
---|---|
Publication date | 2006 |
Links | Original |
Identifying Document Topics Using the Wikipedia Category Network - scientific work related to Wikipedia quality published in 2006, written by Peter Sch.
Overview
In the last few years the size and coverage of Wikipedia, a freely available on-line encyclopedia has reached the point where it can be utilized similar to an ontology or taxonomy to identify the topics discussed in a document. In this paper authors will show that even a simple algorithm that exploits only the titles and categories of Wikipedia articles can characterize documents by Wikipedia categories surprisingly well. Authors test the reliability of method by predicting categories of Wikipedia articles themselves based on their bodies, and by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts.
Embed
Wikipedia Quality
Sch, Peter. (2006). "[[Identifying Document Topics Using the Wikipedia Category Network]]".
English Wikipedia
{{cite journal |last1=Sch |first1=Peter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2006 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network}}
HTML
Sch, Peter. (2006). "<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>".