Difference between revisions of "Identifying Document Topics Using the Wikipedia Category Network"
(Identifying Document Topics Using the Wikipedia Category Network -- new article) |
(Adding categories) |
||
(13 intermediate revisions by 10 users not shown) | |||
Line 1: | Line 1: | ||
− | '''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to Wikipedia quality published in | + | {{Infobox work |
+ | | title = Identifying Document Topics Using the Wikipedia Category Network | ||
+ | | date = 2009 | ||
+ | | authors = [[Péter Schönhofen]] | ||
+ | | doi = 10.3233/WIA-2009-0162 | ||
+ | | link = http://dl.acm.org/citation.cfm?id=1551707.1551712 | ||
+ | }} | ||
+ | '''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to [[Wikipedia quality]] published in 2009, written by [[Péter Schönhofen]]. | ||
== Overview == | == Overview == | ||
− | In the last few years the size and coverage of | + | In the last few years the size and coverage of [[Wikipedia]], a community edited, freely available on-line encyclopedia has reached the point where it can be effectively used to identify topics discussed in a document, similarly to an [[ontology]] or taxonomy. In this paper authors will show that even a fairly simple algorithm that exploits only the titles and [[categories]] of Wikipedia articles can characterize documents by [[Wikipedia categories]] surprisingly well. Authors test the [[reliability]] of method by predicting categories of Wikipedia articles themselves based on their bodies, and also by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of (or in addition to) their texts. |
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Schönhofen, Péter. (2009). "[[Identifying Document Topics Using the Wikipedia Category Network]]". IOS Press. DOI: 10.3233/WIA-2009-0162. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Schönhofen |first1=Péter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2009 |doi=10.3233/WIA-2009-0162 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network |journal=IOS Press}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Schönhofen, Péter. (2009). &quot;<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>&quot;. IOS Press. DOI: 10.3233/WIA-2009-0162. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | [[Category:Scientific works]] |
Latest revision as of 21:47, 16 November 2020
Authors | Péter Schönhofen |
---|---|
Publication date | 2009 |
DOI | 10.3233/WIA-2009-0162 |
Links | Original |
Identifying Document Topics Using the Wikipedia Category Network - scientific work related to Wikipedia quality published in 2009, written by Péter Schönhofen.
Overview
In the last few years the size and coverage of Wikipedia, a community edited, freely available on-line encyclopedia has reached the point where it can be effectively used to identify topics discussed in a document, similarly to an ontology or taxonomy. In this paper authors will show that even a fairly simple algorithm that exploits only the titles and categories of Wikipedia articles can characterize documents by Wikipedia categories surprisingly well. Authors test the reliability of method by predicting categories of Wikipedia articles themselves based on their bodies, and also by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of (or in addition to) their texts.
Embed
Wikipedia Quality
Schönhofen, Péter. (2009). "[[Identifying Document Topics Using the Wikipedia Category Network]]". IOS Press. DOI: 10.3233/WIA-2009-0162.
English Wikipedia
{{cite journal |last1=Schönhofen |first1=Péter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2009 |doi=10.3233/WIA-2009-0162 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network |journal=IOS Press}}
HTML
Schönhofen, Péter. (2009). "<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>". IOS Press. DOI: 10.3233/WIA-2009-0162.