Difference between revisions of "Identifying Document Topics Using the Wikipedia Category Network"

From Wikipedia Quality
Jump to: navigation, search
(Links)
(Embed for English Wikipedia, HTML)
Line 1: Line 1:
'''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to [[Wikipedia quality]] published in 2006, written by [[Peter Sch]].
+
{{Infobox work
 +
| title = Identifying Document Topics Using the Wikipedia Category Network
 +
| date = 2006
 +
| authors = [[Péter Schönhofen]]
 +
| doi = 10.1109/WI.2006.92
 +
| link = http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4061411
 +
| plink = https://www.researchgate.net/profile/Peter_Schoenhofen/publication/233748512_Identifying_Document_Topics_Using_the_Wikipedia_Category_Network/links/0912f50b141fcda6e7000000.pdf
 +
}}
 +
'''Identifying Document Topics Using the Wikipedia Category Network''' - scientific work related to [[Wikipedia quality]] published in 2006, written by [[Péter Schönhofen]].
  
 
== Overview ==
 
== Overview ==
In the last few years the size and coverage of [[Wikipedia]], a freely available on-line encyclopedia has reached the point where it can be utilized similar to an [[ontology]] or taxonomy to identify the topics discussed in a document. In this paper authors will show that even a simple algorithm that exploits only the titles and [[categories]] of Wikipedia articles can characterize documents by [[Wikipedia categories]] surprisingly well. Authors test the [[reliability]] of method by predicting categories of Wikipedia articles themselves based on their bodies, and by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts.
+
In the last few years the size and coverage of Wikipe- dia, a freely available on-line encyclopedia has reached the point where it can be utilized similar to an [[ontology]] or tax- onomy to identify the topics discussed in a document. In this paper authors will show that even a simple algorithm that exploits only the titles and [[categories]] of [[Wikipedia]] articles can characterize documents by [[Wikipedia categories]] sur- prisingly well. Authors test the [[reliability]] of method by pre- dicting categories ofWikipedia articles themselves based on their bodies, and by performing classification and cluster- ing on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts.
 +
 
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Schönhofen, Péter. (2006). "[[Identifying Document Topics Using the Wikipedia Category Network]]".DOI: 10.1109/WI.2006.92.
 +
</nowiki>
 +
</code>
 +
 
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Schönhofen |first1=Péter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2006 |doi=10.1109/WI.2006.92 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network}}
 +
</nowiki>
 +
</code>
 +
 
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Schönhofen, Péter. (2006). &amp;quot;<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>&amp;quot;.DOI: 10.1109/WI.2006.92.
 +
</nowiki>
 +
</code>

Revision as of 11:01, 15 September 2019


Identifying Document Topics Using the Wikipedia Category Network
Authors
Péter Schönhofen
Publication date
2006
DOI
10.1109/WI.2006.92
Links
Original Preprint

Identifying Document Topics Using the Wikipedia Category Network - scientific work related to Wikipedia quality published in 2006, written by Péter Schönhofen.

Overview

In the last few years the size and coverage of Wikipe- dia, a freely available on-line encyclopedia has reached the point where it can be utilized similar to an ontology or tax- onomy to identify the topics discussed in a document. In this paper authors will show that even a simple algorithm that exploits only the titles and categories of Wikipedia articles can characterize documents by Wikipedia categories sur- prisingly well. Authors test the reliability of method by pre- dicting categories ofWikipedia articles themselves based on their bodies, and by performing classification and cluster- ing on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts.

Embed

Wikipedia Quality

Schönhofen, Péter. (2006). "[[Identifying Document Topics Using the Wikipedia Category Network]]".DOI: 10.1109/WI.2006.92.

English Wikipedia

{{cite journal |last1=Schönhofen |first1=Péter |title=Identifying Document Topics Using the Wikipedia Category Network |date=2006 |doi=10.1109/WI.2006.92 |url=https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network}}

HTML

Schönhofen, Péter. (2006). &quot;<a href="https://wikipediaquality.com/wiki/Identifying_Document_Topics_Using_the_Wikipedia_Category_Network">Identifying Document Topics Using the Wikipedia Category Network</a>&quot;.DOI: 10.1109/WI.2006.92.