Difference between revisions of "Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles"

From Wikipedia Quality
Jump to: navigation, search
(+ infobox)
(+ Embed)
 
Line 10: Line 10:
 
== Overview ==
 
== Overview ==
 
Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of [[Wikipedia]] as the lexical knowledge base --- an approach that has already shown promising results in many research studies. In this paper authors propose three path-based [[measures]] for computing document [[relatedness]] in the conceptual space formed by the hierarchical organization of a Wikipedia Category Graph (WCG). Authors compare the proposed approaches with the standard Path Length method to establish the best relatedness measure for the WCG representation. To test overall WCG efficiency, authors compare the proposed representations with the BoW method. The evaluation was performed with two different types of clustering algorithms (OPTICS and K-Means), used for categorization of keyword-based search results. The experiments have shown that approach outperforms the standard Path Length approach, and the WCG representation achieves better results than BoW.
 
Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of [[Wikipedia]] as the lexical knowledge base --- an approach that has already shown promising results in many research studies. In this paper authors propose three path-based [[measures]] for computing document [[relatedness]] in the conceptual space formed by the hierarchical organization of a Wikipedia Category Graph (WCG). Authors compare the proposed approaches with the standard Path Length method to establish the best relatedness measure for the WCG representation. To test overall WCG efficiency, authors compare the proposed representations with the BoW method. The evaluation was performed with two different types of clustering algorithms (OPTICS and K-Means), used for categorization of keyword-based search results. The experiments have shown that approach outperforms the standard Path Length approach, and the WCG representation achieves better results than BoW.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Kucharczyk, źUkasz; Szymański, Julian. (2017). "[[Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles]]". Springer US. DOI: 10.1007/s10844-016-0416-5.
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Kucharczyk |first1=źUkasz |last2=Szymański |first2=Julian |title=Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles |date=2017 |doi=10.1007/s10844-016-0416-5 |url=https://wikipediaquality.com/wiki/Path-Based_Methods_on_Categorical_Structures_for_Conceptual_Representation_of_Wikipedia_Articles |journal=Springer US}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Kucharczyk, źUkasz; Szymański, Julian. (2017). &amp;quot;<a href="https://wikipediaquality.com/wiki/Path-Based_Methods_on_Categorical_Structures_for_Conceptual_Representation_of_Wikipedia_Articles">Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles</a>&amp;quot;. Springer US. DOI: 10.1007/s10844-016-0416-5.
 +
</nowiki>
 +
</code>

Latest revision as of 20:27, 14 June 2019


Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles
Authors
źUkasz Kucharczyk
Julian Szymański
Publication date
2017
DOI
10.1007/s10844-016-0416-5
Links
Original

Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles - scientific work related to Wikipedia quality published in 2017, written by źUkasz Kucharczyk and Julian Szymański.

Overview

Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base --- an approach that has already shown promising results in many research studies. In this paper authors propose three path-based measures for computing document relatedness in the conceptual space formed by the hierarchical organization of a Wikipedia Category Graph (WCG). Authors compare the proposed approaches with the standard Path Length method to establish the best relatedness measure for the WCG representation. To test overall WCG efficiency, authors compare the proposed representations with the BoW method. The evaluation was performed with two different types of clustering algorithms (OPTICS and K-Means), used for categorization of keyword-based search results. The experiments have shown that approach outperforms the standard Path Length approach, and the WCG representation achieves better results than BoW.

Embed

Wikipedia Quality

Kucharczyk, źUkasz; Szymański, Julian. (2017). "[[Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles]]". Springer US. DOI: 10.1007/s10844-016-0416-5.

English Wikipedia

{{cite journal |last1=Kucharczyk |first1=źUkasz |last2=Szymański |first2=Julian |title=Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles |date=2017 |doi=10.1007/s10844-016-0416-5 |url=https://wikipediaquality.com/wiki/Path-Based_Methods_on_Categorical_Structures_for_Conceptual_Representation_of_Wikipedia_Articles |journal=Springer US}}

HTML

Kucharczyk, źUkasz; Szymański, Julian. (2017). &quot;<a href="https://wikipediaquality.com/wiki/Path-Based_Methods_on_Categorical_Structures_for_Conceptual_Representation_of_Wikipedia_Articles">Path-Based Methods on Categorical Structures for Conceptual Representation of Wikipedia Articles</a>&quot;. Springer US. DOI: 10.1007/s10844-016-0416-5.