Difference between revisions of "User Interest Profile Identification Using Wikipedia Knowledge Database"

From Wikipedia Quality
Jump to: navigation, search
(Links)
(+ infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = User Interest Profile Identification Using Wikipedia Knowledge Database
 +
| date = 2013
 +
| authors = [[Huakang Li]]<br />[[Longbin Lai]]<br />[[Xiaofeng Xu]]<br />[[Yao Shen]]<br />[[Xiangyang Xu]]<br />[[Chunrong Xia]]
 +
| doi = 10.1109/HPCC.and.EUC.2013.340
 +
| link =
 +
}}
 
'''User Interest Profile Identification Using Wikipedia Knowledge Database''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Huakang Li]], [[Longbin Lai]], [[Xiaofeng Xu]], [[Yao Shen]], [[Xiangyang Xu]] and [[Chunrong Xia]].
 
'''User Interest Profile Identification Using Wikipedia Knowledge Database''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Huakang Li]], [[Longbin Lai]], [[Xiaofeng Xu]], [[Yao Shen]], [[Xiangyang Xu]] and [[Chunrong Xia]].
  
 
== Overview ==
 
== Overview ==
 
The interesting, targeted, relevant advertisement is considered as one of the most honest proceeds for personalizing recommendation. Topic identification is the most important technique for the unstructured web pages. Conventional content classification approaches based on bag of words are difficult to process massive web pages. In this paper, [[Wikipedia]] Category Network (WCN) nodes are used to identify a web page topic and estimate user's interest profile. Wikipedia is the largest contents knowledge database and updated dynamically. A basic interest data set is marked for WCN. The topic characterization for each WCN node is generated with the depth and breadth of the interest data set. To reduce the deviation of the breadth, a family generation algorithm is proposed to estimate the generation weight in WCN. Finally, an interest decay model based on URL number is proposed to represent user's interest profile in time period. Experimental results illustrated that the performance of Web page topic identification is significant using WCN with family model, and the profile identification model has a dynamical performance for active users.
 
The interesting, targeted, relevant advertisement is considered as one of the most honest proceeds for personalizing recommendation. Topic identification is the most important technique for the unstructured web pages. Conventional content classification approaches based on bag of words are difficult to process massive web pages. In this paper, [[Wikipedia]] Category Network (WCN) nodes are used to identify a web page topic and estimate user's interest profile. Wikipedia is the largest contents knowledge database and updated dynamically. A basic interest data set is marked for WCN. The topic characterization for each WCN node is generated with the depth and breadth of the interest data set. To reduce the deviation of the breadth, a family generation algorithm is proposed to estimate the generation weight in WCN. Finally, an interest decay model based on URL number is proposed to represent user's interest profile in time period. Experimental results illustrated that the performance of Web page topic identification is significant using WCN with family model, and the profile identification model has a dynamical performance for active users.

Revision as of 14:25, 23 November 2019


User Interest Profile Identification Using Wikipedia Knowledge Database
Authors
Huakang Li
Longbin Lai
Xiaofeng Xu
Yao Shen
Xiangyang Xu
Chunrong Xia
Publication date
2013
DOI
10.1109/HPCC.and.EUC.2013.340
Links

User Interest Profile Identification Using Wikipedia Knowledge Database - scientific work related to Wikipedia quality published in 2013, written by Huakang Li, Longbin Lai, Xiaofeng Xu, Yao Shen, Xiangyang Xu and Chunrong Xia.

Overview

The interesting, targeted, relevant advertisement is considered as one of the most honest proceeds for personalizing recommendation. Topic identification is the most important technique for the unstructured web pages. Conventional content classification approaches based on bag of words are difficult to process massive web pages. In this paper, Wikipedia Category Network (WCN) nodes are used to identify a web page topic and estimate user's interest profile. Wikipedia is the largest contents knowledge database and updated dynamically. A basic interest data set is marked for WCN. The topic characterization for each WCN node is generated with the depth and breadth of the interest data set. To reduce the deviation of the breadth, a family generation algorithm is proposed to estimate the generation weight in WCN. Finally, an interest decay model based on URL number is proposed to represent user's interest profile in time period. Experimental results illustrated that the performance of Web page topic identification is significant using WCN with family model, and the profile identification model has a dynamical performance for active users.