Difference between revisions of "Selecting Keywords to Represent Web Pages Using Wikipedia Information"

From Wikipedia Quality
Jump to: navigation, search
(infobox)
(+ embed code)
Line 10: Line 10:
 
== Overview ==
 
== Overview ==
 
In this paper authors present three new methods to extract keywords from web pages using [[Wikipedia]] as an external source of information. The information used from Wikipedia includes the titles of articles, co-occurrence of keywords and [[categories]] associated with each Wikipedia definition. Authors compare methods with three keyword extraction methods used as baselines: (i) all the terms of a web page, (ii) a TF-IDF implementation that extracts single weighted words of a web page and (iii) a previously proposed Wikipedia-based keyword extraction method presented in the literature. Authors compare three keyword extraction methods with the baseline methods in three distinct scenarios, all related to target application, which is the selection of ads in a context-based advertising system. In the first scenario, the target pages to place ads were extracted from Wikipedia articles, whereas the target pages in the other two scenarios were extracted from a news web site. Experimental results show that methods are quite competitive solutions for the task of selecting good keywords to represent target web pages, albeit being simple, effective and time efficient. For instance, in the first scenario best method used to extract keywords from Wikipedia articles achieved an improvement of 33% when compared to the second best baseline, and a gain of 26% when considering all the terms.
 
In this paper authors present three new methods to extract keywords from web pages using [[Wikipedia]] as an external source of information. The information used from Wikipedia includes the titles of articles, co-occurrence of keywords and [[categories]] associated with each Wikipedia definition. Authors compare methods with three keyword extraction methods used as baselines: (i) all the terms of a web page, (ii) a TF-IDF implementation that extracts single weighted words of a web page and (iii) a previously proposed Wikipedia-based keyword extraction method presented in the literature. Authors compare three keyword extraction methods with the baseline methods in three distinct scenarios, all related to target application, which is the selection of ads in a context-based advertising system. In the first scenario, the target pages to place ads were extracted from Wikipedia articles, whereas the target pages in the other two scenarios were extracted from a news web site. Experimental results show that methods are quite competitive solutions for the task of selecting good keywords to represent target web pages, albeit being simple, effective and time efficient. For instance, in the first scenario best method used to extract keywords from Wikipedia articles achieved an improvement of 33% when compared to the second best baseline, and a gain of 26% when considering all the terms.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Vidal, Maisa; Menezes, Guilherme Vale; Berlt, Klessius; Moura, Edleno Silva de; Okada, Karla; Ziviani, Nivio; Fernandes, David; Cristo, Marco. (2012). "[[Selecting Keywords to Represent Web Pages Using Wikipedia Information]]".DOI: 10.1145/2382636.2382714.
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Vidal |first1=Maisa |last2=Menezes |first2=Guilherme Vale |last3=Berlt |first3=Klessius |last4=Moura |first4=Edleno Silva de |last5=Okada |first5=Karla |last6=Ziviani |first6=Nivio |last7=Fernandes |first7=David |last8=Cristo |first8=Marco |title=Selecting Keywords to Represent Web Pages Using Wikipedia Information |date=2012 |doi=10.1145/2382636.2382714 |url=https://wikipediaquality.com/wiki/Selecting_Keywords_to_Represent_Web_Pages_Using_Wikipedia_Information}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Vidal, Maisa; Menezes, Guilherme Vale; Berlt, Klessius; Moura, Edleno Silva de; Okada, Karla; Ziviani, Nivio; Fernandes, David; Cristo, Marco. (2012). &amp;quot;<a href="https://wikipediaquality.com/wiki/Selecting_Keywords_to_Represent_Web_Pages_Using_Wikipedia_Information">Selecting Keywords to Represent Web Pages Using Wikipedia Information</a>&amp;quot;.DOI: 10.1145/2382636.2382714.
 +
</nowiki>
 +
</code>

Revision as of 17:16, 24 July 2019


Selecting Keywords to Represent Web Pages Using Wikipedia Information
Authors
Maisa Vidal
Guilherme Vale Menezes
Klessius Berlt
Edleno Silva de Moura
Karla Okada
Nivio Ziviani
David Fernandes
Marco Cristo
Publication date
2012
DOI
10.1145/2382636.2382714
Links
Original

Selecting Keywords to Represent Web Pages Using Wikipedia Information - scientific work related to Wikipedia quality published in 2012, written by Maisa Vidal, Guilherme Vale Menezes, Klessius Berlt, Edleno Silva de Moura, Karla Okada, Nivio Ziviani, David Fernandes and Marco Cristo.

Overview

In this paper authors present three new methods to extract keywords from web pages using Wikipedia as an external source of information. The information used from Wikipedia includes the titles of articles, co-occurrence of keywords and categories associated with each Wikipedia definition. Authors compare methods with three keyword extraction methods used as baselines: (i) all the terms of a web page, (ii) a TF-IDF implementation that extracts single weighted words of a web page and (iii) a previously proposed Wikipedia-based keyword extraction method presented in the literature. Authors compare three keyword extraction methods with the baseline methods in three distinct scenarios, all related to target application, which is the selection of ads in a context-based advertising system. In the first scenario, the target pages to place ads were extracted from Wikipedia articles, whereas the target pages in the other two scenarios were extracted from a news web site. Experimental results show that methods are quite competitive solutions for the task of selecting good keywords to represent target web pages, albeit being simple, effective and time efficient. For instance, in the first scenario best method used to extract keywords from Wikipedia articles achieved an improvement of 33% when compared to the second best baseline, and a gain of 26% when considering all the terms.

Embed

Wikipedia Quality

Vidal, Maisa; Menezes, Guilherme Vale; Berlt, Klessius; Moura, Edleno Silva de; Okada, Karla; Ziviani, Nivio; Fernandes, David; Cristo, Marco. (2012). "[[Selecting Keywords to Represent Web Pages Using Wikipedia Information]]".DOI: 10.1145/2382636.2382714.

English Wikipedia

{{cite journal |last1=Vidal |first1=Maisa |last2=Menezes |first2=Guilherme Vale |last3=Berlt |first3=Klessius |last4=Moura |first4=Edleno Silva de |last5=Okada |first5=Karla |last6=Ziviani |first6=Nivio |last7=Fernandes |first7=David |last8=Cristo |first8=Marco |title=Selecting Keywords to Represent Web Pages Using Wikipedia Information |date=2012 |doi=10.1145/2382636.2382714 |url=https://wikipediaquality.com/wiki/Selecting_Keywords_to_Represent_Web_Pages_Using_Wikipedia_Information}}

HTML

Vidal, Maisa; Menezes, Guilherme Vale; Berlt, Klessius; Moura, Edleno Silva de; Okada, Karla; Ziviani, Nivio; Fernandes, David; Cristo, Marco. (2012). &quot;<a href="https://wikipediaquality.com/wiki/Selecting_Keywords_to_Represent_Web_Pages_Using_Wikipedia_Information">Selecting Keywords to Represent Web Pages Using Wikipedia Information</a>&quot;.DOI: 10.1145/2382636.2382714.