Difference between revisions of "Transforming Wikipedia into Named Entity Training Data"

From Wikipedia Quality
Jump to: navigation, search
(+ infobox)
(Embed)
Line 10: Line 10:
 
== Overview ==
 
== Overview ==
 
Statistical [[named entity]] recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. Authors exploit [[Wikipedia]] to create a massive corpus of named entity annotated text. Authors transform Wikipedia’s links into named entity annotations by classifying the target articles into common entity types (e.g. person, organisation and location). Comparing to MUC, CONLL and BBN corpora, Wikipedia generally performs better than other cross-corpus train/test pairs.
 
Statistical [[named entity]] recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. Authors exploit [[Wikipedia]] to create a massive corpus of named entity annotated text. Authors transform Wikipedia’s links into named entity annotations by classifying the target articles into common entity types (e.g. person, organisation and location). Comparing to MUC, CONLL and BBN corpora, Wikipedia generally performs better than other cross-corpus train/test pairs.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Nothman, Joel; Curran, James R.; Murphy, Tara. (2008). "[[Transforming Wikipedia into Named Entity Training Data]]".
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Nothman |first1=Joel |last2=Curran |first2=James R. |last3=Murphy |first3=Tara |title=Transforming Wikipedia into Named Entity Training Data |date=2008 |url=https://wikipediaquality.com/wiki/Transforming_Wikipedia_into_Named_Entity_Training_Data}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Nothman, Joel; Curran, James R.; Murphy, Tara. (2008). &amp;quot;<a href="https://wikipediaquality.com/wiki/Transforming_Wikipedia_into_Named_Entity_Training_Data">Transforming Wikipedia into Named Entity Training Data</a>&amp;quot;.
 +
</nowiki>
 +
</code>

Revision as of 11:16, 15 September 2019


Transforming Wikipedia into Named Entity Training Data
Authors
Joel Nothman
James R. Curran
Tara Murphy
Publication date
2008
Links
Original Preprint

Transforming Wikipedia into Named Entity Training Data - scientific work related to Wikipedia quality published in 2008, written by Joel Nothman, James R. Curran and Tara Murphy.

Overview

Statistical named entity recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. Authors exploit Wikipedia to create a massive corpus of named entity annotated text. Authors transform Wikipedia’s links into named entity annotations by classifying the target articles into common entity types (e.g. person, organisation and location). Comparing to MUC, CONLL and BBN corpora, Wikipedia generally performs better than other cross-corpus train/test pairs.

Embed

Wikipedia Quality

Nothman, Joel; Curran, James R.; Murphy, Tara. (2008). "[[Transforming Wikipedia into Named Entity Training Data]]".

English Wikipedia

{{cite journal |last1=Nothman |first1=Joel |last2=Curran |first2=James R. |last3=Murphy |first3=Tara |title=Transforming Wikipedia into Named Entity Training Data |date=2008 |url=https://wikipediaquality.com/wiki/Transforming_Wikipedia_into_Named_Entity_Training_Data}}

HTML

Nothman, Joel; Curran, James R.; Murphy, Tara. (2008). &quot;<a href="https://wikipediaquality.com/wiki/Transforming_Wikipedia_into_Named_Entity_Training_Data">Transforming Wikipedia into Named Entity Training Data</a>&quot;.