Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation

From Wikipedia Quality
Jump to: navigation, search


Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation
Authors
Abdulgabbar Saif
Abdulgabbar Saif
Nazlia Omar
Ummi Zakiah Zainodin
Mohd Juziaddin Ab Aziz
Publication date
2018
DOI
10.1016/j.procs.2018.01.062
Links
Original

Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation - scientific work related to Wikipedia quality published in 2018, written by Abdulgabbar Saif, Abdulgabbar Saif, Nazlia Omar, Ummi Zakiah Zainodin and Mohd Juziaddin Ab Aziz.

Overview

Abstract Building of sense-tagged data is a main challenge for supervised techniques that achieved promising results in word sense disambiguation. The manual building of sense-tagged data is a labor and a time-consuming task because each ambiguous word has to be labeled in collected contexts by linguistic experts. Therefore, this paper proposes a knowledge-based method for building the Arabic sense-tagged corpus from Wikipedia. The method starts with mapping Arabic WordNet and Wikipedia to select the Wikipedia article for the corresponding sense in WordNet. In this mapping step, the cross-lingual method is used to measure the similarity between features of a Wikipedia article and a WordNet sense separately. Then, the incoming-links of Wikipedia articles are exploited to extract instances for the sense of each ambiguous word in WordNet. For handling the lack of instances of some articles in Wikipedia, the multiword-based technique is proposed to increase a number of instances for each concept. Experimental results show that the cross-lingual method outperforms monolingual method that is based on Arabic features only. The sense-tagged corpus is created for 50 ambiguous words yielding 148 senses with 30,961 instances.

Embed

Wikipedia Quality

Saif, Abdulgabbar; Saif, Abdulgabbar; Omar, Nazlia; Zainodin, Ummi Zakiah; Aziz, Mohd Juziaddin Ab. (2018). "[[Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation]]". Elsevier BV. DOI: 10.1016/j.procs.2018.01.062.

English Wikipedia

{{cite journal |last1=Saif |first1=Abdulgabbar |last2=Saif |first2=Abdulgabbar |last3=Omar |first3=Nazlia |last4=Zainodin |first4=Ummi Zakiah |last5=Aziz |first5=Mohd Juziaddin Ab |title=Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation |date=2018 |doi=10.1016/j.procs.2018.01.062 |url=https://wikipediaquality.com/wiki/Building_Sense_Tagged_Corpus_Using_Wikipedia_for_Supervised_Word_Sense_Disambiguation |journal=Elsevier BV}}

HTML

Saif, Abdulgabbar; Saif, Abdulgabbar; Omar, Nazlia; Zainodin, Ummi Zakiah; Aziz, Mohd Juziaddin Ab. (2018). &quot;<a href="https://wikipediaquality.com/wiki/Building_Sense_Tagged_Corpus_Using_Wikipedia_for_Supervised_Word_Sense_Disambiguation">Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation</a>&quot;. Elsevier BV. DOI: 10.1016/j.procs.2018.01.062.