Cross Script Hindi English Ner Corpus from Wikipedia

From Wikipedia Quality
Revision as of 10:47, 12 December 2019 by Stella (talk | contribs) (Cats.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Cross Script Hindi English Ner Corpus from Wikipedia
Authors
Mohd Zeeshan Ansari
Tanvir Ahmad
Arshad Ali
Publication date
2018
Links
Original Preprint

Cross Script Hindi English Ner Corpus from Wikipedia - scientific work related to Wikipedia quality published in 2018, written by Mohd Zeeshan Ansari, Tanvir Ahmad and Arshad Ali.

Overview

The text generated on social media platforms is essentially a mixed lingual text. The mixing of language in any form produces considerable amount of difficulty in language processing systems. Moreover, the advancements in language processing research depends upon the availability of standard corpora. The development of mixed lingual Indian Named Entity Recognition (NER) systems are facing obstacles due to unavailability of the standard evaluation corpora. Such corpora may be of mixed lingual nature in which text is written using multiple languages predominantly using a single script only. The motivation of work is to emphasize the automatic generation such kind of corpora in order to encourage mixed lingual Indian NER. The paper presents the preparation of a Cross Script Hindi-English Corpora from Wikipedia category pages. The corpora is successfully annotated using standard CoNLL-2003 categories of PER, LOC, ORG, and MISC. Its evaluation is carried out on a variety of machine learning algorithms and favorable results are achieved.

Embed

Wikipedia Quality

Ansari, Mohd Zeeshan; Ahmad, Tanvir; Ali, Arshad. (2018). "[[Cross Script Hindi English Ner Corpus from Wikipedia]]".

English Wikipedia

{{cite journal |last1=Ansari |first1=Mohd Zeeshan |last2=Ahmad |first2=Tanvir |last3=Ali |first3=Arshad |title=Cross Script Hindi English Ner Corpus from Wikipedia |date=2018 |url=https://wikipediaquality.com/wiki/Cross_Script_Hindi_English_Ner_Corpus_from_Wikipedia}}

HTML

Ansari, Mohd Zeeshan; Ahmad, Tanvir; Ali, Arshad. (2018). &quot;<a href="https://wikipediaquality.com/wiki/Cross_Script_Hindi_English_Ner_Corpus_from_Wikipedia">Cross Script Hindi English Ner Corpus from Wikipedia</a>&quot;.