Multiword Noun Compound Bracketing Using Wikipedia

From Wikipedia Quality
Revision as of 17:11, 24 July 2019 by Madison (talk | contribs) (Multiword Noun Compound Bracketing Using Wikipedia - new page)
Jump to: navigation, search

Multiword Noun Compound Bracketing Using Wikipedia - scientific work related to Wikipedia quality published in 2014, written by Caroline Barri.


This research suggests two contributions in relation to the multiword noun compound bracketing problem: first, demonstrate the usefulness of Wikipedia for the task, and second, present a novel bracketing method relying on a word association model. The intent of the association model is to represent combined evidence about the possibly lexical, relational or coordinate nature of links between all pairs of words within a compound. As for Wikipedia, it is promoted for its encyclopedic nature, meaning it describes terms and named entities, as well as for its size, large enough for corpus-based statistical analysis. Both types of information will be used in measuring evidence about lexical units, noun relations and noun coordinates in order to feed the association model in the bracketing algorithm. Using a gold standard of around 4800 multiword noun compounds, authors show performances of 73% in a strict match evaluation, comparing favourably to results reported in the literature using unsupervised approaches.