Wikipedia based Semantic Metadata Annotation of Audio Transcripts
Authors | Giulio Paci Giorgio Pedrazzi Roberta Turra |
---|---|
Publication date | 2010 |
Links | Original |
Wikipedia based Semantic Metadata Annotation of Audio Transcripts - scientific work related to Wikipedia quality published in 2010, written by Giulio Paci, Giorgio Pedrazzi and Roberta Turra.
Overview
A method to automatically annotate video items with semantic metadata is presented. The method has been developed in the context of the Papyrus project to annotate documentary- like broadcast videos with a set of relevant keywords using automatic speech recognition (ASR) transcripts as a primary complementary resource. The task is complicated by the high word error rate (WER) of the ASR for this kind of videos. For this reason a novel relevance criterion based on domain information is proposed. Wikipedia is used both as a source of metadata and as a linguistic resource for disambiguating keywords and for eliminating the out of topic/out of domain keywords. Documents are annotated with relevant links to Wikipedia pages, concepts definitions, synonyms, translations and concepts categories.
Embed
Wikipedia Quality
Paci, Giulio; Pedrazzi, Giorgio; Turra, Roberta. (2010). "[[Wikipedia based Semantic Metadata Annotation of Audio Transcripts]]".
English Wikipedia
{{cite journal |last1=Paci |first1=Giulio |last2=Pedrazzi |first2=Giorgio |last3=Turra |first3=Roberta |title=Wikipedia based Semantic Metadata Annotation of Audio Transcripts |date=2010 |url=https://wikipediaquality.com/wiki/Wikipedia_based_Semantic_Metadata_Annotation_of_Audio_Transcripts}}
HTML
Paci, Giulio; Pedrazzi, Giorgio; Turra, Roberta. (2010). "<a href="https://wikipediaquality.com/wiki/Wikipedia_based_Semantic_Metadata_Annotation_of_Audio_Transcripts">Wikipedia based Semantic Metadata Annotation of Audio Transcripts</a>".