Wikipedia based Semantic Metadata Annotation of Audio Transcripts

From Wikipedia Quality
Jump to: navigation, search


Wikipedia based Semantic Metadata Annotation of Audio Transcripts
Authors
Giulio Paci
Giorgio Pedrazzi
Roberta Turra
Publication date
2010
Links
Original

Wikipedia based Semantic Metadata Annotation of Audio Transcripts - scientific work related to Wikipedia quality published in 2010, written by Giulio Paci, Giorgio Pedrazzi and Roberta Turra.

Overview

A method to automatically annotate video items with semantic metadata is presented. The method has been developed in the context of the Papyrus project to annotate documentary- like broadcast videos with a set of relevant keywords using automatic speech recognition (ASR) transcripts as a primary complementary resource. The task is complicated by the high word error rate (WER) of the ASR for this kind of videos. For this reason a novel relevance criterion based on domain information is proposed. Wikipedia is used both as a source of metadata and as a linguistic resource for disambiguating keywords and for eliminating the out of topic/out of domain keywords. Documents are annotated with relevant links to Wikipedia pages, concepts definitions, synonyms, translations and concepts categories.

Embed

Wikipedia Quality

Paci, Giulio; Pedrazzi, Giorgio; Turra, Roberta. (2010). "[[Wikipedia based Semantic Metadata Annotation of Audio Transcripts]]".

English Wikipedia

{{cite journal |last1=Paci |first1=Giulio |last2=Pedrazzi |first2=Giorgio |last3=Turra |first3=Roberta |title=Wikipedia based Semantic Metadata Annotation of Audio Transcripts |date=2010 |url=https://wikipediaquality.com/wiki/Wikipedia_based_Semantic_Metadata_Annotation_of_Audio_Transcripts}}

HTML

Paci, Giulio; Pedrazzi, Giorgio; Turra, Roberta. (2010). &quot;<a href="https://wikipediaquality.com/wiki/Wikipedia_based_Semantic_Metadata_Annotation_of_Audio_Transcripts">Wikipedia based Semantic Metadata Annotation of Audio Transcripts</a>&quot;.