Back to Top

text saliency

Text Saliency

In order to estimate word-level saliency scores in multimedia data, we first need to annotate or automatic recognize spoken language information in the audio stream. In addition, the (automatically or manually annotated) transcripts have to be time-aligned with the audio stream. In this work, we utilize the annotation available in the subtitles of commercial video streams, although, the proposed approach can be also applied to the output of an automatic speech recognizer.

Processing steps:

Subscribe to RSS - text saliency