Back to Top

ICASSP-2017 Tutorial: Multimodal Signal Processing, Saliency and Summarization

Presenters: Petros Maragos, Alexandros Potamianos, Athanasia Zlatintsi, Petros Koutras


The goal of the ICASSP-2017 tutorial is to provide a concise overview of the computational aspects of human attention as applied to multimodal signal processing and multimodal (i.e., audio-visual-text) salient event detection in multimodal information streams, such as videos with audio and text. It will present state-of-the-art work in multimodal signal processing, audio-visual saliency models, related audio processing and computer vision algorithms, how to tackle the task of semantic saliency computation for text, multimodal fusion, technological applications, such as audio and movie summarization and outstanding research frontiers in this area. Application areas of saliency computation approaches include audio-visual event detection, video abstraction and summarization, image/video retrieval, scene analysis, action recognition, object recognition, perception-based video processing. Additionally, in this tutorial state-of-the-art algorithms will be presented and specifically a unified energy-based audio-visual framework for frontend processing, a method for text saliency computation, detection of perceptually salient events from videos, as well as a movie summarization system for the automatic production of summaries. Further, a state-of-the-art multimodal video database, namely COGNIMUSE, will be presented as well. The database is annotated with sensory and semantic saliency, events, cross-media semantics and emotion, which can be used for training and evaluation of event detection and summarization algorithms, for classification and recognition of audio-visual and cross-media events, as well as for emotion tracking.


13:30 - 14:15 Part 1: Multimodal Signal Processing, Audio-Visual Perception & Fusion Petros Maragos
14:15 - 15:00 Part 2: Visual processing & Saliency Petros Koutras
15:30 - 16:00 Part 3: Audio processing & Saliency Athanasia Zlatintsi
16:00 - 16:30 Part 4: Text processing & Saliency Alexandros Potamianos
16:30 - 17:00 Part 5: Video Summarization All




If you use any of these material please cite the tutorial as:

Petros Maragos, Alexandros Potamianos, Athanasia Zlatintsi and Petros Koutras, Multimodal Signal Processing, Saliency and Summarization, Tutorial presented at 42th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, LA, USA, 2017.