SalAR: Attentive recurrent network for dynamic visual saliency

Loading...
Thumbnail Image
Date
2019-12-19
Authors
Asif, Muhammad Arsal
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this thesis, we present a visual saliency model for predicting locations of fixations in eye-tracking datasets for videos. While most visual saliency models are designed to address image saliency, the development of models for video saliency is a topic of ongoing interest. The current state-of-the-art models in video saliency include a dynamic component to learn temporal features, whereas we leverage ideas from other domains of computer vision to build a static network architecture. As gaze is usually fixated on objects encoded in image classification networks, we design our model by extending the VGG-16 network. We present comprehensive quantitative and qualitative analysis on three large-scale video saliency datasets. Our analysis shows that the proposed model outperforms the current state-of-the-art models. Furthermore, we explore the inefficacy of the saliency metrics for comparing the performance of the top-performing video saliency models, and we highlight the discrepancies in the available video saliency datasets.
Description
Keywords
Computer vision, Deep Learning, Computer science, Visual saliency, Saliency, Video saliency, Image saliency, Neural networks
Citation