SalAR: Attentive recurrent network for dynamic visual saliency

Asif, Muhammad Arsal

SalAR: Attentive recurrent network for dynamic visual saliency

Files

asif_muhammadarsal.pdf(54.27 MB)

Date

2019-12-19

Authors

Asif, Muhammad Arsal

Abstract

In this thesis, we present a visual saliency model for predicting locations of fixations in eye-tracking datasets for videos. While most visual saliency models are designed to address image saliency, the development of models for video saliency is a topic of ongoing interest. The current state-of-the-art models in video saliency include a dynamic component to learn temporal features, whereas we leverage ideas from other domains of computer vision to build a static network architecture. As gaze is usually fixated on objects encoded in image classification networks, we design our model by extending the VGG-16 network. We present comprehensive quantitative and qualitative analysis on three large-scale video saliency datasets. Our analysis shows that the proposed model outperforms the current state-of-the-art models. Furthermore, we explore the inefficacy of the saliency metrics for comparing the performance of the top-performing video saliency models, and we highlight the discrepancies in the available video saliency datasets.

Keywords

Computer vision, Deep Learning, Computer science, Visual saliency, Saliency, Video saliency, Image saliency, Neural networks

URI

http://hdl.handle.net/1993/34455

Collections

FGS - Electronic Theses and Practica

Full item page