Recurrent Neural Network for Learning Spatial and Temporal Information from Videos

dc.contributor.authorNabavi, Seyed shahabeddin
dc.contributor.examiningcommitteeHu, Pingzhao (BMG/Computer Science) Ashraf, Ahmed (ECE)en_US
dc.contributor.supervisorWang, Yang (Computer Science)en_US
dc.date.accessioned2019-07-19T13:40:35Z
dc.date.available2019-07-19T13:40:35Z
dc.date.issued2019en_US
dc.date.submitted2019-06-22T21:58:41Zen
dc.degree.disciplineComputer Scienceen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractRecurrent Neural Network is a well-established tool for sequential modelling. It includes a variety of techniques and models to extract temporal information from a sequence of data (e.g. frames of a video sequence). This thesis presents novel end-to-end deep learning recurrent based architectures for two computer vision problems: semantic segmentation prediction and camera pose estimation. Firstly, we investigate the problem of extracting temporal information in the context of semantic segmentation prediction. we demonstrate the capability of recurrent architecture in feature prediction by presenting a novel encoder-decoder convolutional LSTM architecture. We also utilize a bidirectional convolutional LSTM as an extension of our work. Furthermore, we explore a step-by-step extraction of spatial information in the problem of monocular camera pose estimation with an end-to-end unsupervised training scheme which relies on a recurrent based pose estimator. We illustrate the contribution of recurrent estimation (a.k.a step-by-step estimation) in the estimation of large displacements and complex transformations. We also show the impact of this process on the monocular depth estimation process.en_US
dc.description.noteOctober 2019en_US
dc.identifier.citationNabavi, Seyed shahabeddin. Rochan, Mrigank.Wang, Yang. (2018). Future Semantic Segmentation with Convolutional LSTM. British Machine Vision Conferenceen_US
dc.identifier.urihttp://hdl.handle.net/1993/34039
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectFuture semantic segmentationen_US
dc.subjectRecurrent neural networken_US
dc.subjectUnsupervised camera pose estimationen_US
dc.subjectSpatial informationen_US
dc.subjectDeep learningen_US
dc.subjectComputer visionen_US
dc.subjecttemporal informationen_US
dc.subjectVideo predictionen_US
dc.titleRecurrent Neural Network for Learning Spatial and Temporal Information from Videosen_US
dc.typemaster thesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nabavi_Seyed shahabeddin.pdf
Size:
12.92 MB
Format:
Adobe Portable Document Format
Description:
Master's Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: