Gated recurrent networks for scene parsing

Thumbnail Image
Karim, Rezaul
Journal Title
Journal ISSN
Volume Title
In this thesis, we consider the problem of feedback routing and gating mechanisms in deep neural networks for dense pixel labeling tasks including scene parsing and semantic segmentation. The goal of semantic segmentation is to label every pixel in an image or video frame according to a specific set of object classes while scene parsing involves labeling both objects (e.g. person, car) and stuff (e.g. sky, road, field). Semantic segmentation and scene parsing have a wide variety of practical application including robot navigation and for autonomous vehicles. Recently there has been great progress with deep convolutional neural network-based solutions for scene parsing and semantic segmentation through increasing the depth and architectural complexity of the networks. Current successful feedforward architectures lack recurrent feedback connections that allow for information routing and dynamics, a phenomenon that is ubiquitous in the human brain. Such networks are reaching towards a limit on performance of inference capabilities possibly due to their implementation involving a single feedforward pass. Motivated by the dynamics of feedforward and recurrent processing in the brain, we propose a recurrent feedback gating mechanism that allows strong inference to be possible in an iterative manner. Our initially proposed Recurrent Iterative Gating Networks (RIGNet) reveal the powerful capability of feedback to improve the inference capability of almost any network. Based on this observation, we later propose Distributed Iterative Gating Networks (DIGNet), which can be considered as a canonical feedback routing mechanism with appropriate gating modules, capable of boosting inference capabilities to an even greater extent than RIGNet. Experimental results on several benchmark datasets demonstrate the effectiveness of feedback gating in deep neural networks for scene parsing and the superiority of the proposed feedback gating mechanism.
Computer vision, Semantic segmentation, Scene parsing, CNN, Recurrent gating, Iterative gating, Feedback networks, Pascal VOC 2012, COCO-Stuff, ADE20K
Karim, Rezaul, Md Amirul Islam, and Neil DB Bruce. "Recurrent Iterative Gating Networks for Semantic Segmentation." In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1070-1079. IEEE, 2019.