Object localization in weakly labeled images and videos

Thumbnail Image
2014-05, 2014-12, 2015-06
Rochan, Mrigank
Journal Title
Journal ISSN
Volume Title
Springer International Publishing
We consider the problem of localizing objects in weakly labeled images/videos. An image/video (e.g., Flickr image and YouTube video) is weakly labeled if it is associated with a tag describing the main object present in the image/video. It is weakly labeled because the tag only indicates the presence/absence of the object, but does not provide the detailed spatial location of the object. Given an image/video with an object tag, our goal is to localize the object in it. In this thesis, we propose two novel techniques to handle this challenging problem. First, we build a video-specific object appearance model and then incorporate temporal consistency information to localize the object. Second, we make use of existing detectors of some other object classes (which we call "familiar objects") to build the appearance model of the unseen object class (i.e., the object of interest). Experimental results show the effectiveness of the proposed methods.
Computer Vision, Object Localization
Rochan, Mrigank, et al. "Segmenting objects in weakly labeled videos." Conference on Computer and Robot Vision (CRV), 2014. IEEE, 2014.
Rochan, Mrigank, and Yang Wang. "Efficient object localization and segmentation in weakly labeled videos." International Symposium on Visual Computing. Springer International Publishing, 2014.
Rochan, Mrigank, and Yang Wang. "Weakly supervised localization of novel objects using appearance transfer." 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015.