Visual Representations for Navigation and Object Discovery
Deliberate navigation in previously unseen environments and detection of novel objects instances are some of the key functionalities of intelligent agents engaged in fetch and delivery tasks. While data-driven deep learning approaches fueled rapid progress in object category recognition and semantic segmentation by exploiting large amounts of labelled data, extending this learning paradigm to robotic setting comes with challenges.
To overcome the need for large amount of labeled data for training object instance detectors we use active self-supervision provided by a robot traversing an environment. The knowledge of ego-motion enables the agent to effectively associate multiple object hypotheses, which serve as training data for learning novel object embeddings from unlabelled data. The object detectors trained in this manner achieve higher mAP compared to off-the-shelf detectors trained on this limited data.
I will describe an approach towards semantic target driven navigation, which entails finding a way through a complex environment to a target object. The proposed approach learns navigation policies on top of representations that capture spatial layout and semantic contextual cues. The choice of this representation exploits models trained on large standard vision datasets, enables better generalization and joint use of simulated environments and real images for effective training of navigation policies.
Jana Kosecka is Professor at the Department of Computer Science, George Mason University. She obtained PhD in Computer Science from University of Pennsylvania. Following her PhD, she was a postdoctoral fellow at the EECS Department at University of California, Berkeley. She is the recipient of David Marr's prize and received the National Science Foundation CAREER Award. Jana is a chair of IEEE technical Committee of Robot Perception, Associate Editor of IEEE Robotics and Automation Letters and International Journal of Computer Vision, former editor of IEEE Transactions on Pattern Analysis and Machine Intelligence. She held visiting positions at Stanford University, Google and Nokia Research. She is a co-author of a monograph titled Invitation to 3D vision: From Images to Geometric Models. Her general research interests are in Computer Vision and Robotics. In particular she is interested 'seeing' systems engaged in autonomous tasks, acquisition of static and dynamic models of environments by means of visual sensing and human-computer interaction.