I am a first year PhD student at the Robotics Institute at CMU advised by Prof. Zico Kolter and Prof. Zac Manchester. I am broadly interested in machine learning problems faced by robotics systems and like to seek solutions to these problems by looking at them from a more foundational lens of optimization and graphical models, sometimes taking inspiration from neuroscience and cognitive psychology. I previously finished my Masters in Robotics from Robotics Institute and was advised by Prof. Katia Sycara. My publications and descriptions of some selected projects are available below and on my Google Scholar page. I also recently interned at Nuro, self-driving car startup, where I worked on developing a differentiable planning pipeline.

Education

August 2020 - Present Ph.D. in Robotics Research (0.00/0.00)
Carnegie Mellon University
Aug 2017 - Dec 2019 M.S. in Robotics Research (4.09/4.33)
Carnegie Mellon University
July 2013 - May 2017 B.S. in Electrical Engineering (8.99/10.00)
IIT-BHU Varanasi

Publications and Selected Projects

Google Scholar

</td> </tr>
MAME : Model Agnostic Meta Exploration
Swaminathan Gurumurthy, Sumit Kumar, Katia Sycara
CoRL 2019
[1] [pdf] [code]
We propose to explicitly model a separate exploration policy for the task distribution in Meta-RL given the requirements on sample efficiency. Having two different policies gives more flexibility during training and makes adaptation to any specific task easier. We show that using self-supervised or supervised learning objectives for adaptation stabilizes the training process and improves performance.
Community Regularization of Visually-Grounded Dialog
Akshat Agarwal*, Swaminathan Gurumurthy*, Vasu Sharma*, Katia Sycara, Michael Lewis
AAMAS 2019 [Oral talk]
[2] [pdf] [code]
We aim to train 2 agents on the visual dialogue dataset where one agent is given access to an image and the other agent is tasked with guessing the contents of the image by establishing a dialogue with the first agent. The two agents are initially trained using supervision followed by Reinforce. In order to combat the resulting drift from natural language when training with Reinforce, we introduce a community regularization scheme of training a population of agents.
3D Point Cloud Completion using Latent Optimization in GANs
Shubham Agarwal*, Swaminathan Gurumurthy*
WACV 2019
[3] [pdf]
We address a fundamental problem with Neural Network based point cloud completion methods which reconstruct the entire structure rather than preserving the points already provided as input. These methods struggle when tested on unseen deformities. We address this problem by introducing a GAN based Latent optimization procedure to perform output constrained optimization using the regions provided in the input.
Exploiting Data and Human Knowledge for Predicting Wildlife Poaching
Swaminathan Gurumurthy, Lantao Yu, Chenyan Zhang, Yongchao Jin, Weiping Li, Haidong Zhang, Fei Fang
COMPASS 2019 [Oral talk]
[4] [pdf] [code]
Using past data of traps/snares found in a wildlife Sanctuary, we predict the regions of high probability of traps/snares to guide the rangers to patrol those regions. We use novel frameworks of incorporating expert domain knowledge for the dynamic sampling of data points in order to tackle the imbalance in data. We further use these regions to produce optimal patrol routes for the rangers. This has now been deployed in a conservation area in China.
DeLiGAN: GANs for Diverse and Limited Data
Swaminathan Gurumurthy*, Ravi Kiran S.* and R. Venkatesh Babu
CVPR 2017
[5] [pdf] [code]
We try to explore the idea of finding high probability regions in the latent space of GANs by learning a latent space representation using learnable Mixture of Gaussians. This enables the GAN to model a multimodal distribution and stabilizes training as observed visually and by the intra-class variance measured using a modified inception score. Our modification is especially useful when the dataset is very small and diverse.
Query Efficient Black Box Attacks in Neural Networks
Swaminathan Gurumurthy, Fei Fang and Martial Hebert
[6] [pdf]
We test various methods to increase the sample efficiency of adversarial black box attacks on Neural nets. In one of the methods, we analyze the transferability of gradients and find that it has two components: Network specific components and Task specific components. The task specific component corresponds to the transferable properties of adversarial examples between architectures. Hence, we attempted to isolate this component and enhance the transfer properties. We then perform multiple queries on the black box network to obtain the architecture specific components using ES.
Visual SLAM based SfM for Boreholes
Swaminathan Gurumurthy, Tat-Jun Chin and Ian Reid
[7] [code]
Built a package to construct a sparse map and camera trajectory using SIFT features, fine-tuned using bundle adjustment and loop closure. It was tailored for boreholes and underground scenes with forward motion, where most of the current state of the art approaches like LSD SLAM, ORB SLAM and SVO struggled at both localization and mapping.
Off-on policy learning
Swaminathan Gurumurthy, Bhairav Mehta, Anirudh Goyal
[8] On policy methods are known to exhibit stable behavior and off-policy methods are known to be sample efficient. The goal here was to get the best of both worlds. We first developed a self-imitation based method to learn from a diverse set of exploratory policies which perform coordinated exploration. We also tried a meta-learning objective to ensure that the off-policy updates to the policies are aligned with future on-policy updates. This leads to more stable training but fails to reach peak performance in most continuous control tasks we tested on.
Exploring interpretability in Atari Games for RL policies using Counterfactuals
Swaminathan Gurumurthy, Akshat Agarwal, Prof. Katia Sycara
[9] We aimed to understand what RL agents learn in simple games such as in Atari. We developed a GAN based method to find counterfactuals for the policies, i.e., we find small perturbations in the scene that can lead to changes in the agent action and use these to interpret agent behavior. GAN in this case is used to avoid adversarial examples and produce semantically meaningful perturbations.

Last updated on 2020-06-13