Ansh Jain
I am currently working at Amazon as an Applied Scientist. My team is responsible for predicting, identifying and mitigating risks that a product or service might face when launched.
Prior to Amazon, I graduated from the University of Wisconsin Madison, with a Master's in Computer Science. During my time, I also worked with Prof. Mohit Gupta in the domain of efficient
video understanding. My courses included:-
- Fall 2021 - CS 760 Machine Learning, CS 532 Matrix methods in Machine learning, CS 764 Topics in DBMS
- Spring 2022 - CS 769 Advanced NLP, CS 839 Deep Learning for Visual Recognition
- Fall 2022 - CS 561 Probability and information theory in ML, CS 537 Operating System, CS 759 High Performance Computing
I completed my Bachelor's degree from the Netaji Subhas Institute of Technology, Delhi University (currently Netaji Subhas University of Technology) in Information Technology. Afterward, I worked at the Samsung R&D Institute India, Bangalore (SRIB) for 2 years as a software developer.
My work focused on on-device video analytics such as video classification for incorporating in Samsung's video editor engine. I further contributed to the development of features such as 360 video editing, video summarization, single take, video filters, etc.
Previously, my research work has been majorly in the field of Machine Learning, Computer Vision, and NLP. I worked with Prof. Vinay P. Namboodiri in the Vision and Language lab at IIT, Kanpur on the topics of attention, multi-modal question answering, and machine translation.
The experiences and learnings have motivated me to continue working in the field and explore more topics in-depth as a graduate student at UWM.
My enthusiasm for learning is unceasing and I hope to continue creating an impact in the Computer Science field.
Email  / 
Resume  / 
Google Scholar  / 
Github
|
|
|
Applied Scientist | Amazon (Amazon.com)
Jan'23 - present
Working on the project to improve features/services/products accross amazon wrt to sustainability. The goal is to develop CV algorithms for architecture diagram understanding and recommending better architectural alternatives.
|
|
Applied Science Intern | Amazon (Amazon Lab126)
May '22 - Aug'22
As part of the Halo Health CV team, my project included 3D shape and pose estimation from 2D images. The research project involved implementation of
techniques for accurate prediction of 3D shape and pose for unconstrained 2D image input in Pytorch. I worked with state-of-the-art models such as SMPL, SMPL-X and
gained experience with 3D geometry concepts such as human mesh, perspective projection, camera intrinsics etc.
|
|
Senior Software Engineer | Samsung RnD Institute Bangalore
June '19 - July'21
At Samsung I worked in the Video Editor team and was fortunate to contribute to the application. The majority of my time at Samsung was spent
working on the Video Classification project, where I gained experience in Pytorch, Tensorflow, Keras, Video processing etc. Also, I worked with
state-of-the-art models such as MobileNets, Inception Nets etc. The remaining time I worked on the development of video editing features such as
360 video editing, tone filters, video highlight creation, etc. In addition to that, I was the main contributor from our team to the Single Take
feature, which was the USP of Samsung S21 and presented at the launch event.
|
|
Researcher | Indian Institute of Technology Kanpur
December '18 - March'19, Jan'20 - June'20
I worked full-time for the first two months and remotely part-time for the remaining time with the Computer Vision and Language lab.
The project initially started as an improvement of the state-of-the-art for Visual Question Answering (VQA) but later evolved
into a general approach to improve attention networks in deep learning networks thus increasing their accuracy. The research work was published
in one of the top conferences in computer vision, WACV.
|
|
Software Development Intern | Samsung RnD Institute Bangalore
May '18 - July'18
The two-month internship was challenging and helped me develop an understanding of how an industry works.
During the course of my project, I created a GUI with a backend for Lightweight Machine-to-Machine (LWM2M) device management
for IoT devices working on Constrained Application Protocol (CoAP) using the Java Swing framework in Eclipse. In addition to that, I was among
the top 3% of interns to clear the advanced level coding test based on data structures and algorithms on the first attempt.
|
Research and Publications
|
|
Self Supervision for Attention Networks
Ansh Jain*,
Badri N Patro*,
Kasturi GS*,
Vinay P Namboodiri
(*equal contribution), Winter Conference on Applications of Computer Vision (WACV) 2021
In this paper, we propose a novel method to improve the attention mechanism by inducing “self-supervision”. The method introduced
in this paper can be generalized to deep learning model that utilizes an attention module. It can be extended to different input modalities like text and image.
[Code]
|
[Initial Research]
|
|
SCALe: Supervised Contrastive approach for Active Learning
Hardik Chauhan, Ansh Jain, Kaushal Rai, Ritu Raut
CS:769 Advanced NLP
The Active Learning approached in NLP use Masked Language Modeling for tasks such as text classification. In this research work, we showcase that this
approach has many limitations and that a supervised Contrastive training procedure provides much more intuitive results. We succesfully achieve an improvement
over SOTA with only a fraction of training data using the proposed techique.
[Code]
|
|
Detection and Classification of Radio Frequency Jamming Attacks using Machine learning
Kasturi GS*
Ansh Jain*,
Jagdeep Singh*
(*equal contribution), JoWUA Vol. 11, No. 4 journal 2020
In this project, we explore the detection and classification Radio Frequency Jamming attacks in wireless ad-hoc networks. This is necessary
to take appropraite countermeasures against different type of attacks and prevent them.
[Code]
|
|
Paraphrase Question Generation Through Graph Convolution Network
Ansh Jain,
Kasturi GS,
Badri N Patro
In this project, we propose an encoder-decoder model to create a better sentence level embedding and evaluate the model on the task of Paraphrase Question
generation. The semantics are captured by a pairwise decoder that enforces encodings of similar sentences to be close to each other. Further, the syntactics
of sentences are captured by stacking a GCN over LSTM states.
[Code]
|
|
Event Neural Network implementation
Ansh Jain
CS:799 Master's Research
This work was focused on developing the comparison between a traditional and an event-based neural network to
evaluate the practical computational savings. The aim is to evaluate how the actual savings measure against the theoretical
claims of the paper. The code is developed in C++(for vgg16) from scratch with basic data structures as the concepts of the paper
cannot be tested with standard deep learning frameworks and libraries.
[Code]
|
|
Context Associated Object Removal from Images
Ansh Jain, Kaushal Rai, Avinash Kumar, Kriti Goyal
CS 839: Deep Learning for Visual Recognition
The goal of the project was to create an end-to-end pipeline to remove unwanted objects from an image, to create a semantically coherent output. Therefore,
along with objects, it's corresponding context (shadow in our case) also needs to be removed. We combined multiple modules to achieve this, including
object and shadow segmentation, superpixel creation to fine-tune the segmentation output, and inpainting using a generative model.
|
|
Developing GAN for Image-to-Image-Translation
Ansh Jain,
Kaushal Rai
CS:760 Machine Learning: Course Project
We analyze the hypothesis that generative models trained through adversarial process achieve appreciable results on Image-to-Image Translation tasks.
For this purpose, we develop a GAN and train it on the "maps" dataset from pix2pix. We further compare the developed model with a style transfer
architecture and present the results.
[Code]
|
|
Intelligent Query Response Time Prediction
Ansh Jain,
Kaushal Rai
CS:764 Topics in DBMS: Course Project
The aim of project is to apply supervised learning algorithms for predicting query response time. We develop operator-level
query predictors for some of the most common database operations. The operator-level models, are easier to train and can generalize
well to unseen queries.
[Code]
|
The source code for the website can be found here. This is a modification of the template
by Jon Barron.
|
|