Ansh

Ansh Jain

I am currently working at Amazon as an Applied Scientist. My team is responsible for predicting, identifying and mitigating risks that a product or service might face when launched.
Prior to Amazon, I graduated from the University of Wisconsin Madison, with a Master's in Computer Science. During my time, I also worked with Prof. Mohit Gupta in the domain of efficient video understanding. My courses included:-

Fall 2021 - CS 760 Machine Learning, CS 532 Matrix methods in Machine learning, CS 764 Topics in DBMS
Spring 2022 - CS 769 Advanced NLP, CS 839 Deep Learning for Visual Recognition
Fall 2022 - CS 561 Probability and information theory in ML, CS 537 Operating System, CS 759 High Performance Computing

I completed my Bachelor's degree from the Netaji Subhas Institute of Technology, Delhi University (currently Netaji Subhas University of Technology) in Information Technology. Afterward, I worked at the Samsung R&D Institute India, Bangalore (SRIB) for 2 years as a software developer. My work focused on on-device video analytics such as video classification for incorporating in Samsung's video editor engine. I further contributed to the development of features such as 360 video editing, video summarization, single take, video filters, etc.

Previously, my research work has been majorly in the field of Machine Learning, Computer Vision, and NLP. I worked with Prof. Vinay P. Namboodiri in the Vision and Language lab at IIT, Kanpur on the topics of attention, multi-modal question answering, and machine translation.

The experiences and learnings have motivated me to continue working in the field and explore more topics in-depth as a graduate student at UWM. My enthusiasm for learning is unceasing and I hope to continue creating an impact in the Computer Science field.

Email / Resume / Google Scholar / Github

Experience

	Applied Scientist \| Amazon (Amazon.com) Jan'23 - present Working on the project to improve features/services/products accross amazon wrt to sustainability. The goal is to develop CV algorithms for architecture diagram understanding and recommending better architectural alternatives.
	Applied Science Intern \| Amazon (Amazon Lab126) May '22 - Aug'22 As part of the Halo Health CV team, my project included 3D shape and pose estimation from 2D images. The research project involved implementation of techniques for accurate prediction of 3D shape and pose for unconstrained 2D image input in Pytorch. I worked with state-of-the-art models such as SMPL, SMPL-X and gained experience with 3D geometry concepts such as human mesh, perspective projection, camera intrinsics etc.
	Senior Software Engineer \| Samsung RnD Institute Bangalore June '19 - July'21 At Samsung I worked in the Video Editor team and was fortunate to contribute to the application. The majority of my time at Samsung was spent working on the Video Classification project, where I gained experience in Pytorch, Tensorflow, Keras, Video processing etc. Also, I worked with state-of-the-art models such as MobileNets, Inception Nets etc. The remaining time I worked on the development of video editing features such as 360 video editing, tone filters, video highlight creation, etc. In addition to that, I was the main contributor from our team to the Single Take feature, which was the USP of Samsung S21 and presented at the launch event.
	Researcher \| Indian Institute of Technology Kanpur December '18 - March'19, Jan'20 - June'20 I worked full-time for the first two months and remotely part-time for the remaining time with the Computer Vision and Language lab. The project initially started as an improvement of the state-of-the-art for Visual Question Answering (VQA) but later evolved into a general approach to improve attention networks in deep learning networks thus increasing their accuracy. The research work was published in one of the top conferences in computer vision, WACV.
	Software Development Intern \| Samsung RnD Institute Bangalore May '18 - July'18 The two-month internship was challenging and helped me develop an understanding of how an industry works. During the course of my project, I created a GUI with a backend for Lightweight Machine-to-Machine (LWM2M) device management for IoT devices working on Constrained Application Protocol (CoAP) using the Java Swing framework in Eclipse. In addition to that, I was among the top 3% of interns to clear the advanced level coding test based on data structures and algorithms on the first attempt.

Research and Publications

	Self Supervision for Attention Networks Ansh Jain, Badri N Patro, Kasturi GS, Vinay P Namboodiri (equal contribution), Winter Conference on Applications of Computer Vision (WACV) 2021 In this paper, we propose a novel method to improve the attention mechanism by inducing “self-supervision”. The method introduced in this paper can be generalized to deep learning model that utilizes an attention module. It can be extended to different input modalities like text and image. [Code] \| [Initial Research]
	SCALe: Supervised Contrastive approach for Active Learning Hardik Chauhan, Ansh Jain, Kaushal Rai, Ritu Raut CS:769 Advanced NLP The Active Learning approached in NLP use Masked Language Modeling for tasks such as text classification. In this research work, we showcase that this approach has many limitations and that a supervised Contrastive training procedure provides much more intuitive results. We succesfully achieve an improvement over SOTA with only a fraction of training data using the proposed techique. [Code]
	Detection and Classification of Radio Frequency Jamming Attacks using Machine learning Kasturi GS* Ansh Jain, Jagdeep Singh (equal contribution)*, JoWUA Vol. 11, No. 4 journal 2020 In this project, we explore the detection and classification Radio Frequency Jamming attacks in wireless ad-hoc networks. This is necessary to take appropraite countermeasures against different type of attacks and prevent them. [Code]
	Paraphrase Question Generation Through Graph Convolution Network Ansh Jain, Kasturi GS, Badri N Patro In this project, we propose an encoder-decoder model to create a better sentence level embedding and evaluate the model on the task of Paraphrase Question generation. The semantics are captured by a pairwise decoder that enforces encodings of similar sentences to be close to each other. Further, the syntactics of sentences are captured by stacking a GCN over LSTM states. [Code]

Other Projects

	Event Neural Network implementation Ansh Jain CS:799 Master's Research This work was focused on developing the comparison between a traditional and an event-based neural network to evaluate the practical computational savings. The aim is to evaluate how the actual savings measure against the theoretical claims of the paper. The code is developed in C++(for vgg16) from scratch with basic data structures as the concepts of the paper cannot be tested with standard deep learning frameworks and libraries. [Code]
	Context Associated Object Removal from Images Ansh Jain, Kaushal Rai, Avinash Kumar, Kriti Goyal CS 839: Deep Learning for Visual Recognition The goal of the project was to create an end-to-end pipeline to remove unwanted objects from an image, to create a semantically coherent output. Therefore, along with objects, it's corresponding context (shadow in our case) also needs to be removed. We combined multiple modules to achieve this, including object and shadow segmentation, superpixel creation to fine-tune the segmentation output, and inpainting using a generative model.
	Developing GAN for Image-to-Image-Translation Ansh Jain, Kaushal Rai CS:760 Machine Learning: Course Project We analyze the hypothesis that generative models trained through adversarial process achieve appreciable results on Image-to-Image Translation tasks. For this purpose, we develop a GAN and train it on the "maps" dataset from pix2pix. We further compare the developed model with a style transfer architecture and present the results. [Code]
	Intelligent Query Response Time Prediction Ansh Jain, Kaushal Rai CS:764 Topics in DBMS: Course Project The aim of project is to apply supervised learning algorithms for predicting query response time. We develop operator-level query predictors for some of the most common database operations. The operator-level models, are easier to train and can generalize well to unseen queries. [Code]

The source code for the website can be found here. This is a modification of the template by Jon Barron.