Student Developer June 2021 - August 2021
International Neuroinformatics Coordinating Facility (INCF), Google Summer of Code 2021
- Developed a robust open-source eye tracker
- Check out the project here
I'm a second year Master's student at the University of Michigan, Ann Arbor, specializing in Computer Vision.
I am excited about tech and computers and ensure that I am up to date with the recent developments in the field.
With a love for both hardware and software, I enjoy tinkering and creating. Check out this prototye instant noodle maker my friends and I built!
I also like taking simple games and making them more fun. Check out CompleXO an intersting twist on an all time classic.
My main interest lies in Artificial Intelligence, specifically Deep Learning for Computer Vision. With limitless use cases and applications, I want to contribute to the field and build systems and algorithms that can aid and ease human life.
Computers have been an integral part of my life and I've been coding since I was 12. While my lanugae of choice is Python, I am also comfortable in Java. I am passionate about Computer Vision and Deep Learning and have been learning and doing projects related to them since 2017.
I am able to pick up new skills very quickly and this has led to a varied skillset. I have worked with both PyTorch and TensorFlow and can quicly and efficiently debug code.
International Neuroinformatics Coordinating Facility (INCF), Google Summer of Code 2021
Fynd, Mumbai, India
SensoVision Systems, Bengaluru, India
Ericsson R&D, Chennai, India
Electrical Engineering and Computer Science
Specialization in Computer Vision
University of Michigan, Ann Arbor, USA
2021 - present
CGPA: 3.953/4.00
Course work: Foundations of Computer Vision (EECS 504), Advanced Topics in Computer Vision (EECS 542), Deep Learning for Computer Vision (EECS 598), Artificial General Intelligence (EECS 598), Robotics Systems Lab (ROB 550), AI in Biomedical Engineering (BIOMEDE 499), Computational Data Science and Machine Learning (EECS 505), Probability and Random Processes (EECS 501), Interpersonal Skills (ENTR 550)
Electronics and Computer Engineering
Vellore Institute of Technology, Chennai, India
2016 - 2020
CGPA: 9.12/10.0
The Indian School Certifcate (ISC)
The Cathedral and John Connon School, Mumbai, India
2014 - 2016
Aggregate: 94%
The Indian Certificate of Secondary Education (ICSE)
The Cathedral and John Connon School, Mumbai, India
2014 - 2016
Aggregate: 95%
Check out my Google Scholar page
Abstract:
In this paper, we propose an image matting framework called Salient Image Matting to estimate the per-pixel opacity value of the most salient foreground in an image. To deal with a large amount of semantic diversity in images, a trimap is conventionally required as it provides important guidance about object semantics to the matting process. However, creating a good trimap is often expensive and timeconsuming. The SIM framework simultaneously deals with the challenge of learning a wide range of semantics and salient object types in a fully automatic and an end to end manner. Specifically, our framework is able to produce accurate alpha mattes for a wide range of foreground objects and cases where the foreground class, such as human, appears in a very different context than the train data directly from an RGB input. This is done by employing a salient object detection model to produce a trimap of the most salient object in the image in order to guide the matting model about higher-level object semantics. Our framework leverages large amounts of coarse annotations coupled with a heuristic trimap generation scheme to train the trimap prediction network so it can produce trimaps for arbitrary foregrounds. Moreover, we introduce a multi-scale fusion architecture for the task of matting to better capture finer, low-level opacity semantics. With high-level guidance provided by the trimap network, our framework requires only a fraction of expensive matting data as compared to other automatic methods while being able to produce alpha mattes for a diverse range of inputs. We demonstrate our framework on a range of diverse images and experimental results show our framework compares favourably against state of art matting methods without the need for a trimap.
Abstract:
5G is the next wave in the communication industry where the end customers enjoy multiple services with minimum latency. Since 5G towers operate in high-frequency spectrum the coverage of these towers is very limited when compared with its earlier versions and this facilitates the requirement of installing additional number of mobile towers. One of the key requirements for enabling the full-scale deployment of the new towers in addition to the existing mobile towers are to ensure their regular inspection and maintenance without human intervention. While there has been research carried out to develop autonomous drones for monitoring mobile towers, the development is limited to the inspection stage and not capable of fixing the faults without human assistance. The fundamental limitation that prevented carrying out a repairing task was the absence of a competent robotic arm that can act as a human arm. Traditional drones are only equipped with a robotic manipulator on the bottom using which cannot access all the points in a typical wireless infrastructure with a positive slope. This paper designs an autonomous drone, from scratch, with a robotic arm on the top to inspect and repair faults in mobile towers. The main contribution is a novel coupling algorithm that takes proactive action to stabilize the drone while the arm is carrying out the mending tasks with less energy. The drone is also tested in a real- time windy environment by emulating the mobile tower scenario.
Abstract:
Most automobiles lack reliable smart systems that can constantly track the driver's behaviour and raise alarms as required. Extant systems are either too slow or not robust enough to cope with different types of drivers and conditions. In this paper, a robust system to continuously track the driver's eye and detect its state (open/close) is proposed. Frames from a live camera feed are constantly processed. Viola Jones algorithm, using Haar filters extracts the eye. The extraction is efficient with and without spectacles (translucent) and the system can even estimate the Region of Interest (RoI) where it is most likely to find the eye in the event that no eyes are explicitly detected. A trained CNN model using the LeNet architecture classifies the extracted eyes. The rate at which predictions are made is also higher than existing systems. The system raises an alarm if, after analysing the data points, it detects any anomalies.
Abstract:
Deep learning and other big data technologies have over time become very powerful and accurate. There are algorithms and models developed that have near human accuracy in their task. In health care, the amount of data available is massive and hence, these technologies have a great scope. This paper reviews a few interesting contributions to the field specifically to medical imaging, genomics and patient health records.
A Deep Learning based anomaly detection module that can be trained on just good/non-defective images and is able to locate anomalies/defects.
A super simple implementation of Semantic Segmentation using PyTorch.
Accurate and fast object detection using YOLOv5 but simple.
A super simple, object oriented, classifier in TensorFlow 2.x.
A complete heads-up display designed for formula student teams. Collects and displays information from multiple sensors.