Experience
Total Experience: 3 years and 10 months +
Technology Innovation Institute (TII)📍Abu Dhabi, UAE
👨💻 Machine Learning Engineer ⌛ 2023/Oct - Present (~7 months +)
Introduced image-understanding capabilities in text-only Large Language Models (LLMs) by fine-tuning with visual question-answering datasets on AWS SageMaker using Huggingface's Transformers, TRL, Datasets, and Accelerate packages
Leading the development of a vision-language dataset based on web documents with interleaved image-text, aiming to enhance the contextual understanding of LLMs in visual tasks
Collaborating to upgrade FalconLLM, TII's flagship open-source LLM, focusing on refining the distributed training framework, Giagatron, similar to NVIDIA's Megatron and Huggingface's Nanotron
G42’s Inception📍Abu Dhabi, UAE
👨💻 Applied Scientist ⌛ 2023/Jun - 2023/Aug (~3 months)
Collaborated with a team to develop Jais, an English-Arabic Large Language Model (LLM)
Set up and managed the LLM Arena using the FastChat package to benchmark in-house, open-source, GPT4 models with human annotators through Elo ratings
Processed and utilized LLM Arena data to align the fine-tuned models on harmlessness and usefulness using Reinforcement Learning from Human Feeback (RLHF) and Direct Preference Optimization (DPO) with the TRL package
Microsoft Research 📍Bengaluru, India
👨💻 Machine Learning Research Intern ⌛ 2022/May - 2022/Aug (~3 months)
Extensively evaluated the design choices for Text-To-Speech systems and open-sourced state-of-the-art models for 13 Indian languages [ICASSP Paper]
Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) 📍Abu Dhabi, UAE
👨💻Machine Learning Research Assistant ⌛2021/Jan - 2021/Jul (~7 months)
Explored the Double Variational Autoencoder Network (DoENet) for unsupervised adversarial example classification in computer vision, demonstrating its target classifier and attack-agnostic nature, leading to improved performance
TCS Research 📍Chennai, India
👨💻 Machine Learning Developer ⌛2019/Jul - 2020/Nov (~1 year and 5 months)
👨💻 Machine Learning Developer Intern ⌛2019/Apr - 2019/Jul (~3 months)
Enhanced the sales forecasting models for a dynamic pricing system of a prominent retail client, leveraging deep neural networks such as RNNs and LSTNet, along with a novel N-gram-based method [US Patent]
Implemented RNN-based spatiotemporal travel time prediction models to evaluate against temporal difference-based methods [IJCNN Paper]
Developed an employee profile retrieval system that parses information documents and responds to input text queries, as part of an internal profile hunt challenge [Special Initiative Award]
Developed a yield prediction model for hybrid corn crops to recommend effective crossing of species, winning an internal hackathon out of 8 teams [Innovation Pride Award]
Designed computer vision pipelines to solve curved text extraction, perspective correction, etc., winning an internal image enrichment hackathon with over 100 participants [Innovation Pride Award]
IIT Madras | AI4Bharat | One Fourth Labs 📍Chennai, India
👨💻 Machine Learning Project Intern ⌛2018/Dec - 2019/Mar (~4 months)
👨💻 Summer Research Fellow ⌛2018/Jun - 2018/Aug (~2 months)
Created a scene text translation system for Indian languages using synthetic datasets, incorporating an Efficient and Accurate Scene Text Detector (EAST) for detection, and Convolutional Recurrent Neural Networks (CRNN) for Classification and Recognition [Dataset] [Detection] [Classification] [Recognition]
Set up programming competitions for the One Fourth Labs Deep Learning course [Course]
Investigated attention models in Deep Learning, analyzing foreground region detection through proxy data with distinct statistical properties [Report]