Atharva Anand Joshi

Hi everyone! I am Atharva Anand Joshi, graduate student at Carnegie Mellon University. I am pursuing master’s in Electrical and Computer Engineering with concentration in AI/ML Systems. I am intrigued by the application of machine learning algorithms in audio, finance and marketing use cases. My research interests include speech processing, deep learning and representation learning. Currently, I am a collaborator at the WAVLab under the guidance of Professor Shinji Watanabe.

At Hewlett-Packard - Poly, I explored DL-based algorithms for Noise Suppression. The pipeline I developed over the summer removes both stationary and impulsive background noises directly on the headset in real-time. We also introduced a scalable deployment pipeline that would lead to faster and cost-effective integration into upcoming products.

I completed my B.E in Electrical and Electronics Engineering from BITS Pilani in July, 2022. During my undergrad, I worked on research projects under the guidance of Professor Syed Mohammad Zafaruddin (SM Research Group) and Professor Ananthakrishna Chintanpalli.

Before joining CMU, I worked as an Analyst at American Express, AI Labs . Here, I explored modelling approaches involving a blend of Tabular deep learning with Tree-based algorithms for Credit Default Prediction - What is the probability of a given customer failing to repay their outstanding debt in near future? This helps in setting the credit line and making other such credit-related decisions.

During my internship at AmEx, I developed a template-based framework that allows users to seamlessly create and deploy their end-to-end Self Learning pipelines for Sequence Models. I have also interned at Adobe Research, India with mentors Dr Atanu Sinha and Dr Sunav Choudhary. Here, we created concise user representations which can be projected onto edge server, providing faster marketing services.

Feel free to reach out for research collaboration, academic guidance (if you are an undergraduate student from BITS) or even just for a chat! For GRE and TOEFL preparation guidance, especially under time contraints do check out this repository.

Email / Resume / Google Scholar / Linkedin / Github

Education

Carnegie Mellon University, Pittsburgh, PA

Master of Science in Electrical and Computer Engineering (December 2024)
Concentration: AI/ML Systems

GPA: 4.0/4.0

Birla Institute of Technology and Science, Pilani

B.E. Electrical and Electronics Engineering (July 2022)

GPA: 9.49/10
GRE: 331/340 (AWA: 4/6)
TOEFL: 109/120

Experience

Watanabe’s Audio and Voice (WAV) Lab

Graduate Research Project
January 2024 - Present

Speech Processing: Enhancement and Separation

Hewlett-Packard Inc.

Research and Development
May 2023 - August 2023

Deep Learning for Noise Suppression on Poly Headsets

American Express, Artificial Intelligence Labs

AiDa Deploy Team
January 2022 - December 2022 (Intern and Fulltime)

1) Credit Default Prediction through Deep Learning

2) Template-based Development and Deployment on AiDa

Adobe Research, India

Big Data Experience Lab
May 2021 - August 2021

Edge Computing for Marketing Technology

Publications

[1] A. A. Joshi, H. Settibhaktini and A. Chintanpalli, "Modeling Concurrent Vowel Scores Using the Time Delay Neural Network and Multitask Learning," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2452-2459, 2022, doi: 10.1109/TASLP.2022.3192096. (Link)

[2] A. A. Joshi, P. Bhardwaj and S. M. Zafaruddin, "Terahertz Wireless Transmissions with Maximal Ratio Combining over Fluctuating Two-Ray Fading," 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 1575-1580, doi: 10.1109/WCNC51071.2022.9771926. (Link)

Patent

S. Chakraborty, S. Choudhary, A. Sinha, S. Nair, M. Ghuhan, Y. Gagneja, A. Joshi, A. Tyagi, S. Gupta, “Generating Concise and Common User Representations for Edge Systems from Event Sequence Data Stored on Hub Systems”, US 17/849,320, Filed Jun 24, 2022. (Link)

Teaching and Mentoring

Teaching Assistant for 10-605: ML with Large Datasets

Instructors: Prof. Ameet Talwalkar and Prof. Geoff Gordon

Course Website: 10-405/10-605

Teaching Assistant for 18-661: Introduction to Machine Learning for Engineers

Instructors: Prof. Yuejie Chi and Prof. Beidi Chen

Course Website: 18-461/18-661

Teaching Assistant for BITS F312: Neural Networks and Fuzzy Logic

Instructors: Prof. Surekha Bhanot and Prof. Bijoy Krishna Mukherjee

Course Website: Team NNFL

Here's a presentation on how to read research papers: Demystifying Research Papers

Projects

ESPnet: End-to-end speech processing toolkit

Contributed an end-to-end reproducible deep learning pipeline for the Kinect-WSJ dataset – a multichannel, reverberated and noisy version of the WSJ0-2mix dataset.

Performed benchmark analysis on this speech separation dataset using current state-of-the-art models.

Code / Model

Proactive Servicing: Guess What? (Amex ML Challenge)

Combined event sequences and demographic data to predict customer intent at the start of the Ask Amex chat session.

The approach involved joint training of Bidirectional GRU with Feedforward Networks.

Attained a validation top-5 accuracy score of 0.768. Our solution made it to the top 10 leaderboard and was selected for internal presentation.

High Performance Parallel Implementations for Convolutional Neural Networks

Provided fast OpenMP and CUDA implementations for various subroutines corresponding to the convolution layer.

Investigated several design choices in detail aiming to optimize speedup over a simple sequential C implementation. (simple_cnn)

Achieved maximum speedup of 4.23x on the Intel(R) Xeon(R) Silver 4208 CPU and 73.87x on the Nvidia Tesla T4 GPU.

Report

Concurrent Vowel Identification using TDNN-MTL

Predicted the effect of fundamental frequency (F0) difference on the identification scores in a concurrent vowel identification experiment using Deep Learning.

From the neuron responses generated by the Auditory Nerve Model, a temporal network architecture was used to model short-term and long-term dependencies.

Paper

Quasi-Newton Methods for Deep Learning

Replication of the NeurIPS 2020 paper: Practical Quasi-Newton Methods for Training Deep Neural Networks (Donald Goldfarb, Yi Ren, Achraf Bahamou).

Theoretical analysis of the algorithm along with comparison with other first and second-order descent methods across multiple datasets.

Report

Terahertz wireless transmissions with MRC receiver over FTR fading

Framework to perform numerical analysis on FTR channel models obtained upon combining small-scale fading and antenna misalignment effects.

This analysis can also be verified using Monte-Carlo Simulations.

Code / Paper

Biomedical Image Segmentation using Convolutional Neural Networks

A Keras implementation for the research paper U-Net: Convolutional Networks for Biomedical Image Segmentation (Olaf Ronneberger, Philipp Fischer, and Thomas Brox).

Includes Overlap Tile Strategy and Random Elastic Tranformations.

Code

Drone Object Detection

Lightweight CNN model trained on DOTA-v1.0 dataset and deployed to the Nvidia Jetson Nano for real-time object detection.

Code

Music

I have been an avid practitioner and performer of Hindustani Classical Vocal Music for the past fourteen years.
Here's a short article on Raga that I had written in my sophomore year (2020). Would love to hear your thoughts!

Following is my musical journey during college and beyond:

My Instagram Channel

On this channel, I post Classical music, Ghazals and Bollywood covers during my free time.

Link

Ragamalika, the Classical Music and Dance Club of BITS Pilani

Joint Coordinator

Composed and performed music for our semester productions: Nritya Ranjani and Sangamam

Managed external professional concerts: We've had the opportunity to host some wonderful artists in the past including Pandit Jayateerth Mevundi, Pandit Abhishek Raghuram and IndoSoul by Karthick Iyer.

Instagram / Youtube / Facebook

Thanks Jon Barron for this template.