Karthik Bhaskar
Senior AI Scientist
About Me
I am a Senior AI Scientist at CIBC. Previously, I worked as a ML Researcher at WangLab affiliated with Vector Institute and University Health Network, proudly advised by Prof. Bo Wang.
I completed my Master's degree in ECE, specialized in Machine Learning at the University of Toronto advised by Prof. Deepa Kundur and by Prof. Yuri Lawryshyn. In my free time I work on projects at the intersection of Deep Learning, Natural Language Processing and Healthcare. Apart from that, I am also interested in Large Language Models and Deep Reinforcement Learning.
My ultimate goal is to build robust, privacy-preserved, and interpretable algorithms with human like ability to generalize in real-world environments by using data as its own supervision.
I am a Stream Owner and Discussion Group Lead of the "Machine Learning in Healthcare" stream at AISC (Aggregate Intellect), where we discuss one paper at a time every week from basics to state-of-the-art ML papers in HealthCare.
Throughout my life, I have approached every challenge with enthusiasm, creativity, and a ceaseless desire to achieve success. This passion and drive have paved the way to countless opportunities, unique experiences, and excellent relationships, both personally and professionally. I enjoy working with people and discussing ideas.
Research Interests
- Large Language Models
- Deep Learning
- Natural Language Processing
- Deep Reinforcement Learning
- Privacy Preserved ML
- Recommender Systems
Experience
Senior AI Scientist
CIBC
2020 - Present
- Designed and implemented Retrieval-Augmented Generation (RAG) systems using vector databases, improving information retrieval accuracy by 35% and enabling more context-aware AI interactions.
- Implemented robust LLMOps practices, leveraging MLFlow for experiment tracking and Databricks ML for managing machine learning workflows, resulting in a 25% reduction in model deployment time and improved reproducibility of results.
- Architected and optimized distributed compute environments for running large-scale language models, implementing data and model parallelism techniques to efficiently handle massive datasets and reduce training.
- Developed comprehensive observability and monitoring solutions for LLM models, tracking performance metrics, model drift, and resource usage, leading to an improvement in model reliability and a reduction in operational costs.
- Engineered and deployed an innovative HR chatbot powered by GPT-4, analyzing HR documents and providing instant responses to employee queries, reducing average response time by 99% (from 15-20 minutes to seconds) and significantly enhancing organizational efficiency.
- Engineered a cutting-edge Strategic Workforce Management tool leveraging Gaussian Deep Learning Models and Graph Neural Networks, enabling HR to predict 3-5 year workforce dynamics (attrition, hiring, internal moves, retirements) with uncertainty quantification and scenario planning capabilities, resulting in a 40% reduction in decision-making time for data-driven workforce planning.
- Optimized data pipeline efficiency by identifying and resolving I/O bottlenecks and inefficient data shuffling, resulting in a 50% reduction in data processing time for large-scale ML workloads.
- Engineered and deployed cutting-edge Large Language Models (LLMs) including GPT-3.5, Llama, and Falcon, implementing fine-tuning techniques and prompt engineering to enhance search functionality and efficiency of our internal platform, resulting in a 30% improvement in query response accuracy.
- Developed Search tools and ML models using NLP techniques to enhance the team's ability to investigate potential money laundering activity and identify trends, resulting in a 35% reduction in investigation time.
- Decreased 25% of call-center call volumes by building a Conversational AI system, integrating Automatic Speech Recognition and Neural Agent Assistance with state-of-the-art Transformer models. Published our work at the EMNLP 2022 - Industry Track conference.
- Reduced 80% of manual labour by deploying an end-to-end self-served BERT model to analyze and extract insights from client feedbacks.
- Built a Semantic Search Engine using Transformers to get relevant data for downstream ML tasks like building Dashboard.
- Engineered and deployed a high-performance Semantic Search Engine using Transformers, implementing data and model parallelism to improve training. Utilized Streamlit, Docker, and Docker Compose for seamless deployment and scaling.
- Built Transformer based Topic Modelling on Deficiency data and used the results in downstream Classification Model to predict the category of deficiencies.
- Built a DistilGPT-2 based Transformer neural network to generate synthetic financial data for downstream ML models.
- Managed, trained and interviewed full time employee and interns in Data Science and Software Engineer roles.
ML Researcher
WangLab - Vector Institute
2020 - 2021
- WangLab is affiliated with Vector Institute and University Health Network - Advised by Bo Wang
- Worked on a project at the intersection of Computational Biology, Deep Learning and Natural Language Processing.
- Applied Self Supervised Learning, Weak Supervision and Data Programming on MIMIC III database.
- Built Transformer based architecture model to improve the accuracy of massively multi-task classification problem.
Graduate Machine Learning Researcher
University of Toronto
2018 - 2020
- Responsible for developing and building cutting edge state of the art deep learning based recommendation system.
- Built a Deep Learning-based Recommendation System for Wolseley's e-commerce website from scratch to production.
- Dataset is massive involving more than 200,000 unique customers and 500,000 unique SKUs.
- Achieved a personalized NDCG score of 72.4% and improved the One-Product Hit Ratio to 100%.
Data Science/Machine Learning Intern
Cybersecurity Research
Summer 2018 and Summer 2019
- Worked with cybersecurity professionals and built a defense mechanism using deep learning and unsupervised learning techniques to prevent cyberattacks.
- Built Unsupervised Auto Tagging algorithm and Automatic Rule Synthesis for Octavius.
- Investigated an Automatic Rule synthesis algorithm for Octavius using Deep Reinforcement Learning to improve overall defense mechanism.
- Devised a Deep Learning model that detects and prevents the cyberattack before it happens (ProjectX).
- Used NLP, Neural Networks, Knowledge Graphs for Keyword Extraction, etc.
Projects
Multi-Agent Deep Deterministic Policy Gradient
View ProjectDefense GAN & Physical Adversarial Examples
View ProjectDeep Deterministic Policy Gradient
View ProjectHuman Like Chess Engine
View ProjectNavigation: Deep Q Networks
View ProjectDetecting Deforestation from Satellite Images
View ProjectHyper Face
View ProjectSelf Driving Cars
View ProjectMusic Generation
View ProjectMicrosoft Malware Detection
View ProjectFacebook Social Network Graph Prediction
View ProjectGoogle Advertisement Click Prediction
View ProjectResearch Publications
Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support
Building Agent Assistants that can help improve customer service support requires inputs from industry users and their customers, as well as knowledge about state-of-the-art Natural Language Processing (NLP) technology. We combine expertise from academia and industry to bridge the gap and build task/domain-specific Neural Agent Assistants (NAA) with three high-level components for: (1) Intent Identification, (2) Context Retrieval, and (3) Response Generation. In this paper, we outline the pipeline of the NAA core system and also present three case studies in which three industry partners successfully adapt the framework to find solutions to their unique challenges. Our findings suggest that a collaborative process is instrumental in spurring the development of emerging NLP models for Conversational AI tasks in industry. The full reference implementation code and results are available at https://github.com/VectorInstitute/NAA.
Implicit Feedback Deep Collaborative Filtering Product Recommendation System
In this paper, several Collaborative Filtering (CF) approaches with latent variable methods were studied using user-item interactions to capture important hidden variations of the sparse customer purchasing behaviours. The latent factors are used to generalize the purchasing pattern of the customers and to provide product recommendations. CF with Neural Collaborative Filtering(NCF) was shown to produce the highest Normalized Discounted Cumulative Gain (NDCG) performance on the real-world proprietary dataset provided by a large parts supply company. Different hyperparameters were tested using Bayesian Optimization (BO) for applicability in the CF framework. External data sources like click-data and metrics like Clickthrough Rate (CTR) were reviewed for potential extensions to the work presented. The work shown in this paper provides techniques the Company can use to provide product recommendations to enhance revenues, attract new customers, and gain advantages over competitors.