Senior AI Scientist @ CIBC

ML Researcher @ Vector Institute

Education

Master of Applied Science (M.A.Sc)

University of Toronto

Specialization: Deep Learning, Natural Language Processing, Healthcare

Advisors: Prof. Bo Wang, Prof. Deepa Kundur, Prof. Yuri Lawryshyn

My Expertise Explore Projects

Hi, I'm

Karthik Bhaskar

Senior AI Scientist

About Me

I am a Senior AI Scientist at CIBC. Previously, I worked as a ML Researcher at WangLab affiliated with Vector Institute and University Health Network, proudly advised by Prof. Bo Wang.

I completed my Master's degree in ECE, specialized in Machine Learning at the University of Toronto advised by Prof. Deepa Kundur and by Prof. Yuri Lawryshyn. In my free time I work on projects at the intersection of Deep Learning, Natural Language Processing and Healthcare. Apart from that, I am also interested in Large Language Models and Deep Reinforcement Learning.

My ultimate goal is to build robust, privacy-preserved, and interpretable algorithms with human like ability to generalize in real-world environments by using data as its own supervision.

I am a Stream Owner and Discussion Group Lead of the "Machine Learning in Healthcare" stream at AISC (Aggregate Intellect), where we discuss one paper at a time every week from basics to state-of-the-art ML papers in HealthCare.

Throughout my life, I have approached every challenge with enthusiasm, creativity, and a ceaseless desire to achieve success. This passion and drive have paved the way to countless opportunities, unique experiences, and excellent relationships, both personally and professionally. I enjoy working with people and discussing ideas.

Research Interests

Large Language Models
Deep Learning
Natural Language Processing
Deep Reinforcement Learning
Privacy Preserved ML
Recommender Systems

Experience

Senior AI Scientist

CIBC

2020 - Present

Designed and implemented Retrieval-Augmented Generation (RAG) systems using vector databases, improving information retrieval accuracy by 35% and enabling more context-aware AI interactions.
Implemented robust LLMOps practices, leveraging MLFlow for experiment tracking and Databricks ML for managing machine learning workflows, resulting in a 25% reduction in model deployment time and improved reproducibility of results.
Architected and optimized distributed compute environments for running large-scale language models, implementing data and model parallelism techniques to efficiently handle massive datasets and reduce training.
Developed comprehensive observability and monitoring solutions for LLM models, tracking performance metrics, model drift, and resource usage, leading to an improvement in model reliability and a reduction in operational costs.
Engineered and deployed an innovative HR chatbot powered by GPT-4, analyzing HR documents and providing instant responses to employee queries, reducing average response time by 99% (from 15-20 minutes to seconds) and significantly enhancing organizational efficiency.
Engineered a cutting-edge Strategic Workforce Management tool leveraging Gaussian Deep Learning Models and Graph Neural Networks, enabling HR to predict 3-5 year workforce dynamics (attrition, hiring, internal moves, retirements) with uncertainty quantification and scenario planning capabilities, resulting in a 40% reduction in decision-making time for data-driven workforce planning.
Optimized data pipeline efficiency by identifying and resolving I/O bottlenecks and inefficient data shuffling, resulting in a 50% reduction in data processing time for large-scale ML workloads.
Engineered and deployed cutting-edge Large Language Models (LLMs) including GPT-3.5, Llama, and Falcon, implementing fine-tuning techniques and prompt engineering to enhance search functionality and efficiency of our internal platform, resulting in a 30% improvement in query response accuracy.
Developed Search tools and ML models using NLP techniques to enhance the team's ability to investigate potential money laundering activity and identify trends, resulting in a 35% reduction in investigation time.
Decreased 25% of call-center call volumes by building a Conversational AI system, integrating Automatic Speech Recognition and Neural Agent Assistance with state-of-the-art Transformer models. Published our work at the EMNLP 2022 - Industry Track conference.
Reduced 80% of manual labour by deploying an end-to-end self-served BERT model to analyze and extract insights from client feedbacks.
Built a Semantic Search Engine using Transformers to get relevant data for downstream ML tasks like building Dashboard.
Engineered and deployed a high-performance Semantic Search Engine using Transformers, implementing data and model parallelism to improve training. Utilized Streamlit, Docker, and Docker Compose for seamless deployment and scaling.
Built Transformer based Topic Modelling on Deficiency data and used the results in downstream Classification Model to predict the category of deficiencies.
Built a DistilGPT-2 based Transformer neural network to generate synthetic financial data for downstream ML models.
Managed, trained and interviewed full time employee and interns in Data Science and Software Engineer roles.

ML Researcher

WangLab - Vector Institute

2020 - 2021

WangLab is affiliated with Vector Institute and University Health Network - Advised by Bo Wang
Worked on a project at the intersection of Computational Biology, Deep Learning and Natural Language Processing.
Applied Self Supervised Learning, Weak Supervision and Data Programming on MIMIC III database.
Built Transformer based architecture model to improve the accuracy of massively multi-task classification problem.

Graduate Machine Learning Researcher

University of Toronto

2018 - 2020

Responsible for developing and building cutting edge state of the art deep learning based recommendation system.
Built a Deep Learning-based Recommendation System for Wolseley's e-commerce website from scratch to production.
Dataset is massive involving more than 200,000 unique customers and 500,000 unique SKUs.
Achieved a personalized NDCG score of 72.4% and improved the One-Product Hit Ratio to 100%.

Data Science/Machine Learning Intern

Cybersecurity Research

Summer 2018 and Summer 2019

Worked with cybersecurity professionals and built a defense mechanism using deep learning and unsupervised learning techniques to prevent cyberattacks.
Built Unsupervised Auto Tagging algorithm and Automatic Rule Synthesis for Octavius.
Investigated an Automatic Rule synthesis algorithm for Octavius using Deep Reinforcement Learning to improve overall defense mechanism.
Devised a Deep Learning model that detects and prevents the cyberattack before it happens (ProjectX).
Used NLP, Neural Networks, Knowledge Graphs for Keyword Extraction, etc.

Projects

Research Publications

Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support

Karthik Raja Kalaiselvi Bhaskar, Stephen Obadinma, et al. ACL Anthology - EMNLP, 2022

Building Agent Assistants that can help improve customer service support requires inputs from industry users and their customers, as well as knowledge about state-of-the-art Natural Language Processing (NLP) technology. We combine expertise from academia and industry to bridge the gap and build task/domain-specific Neural Agent Assistants (NAA) with three high-level components for: (1) Intent Identification, (2) Context Retrieval, and (3) Response Generation. In this paper, we outline the pipeline of the NAA core system and also present three case studies in which three industry partners successfully adapt the framework to find solutions to their unique challenges. Our findings suggest that a collaborative process is instrumental in spurring the development of emerging NLP models for Conversational AI tasks in industry. The full reference implementation code and results are available at https://github.com/VectorInstitute/NAA.

arXiv

@inproceedings{obadinma-etal-2022-bringing, title = "Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support", author = "Obadinma, Stephen and Khan Khattak, Faiza and Wang, Shirley and Sidhorn, Tania and Lau, Elaine and Robertson, Sean and Niu, Jingcheng and Au, Winnie and Munim, Alif and Kalaiselvi Bhaskar, Karthik Raja", editor = "Li, Yunyao and Lazaridou, Angeliki", booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track", month = dec, year = "2022", address = "Abu Dhabi, UAE", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.emnlp-industry.44", doi = "10.18653/v1/2022.emnlp-industry.44", pages = "440--450", abstract = "Building Agent Assistants that can help improve customer service support requires inputs from industry users and their customers, as well as knowledge about state-of-the-art Natural Language Processing (NLP) technology. We combine expertise from academia and industry to bridge the gap and build task/domain-specific Neural Agent Assistants (NAA) with three high-level components for: (1) Intent Identification, (2) Context Retrieval, and (3) Response Generation. In this paper, we outline the pipeline of the NAA{'}s core system and also present three case studies in which three industry partners successfully adapt the framework to find solutions to their unique challenges. Our findings suggest that a collaborative process is instrumental in spurring the development of emerging NLP models for Conversational AI tasks in industry. The full reference implementation code and results are available at \url{https://github.com/VectorInstitute/NAA}.", }

Implicit Feedback Deep Collaborative Filtering Product Recommendation System

Karthik Raja Kalaiselvi Bhaskar, Deepa Kundur, Yuri Lawryshyn arXiv, 2020

In this paper, several Collaborative Filtering (CF) approaches with latent variable methods were studied using user-item interactions to capture important hidden variations of the sparse customer purchasing behaviours. The latent factors are used to generalize the purchasing pattern of the customers and to provide product recommendations. CF with Neural Collaborative Filtering(NCF) was shown to produce the highest Normalized Discounted Cumulative Gain (NDCG) performance on the real-world proprietary dataset provided by a large parts supply company. Different hyperparameters were tested using Bayesian Optimization (BO) for applicability in the CF framework. External data sources like click-data and metrics like Clickthrough Rate (CTR) were reviewed for potential extensions to the work presented. The work shown in this paper provides techniques the Company can use to provide product recommendations to enhance revenues, attract new customers, and gain advantages over competitors.

arXiv

@article{raja2020implicit, title={Implicit Feedback Deep Collaborative Filtering Product Recommendation System}, author={Raja Kalaiselvi Bhaskar, Karthik and Kundur, Deepa and Lawryshyn, Yuri}, journal={arXiv e-prints}, pages={arXiv--2009}, year={2020} }

Technical Skills

Machine Learning

PyTorch

TensorFlow

Keras

Scikit-learn

MLflow

LLMs & NLP

LangChain

LlamaIndex

HuggingFace

Spacy

Programming

Python

SQL

Git

Linux

Spark

Data Processing & Analysis

Numpy

Pandas

SciPy

PySpark

Visualization

Plotly

Seaborn

Matplotlib

Tableau

Web Development

Streamlit

Flask

AngularJS

NodeJS

Bootstrap

Cloud & DevOps

Azure

AWS

GCP

Docker

Kubernetes

Database & Vector Search

Pinecone

Ray

MLOps & LLMOps

Model Lifecycle

Experiment Tracking

Observability

Monitoring

Research Areas

Deep Learning

LLMs

Generative AI

Reinforcement Learning

Recommendation Systems

Conference Talks & Presentations

Deep Learning in HealthCare and its Practical Limitations

Jan 20, 2021 7:30 PM

AISC

Virtual Talk

Deep learning uses statistical techniques to give computer systems the ability to learn with incoming data and to identify patterns and make decisions with minimal human direction. Armed with such targeted analytics, doctors may be better able to assess risk, make correct diagnoses, and offer patients more effective treatments. Deep Learning has a lot of potential in Healthcare. But why don't these techniques are adopted in hospitals yet? What are the gaps between academic research and production level code in Deep Learning and Healthcare? How can we mitigate this production level gap in Deep Learning and Healthcare, and what are some of the tools and techniques we can deploy?

History of Deep Learning Why DeepLearing for Healthcare? Practical Limitations Research vs Production Data Augmentation Data Synthesis Pretraining Deep Learning Engineering ML Lifecycle

Watch Talk

Senior AI Scientist @ CIBC

ML Researcher @ Vector Institute

Education

Master of Applied Science (M.A.Sc)

Karthik Bhaskar

Senior AI Scientist

About Me

Research Interests

Experience

Senior AI Scientist

ML Researcher

Graduate Machine Learning Researcher

Data Science/Machine Learning Intern

Projects

Multi-Agent Deep Deterministic Policy Gradient

Defense GAN & Physical Adversarial Examples

Deep Deterministic Policy Gradient

Human Like Chess Engine

Navigation: Deep Q Networks

Detecting Deforestation from Satellite Images

Hyper Face

Self Driving Cars

Music Generation

Microsoft Malware Detection

Facebook Social Network Graph Prediction

Google Advertisement Click Prediction

Research Publications

Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support

Implicit Feedback Deep Collaborative Filtering Product Recommendation System

Technical Skills

Machine Learning

LLMs & NLP

Programming

Data Processing & Analysis

Visualization

Web Development

Cloud & DevOps

Database & Vector Search

MLOps & LLMOps

Research Areas

Conference Talks & Presentations

Deep Learning in HealthCare and its Practical Limitations