Karthik Bhaskar

Senior AI Scientist @ CIBC

ML Researcher @ Vector Institute

Education

Master of Applied Science (M.A.Sc)

University of Toronto

Specialization: Deep Learning, Natural Language Processing, Healthcare

Advisors: Prof. Bo Wang, Prof. Deepa Kundur, Prof. Yuri Lawryshyn

Hi, I'm

Karthik Bhaskar

Senior AI Scientist

About Me

I am a Senior AI Scientist at CIBC. Previously, I worked as a ML Researcher at WangLab affiliated with Vector Institute and University Health Network, proudly advised by Prof. Bo Wang.

I completed my Master's degree in ECE, specialized in Machine Learning at the University of Toronto advised by Prof. Deepa Kundur and by Prof. Yuri Lawryshyn. In my free time I work on projects at the intersection of Deep Learning, Natural Language Processing and Healthcare. Apart from that, I am also interested in Large Language Models and Deep Reinforcement Learning.

My ultimate goal is to build robust, privacy-preserved, and interpretable algorithms with human like ability to generalize in real-world environments by using data as its own supervision.

I am a Stream Owner and Discussion Group Lead of the "Machine Learning in Healthcare" stream at AISC (Aggregate Intellect), where we discuss one paper at a time every week from basics to state-of-the-art ML papers in HealthCare.

Throughout my life, I have approached every challenge with enthusiasm, creativity, and a ceaseless desire to achieve success. This passion and drive have paved the way to countless opportunities, unique experiences, and excellent relationships, both personally and professionally. I enjoy working with people and discussing ideas.

Research Interests

  • Large Language Models
  • Deep Learning
  • Natural Language Processing
  • Deep Reinforcement Learning
  • Privacy Preserved ML
  • Recommender Systems

Experience

Senior AI Scientist

CIBC

2020 - Present

  • Designed and implemented Retrieval-Augmented Generation (RAG) systems using vector databases, improving information retrieval accuracy by 35% and enabling more context-aware AI interactions.
  • Implemented robust LLMOps practices, leveraging MLFlow for experiment tracking and Databricks ML for managing machine learning workflows, resulting in a 25% reduction in model deployment time and improved reproducibility of results.
  • Architected and optimized distributed compute environments for running large-scale language models, implementing data and model parallelism techniques to efficiently handle massive datasets and reduce training.
  • Developed comprehensive observability and monitoring solutions for LLM models, tracking performance metrics, model drift, and resource usage, leading to an improvement in model reliability and a reduction in operational costs.
  • Engineered and deployed an innovative HR chatbot powered by GPT-4, analyzing HR documents and providing instant responses to employee queries, reducing average response time by 99% (from 15-20 minutes to seconds) and significantly enhancing organizational efficiency.
  • Engineered a cutting-edge Strategic Workforce Management tool leveraging Gaussian Deep Learning Models and Graph Neural Networks, enabling HR to predict 3-5 year workforce dynamics (attrition, hiring, internal moves, retirements) with uncertainty quantification and scenario planning capabilities, resulting in a 40% reduction in decision-making time for data-driven workforce planning.
  • Optimized data pipeline efficiency by identifying and resolving I/O bottlenecks and inefficient data shuffling, resulting in a 50% reduction in data processing time for large-scale ML workloads.
  • Engineered and deployed cutting-edge Large Language Models (LLMs) including GPT-3.5, Llama, and Falcon, implementing fine-tuning techniques and prompt engineering to enhance search functionality and efficiency of our internal platform, resulting in a 30% improvement in query response accuracy.
  • Developed Search tools and ML models using NLP techniques to enhance the team’s ability to investigate potential money laundering activity and identify trends, resulting in a 35% reduction in investigation time.
  • Decreased 25% of call-center call volumes by building a Conversational AI system, integrating Automatic Speech Recognition and Neural Agent Assistance with state-of-the-art Transformer models. Published our work at the EMNLP 2022 - Industry Track conference.
  • Reduced 80% of manual labour by deploying an end-to-end self-served BERT model to analyze and extract insights from client feedbacks.
  • Built a Semantic Search Engine using Transformers to get relevant data for downstream ML tasks like building Dashboard.
  • Engineered and deployed a high-performance Semantic Search Engine using Transformers, implementing data and model parallelism to improve training. Utilized Streamlit, Docker, and Docker Compose for seamless deployment and scaling.
  • Built Transformer based Topic Modelling on Deficiency data and used the results in downstream Classification Model to predict the category of deficiencies.
  • Built a DistilGPT-2 based Transformer neural network to generate synthetic financial data for downstream ML models.
  • Managed, trained and interviewed full time employee and interns in Data Science and Software Engineer roles.

ML Researcher

WangLab - Vector Institute

2020 - 2021

  • WangLab is affiliated with Vector Institute and University Health Network - Advised by Bo Wang
  • Worked on a project at the intersection of Computational Biology, Deep Learning and Natural Language Processing.
  • Applied Self Supervised Learning, Weak Supervision and Data Programming on MIMIC III database.
  • Built Transformer based architecture model to improve the accuracy of massively multi-task classification problem.

Graduate Machine Learning Researcher

University of Toronto

2018 - 2020

  • Responsible for developing and building cutting edge state of the art deep learning based recommendation system.
  • Built a Deep Learning-based Recommendation System for Wolseley's e-commerce website from scratch to production.
  • Dataset is massive involving more than 200,000 unique customers and 500,000 unique SKUs.
  • Achieved a personalized NDCG score of 72.4% and improved the One-Product Hit Ratio to 100%.

Data Science/Machine Learning Intern

Cybersecurity Research

Summer 2018 and Summer 2019

  • Worked with cybersecurity professionals and built a defense mechanism using deep learning and unsupervised learning techniques to prevent cyberattacks.
  • Built Unsupervised Auto Tagging algorithm and Automatic Rule Synthesis for Octavius.
  • Investigated an Automatic Rule synthesis algorithm for Octavius using Deep Reinforcement Learning to improve overall defense mechanism.
  • Devised a Deep Learning model that detects and prevents the cyberattack before it happens (ProjectX).
  • Used NLP, Neural Networks, Knowledge Graphs for Keyword Extraction, etc.

Projects

Multi-Agent Deep Deterministic Policy Gradient

View Project

Defense GAN & Physical Adversarial Examples

View Project

Deep Deterministic Policy Gradient

View Project

Human Like Chess Engine

View Project

Navigation: Deep Q Networks

View Project

Detecting Deforestation from Satellite Images

View Project

Hyper Face

View Project

Self Driving Cars

View Project

Music Generation

View Project

Microsoft Malware Detection

View Project

Facebook Social Network Graph Prediction

View Project

Google Advertisement Click Prediction

View Project

Research Publications

Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support

Karthik Raja Kalaiselvi Bhaskar, Stephen Obadinma, et al. ACL Anthology - EMNLP, 2022

Building Agent Assistants that can help improve customer service support requires inputs from industry users and their customers, as well as knowledge about state-of-the-art Natural Language Processing (NLP) technology. We combine expertise from academia and industry to bridge the gap and build task/domain-specific Neural Agent Assistants (NAA) with three high-level components for: (1) Intent Identification, (2) Context Retrieval, and (3) Response Generation. In this paper, we outline the pipeline of the NAA core system and also present three case studies in which three industry partners successfully adapt the framework to find solutions to their unique challenges. Our findings suggest that a collaborative process is instrumental in spurring the development of emerging NLP models for Conversational AI tasks in industry. The full reference implementation code and results are available at https://github.com/VectorInstitute/NAA.

Implicit Feedback Deep Collaborative Filtering Product Recommendation System

Karthik Raja Kalaiselvi Bhaskar, Deepa Kundur, Yuri Lawryshyn arXiv, 2020

In this paper, several Collaborative Filtering (CF) approaches with latent variable methods were studied using user-item interactions to capture important hidden variations of the sparse customer purchasing behaviours. The latent factors are used to generalize the purchasing pattern of the customers and to provide product recommendations. CF with Neural Collaborative Filtering(NCF) was shown to produce the highest Normalized Discounted Cumulative Gain (NDCG) performance on the real-world proprietary dataset provided by a large parts supply company. Different hyperparameters were tested using Bayesian Optimization (BO) for applicability in the CF framework. External data sources like click-data and metrics like Clickthrough Rate (CTR) were reviewed for potential extensions to the work presented. The work shown in this paper provides techniques the Company can use to provide product recommendations to enhance revenues, attract new customers, and gain advantages over competitors.

Technical Skills

Machine Learning

PyTorch
TensorFlow
Keras
Scikit-learn
MLflow

LLMs & NLP

LangChain
LlamaIndex
HuggingFace
Spacy

Programming

Python
SQL
Git
Linux
Spark

Data Processing & Analysis

Numpy
Pandas
SciPy
PySpark

Visualization

Plotly
Seaborn
Matplotlib
Tableau

Web Development

Streamlit
Flask
AngularJS
NodeJS
Bootstrap

Cloud & DevOps

Azure
AWS
GCP
Docker
Kubernetes

Database & Vector Search

Pinecone
Ray

MLOps & LLMOps

Model Lifecycle
Experiment Tracking
Observability
Monitoring

Research Areas

Deep Learning
LLMs
Generative AI
Reinforcement Learning
Recommendation Systems

Conference Talks & Presentations

Deep Learning in HealthCare and its Practical Limitations

Jan 20, 2021 7:30 PM
AISC
Virtual Talk
Deep learning uses statistical techniques to give computer systems the ability to learn with incoming data and to identify patterns and make decisions with minimal human direction. Armed with such targeted analytics, doctors may be better able to assess risk, make correct diagnoses, and offer patients more effective treatments. Deep Learning has a lot of potential in Healthcare. But why don't these techniques are adopted in hospitals yet? What are the gaps between academic research and production level code in Deep Learning and Healthcare? How can we mitigate this production level gap in Deep Learning and Healthcare, and what are some of the tools and techniques we can deploy?
History of Deep Learning Why DeepLearing for Healthcare? Practical Limitations Research vs Production Data Augmentation Data Synthesis Pretraining Deep Learning Engineering ML Lifecycle
Watch Talk