👋 Hey, I'm Dweep Pandya
Senior AI Engineer

Building enterprise-scale RAG systems, MCP endpoints, and high-fidelity AI retrieval pipelines at Thoughtworks. 6+ years turning complex AI challenges into production-grade solutions.

🏢 Thoughtworks 🤖 RAG Architecture ⚡ MCP ☁️ Azure Databricks 🔗 LangChain
Dweep Pandya

Experience

Thoughtworks Senior AI Engineer Sept 2025 – Present

📍 Pune, Maharashtra

  • Led end-to-end architecture of an Enterprise-Scale RAG system on Azure Databricks — designed for high scalability, security, and low-latency performance.
  • Orchestrated complex data pipelines for unstructured data processing, implementing sophisticated chunking and vectorization strategies.
  • Engineered context-aware AI capabilities exposed as Model Context Protocol (MCP) endpoints for seamless agentic interoperability.
  • Designed hybrid search pipelines combining dense vector and sparse retrieval for high-fidelity document retrieval.
  • Implemented dynamic context assembly and advanced prompt optimization techniques to maximise LLM output quality.
Ecolibrium Lead Data Scientist July 2022 – Sept 2025

📍 Pune, Maharashtra

  • Led a team of 4 Data Scientists, driving technical mentoring, sprint planning, and cross-functional collaboration.
  • Delivered $1M+ USD in ROI for clients using AI-driven Operational Excellence and Sustainability solutions.
  • Spearheaded company's transition to Generative AI — built RAG-based agentic architectures using LangChain, Vector DBs, and fine-tuned open-source LLMs.
  • Architected scalable production ML infrastructure using Terraform (IaC), Kubernetes (Kubeflow), and MLFlow.
  • Built predictive maintenance, energy forecasting, and anomaly detection models at scale on AWS (EC2, ECS, S3, EKS, Lambda, Glue).
NICE Actimize Data Scientist June 2019 – June 2022

📍 Pune, Maharashtra

  • Developed and deployed supervised/unsupervised ML models for Fraud Detection and Anti-Money Laundering (AML) on AWS SageMaker.
  • Reduced manual analytical effort from 6 weeks to 2 hours through ML-based process optimization — massive ROI impact.
  • Managed major client accounts directly, including Goldman Sachs.
  • Worked extensively with big datasets (100M+ rows) using Athena, S3, and Python-based pipelines.

Skills

🤖

AI & GenAI

RAG Architectures Hybrid Search Embedding Models Prompt Optimization MCP LangChain Vector Databases LLM Fine-tuning
🧠

Machine Learning

PyTorch TensorFlow Scikit-Learn XGBoost LSTM / Transformers Anomaly Detection Time-Series
☁️

Cloud & Infrastructure

Azure Databricks AWS SageMaker EC2 / ECS / EKS S3 / Athena / Glue Kubernetes Terraform Kubeflow MLFlow
🛠️

Languages & Data

Python SQL Pandas PySpark Tableau Git Docker

Publications

Named Entity Recognition: A Survey for Indian Languages

IEEE — 2020

View on IEEE →

Hindi Named Entities Recognition (NER) using Natural Language Processing and Machine Learning

IJAECS — International Journal of Advanced Engineering, Management and Science

View Publication →

Education

🎓

Vishwakarma Institute of Information Technology (VIIT)

Bachelor of Engineering — Computer Engineering

2015 – 2019 · Pune, Maharashtra

SGPA: 9.32 / 10

🏆 College Topper — 9.86 SGPA (First Year)

Get in Touch

Whether you want to talk AI, collaborate on something interesting, or just say hi — my inbox is always open.

☁️

AWS Certified Cloud Practitioner

Amazon Web Services

🌍

Languages

English · Hindi · Gujarati

🎮

High-Performance Gamer

Counter-Strike enthusiast