
Varun
From United States
Varun – AI, LLM, AWS
Varun is a versatile Senior AI/ML and Backend Engineer with extensive experience delivering production-grade AI systems at scale. He brings strong expertise in LLMs, transformers, and retrieval-augmented generation, alongside a solid foundation in classical ML and backend engineering with Python. With hands-on experience in designing scalable architectures, fine-tuning models, and deploying enterprise-grade AI solutions for high-volume use cases, Varun excels at bridging research concepts with practical business applications. He is assessed as a candidate well-suited for founding engineer roles or senior AI/ML positions in startups and scale-ups.
11 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Sr. Data Scientist. AI/ML Engineer
One of the world’s largest financial services companies, serving millions of customers across payments, lending, and banking products. The project was an Agentic AI platform designed to modernize dispute resolution by applying large language models and retrieval-augmented generation for financial document analysis and decision support. The product enabled a compliant and more accurate handling of disputes by combining GPT-4 with enterprise retrieval systems, ensuring low-latency decisioning across millions of financial transactions.
- Led an Agentic AI initiative with Azure OpenAI GPT-4, LangChain, and RAG to modernize dispute resolution and improve customer experience.
- Designed and deployed FastAPI microservices on AKS with Cognitive Search and PGVector, enabling low-latency, compliant financial decisioning.
- Fine-tuned GPT-4 with LoRA on Azure ML using financial corpora, strengthening classification accuracy and reducing escalations.
- Built preprocessing pipelines with Dask on Databricks to enrich embeddings, improve recall, and reduce token usage in RAG workflows.
- Managed end-to-end MLOps pipelines with Azure ML and DevOps, integrating blue-green rollouts, rollback gates, and SOC2-aligned governance in GitHub.
- Secured production deployments with Azure Key Vault and Managed Identity, ensuring compliant secrets management and zero audit exceptions.
- Applied LangSmith, OpenAI Evals, and Application Insights to monitor LangChain pipelines, enforce Responsible AI, and catch regressions early.
- Built real-time streaming ingestion with Event Hubs and Synapse Spark, powering fraud alerts and improving grounding quality for RAG systems.
- Documented prompt-engineering playbooks in Confluence to optimize templates and reduce token costs while safeguarding answer quality.
- Delivered executive Power BI dashboards linked with LangChain traces, giving leadership visibility into KPIs, costs, and model performance.
- Led Agile delivery in Jira across cross-functional teams, delivering production increments consistently with strong acceptance rates.
Data Scientist. AI/ML Engineer
A leading U.S. health insurance and healthcare services provider. The project was a claims automation and healthcare AI platform designed to help insurers and clinicians streamline claims processing, detect fraud, and improve decision support. The product enabled end-to-end digitization and analysis of claims data, integrating OCR, NLP, and ML pipelines to convert handwritten and structured claims into actionable insights. It supported fast reimbursement cycles, clinical summarization, and compliance-ready data pipelines for sensitive health records.
- Built and fine-tuned BERT, BioBERT, and ClinicalBERT models on AWS SageMaker to extract ICD-10 codes and triage healthcare claims.
- Integrated OpenAI GPT-3 APIs and early RAG pipelines with Amazon Kendra and FAISS to power clinical summarization and policy search assistants.
- Designed OCR and PHI redaction workflows using AWS Textract, Python, and Comprehend Medical to digitize handwritten claims and ensure secure data handling.
- Developed data preprocessing pipelines in Pandas, NumPy, and AWS Glue to cleanse messy claims and improve downstream feature quality.
- Built asynchronous batch processing pipelines with Python workers and AWS SQS to handle high-volume OCR claims efficiently.
- Deployed low-latency adjudication services via Flask, AWS API Gateway, and SQS to streamline enterprise-scale claim processing.
- Delivered Tableau dashboards powered by Amazon Redshift, providing leadership visibility into SLA compliance, fraud alerts, and triage KPIs.
- Managed MLOps lifecycle with SageMaker Pipelines, MLflow, and CodePipeline, ensuring reproducible, compliant, and production-ready AI workflows.
- Implemented observability and explainability with SHAP, CloudWatch, Prometheus, and Grafana to maintain trust, transparency, and resilience in AI-driven decisions.
- Documented audit artifacts in GitHub and facilitated staged rollouts with rollback safeguards, supporting compliance and operational reliability.
Data Scientist
An extensive private hospital network serving millions of patients across specialties. The project was a healthcare data science and analytics platform designed to help doctors, insurers, and administrators predict patient risks, optimize resources, and detect fraud in claims. The product enabled data-driven healthcare decisions by integrating EHR records, billing data, and lab results into unified pipelines and applying predictive modeling, forecasting, and fraud detection.
- Developed predictive models in Python, R, and Scikit-learn for readmission risk, ICU demand forecasting, and chronic disease modeling.
- Designed fraud detection pipelines using XGBoost, PostgreSQL, and AWS Athena to identify anomalies in claims data.
- Automated EHR data preprocessing with FHIR/HL7 standards, Pandas, and AWS Glue to streamline feature engineering and modeling workflows.
- Applied unsupervised learning techniques such as K-Means and PCA for patient segmentation based on comorbidity and medication adherence patterns.
- Built interactive Power BI and Tableau dashboards for hospital administrators and insurers to track utilization, costs, and fraud alerts.
- Used SHAP and statistical validation methods to improve model explainability and ensure confidence in decision-making.
- Implemented batch scoring scripts on AWS EC2 with Cron and Glue to operationalize regular model inference on healthcare data.
- Maintained reproducible workflows using Git, Jupyter, and Markdown while collaborating with teams through Jira, Slack, and Confluence.
Data Scientist
One of India’s largest supermarket and hypermarket chains serving millions of daily shoppers. The project focused on developing early-stage data science and analytics solutions to optimize customer segmentation, inventory forecasting, and marketing effectiveness. The product was a set of analytics and predictive modeling solutions built on retail transaction, POS, and inventory data.
- Built data cleaning and transformation pipelines with Python, R, SQL, and Excel, standardizing POS and inventory data for reliable downstream analytics.
- Developed early regression, classification, and ARIMA forecasting models in scikit-learn and R, improving demand forecasting and customer segmentation accuracy.
- Performed feature extraction and correlation analysis using Pandas, Excel, and RStudio, identifying key drivers for marketing and inventory decisions.
- Leveraged Amazon S3 and EC2 with Hive queries for batch training and reporting, supporting large-scale data storage and compute needs.
- Designed dashboards and reports in Tableau, Excel, Matplotlib, and Seaborn, enabling stakeholders to track promotions, stock turnover, and customer behavior.
- Conducted EDA, statistical tests, and dataset validation with Pandas, R, and SQL, ensuring quality inputs and trustworthy results.
- Maintained code reproducibility and documentation using Jupyter, RMarkdown, Word, and PowerPoint, supporting collaboration and audit readiness.
- Supported dashboard enhancements and deployment reviews, contributing to production scoring tools and peer-reviewed Python workflows.