Shubhankar – Data Science, Python, MLOps, experts in Lemon.io

Shubhankar

From United States (UTC-5)

Data Scientist|Senior

MLOps Engineer|Senior

Machine Learning Engineer|Senior

Lemon.io stats

1

offers now 🔥

Skills and seniority verified on Apr 29, 2026

Shubhankar – Data Science, Python, MLOps

Shubhankar is a senior Data Scientist, MLOps, and Machine Learning engineer with strong experience in large-scale ML systems, remote sensing, and climate data pipelines. He has led teams and architected distributed solutions for terabyte-scale scientific datasets, demonstrating practical production awareness and domain-driven feature engineering. His strengths lie in scalable data processing, MLOps practices, and domain-driven feature engineering, with solid exposure to real-world scientific and environmental use cases.

11 years of commercial experience in

AI

Analytics

Climate tech

Data analytics

Scientific research

Main technologies

Data Science

5 years

Python

5 years

MLOps

4 years

Machine learning

5 years

NumPy

7 years

Additional skills

Claude LLM

Pandas

Snowflake

Keras

Tensorflow

LLM

Tableau

SQL

Scikit-learn

PyTorch

Terraform

MLflow

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Lead ML & MLOps Architect

Dec 2025 - May 20265 months

Project Overview

An internal full-stack simulation platform for long-horizon forest ecosystem modeling under real-world climate change scenarios. It supports 100-year iLand forest projections, experiment tracking, and artifact management for internal research teams working across insurance, finance, and forest management.

Project gallery:

Portfolio example for Symbiose Management by Shubhankar , Lead ML and MLOps architect

Portfolio example for Symbiose Management by Shubhankar , Lead ML and MLOps architect

Responsibilities:

Architected and built an end-to-end experiment management platform for iLand forest simulations from scratch, including a custom launcher UI, Azure Blob Storage artifact pipeline, and MLflow 3.7 tracking integration;
Designed simulation orchestration for 100-year forest projections using ICHEC-EC-EARTH RCP8.5 climate scenarios with disturbance modeling enabled;
Instrumented and visualized 11 forest ecosystem metrics, including carbon sequestration, tree volume, basal area, height, and NPP, across factorial and batch experiment runs;
Delivered a reproducible ML experimentation system with run comparison, metric visualization, and artifact versioning for internal research teams.

Project Tech stack:

AI system design

AI API integration

AI deployment

Data analysis

Python

Microsoft Azure

Docker

MLflow

Lead ML & MLOps architect

Dec 2025 - May 20265 months

Project Overview

A large-scale data orchestration platform for ingesting, processing, and delivering geospatial and environmental datasets. It pulls data from satellite imagery providers, climate model outputs, and government land registries, then transforms raw inputs into analysis-ready formats for GIS and analytics consumers across Europe.

Project gallery:

Portfolio example for Symbiose Management by Shubhankar , Lead ML & MLOps architect

Portfolio example for Symbiose Management by Shubhankar , Lead ML & MLOps architect

Responsibilities:

Architected a production Airflow environment managing 45+ scheduled pipelines with full health monitoring and 128 concurrent task slots;
Built automated ingestion pipelines pulling from satellite providers, climate agencies, and government land registries, including European and French national sources;
Developed format conversion pipelines for Cloud-Optimized GeoTIFF and PMTiles to optimize large raster datasets for web and tile-based delivery;
Integrated Google Earth Engine and cloud blob storage as data sources across multiple pipeline families;
Implemented tagging, scheduling, and dependency strategies to coordinate 45+ DAGs with varying cadences;
Monitored pipeline health and resolved failures to maintain high success rates across all scheduled runs.

Project Tech stack:

Apache Airflow

Python

GoogleAPI

Azure DevOps

Docker

PostgreSQL

AI Integration Engineer

Mar 2026 - Apr 20261 month

Project Overview

A prototype demonstrating how to use ComfyUI as a backend for customer-facing image generation applications. The system routes user requests through ComfyUI's workflow engine, which orchestrates a custom node that dispatches inference jobs to FAL.ai's cloud GPU infrastructure. Built a Gradio-based frontend that submits jobs, tracks queue status in real time, and returns generated images to end users.

Project gallery:

Portfolio example for ComfyUI-demo by Shubhankar , AI Integration Engineer

Portfolio example for ComfyUI-demo by Shubhankar , AI Integration Engineer

Portfolio example for ComfyUI-demo by Shubhankar , AI Integration Engineer

Portfolio example for ComfyUI-demo by Shubhankar , AI Integration Engineer

Responsibilities:

Designed and implemented a custom ComfyUI node (FalTextToImage) that integrates FAL.ai cloud GPU inference as a drop-in backend;
Built a customer-facing Gradio UI that routes image generation requests through ComfyUI's workflow queue;
Implemented real-time job status tracking with progress updates between the Gradio frontend and ComfyUI API;
Configured ComfyUI workflow pipelines for multiple image generation models (Lightning SDXL, Fast SDXL, AuraFlow, SD3 Medium);
Designed the system architecture to decouple workflow orchestration (ComfyUI) from model inference (FAL.ai cloud GPUs);

Project Tech stack:

Python

REST API

PyTorch

ComfyUI

CTO

Dec 2024 - Dec 20251 year

Project Overview

An AI meteorology platform for automated climate and solar forecasting. It combines LLM agents, MCP tools, and an interactive global weather map to support time-series forecasts, solar analysis, and climate data exploration.

Project gallery:

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Responsibilities:

Led a 5-person team in architecting and shipping an AI meteorologist product with a natural language chat interface, interactive global solar radiation map, and time-series forecast playback powered by Claude LLM agents and MCP tools;
Built a distributed climate data pipeline using Dask, Ray, and GCP, processing 40TB of ECMWF global climate data and reducing processing time from months to days;
Achieved 10-15% RMSE improvement in temperature forecasts and 5% RMSE improvement in solar power forecasts through large-scale bias correction and ensemble modeling;
Integrated pvlib ModelChain with CEC models and 17,544 hours of historical weather reanalysis (2023-2024) for professional-grade solar energy analysis;
Secured Techstars 2025 ($120k), Stanford StartX, and Stanford TomKat Sustainability sponsorship to scale R&D.

Project Tech stack:

PyTorch

Python

XGBoost

Claude API

Claude LLM

MCP

MLOps

LLM

GCP Compute Engine

Ray

Dask

CTO

Dec 2024 - Dec 20251 year

Project Overview

A high-performance distributed ML preprocessing pipeline for climate and weather data stored in Google Cloud Storage. It processes 160TB of data across a small distributed cluster and reduced end-to-end preprocessing time from 8 days to 1.3 days.

Project gallery:

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Portfolio example for SoranoAI by Shubhankar , CTO

Responsibilities:

Reduced ML preprocessing time from 8 days to 1.3 days by architecting a distributed pipeline across 3 machines processing 160TB+ of GCS-hosted climate data;
Achieved a 35x GCS read speedup (70 min → 2 min) through aggressive gcsfuse tuning, including a 512GB file cache, 80 parallel connections per host, and 200 parallel downloads;
Designed distributed workload splitting across 680 variables (land and ocean features) with data locality optimization, writing approximately 17TB of preprocessed output per machine;
Built full observability into the pipeline with automated progress tracking, ETA estimation, and structured logging for 30+ hour production runs;
Developed a test-mode framework validating the full pipeline in 40 minutes before committing to multi-day production runs.

Project Tech stack:

Python

GCP

Dask

Distributed Systems

Keep in mind, the experience summary might exclude non-relevant projects

Education

2018

Computer Science

Masters

Languages

English

Advanced

Hire Shubhankar or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request

Copyright © 2026 lemon.io. All rights reserved.

Terms of use Privacy policy