Logo
Rishabh – SQL, Airflow, PySpark, experts in Lemon.io

Rishabh

From United States (UTC-7)flag

Data Engineer|Middle-to-senior

Rishabh – SQL, Airflow, PySpark

Rishabh is a middle-to-senior data engineer with solid production experience in PySpark, Airflow, Big Data, AWS/GCP, SQL, and Python, building and maintaining large-scale pipelines in product environments like Meta/Wayfair.He has a good grasp of modern data engineering practices, though part of his background is based on internal Meta tooling, so he may need a short ramp-up on more standard, off-the-shelf solutions and frameworks your team uses. On the soft side, he comes across as friendly, collaborative, and calm under light pressure. He is motivated by startup environments, open to owning specific domains, and comfortable with EU time zone overlap.

10 years of commercial experience in
Adtech
Advertising
AI
Data analytics
Retail
Supply chain
Main technologies
SQL
8 years
Airflow
5 years
PySpark
3 years
Python
8 years
Big Data
5 years
Additional skills
BigQuery
Apache Spark
Snowflake
Data Warehouse
Data visualization
Data Modeling
User-centered design
Data analysis
CI/CD
Core Data
PostgreSQL
Database design
AWS
Pandas
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Data Engineer
Apr 2025 - Ongoing10 months
Project Overview

A centralized, cross‑functional usability analytics framework that combines data pipelines and an executive dashboard into a single source of truth on how advertisers use key workflows across multiple ad tools. It unifies fragmented UI interaction logs into a high‑performance data lake, computes granular KPIs (latency, funnel completion, adoption trends) with rich country and vertical segmentation, and surfaces an Advertiser Usability Score to highlight friction points across 100+ critical workflows and guide product and UX roadmaps.

Responsibilities:
  • Enabled product teams to define and prioritize data‑backed product roadmaps.
  • Identified and reduced key sources of advertiser friction across critical workflows.
  • Supported revenue growth by informing UX improvements that increased advertiser efficiency and satisfaction.
Project Tech stack:
Python
SQL
Apache Spark
Data Modeling
Data visualization
Data Warehouse
Data Engineer
Jun 2025 - Nov 20255 months
Project Overview

A foundational benchmarking data layer that powers competitive insights in an ads management platform, providing advertisers with a consistent yardstick to compare their performance against industry peers. It delivers high‑granularity ad‑level datasets and revamped aggregate layers that feed benchmark models and UI surfaces, enabling advertisers and ML teams to track CTR, CPC, and CPM against their vertical, train real‑time benchmarking models, and rely on more complete, validated, and reliable coverage.

Responsibilities:

Responsibilities:

  • Engineered the core training dataset;
  • Designed a rigorous data validation and parity framework;
  • Streamlined the backend architecture;
  • Managed upstream data source integration.

Accomplishments

  • Enhanced data coverage by 30%;
  • Reduced data maintenance overhead and storage costs by 60%;
  • Improved pipeline reliability and data freshness;
  • Achieved high-accuracy benchmarking through data validation;
  • Enabled advanced AI capabilities.
Project Tech stack:
SQL
Python
Data Engineer
Jan 2025 - Nov 20259 months
Project Overview

A centralized ads reporting data engine that acts as the foundational source of truth behind the Ads Reporting surface, transforming raw interaction logs into a reliable, queryable layer for performance and spend analytics. It processes billions of user events (clicks, scrolls, views) into structured facts and dimensions used by internal product and data teams to monitor feature usage, close reporting gaps, and drive data‑informed UX and roadmap decisions.

Responsibilities:

Responsibilities:

  • Architected a foundational data engine with facts and dimensions;
  • Defined and implemented a standardized event/sub-event framework;
  • Integrated multi-dimensional advertiser attributes;
  • Designed a high-complexity data layer for analytics;
  • Dashboard for stakeholders.

Accomplishments:

  • Eliminated data silos;
  • Enhanced platform health monitoring;
  • Enabled data-driven UI/UX roadmaps.
Project Tech stack:
SQL
Python
User-centered design
Data Engineer
Feb 2022 - Nov 20229 months
Project Overview

A mission‑critical automated replenishment and optimization tool that serves as the central decision layer for physical retail stores, integrating supply‑chain, warehouse, and storefront data to automate inventory planning. It replaces manual workflows with a single analytical engine that accounts for aisle geometry and capacity, then provides prescriptive, store‑level recommendations so planning teams and store managers can reduce stockouts, overstocks, and operational costs.

Responsibilities:

Responsibilities:

  • Architected an end-to-end Replenishment Data Pipeline;
  • Developed automated ETL scripts;
  • Integrated spatial metadata;
  • Built a comprehensive Looker analytics suite.

Accomplishments:

  • Automated 100% of the replenishment workflow, migrating the Planning and Allocation team from manual Excel-based tracking to a real-time, automated dashboard;
  • Optimized store inventory levels;
  • Improved data accessibility and reliability;
  • Drove strategic distribution decisions.
Project Tech stack:
Python
SQL
Data Modeling
Data analysis
Data visualization

Education

2017
Information Systems
Masters Degree

Languages

English
Advanced

Hire Rishabh or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2026 lemon.io. All rights reserved.