Rishabh – SQL, Airflow, PySpark
Rishabh is a middle-to-senior data engineer with solid production experience in PySpark, Airflow, Big Data, AWS/GCP, SQL, and Python, building and maintaining large-scale pipelines in product environments like Meta/Wayfair.He has a good grasp of modern data engineering practices, though part of his background is based on internal Meta tooling, so he may need a short ramp-up on more standard, off-the-shelf solutions and frameworks your team uses. On the soft side, he comes across as friendly, collaborative, and calm under light pressure. He is motivated by startup environments, open to owning specific domains, and comfortable with EU time zone overlap.
10 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Data Engineer
A centralized, cross‑functional usability analytics framework that combines data pipelines and an executive dashboard into a single source of truth on how advertisers use key workflows across multiple ad tools. It unifies fragmented UI interaction logs into a high‑performance data lake, computes granular KPIs (latency, funnel completion, adoption trends) with rich country and vertical segmentation, and surfaces an Advertiser Usability Score to highlight friction points across 100+ critical workflows and guide product and UX roadmaps.
Data Engineer
A foundational benchmarking data layer that powers competitive insights in an ads management platform, providing advertisers with a consistent yardstick to compare their performance against industry peers. It delivers high‑granularity ad‑level datasets and revamped aggregate layers that feed benchmark models and UI surfaces, enabling advertisers and ML teams to track CTR, CPC, and CPM against their vertical, train real‑time benchmarking models, and rely on more complete, validated, and reliable coverage.
Responsibilities:
- Engineered the core training dataset;
- Designed a rigorous data validation and parity framework;
- Streamlined the backend architecture;
- Managed upstream data source integration.
Accomplishments
- Enhanced data coverage by 30%;
- Reduced data maintenance overhead and storage costs by 60%;
- Improved pipeline reliability and data freshness;
- Achieved high-accuracy benchmarking through data validation;
- Enabled advanced AI capabilities.
Data Engineer
A centralized ads reporting data engine that acts as the foundational source of truth behind the Ads Reporting surface, transforming raw interaction logs into a reliable, queryable layer for performance and spend analytics. It processes billions of user events (clicks, scrolls, views) into structured facts and dimensions used by internal product and data teams to monitor feature usage, close reporting gaps, and drive data‑informed UX and roadmap decisions.
Responsibilities:
- Architected a foundational data engine with facts and dimensions;
- Defined and implemented a standardized event/sub-event framework;
- Integrated multi-dimensional advertiser attributes;
- Designed a high-complexity data layer for analytics;
- Dashboard for stakeholders.
Accomplishments:
- Eliminated data silos;
- Enhanced platform health monitoring;
- Enabled data-driven UI/UX roadmaps.
Data Engineer
A mission‑critical automated replenishment and optimization tool that serves as the central decision layer for physical retail stores, integrating supply‑chain, warehouse, and storefront data to automate inventory planning. It replaces manual workflows with a single analytical engine that accounts for aisle geometry and capacity, then provides prescriptive, store‑level recommendations so planning teams and store managers can reduce stockouts, overstocks, and operational costs.
Responsibilities:
- Architected an end-to-end Replenishment Data Pipeline;
- Developed automated ETL scripts;
- Integrated spatial metadata;
- Built a comprehensive Looker analytics suite.
Accomplishments:
- Automated 100% of the replenishment workflow, migrating the Planning and Allocation team from manual Excel-based tracking to a real-time, automated dashboard;
- Optimized store inventory levels;
- Improved data accessibility and reliability;
- Drove strategic distribution decisions.