Logo
Juan – SQL, Python, AWS, experts in Lemon.io

Juan

From Mexico (UTC-6)flag

Data Engineer|Senior

Juan – SQL, Python, AWS

Juan is a Senior Data Engineer and Architect with strong hands-on expertise in SQL, Spark, Airflow, and multi-cloud ecosystems (AWS, GCP, Azure). He demonstrates solid knowledge of large-scale data processing, ETL design, and workflow orchestration, with clear technical reasoning. Juan brings 20+ years of experience building scalable, secure data platforms and integrating AI solutions, and combines deep engineering expertise with strategic insight into data architecture. He is also currently pursuing postgraduate studies in Artificial Intelligence and Machine Learning at The University of Texas at Austin.

10 years of commercial experience in
Accounting
Administration
Analytics
Data analytics
Fintech
Govtech
Healthcare
Healthtech
Product management
Project management
Social impact
Data monetization
Hardware
Main technologies
SQL
14 years
Python
7 years
AWS
5 years
Microsoft Azure
7 years
GCP
1 year
Additional skills
Apache Spark
Snowflake
DBT
RAG
LLM
Databricks
C#
ETL
MySQL
NoSQL
MongoDB
Airflow
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Tech Lead
Jun 2025 - Ongoing4 months
Project Overview

The company's mission is to empower Americans by providing access to factual and transparent data. By aggregating information from federal, state, and local government sources, we make comprehensive government data easily accessible via our online platforms.

Responsibilities:
  • Designed and optimized Databricks Lakehouse pipelines unifying 1,000+ federal, state, and local datasets, improving ETL performance by 45% and reducing compute costs by 30%;
  • Implemented Delta Lake and Unity Catalog for reproducible, auditable data powering public dashboards on Builder.com and Flourish;
  • Built API integrations and visualization feeds enabling near-real-time civic data access for millions of users.
Project Tech stack:
Databricks
Python
JavaScript
SQL
PySpark
Amazon S3
Senior Data Architect
Jan 2024 - Jun 20251 year 4 months
Project Overview

A medical DataLake house importing several SQL Server and MySQL data to Snowflake for Patient and Clinical data Analytics. It handles data from over 30 cardiovascular practices across America, caring for 1.1 million patients.

Responsibilities:
  • Engineered a Snowflake Data Lakehouse integrating multi-source data from SQL Server and MySQL systems across 30+ cardiovascular practices, consolidating 1.1M+ patient records for clinical and operational analytics;
  • Designed and optimized ELT pipelines for patient, procedure, and EHR data, improving processing efficiency by 40% and enabling daily refreshes of key clinical KPIs;
  • Implemented data quality, lineage, and governance frameworks, ensuring HIPAA compliance and consistent metrics across sites;
  • Partnered with clinical and analytics teams to deliver interactive dashboards supporting physician performance tracking, patient outcomes, and RVU-based financial reporting.
Project Tech stack:
Snowflake
Microsoft SQL Server
Python
SQL
Transact-SQL (T-SQL)
SQL Server
DBT
MySQL
Airflow
PowerBI
Fivetran
Project Technical Manager
Jan 2024 - Oct 20249 months
Project Overview

A hardware lifecycle management platform designed to support OEM operations and device division projects.

Responsibilities:
  • Managed end-to-end delivery of a hardware lifecycle management platform, coordinating cross-functional teams across engineering, UX, and operations to streamline OEM device tracking and lifecycle visibility;
  • Defined and governed Master Data Management (MDM) and UX requirements, standardizing device metadata, improving data quality, and unifying the user experience across multiple product lines;
  • Established data governance frameworks ensuring secure, traceable, and ethical use of training and inference data across AI-enabled systems;
  • Partnered with UX and engineering teams to refine AI-driven user flows, aligning interface design with model capabilities and business objectives;
  • Led Agile project planning, stakeholder engagement, and sprint delivery, ensuring roadmap alignment and seamless integration with Microsoft’s global supply chain systems;
  • Improved platform usability and data consistency, reducing manual reconciliation by ~35% and enhancing reporting accuracy across global operations.
Project Tech stack:
.NET
React
Microsoft SQL Server
Azure DevOps
SQL Server
PowerBI
Agile
Jira
Confluence
REST API
Tech Lead
Nov 2022 - Feb 20233 months
Project Overview

Migration of external file processing from Scala to PySpark on Databricks to modernize Mexico’s tax data infrastructure.

Responsibilities:
  • Migrated legacy Scala-based ETL pipelines to PySpark within Databricks, modernizing SAT’s large-scale tax data processing framework and improving maintainability and performanceж
  • Optimized data ingestion and transformation workflows for high-volume fiscal datasets, reducing processing time by 40% and enabling more efficient reconciliation of taxpayer and fiscal records;
  • Implemented Delta Lake architecture and parameterized notebooks for scalable, auditable, and reusable data pipelines across multiple tax data domains;
  • Collaborated with internal data governance teams to ensure data lineage, compliance, and auditability within Mexico’s national tax data ecosystem.
Project Tech stack:
Apache Spark
PySpark
Scala
Python
Databricks
SQL
Git

Education

2024
Technology and Systems
Bachelor's
2026
Artificial Intelligence and Machine Learning
Postgraduate

Languages

Spanish
Advanced
English
Advanced

Hire Juan or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2025 lemon.io. All rights reserved.