Logo
Luis – Python, AWS, SQL, experts in Lemon.io

Luis

From Colombia (UTC-5)flag

Data Engineer|Senior

Luis – Python, AWS, SQL

Luis is a Senior Data Engineer with over 14 years of dedicated experience in data engineering and 17 years in software overall. He brings strong expertise in Python, SQL, and AWS, with hands-on experience building and maintaining data pipelines and cloud-based solutions. Luis has successfully delivered data warehouse migrations, improved data workflows, and ensured high data quality across multiple projects. He is skilled at combining solid engineering with fast-moving, product-focused teams, making him a great fit for startups scaling their data operations. He has led development teams in financial and healthcare sectors, knows how to set up robust pipelines under tight timelines, and can quickly translate complex requirements into reliable, scalable solutions.

18 years of commercial experience in
Accounting
AI
Banking
Fintech
Healthcare
Transportation
Subscription
Main technologies
Python
8.5 years
AWS
3.5 years
SQL
16.5 years
Snowflake
5 years
Redshift
1 year
Additional skills
PySpark
Data visualization
Data Warehouse
Microsoft Power BI
PL/SQL
Oracle
Airflow
DBT
Vertex AI
GCP
Databricks
BigQuery
Tableau
RAG
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Senior Data Engineer
Oct 2024 - Sep 202510 months
Project Overview

Build a governed, reusable transportation data foundation on Databricks using a medallion (bronze–silver–gold) architecture to ingest raw operational feeds, standardize and cleanse trip/shipment/asset data, and publish analytics-ready, business-aligned datasets for planning, tracking, and cost/performance insights across logistics.

Responsibilities:
  • Built and maintained Databricks (PySpark) workflows following a medallion (bronze–silver–gold) architecture to ingest and refine transportation data from multiple Kafka topics;
  • Standardized, cleansed, and consolidated trip, shipment, and asset events into analytics-ready datasets to support logistics planning, tracking, and performance reporting;
  • Implemented data-quality controls (schema enforcement, deduplication, late-arrival handling) to guarantee reliable, trustworthy transportation data for downstream consumers.
Project Tech stack:
Databricks
PySpark
SQL
Lead Developer
Jul 2024 - Oct 20243 months
Project Overview

It is a Retrieval-Augmented Generation (RAG) solution that ingests and interprets applicants’ financial documents (payslips, bank statements, tax returns, collateral docs), cross-checks them against business rules and risk thresholds, and produces explainable approve/reject recommendations - reducing manual review time, improving consistency, and strengthening auditability of the loan origination process.

Responsibilities:
  • Designed and implemented a Python-based RAG system on GCP using Vertex AI to ingest, extract, structure, classify, and analyze financial documents, including scanned and multi-image formats;
  • Enabled support for multiple applicant document types (payslips, bank statements, tax returns, collateral docs) and variable-quality inputs to improve coverage and automation;
  • Integrated business/risk rules to generate explainable approve/reject recommendations, reducing manual loan-document review time and improving consistency;
  • Led and coached a cross-functional team of 5 engineers (one data scientist, 4 Python developers), defining delivery milestones, enforcing MLOps and quality practices, and ensuring the solution was scalable, secure, and auditable.
Project Tech stack:
Python
GCP
Vertex AI
RAG
Senior Data Engineer
Mar 2022 - Jul 20242 years 4 months
Project Overview

Modernize and re-platform existing data pipelines from Redshift to Snowflake to ensure consistent data models, optimized performance, and lower platform fragmentation - leveraging Snowflake-native features (tasks, streams, warehouse sizing) and CI/CD to deliver reliable, maintainable, and cost-aware data workflows.

Responsibilities:
  • Implemented, optimized, and maintained new data ingestions into Snowflake using dbt Core (and later Cloud);
  • Implemented new data ingestions from Redshift into Snowflake using Airflow within a fully AWS-based data lake architecture;
  • Maintained AWS Glue (PySpark) workflows populating warehouse data in Redshift;
  • Successfully proposed and implemented a data lake architecture across new and legacy ELT processes, improving data quality and governance.
Project Tech stack:
Snowflake
DBT
Airflow
AWS
SQL
PySpark
Redshift
Senior Data Engineer
Mar 2021 - Mar 20221 year
Project Overview

A secure, scalable, and cost-efficient data platform by migrating the enterprise data warehouse from the on-premise Oracle database to Snowflake, fully hosted on AWS - modernizing ingestion, storage, and analytics while minimizing downtime and ensuring data quality.

Responsibilities:
  • Researched, designed, and developed a new data integration system using modern technologies for process orchestration (Airflow) and data transformation (dbt, Great Expectations, Snowflake) within a fully AWS environment;
  • Optimized procedures and SQL statements in legacy ETL processes across Oracle and Snowflake databases;
  • Maintained and documented a Python-based system delivering transactional data to clients via unstructured text files;
  • Reduced execution time of a major daily legacy ETL from 10 to 8 hours (–20%) by improving procedure call orchestration and optimizing SQL statements.
Project Tech stack:
Snowflake
DBT
Airflow
AWS
PySpark
SQL
Python
Lead Business Intelligence
Feb 2020 - Mar 20211 year 1 month
Project Overview

New Business Intelligence area that standardizes data sources, defines governance and KPIs, and provides self-service reporting and dashboards to business units, improving decision-making speed, accuracy, and transparency. This project involved integrating multiple data sources, cleaning, quality control, transformation, and normalization using SQL, Pentaho, and Great Expectations.

Responsibilities:
  • Created a new data warehouse in the company's on-premises infrastructure to consolidate all organizational information;
  • Detected data silos and defined their integration into the new data warehouse;
  • Defined minimum policies for data governance;
  • Managed, researched, and developed two projects generating reports for the company’s financial area and top management;
  • Built ETLs using Pentaho and Great Expectations, and developed dashboards in Power BI;
  • Designed and implemented the Technical Provisions process to ensure company solvency control.
Project Tech stack:
Oracle
PL
SQL
Microsoft Power BI
SQL
Python

Education

2008
Informatic Engineer
Bachelor
2018
Computer Sciences
Master

Languages

French
Intermediate
English
Advanced

Hire Luis or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2025 lemon.io. All rights reserved.