Luis

From Colombia (UTC-5)

Data Engineer|Senior

Lemon.io stats

1

offers now 🔥

Skills and seniority verified on Nov 3, 2025

Luis – Python, AWS, SQL

Luis is a Senior Data Engineer with over 14 years of dedicated experience in data engineering and 17 years in software overall. He brings strong expertise in Python, SQL, and AWS, with hands-on experience building and maintaining data pipelines and cloud-based solutions. Luis has successfully delivered data warehouse migrations, improved data workflows, and ensured high data quality across multiple projects. He is skilled at combining solid engineering with fast-moving, product-focused teams, making him a great fit for startups scaling their data operations. He has led development teams in financial and healthcare sectors, knows how to set up robust pipelines under tight timelines, and can quickly translate complex requirements into reliable, scalable solutions.

18 years of commercial experience in

Accounting

Banking

Fintech

Healthcare

Transportation

Subscription

Main technologies

Python

8.5 years

AWS

3.5 years

SQL

16.5 years

Snowflake

5 years

Redshift

2 years

Additional skills

PySpark

Data visualization

Data Warehouse

Microsoft Power BI

PL/SQL

Oracle

Airflow

DBT

GCP

Vertex AI

Databricks

BigQuery

Tableau

RAG

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior Data Engineer

Oct 2024 - Sep 202510 months

Project Overview

Build a governed, reusable transportation data foundation on Databricks using a medallion (bronze–silver–gold) architecture to ingest raw operational feeds, standardize and cleanse trip/shipment/asset data, and publish analytics-ready, business-aligned datasets for planning, tracking, and cost/performance insights across logistics.

Responsibilities:

Built and maintained Databricks (PySpark) workflows following a medallion (bronze–silver–gold) architecture to ingest and refine transportation data from multiple Kafka topics;
Standardized, cleansed, and consolidated trip, shipment, and asset events into analytics-ready datasets to support logistics planning, tracking, and performance reporting;
Implemented data-quality controls (schema enforcement, deduplication, late-arrival handling) to guarantee reliable, trustworthy transportation data for downstream consumers.

Project Tech stack:

Databricks

PySpark

SQL

Lead Developer

Jul 2024 - Oct 20243 months

Project Overview

It is a Retrieval-Augmented Generation (RAG) solution that ingests and interprets applicants’ financial documents (payslips, bank statements, tax returns, collateral docs), cross-checks them against business rules and risk thresholds, and produces explainable approve/reject recommendations - reducing manual review time, improving consistency, and strengthening auditability of the loan origination process.

Responsibilities:

Designed and implemented a Python-based RAG system on GCP using Vertex AI to ingest, extract, structure, classify, and analyze financial documents, including scanned and multi-image formats;
Enabled support for multiple applicant document types (payslips, bank statements, tax returns, collateral docs) and variable-quality inputs to improve coverage and automation;
Integrated business/risk rules to generate explainable approve/reject recommendations, reducing manual loan-document review time and improving consistency;
Led and coached a cross-functional team of 5 engineers (one data scientist, 4 Python developers), defining delivery milestones, enforcing MLOps and quality practices, and ensuring the solution was scalable, secure, and auditable.

Project Tech stack:

Python

GCP

Vertex AI

RAG

Senior Data Engineer

Mar 2022 - Jul 20242 years 4 months

Project Overview

Modernize and re-platform existing data pipelines from Redshift to Snowflake to ensure consistent data models, optimized performance, and lower platform fragmentation - leveraging Snowflake-native features (tasks, streams, warehouse sizing) and CI/CD to deliver reliable, maintainable, and cost-aware data workflows.

Responsibilities:

Implemented, optimized, and maintained new data ingestions into Snowflake using dbt Core (and later Cloud);
Implemented new data ingestions from Redshift into Snowflake using Airflow within a fully AWS-based data lake architecture;
Maintained AWS Glue (PySpark) workflows populating warehouse data in Redshift;
Successfully proposed and implemented a data lake architecture across new and legacy ELT processes, improving data quality and governance.

Project Tech stack:

Snowflake

DBT

Airflow

AWS

SQL

PySpark

Redshift

Senior Data Engineer

Mar 2021 - Mar 20221 year

Project Overview

A secure, scalable, and cost-efficient data platform by migrating the enterprise data warehouse from the on-premise Oracle database to Snowflake, fully hosted on AWS - modernizing ingestion, storage, and analytics while minimizing downtime and ensuring data quality.

Responsibilities:

Researched, designed, and developed a new data integration system using modern technologies for process orchestration (Airflow) and data transformation (dbt, Great Expectations, Snowflake) within a fully AWS environment;
Optimized procedures and SQL statements in legacy ETL processes across Oracle and Snowflake databases;
Maintained and documented a Python-based system delivering transactional data to clients via unstructured text files;
Reduced execution time of a major daily legacy ETL from 10 to 8 hours (–20%) by improving procedure call orchestration and optimizing SQL statements.

Project Tech stack:

Snowflake

DBT

Airflow

AWS

PySpark

SQL

Python

Lead Business Intelligence

Feb 2020 - Mar 20211 year 1 month

Project Overview

New Business Intelligence area that standardizes data sources, defines governance and KPIs, and provides self-service reporting and dashboards to business units, improving decision-making speed, accuracy, and transparency. This project involved integrating multiple data sources, cleaning, quality control, transformation, and normalization using SQL, Pentaho, and Great Expectations.

Responsibilities:

Created a new data warehouse in the company's on-premises infrastructure to consolidate all organizational information;
Detected data silos and defined their integration into the new data warehouse;
Defined minimum policies for data governance;
Managed, researched, and developed two projects generating reports for the company’s financial area and top management;
Built ETLs using Pentaho and Great Expectations, and developed dashboards in Power BI;
Designed and implemented the Technical Provisions process to ensure company solvency control.

Project Tech stack:

Oracle

SQL

Microsoft Power BI

SQL

Python

Keep in mind, the experience summary might exclude non-relevant projects

Education

2008

Informatic Engineer

Bachelor

2018

Computer Sciences

Master

Languages

French

Intermediate

English

Advanced

Hire Luis or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request