Logo
João – SQL, Python, AWS, experts in Lemon.io

João

From Brazil (UTC-5)

flag
Data EngineerSenior
10 years of commercial experience
Analytics
Banking
Cloud computing
Data analytics
Healthcare
Healthtech
Insurance
Modeling software
Lemon.io stats

João – SQL, Python, AWS

Meet our Senior Data Engineer - a seasoned problem solver with a passion for crafting impactful solutions. With diverse experience in insurance, finance, banking, and sales, he brings a wealth of expertise. Proficient in Python, SQL, AWS, and Data Warehouse, João excels in developing source-to-target mappings, data analysis, and ETL processes. His skills span extraction, lineage, quality assurance, conversion, transformation, and loading. Beyond data, he enjoys the gym, books, and relaxation.

Main technologies
SQL
7 years
Python
6 years
AWS
3 years
Additional skills
Microsoft Azure
PostgreSQL
Snowflake
Apache Kafka
Microsoft SQL Server
Apache Airflow
AWS Lambda
GCP
MySQL
Data Warehouse
Data Modeling
Data analysis
Big Data
Database design
Oracle
Tableau
PySpark
Databricks
ETL
Pandas
Ready to start
40h (1-2weeks), part-time when needed
Direct hire
Potentially possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Sr Data Engineer
Jan 2024 - Ongoing1 year 2 months
Project Overview

João is helping global clients—primarily in the United States—modernize their data platforms and optimize data pipelines. His focus is on building robust, scalable solutions using tools like Snowflake, dbt, Fivetran, AWS, and Azure.

Skeleton
Skeleton
Skeleton
Responsibilities:

Key projects & responsibilities:

  • Data Warehouse Project with EDI Integration: Participated in a global initiative to build a modern Data Warehouse ingesting EDI files from major retail partners such as Target, Amazon, Best Buy US, and Best Buy Canada. The project centralized data related to returns, sell-through, sales, inventory, product performance, and more. João contributed from the initial setup of the pipelines to the development of data models and monitoring.
  • Cloudera to Snowflake Migration: Led the migration of enterprise pipelines from Cloudera to Snowflake, with the goal of improving performance and architecture modernization.
  • Cloudera & DB2 Modernization: Worked on a hybrid migration and modernization project involving Cloudera and DB2, moving workloads to a cloud-native stack.
Project Tech stack:
AWS
Amazon S3
Amazon EC2
Amazon RDS
Azure DevOps
Redshift
Data Warehouse
Data Modeling
Big Data
Algorithms and Data Structures
Microsoft Azure
Snowflake
Databricks
Fivetran
Data Structures
Data analysis
Data mining
Data Security
GitHub Actions
Senior Data Engineer / Tech Lead
Oct 2022 - Ongoing2 years 5 months
Project Overview

The Data Warehouse project aimed to create indicators to map the duration of each stage/task in their thousands of products and services offered to customers. The main objective was to identify operational bottlenecks impacting the end customer and address business inquiries regarding average lead time, SLA compliance, customer quantity, values, and various stages.

Target Audience: Operations teams, business analysts, and decision-makers at Santander Bank.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Built the warehouse from the ground up, including requirements gathering and dashboard development support;
  • Collaborated with Product Owners to identify and prioritize business requirements;
  • Designed and implemented data models to analyze operational data effectively;
  • Developed ETL processes using PostgreSQL and the BIX tool to extract and transform data;
  • Established connections and scheduled updates with the PowerBI Report Server via ODBC;
  • Automated procedure execution through job scheduling in Control-M;
  • Contributed to the identification of operational bottlenecks and providing actionable insights to improve business processes;
  • Created dashboards and reports to visualize key performance indicators and support decision-making processes.
Project Tech stack:
Amazon S3
Amazon EC2
Azure DevOps
Big Data
Data Warehouse
Data mining
Data visualization
Data Structures
Data Security
Data Science
PostgreSQL
Azure SQL
NoSQL
MySQL
Transact-SQL (T-SQL)
Oracle SQL Developer
Snowflake
API
Microsoft Power BI
Data Modeling
Apache Spark
Senior Data Engineer
May 2023 - Dec 20237 months
Project Overview

The project involved migrating on-premise data systems (Cloudera, Oracle, PostgreSQL) to a cloud-based infrastructure. The primary goal was to enhance the banking experience for customers by leveraging advanced, secure, and efficient digital solutions. This initiative significantly improved data accessibility and processing capabilities.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Engineered batch data pipelines for AWS Glue/Data Factory, facilitating efficient data lake ingestion;
  • Developed Near Real-Time (NRT) and Stream processing pipelines using Kafka, EventHub, Azure;
  • Stream Analytics, and AWS Kinesis, ensuring timely data availability;
  • Optimized data processing and query execution using PySpark and SparkSQL;
  • Automated testing processes for improved reliability and efficiency in data handling;
  • Managed data storage and processing across various platforms, including Amazon S3, PostgreSQL, Oracle DB, Azure DataLake Storage, and Snowflake.
Project Tech stack:
AWS
Azure SQL
Azure Functions
Microsoft Azure
Azure DevOps
PostgreSQL
Oracle SQL Developer
NoSQL
SQL Server
Transact-SQL (T-SQL)
Data Engineer
Nov 2021 - Sep 20229 months
Project Overview

This project aimed to establish robust data management and governance practices within a global team using Azure DevOps. The product involved implementing data governance frameworks, developing data dictionaries, and creating data lineage to ensure data quality and compliance. The project targeted improving data management efficiency, facilitating data analysis, and enabling better decision-making processes for the organization.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Collaborated with stakeholders to define data requirements and created a comprehensive Data Glossary;
  • Developed a robust Data Dictionary and established Data Lineage to ensure effective data governance;
  • Designed and validated Technical Data Quality Rules and Business Rules using SQL, contributing to improved data quality;
  • Implemented validation and consistency pipelines using SSIS/SQL to automate data validation processes;
  • Developed and implemented a normalized Data Warehouse model, resulting in significant performance improvements when connecting with PowerBI;
  • Created pipelines using SSIS for ETL/ELT processes, enabling efficient data extraction, transformation, and loading;
  • Designed and implemented Data Marts for different business areas, facilitating targeted data analysis;
  • Conducted Data Modeling for both normalized and denormalized databases, ensuring efficient data storage and retrieval;
  • Utilized data mining techniques to identify insights and trends, contributing to informed decision-making;
  • Managed Master Data and worked as part of a DevOps team to ensure smooth deployment and maintenance of the solution.
Project Tech stack:
GCP
Azure SQL
Microsoft Azure
Grafana
Apache Kafka
Big Data
Data Modeling
Data Warehouse
Python
SQL
Oracle
PostgreSQL
Developer Data Engineer
Oct 2018 - Sep 20212 years 10 months
Project Overview

The project aimed to modernize and optimize data management and analysis processes for Life and Pension services.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Contributed as a developer and data engineer within the development team;
  • Conducted ETL/ELT processes to manipulate, transform, and load data for analysis;
  • Played a key role in modernizing data management processes and infrastructure;
  • Successfully migrated data warehouses to cloud platforms for improved scalability and accessibility;
  • Led the migration of dashboards from PowerPoint to Tableau for enhanced visualization and interactivity;
  • Implemented a centralized data warehouse for operational indicators, improving data consistency and accessibility;
  • Demonstrated expertise in achieving results related to life and pension services;
  • Acted as a collaborative member within the data, reports, indicators, and models teams;
  • Provided support to teams in business/data analytics, UX/UI, and data architecture to influence solution design and implementation;
  • Migrated specific databases (System of Record - SOR) from Oracle and SQL Server Management Studio (SSMS) environments to Google Cloud Platform (GCP), ensuring seamless data storage and accessibility;
  • Developed batch processing pipelines to efficiently handle large data volumes, facilitating timely data analysis and reporting;
  • Utilized skills in Google Cloud Platform (GCP), Tableau, data governance, data engineering, SQL, ETL processes, Microsoft SQL Server, Oracle SQL Developer, Oracle Database, Python, PySpark, and SAS to contribute to project success.
Project Tech stack:
GCP
Apache Kafka
Data Warehouse
Data mining
SQL
Oracle
Apache Airflow
Python
PySpark
Apache Spark
Tableau
Firebase DB and Storage
SASS
Data Engineer
Dec 2019 - Nov 202010 months
Project Overview

The project was aimed at developing a comprehensive data warehouse encompassing life and pension information. It included detailed sales data, comparisons of budget versus actuals, breakdowns by sales channels and branches, and commercial productivity. The primary objective was to centralize and streamline data for better business intelligence and decision-making.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Designed and implemented the architecture of the data warehouse, ensuring scalability and efficiency.
  • Optimized data processing pipelines using PySpark/SparkSQL, enhancing performance and data throughput.
  • Conducted automated testing to ensure data integrity and reliability of the data warehouse.
  • Utilized Google Cloud Platform (GCP) technologies, including BigQuery, Cloud Storage, and Dataflow, for robust data management and analysis.
  • Employed Airflow for workflow orchestration, ensuring smooth and automated ETL processes.
  • Integrated SQL Server for database management, supporting complex queries and data storage.
  • Developed dashboards and reports in Tableau, providing insightful analytics and visualizations for business stakeholders.
Project Tech stack:
GCP
Apache Airflow
PySpark
Oracle SQL Developer
PL
SQL
Transact-SQL (T-SQL)

Education

2020
Bachelor of Actuarial Science
Bachelor's Degree
2023
Big Data & Data Analytics
Master's degree
2025
AWS Practitioner
Certification

Languages

Portuguese
Advanced
English
Advanced

Hire João or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2025 lemon.io. All rights reserved.