
João
From Brazil (UTC-5)
10 years of commercial experience
Lemon.io stats
João – SQL, Python, AWS
Meet our Senior Data Engineer - a seasoned problem solver with a passion for crafting impactful solutions. With diverse experience in insurance, finance, banking, and sales, he brings a wealth of expertise. Proficient in Python, SQL, AWS, and Data Warehouse, João excels in developing source-to-target mappings, data analysis, and ETL processes. His skills span extraction, lineage, quality assurance, conversion, transformation, and loading. Beyond data, he enjoys the gym, books, and relaxation.
Main technologies
Additional skills
Ready to start
40h (1-2weeks), part-time when neededDirect hire
Potentially possibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Sr Data Engineer
João is helping global clients—primarily in the United States—modernize their data platforms and optimize data pipelines. His focus is on building robust, scalable solutions using tools like Snowflake, dbt, Fivetran, AWS, and Azure.
Key projects & responsibilities:
- Data Warehouse Project with EDI Integration: Participated in a global initiative to build a modern Data Warehouse ingesting EDI files from major retail partners such as Target, Amazon, Best Buy US, and Best Buy Canada. The project centralized data related to returns, sell-through, sales, inventory, product performance, and more. João contributed from the initial setup of the pipelines to the development of data models and monitoring.
- Cloudera to Snowflake Migration: Led the migration of enterprise pipelines from Cloudera to Snowflake, with the goal of improving performance and architecture modernization.
- Cloudera & DB2 Modernization: Worked on a hybrid migration and modernization project involving Cloudera and DB2, moving workloads to a cloud-native stack.
Senior Data Engineer / Tech Lead
The Data Warehouse project aimed to create indicators to map the duration of each stage/task in their thousands of products and services offered to customers. The main objective was to identify operational bottlenecks impacting the end customer and address business inquiries regarding average lead time, SLA compliance, customer quantity, values, and various stages.
Target Audience: Operations teams, business analysts, and decision-makers at Santander Bank.
- Built the warehouse from the ground up, including requirements gathering and dashboard development support;
- Collaborated with Product Owners to identify and prioritize business requirements;
- Designed and implemented data models to analyze operational data effectively;
- Developed ETL processes using PostgreSQL and the BIX tool to extract and transform data;
- Established connections and scheduled updates with the PowerBI Report Server via ODBC;
- Automated procedure execution through job scheduling in Control-M;
- Contributed to the identification of operational bottlenecks and providing actionable insights to improve business processes;
- Created dashboards and reports to visualize key performance indicators and support decision-making processes.
Senior Data Engineer
The project involved migrating on-premise data systems (Cloudera, Oracle, PostgreSQL) to a cloud-based infrastructure. The primary goal was to enhance the banking experience for customers by leveraging advanced, secure, and efficient digital solutions. This initiative significantly improved data accessibility and processing capabilities.
- Engineered batch data pipelines for AWS Glue/Data Factory, facilitating efficient data lake ingestion;
- Developed Near Real-Time (NRT) and Stream processing pipelines using Kafka, EventHub, Azure;
- Stream Analytics, and AWS Kinesis, ensuring timely data availability;
- Optimized data processing and query execution using PySpark and SparkSQL;
- Automated testing processes for improved reliability and efficiency in data handling;
- Managed data storage and processing across various platforms, including Amazon S3, PostgreSQL, Oracle DB, Azure DataLake Storage, and Snowflake.
Data Engineer
This project aimed to establish robust data management and governance practices within a global team using Azure DevOps. The product involved implementing data governance frameworks, developing data dictionaries, and creating data lineage to ensure data quality and compliance. The project targeted improving data management efficiency, facilitating data analysis, and enabling better decision-making processes for the organization.
- Collaborated with stakeholders to define data requirements and created a comprehensive Data Glossary;
- Developed a robust Data Dictionary and established Data Lineage to ensure effective data governance;
- Designed and validated Technical Data Quality Rules and Business Rules using SQL, contributing to improved data quality;
- Implemented validation and consistency pipelines using SSIS/SQL to automate data validation processes;
- Developed and implemented a normalized Data Warehouse model, resulting in significant performance improvements when connecting with PowerBI;
- Created pipelines using SSIS for ETL/ELT processes, enabling efficient data extraction, transformation, and loading;
- Designed and implemented Data Marts for different business areas, facilitating targeted data analysis;
- Conducted Data Modeling for both normalized and denormalized databases, ensuring efficient data storage and retrieval;
- Utilized data mining techniques to identify insights and trends, contributing to informed decision-making;
- Managed Master Data and worked as part of a DevOps team to ensure smooth deployment and maintenance of the solution.
Developer Data Engineer
The project aimed to modernize and optimize data management and analysis processes for Life and Pension services.
- Contributed as a developer and data engineer within the development team;
- Conducted ETL/ELT processes to manipulate, transform, and load data for analysis;
- Played a key role in modernizing data management processes and infrastructure;
- Successfully migrated data warehouses to cloud platforms for improved scalability and accessibility;
- Led the migration of dashboards from PowerPoint to Tableau for enhanced visualization and interactivity;
- Implemented a centralized data warehouse for operational indicators, improving data consistency and accessibility;
- Demonstrated expertise in achieving results related to life and pension services;
- Acted as a collaborative member within the data, reports, indicators, and models teams;
- Provided support to teams in business/data analytics, UX/UI, and data architecture to influence solution design and implementation;
- Migrated specific databases (System of Record - SOR) from Oracle and SQL Server Management Studio (SSMS) environments to Google Cloud Platform (GCP), ensuring seamless data storage and accessibility;
- Developed batch processing pipelines to efficiently handle large data volumes, facilitating timely data analysis and reporting;
- Utilized skills in Google Cloud Platform (GCP), Tableau, data governance, data engineering, SQL, ETL processes, Microsoft SQL Server, Oracle SQL Developer, Oracle Database, Python, PySpark, and SAS to contribute to project success.
Data Engineer
The project was aimed at developing a comprehensive data warehouse encompassing life and pension information. It included detailed sales data, comparisons of budget versus actuals, breakdowns by sales channels and branches, and commercial productivity. The primary objective was to centralize and streamline data for better business intelligence and decision-making.
- Designed and implemented the architecture of the data warehouse, ensuring scalability and efficiency.
- Optimized data processing pipelines using PySpark/SparkSQL, enhancing performance and data throughput.
- Conducted automated testing to ensure data integrity and reliability of the data warehouse.
- Utilized Google Cloud Platform (GCP) technologies, including BigQuery, Cloud Storage, and Dataflow, for robust data management and analysis.
- Employed Airflow for workflow orchestration, ensuring smooth and automated ETL processes.
- Integrated SQL Server for database management, supporting complex queries and data storage.
- Developed dashboards and reports in Tableau, providing insightful analytics and visualizations for business stakeholders.