Rodrigo

From Brazil (UTC-3)

Data Engineer|Senior

Skills and seniority verified on May 1, 2024

Rodrigo – Python, Apache Airflow, SQL

Rodrigo is a seasoned Data Engineer with over 5 years of experience. His areas of expertise include Python, Scala, and SQL. He is adept at designing and implementing data pipelines for real-time analytics and deploying machine learning projects using Python, TensorFlow, and Pandas. Rodrigo's proficiency extends to various database systems such as MS SQL Server, Databricks, Snowflake, and PostgreSQL. His work has resulted in improved decision-making processes in sectors such as Finance, Telecommunications, and more. Rodrigo excels in dynamic environments and drives projects that increase operational efficiency and intelligence.

8 years of commercial experience in

Analytics

Architecture

Banking

Biotech

Business intelligence

Cloud computing

Computer science

Data analytics

E-learning

Edtech

Healthtech

Logistics

Machine learning

Telecommunications

Platforms

Main technologies

Python

5 years

Apache Airflow

4 years

SQL

5 years

Apache Spark

5 years

Microsoft Azure

4 years

Additional skills

AWS

ElasticSearch

GitLab

Kafka

Docker Compose

API

Scikit-learn

Jira

Machine learning

GraphQL

Terraform

Data analysis

Bash

Git

PyTorch

MongoDB

Azure DevOps

Keras

Kubernetes

Tensorflow

Scala

PostgreSQL

Data visualization

MySQL

Data Science

Pandas

Neo4j

ETL

PySpark

Data Modeling

Apache Hadoop

Amazon RDS

Amazon S3

Amazon SQS

Splunk

AWS Lambda

Amazon EC2

Apache Kafka

Big Data

Data Warehouse

Microsoft Power BI

Databricks

Tableau

Datadog

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior Technical Leader

May 2023 - Feb 20249 months

Project Overview

This is a migration project for the world's largest company by market capitalization and smartphone manufacturer by volume. The project covers the migration of Apache Airflow, including its workflows and related components, to AWS. It includes assessing the current environment, designing AWS infrastructure, migration planning, deployment and configuration, data migration, testing, documentation, and post-migration support.

Responsibilities:

Created and implemented ELT pipelines using Airflow, Snowflake, DBT, and AWS services like Glue/S3;
Optimized SQL and Jinja code for data transformation within DBT, significantly enhancing API response times by 25% through streamlined data processing and efficient query optimization;
Developed data models that support business requirements, optimizing query performance by 20% while decreasing resource utilization by 10%;
Implemented data quality checks and DBT tests to ensure the accuracy and completeness of our data, achieving a 15% increase in data reliability.

Project Tech stack:

Apache Airflow

AWS

Docker Compose

GraphQL

MongoDB

PostgreSQL

MySQL

Python

Amazon RDS

Amazon S3

Amazon EC2

AWS Lambda

Terraform

Bitbucket

ElasticSearch

GitHub Actions

Kubernetes

Splunk

API Gateway

API testing

Apache Kafka

Tableau

Containers

Data Warehouse

Big Data

Database design

Datadog

Senior Data Engineer

Feb 2023 - Apr 20232 months

Project Overview

A biomedical engineering project that involved a code that is capable of reading an XML file and extracting data from it. This extracted data is then used to create nodes and relationships in a Neo4j graph database. The code used the py2neo library to connect to the database and the XML to dict library to parse the XML file.

Responsibilities:

Designed and implemented the ETL process to ingest XML data from the UniProt database and transform it for optimal storage and querying within a Neo4j graph database;
Configured and managed the Neo4j database, ensuring data integrity and optimizing queries for performance enhancements;
Utilized Apache Airflow to orchestrate the data pipeline workflows, ensuring orderly and efficient execution of data processing tasks;
Worked with Docker containers for various project components, including Neo4j, Python applications, and Airflow, to maintain consistent environments across development and production;
Implemented robust testing strategies to validate the data pipeline and its integration with the graph database, ensuring accurate data and reliable system performance;
Created comprehensive documentation for the data pipeline architecture and setup and developed reports to communicate insights derived from the data to stakeholders.

Project Tech stack:

Apache Airflow

Python

Neo4j

Docker

Bash

Linux

Senior Data Engineer

Sep 2022 - Apr 20237 months

Project Overview

ML project for robust model development and deployment to enhance predictive accuracy for the Phoenix team of the Brain project conducting Score analysis for legal entities of a Brazilian Bank.

Responsibilities:

Developed OLAP cubes and deployed an Azure Machine Learning project, incorporating TensorFlow and Pandas for predictive modeling, and demonstrated proficiency in MLOps practices to enhance AI model deployment efficiency;
Focused on enhancing predictive accuracy for score analysis of legal entities in a Brazilian bank. As a result, successfully reduced response time from 2 days to under 2 hours.

Project Tech stack:

Python

Tensorflow

API

Pandas

PyTorch

PySpark

pytest

Keras

Scikit-learn

Machine learning

SQL

Data Science

Data analysis

ETL

Azure DevOps

Apache Kafka

Data Modeling

Data Warehouse

Big Data

Algorithms and Data Structures

Senior Data Engineer

Dec 2021 - Jul 20227 months

Project Overview

The project involved aggregating data from the customer's database's WiFi modems. This data was then integrated into a Qlik Dashboard, facilitating real-time KPI monitoring and significantly boosting operational efficiency for a leading telecommunications provider.

Responsibilities:

Developed OLAP cubes and deployed an Azure Machine Learning project, incorporating TensorFlow and Pandas for predictive modeling. Demonstrated proficiency in MLOps practices to enhance AI model deployment efficiency. Focused on enhancing predictive accuracy for score analysis of legal entities in a Brazilian bank, successfully reducing response time from 2 days to under 2 hours;
Executed advanced T-SQL scripting to automate and optimize database tasks, reducing processing times by over 30%;
Collaborated closely with cross-functional teams to deploy the SSIS package;
Contributed to the integration of generative AI models into the data pipeline utilizing Databricks on Azure and AWS EMR, enhancing predictive analytics capabilities and supporting collaboration with AI engineers and data scientists;
Integrated secure data access protocols with OAuth, employed Postman for robust API testing, and managed data security using Azure Identity, significantly reducing data processing times by 30%;
Developed ETL routines using PySpark, SQL, and Hadoop to streamline data processing and integration for the bank’s data engineering team, resulting in a 25% reduction in data processing time.

Project Tech stack:

Scala

Jira

Confluence

Apache Hadoop

ETL

Apache Kafka

MongoDB

GitLab

Python

Data Modeling

Data visualization

Data Engineer

Dec 2020 - Jun 20216 months

Project Overview

The product aim to enhance the learning experience and outcomes for students by providing real-time analytics on their learning progress, while also offering course creators and instructors actionable insights to improve course content and delivery.

Responsibilities:

Redesigned the data storage strategy by implementing PostgreSQL and MongoDB to efficiently manage both structured and unstructured data;
Treated, manipulated, and prepared complex data for analysis, utilizing Power BI for data exploration and storytelling. This improved data comprehension and decision-making for an edtech company's analytics project;
Implemented high-throughput data processing solutions using Python's psycopg2 and PySpark for PostgreSQL databases, resulting in a 15% reduction in processing time. This optimization enhanced data accessibility and strengthened the data science team's analytical capabilities;
Refactored on-premises pipelines into Azure Cloud infrastructure, increasing scalability and reliability for the data engineering project. This initiative led to a 25% decrease in pipeline processing time.

Project Tech stack:

Apache Spark

PySpark

Azure DevOps

Databricks

Microsoft Azure

Microsoft Power BI

MySQL

SQL

PostgreSQL

ETL

Apache Hadoop

API

Docker

Data Engineer/Analytic Engineer

May 2018 - Nov 20202 years 6 months

Project Overview

The project aimed to develop a data-driven supply chain analytics platform. The platform was designed to integrate smoothly with existing supply chain management systems and offer predictive analytics on various aspects, such as inventory levels, demand forecasting, and supplier performance.

Responsibilities:

Developed the data processing pipeline using Spark and Python to analyze vast amounts of supply chain data, enhancing the capability to derive real-time insights and predictive analytics;
Configured and managed PostgreSQL and MongoDB databases, ensuring efficient data storage and rapid retrieval for analytics purposes and facilitating real-time decision-making in supply chain operations;
Utilized Azure Data Factory and SSIS packages to streamline ETL workflows, improving data accuracy and availability while reducing processing times by 25%;
Created dynamic, interactive dashboards in Power BI, offering comprehensive visibility into inventory levels, supplier performance, and demand forecasts, enabling data-driven decision-making across the supply chain.

Project Tech stack:

Python

Apache Spark

PySpark

Microsoft Azure

Databricks

AWS

PostgreSQL

SQL

ETL

Apache Hadoop

MySQL

API

Microsoft Power BI

Keep in mind, the experience summary might exclude non-relevant projects

Education

2022

Engineering / Civil Engineer

BSc.

2024

Data Architecture/ AAS. in Data Architecture

AAS.

2024

Engineering / Biomedical Engineering

Specialization

2024

Engineering / Big Data Engineering

Specialization

Languages

English

Advanced

Hire Rodrigo or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request