Logo
Rodrigo – Python, Apache Airflow, SQL, experts in Lemon.io

Rodrigo

From Brazil (GMT-3)

flag
Data EngineerSenior
Hire developer
7 years of commercial experience
AI
Analytics
Architecture
Banking
Biotech
Business intelligence
Cloud computing
Computer science
Data analytics
E-learning
Edtech
Healthtech
Logistics
Machine learning
Telecommunications
Platforms
Lemon.io stats

Rodrigo – Python, Apache Airflow, SQL

Rodrigo is a seasoned Data Engineer with over 5 years of experience. His areas of expertise include Python, Scala, and SQL. He is adept at designing and implementing data pipelines for real-time analytics and deploying machine learning projects using Python, TensorFlow, and Pandas. Rodrigo's proficiency extends to various database systems such as MS SQL Server, Databricks, Snowflake, and PostgreSQL. His work has resulted in improved decision-making processes in sectors such as Finance, Telecommunications, and more. Rodrigo excels in dynamic environments and drives projects that increase operational efficiency and intelligence.

Main technologies
Python
5 years
Apache Airflow
4 years
SQL
5 years
Apache Spark
5 years
Microsoft Azure
4 years
Additional skills
AWS
ElasticSearch
GitLab
Kafka
Docker Compose
API
Scikit-learn
Jira
Machine learning
GraphQL
Terraform
Data analysis
Bash
Git
PyTorch
MongoDB
Azure DevOps
Keras
Kubernetes
Tensorflow
Scala
PostgreSQL
Data visualization
MySQL
Data Science
Neo4j
ETL
PySpark
Data Modeling
Apache Hadoop
Amazon RDS
Amazon S3
Amazon SQS
Splunk
AWS Lambda
Amazon EC2
Apache Kafka
Big Data
Data Warehouse
Microsoft Power BI
Databricks
Tableau
Datadog
Ready to start
ASAP
Direct hire
Potentially possible

Experience Highlights

Senior Technical Leader
May 2023 - Feb 20249 months
Project Overview

This is a migration project for the world's largest company by market capitalization and smartphone manufacturer by volume. The project covers the migration of Apache Airflow, including its workflows and related components, to AWS. It includes assessing the current environment, designing AWS infrastructure, migration planning, deployment and configuration, data migration, testing, documentation, and post-migration support.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Created and implemented ELT pipelines using Airflow, Snowflake, DBT, and AWS services like Glue/S3;
  • Optimized SQL and Jinja code for data transformation within DBT, significantly enhancing API response times by 25% through streamlined data processing and efficient query optimization;
  • Developed data models that support business requirements, optimizing query performance by 20% while decreasing resource utilization by 10%;
  • Implemented data quality checks and DBT tests to ensure the accuracy and completeness of our data, achieving a 15% increase in data reliability.
Project Tech stack:
Apache Airflow
AWS
Docker Compose
GraphQL
MongoDB
PostgreSQL
MySQL
Python
Amazon RDS
Amazon S3
Amazon EC2
AWS Lambda
Terraform
Bitbucket
ElasticSearch
GitHub Actions
Kubernetes
Splunk
API Gateway
API testing
Apache Kafka
Tableau
Containers
Data Warehouse
Big Data
Database design
Datadog
Senior Data Engineer
Feb 2023 - Apr 20232 months
Project Overview

A biomedical engineering project that involved a code that is capable of reading an XML file and extracting data from it. This extracted data is then used to create nodes and relationships in a Neo4j graph database. The code used the py2neo library to connect to the database and the XML to dict library to parse the XML file.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Designed and implemented the ETL process to ingest XML data from the UniProt database and transform it for optimal storage and querying within a Neo4j graph database;
  • Configured and managed the Neo4j database, ensuring data integrity and optimizing queries for performance enhancements;
  • Utilized Apache Airflow to orchestrate the data pipeline workflows, ensuring orderly and efficient execution of data processing tasks;
  • Worked with Docker containers for various project components, including Neo4j, Python applications, and Airflow, to maintain consistent environments across development and production;
  • Implemented robust testing strategies to validate the data pipeline and its integration with the graph database, ensuring accurate data and reliable system performance;
  • Created comprehensive documentation for the data pipeline architecture and setup and developed reports to communicate insights derived from the data to stakeholders.
Project Tech stack:
Apache Airflow
Python
Neo4j
Docker
Bash
Linux
Senior Data Engineer
Sep 2022 - Apr 20237 months
Project Overview

ML project for robust model development and deployment to enhance predictive accuracy for the Phoenix team of the Brain project conducting Score analysis for legal entities of a Brazilian Bank.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Developed OLAP cubes and deployed an Azure Machine Learning project, incorporating TensorFlow and Pandas for predictive modeling, and demonstrated proficiency in MLOps practices to enhance AI model deployment efficiency;
  • Focused on enhancing predictive accuracy for score analysis of legal entities in a Brazilian bank. As a result, successfully reduced response time from 2 days to under 2 hours.
Project Tech stack:
Python
Tensorflow
API
Pandas
PyTorch
PySpark
pytest
Keras
Scikit-learn
Machine learning
SQL
Data Science
Data analysis
ETL
Azure DevOps
Apache Kafka
Data Modeling
Data Warehouse
Big Data
Algorithms and Data Structures
Senior Data Engineer
Dec 2021 - Jul 20227 months
Project Overview

The project involved aggregating data from the customer's database's WiFi modems. This data was then integrated into a Qlik Dashboard, facilitating real-time KPI monitoring and significantly boosting operational efficiency for a leading telecommunications provider.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Developed OLAP cubes and deployed an Azure Machine Learning project, incorporating TensorFlow and Pandas for predictive modeling. Demonstrated proficiency in MLOps practices to enhance AI model deployment efficiency. Focused on enhancing predictive accuracy for score analysis of legal entities in a Brazilian bank, successfully reducing response time from 2 days to under 2 hours;
  • Executed advanced T-SQL scripting to automate and optimize database tasks, reducing processing times by over 30%;
  • Collaborated closely with cross-functional teams to deploy the SSIS package;
  • Contributed to the integration of generative AI models into the data pipeline utilizing Databricks on Azure and AWS EMR, enhancing predictive analytics capabilities and supporting collaboration with AI engineers and data scientists;
  • Integrated secure data access protocols with OAuth, employed Postman for robust API testing, and managed data security using Azure Identity, significantly reducing data processing times by 30%;
  • Developed ETL routines using PySpark, SQL, and Hadoop to streamline data processing and integration for the bank’s data engineering team, resulting in a 25% reduction in data processing time.
Project Tech stack:
Scala
Jira
Confluence
Apache Hadoop
ETL
Apache Kafka
MongoDB
GitLab
Python
Data Modeling
Data visualization
Data Engineer
Dec 2020 - Jun 20216 months
Project Overview

The product aim to enhance the learning experience and outcomes for students by providing real-time analytics on their learning progress, while also offering course creators and instructors actionable insights to improve course content and delivery.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Redesigned the data storage strategy by implementing PostgreSQL and MongoDB to efficiently manage both structured and unstructured data;
  • Treated, manipulated, and prepared complex data for analysis, utilizing Power BI for data exploration and storytelling. This improved data comprehension and decision-making for an edtech company's analytics project;
  • Implemented high-throughput data processing solutions using Python's psycopg2 and PySpark for PostgreSQL databases, resulting in a 15% reduction in processing time. This optimization enhanced data accessibility and strengthened the data science team's analytical capabilities;
  • Refactored on-premises pipelines into Azure Cloud infrastructure, increasing scalability and reliability for the data engineering project. This initiative led to a 25% decrease in pipeline processing time.
Project Tech stack:
Apache Spark
PySpark
Azure DevOps
Databricks
Microsoft Azure
Microsoft Power BI
MySQL
SQL
PostgreSQL
ETL
Apache Hadoop
API
Docker
Data Engineer/Analytic Engineer
May 2018 - Nov 20202 years 6 months
Project Overview

The project aimed to develop a data-driven supply chain analytics platform. The platform was designed to integrate smoothly with existing supply chain management systems and offer predictive analytics on various aspects, such as inventory levels, demand forecasting, and supplier performance.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Developed the data processing pipeline using Spark and Python to analyze vast amounts of supply chain data, enhancing the capability to derive real-time insights and predictive analytics;
  • Configured and managed PostgreSQL and MongoDB databases, ensuring efficient data storage and rapid retrieval for analytics purposes and facilitating real-time decision-making in supply chain operations;
  • Utilized Azure Data Factory and SSIS packages to streamline ETL workflows, improving data accuracy and availability while reducing processing times by 25%;
  • Created dynamic, interactive dashboards in Power BI, offering comprehensive visibility into inventory levels, supplier performance, and demand forecasts, enabling data-driven decision-making across the supply chain.
Project Tech stack:
Python
Apache Spark
PySpark
Microsoft Azure
Databricks
AWS
PostgreSQL
SQL
ETL
Apache Hadoop
MySQL
API
Microsoft Power BI

Education

2022
Engineering / Civil Engineer
BSc.
2024
Data Architecture/ AAS. in Data Architecture
AAS.
2024
Engineering / Biomedical Engineering
Specialization
2024
Engineering / Big Data Engineering
Specialization

Copyright © 2024 lemon.io. All rights reserved.