Raphael

From Brazil (UTC-3)

Data Engineer|Senior

Skills and seniority verified on Apr 8, 2026

Raphael – Python, SQL, PySpark

Raphael is a senior data engineer with 8 years of experience, specializing in AWS-based data pipelines using S3, Lambda, Glue, Athena, Redshift, and Databricks. He has hands-on expertise in batch and near-real-time ingestion, medallion-style data layering, and cost-efficient data lake design. Feedback highlights practical delivery skills, clear communication, and a client-first approach, though architectural depth is more implementation-focused than deeply specialized.

8 years of commercial experience in

Banking

Fintech

Healthcare

Retail

Telecommunications

Main technologies

Python

6 years

SQL

8 years

PySpark

3 years

AWS

5 years

Cloud Computing

6 years

Additional skills

Apache Spark

Airflow

ETL

AWS CodeBuild

GitHub Actions

Unit testing

Claude LLM

Terraform

Databricks

DynamoDB

Amazon ECS

AWS Lambda

Microsoft SQL Server

Data Modeling

Oracle

MongoDB

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior Data Engineer

Jun 2025 - Ongoing10 months

Project Overview

A healthcare technology company that provides solutions to improve medication intelligence, compliance, and supply chain visibility across hospitals and pharmacies.

Raphael worked on data platforms powering 340B-related products used by hospitals and pharmacies across the U.S. These systems ingest data from multiple sources, including electronic health records (EHRs), pharmacy systems, and third-party APIs.

His role focused on building scalable AWS-based pipelines to process, standardize, and validate this data, enabling accurate tracking of drug transactions and compliance auditing.

Key features included data validation, compliance checks, and near-real-time processing, ensuring reliable, auditable data for healthcare providers operating under strict regulatory requirements.

Responsibilities:

Designed, developed, and maintained scalable AWS-based data pipelines for healthcare clients.
Provisioned and managed cloud infrastructure using Terraform (IaC) for secure, consistent deployments.
Developed and optimized ETL processes using Python and PySpark to process and transform large-scale datasets.
Orchestrated data pipelines using AWS Step Functions and Airflow.
Implemented CI/CD pipelines using GitHub Actions integrated with AWS CodeBuild.
Used GitHub Copilot and Claude LLM to accelerate ETL development and improve code quality.
Developed and expanded unit testing frameworks using pytest to increase test coverage.
Ensured data security, reliability, and compliance aligned with healthcare requirements.

Project Tech stack:

AWS

Terraform

ETL

Python

PySpark

Airflow

GitHub Actions

AWS CodeBuild

GitHub Copilot

Claude LLM

Unit testing

CloudWatch

Senior Data Platform Engineer

Feb 2024 - Jun 20251 year 4 months

Project Overview

A Brazilian fintech platform that provides financial services and payment infrastructure for small and medium-sized businesses. The product enables companies to manage billing, payments, and cash flow through a centralized platform.

Raphael worked on the data platform that supports both operational and analytical use cases, serving internal teams such as finance, risk, and product, and enabling data-driven decision-making across the company.

The platform ingests data from multiple sources, including transactional systems, external integrations, and event streams. It processes and transforms this data using scalable pipelines built on AWS and Databricks, making it available for analytics, reporting, and downstream services.

Key features included reliable data ingestion, event-driven processing, data standardization, and orchestration of complex workflows using Airflow.

Responsibilities:

Designed and implemented scalable and resilient data architectures.
Built serverless data pipelines using AWS and Databricks.
Implemented data ingestion from different sources using Amazon DMS.
Orchestrated data pipelines with Airflow.
Implemented monitoring and management mechanisms for data pipelines.
Ensured data security and compliance with industry standards.

Project Tech stack:

AWS

Databricks

Airflow

PySpark

Python

Terraform

Data Modeling

Data Warehouse

Cloud development

Cloud Architecture

Senior Data Engineer

Apr 2021 - Feb 20242 years 10 months

Project Overview

A technology platform focused on the retail supply chain, connecting industries, distributors, and small retailers into a single digital ecosystem. The product helps optimize commercial execution, improve sales performance, and enable better decision-making through data.

Raphael worked on building cloud-based data platforms that supported this ecosystem, ingesting and processing data from multiple sources, including sales systems, distributors, and APIs. These pipelines enabled both batch and near-real-time use cases across the platform.

The solutions provided visibility into sales, inventory, and market behavior, helping retailers and suppliers act more efficiently.

Key features included scalable data ingestion, event-driven processing, and reliable data pipelines designed to support real-time insights and operational decision-making across the supply chain.

Responsibilities:

Designed and implemented scalable and resilient data architectures.
Built AWS serverless data pipelines using AWS Lambda, Amazon ECS, Amazon EC2, Amazon S3, DynamoDB, and EventBridge.
Leveraged event-driven data processing for real-time solutions.
Collected, transformed, and moved data from diverse sources for integration.
Implemented monitoring and management mechanisms for data pipelines.
Ensured data security and compliance with industry standards.

Project Tech stack:

AWS

AWS Lambda

Amazon ECS

Amazon EC2

Amazon S3

DynamoDB

Cloud Architecture

Cloud development

Python

PySpark

Pandas

Senior Data Engineer

Feb 2019 - May 20212 years 3 months

Project Overview

A consultancy company that worked for Claro Telecome, which is one of the largest telecommunications companies in Brazil, providing mobile, broadband, and digital services to millions of customers. The project focused on building and maintaining data solutions to support business intelligence and operational reporting across the company.

Raphael worked on developing and maintaining ETL pipelines using Informatica PowerCenter and Oracle (PL/SQL), processing large volumes of telecom data such as customer activity, billing, and service usage.

These pipelines enabled internal teams to generate reports and insights used for decision-making, performance monitoring, and operational analysis.

Key features included data integration across multiple systems, ETL performance optimization, and ensuring data consistency and reliability across environments, with a strong focus on production stability.

Responsibilities:

Performed analysis and modeling for business intelligence.
Developed PL/SQL (Oracle).
Developed ETL in PowerCenter.
Maintained shell scripts.
Monitored approval stages and production/post-production implementations.
Improved ETL process performance.
Manipulated files and folders on Unix.

Project Tech stack:

SQL

Oracle

ETL

Keep in mind, the experience summary might exclude non-relevant projects

Education

2022

Big Data and Data Science

MBA

2017

Computer Engineering

Languages

English

Advanced

Hire Raphael or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request