Raphael – Python, SQL, PySpark
Raphael is a senior data engineer with 8 years of experience, specializing in AWS-based data pipelines using S3, Lambda, Glue, Athena, Redshift, and Databricks. He has hands-on expertise in batch and near-real-time ingestion, medallion-style data layering, and cost-efficient data lake design. Feedback highlights practical delivery skills, clear communication, and a client-first approach, though architectural depth is more implementation-focused than deeply specialized.
8 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Senior Data Engineer
A healthcare technology company that provides solutions to improve medication intelligence, compliance, and supply chain visibility across hospitals and pharmacies.
Raphael worked on data platforms powering 340B-related products used by hospitals and pharmacies across the U.S. These systems ingest data from multiple sources, including electronic health records (EHRs), pharmacy systems, and third-party APIs.
His role focused on building scalable AWS-based pipelines to process, standardize, and validate this data, enabling accurate tracking of drug transactions and compliance auditing.
Key features included data validation, compliance checks, and near-real-time processing, ensuring reliable, auditable data for healthcare providers operating under strict regulatory requirements.
- Designed, developed, and maintained scalable AWS-based data pipelines for healthcare clients.
- Provisioned and managed cloud infrastructure using Terraform (IaC) for secure, consistent deployments.
- Developed and optimized ETL processes using Python and PySpark to process and transform large-scale datasets.
- Orchestrated data pipelines using AWS Step Functions and Airflow.
- Implemented CI/CD pipelines using GitHub Actions integrated with AWS CodeBuild.
- Used GitHub Copilot and Claude LLM to accelerate ETL development and improve code quality.
- Developed and expanded unit testing frameworks using pytest to increase test coverage.
- Ensured data security, reliability, and compliance aligned with healthcare requirements.
Senior Data Platform Engineer
A Brazilian fintech platform that provides financial services and payment infrastructure for small and medium-sized businesses. The product enables companies to manage billing, payments, and cash flow through a centralized platform.
Raphael worked on the data platform that supports both operational and analytical use cases, serving internal teams such as finance, risk, and product, and enabling data-driven decision-making across the company.
The platform ingests data from multiple sources, including transactional systems, external integrations, and event streams. It processes and transforms this data using scalable pipelines built on AWS and Databricks, making it available for analytics, reporting, and downstream services.
Key features included reliable data ingestion, event-driven processing, data standardization, and orchestration of complex workflows using Airflow.
- Designed and implemented scalable and resilient data architectures.
- Built serverless data pipelines using AWS and Databricks.
- Implemented data ingestion from different sources using Amazon DMS.
- Orchestrated data pipelines with Airflow.
- Implemented monitoring and management mechanisms for data pipelines.
- Ensured data security and compliance with industry standards.
Senior Data Engineer
A technology platform focused on the retail supply chain, connecting industries, distributors, and small retailers into a single digital ecosystem. The product helps optimize commercial execution, improve sales performance, and enable better decision-making through data.
Raphael worked on building cloud-based data platforms that supported this ecosystem, ingesting and processing data from multiple sources, including sales systems, distributors, and APIs. These pipelines enabled both batch and near-real-time use cases across the platform.
The solutions provided visibility into sales, inventory, and market behavior, helping retailers and suppliers act more efficiently.
Key features included scalable data ingestion, event-driven processing, and reliable data pipelines designed to support real-time insights and operational decision-making across the supply chain.
- Designed and implemented scalable and resilient data architectures.
- Built AWS serverless data pipelines using AWS Lambda, Amazon ECS, Amazon EC2, Amazon S3, DynamoDB, and EventBridge.
- Leveraged event-driven data processing for real-time solutions.
- Collected, transformed, and moved data from diverse sources for integration.
- Implemented monitoring and management mechanisms for data pipelines.
- Ensured data security and compliance with industry standards.
Senior Data Engineer
A consultancy company that worked for Claro Telecome, which is one of the largest telecommunications companies in Brazil, providing mobile, broadband, and digital services to millions of customers. The project focused on building and maintaining data solutions to support business intelligence and operational reporting across the company.
Raphael worked on developing and maintaining ETL pipelines using Informatica PowerCenter and Oracle (PL/SQL), processing large volumes of telecom data such as customer activity, billing, and service usage.
These pipelines enabled internal teams to generate reports and insights used for decision-making, performance monitoring, and operational analysis.
Key features included data integration across multiple systems, ETL performance optimization, and ensuring data consistency and reliability across environments, with a strong focus on production stability.
- Performed analysis and modeling for business intelligence.
- Developed PL/SQL (Oracle).
- Developed ETL in PowerCenter.
- Maintained shell scripts.
- Monitored approval stages and production/post-production implementations.
- Improved ETL process performance.
- Manipulated files and folders on Unix.