Logo
Mario – Python, AWS, ETL, experts in Lemon.io

Mario

From Guatemala (GMT-6)

flag
Data EngineerSenior
Hire developer
18 years of commercial experience
Accounting
Administration
Analytics
Asset management
Business intelligence
Cloud computing
Consulting services
Data analytics
Lemon.io stats
1
projects done
1024
hours worked
Open
to new offers

Mario – Python, AWS, ETL

Versatile Data Engineer with 17 years of expertise, including leadership roles such as Team Lead and CTO of 15 people. Mario possesses strong self-presentation skills, a business-oriented approach, and demonstrated leadership qualities. During the technical interview, he excelled in justifying architectural decisions and exhibited proficiency in SQL, both written and theoretical aspects.

Main technologies
Python
7 years
AWS
7 years
ETL
7 years
SQL
17 years
Additional skills
Apache Airflow
Microsoft SQL Server
Data Warehouse
Apache Hadoop
Git
Apache Spark
Kubernetes
Tableau
Docker
Snowflake
Azure DevOps
Redshift
API
PostgreSQL
AWS Lambda
MySQL
Ready to start
To be verified
Direct hire
Potentially possible

Ready to get matched with vetted developers fast?
Let’s get started today!Hire developer

Experience Highlights

Senior Data Engineer
Jul 2023 - Ongoing1 year 3 months
Project Overview

The project focused on enhancing the company's datamart ETL processes through the development of stored procedures in Snowflake, AWS Lambda functions, and Data Definition Language (DDL) scripts.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Migrated Adyen SFTP process pipeline and Paypal SFTP process pipeline;
  • Developed new Snowflake Procedures;
  • Generated lambdas for importing new data from Accounts Receivable Aging Data;
  • Created views and procedures for Intacct information and Zuora;
  • Processed incoming files for storage in the AWS S3 data lake through Lambda function development;
  • Formulated intricate queries for views, altered statements, and new table creation as part of DDL scripting for ETL operations;
  • Conducted rigorous unit testing prior to ticket progression into peer review and subsequent Quality Assurance (QA) stages;
  • Demonstrated proficiency in deciphering and modifying inherited codebases from external teams, requiring meticulous code comprehension and adaptation.
Project Tech stack:
Snowflake
AWS
Pandas
Python
Amazon S3
AWS Lambda
MySQL
PostgreSQL
SQL
CTO, Senior Data Engineer, SREs
Aug 2022 - Jul 202311 months
Project Overview

This project addressed the infrastructure upgrade needs of a small firm managing data from over 15 clients, primarily received via email or WhatsApp in CSV or Excel formats. The initiative aimed to modernize their practices to industry standards. It involved deploying 15 distinct organizations, each tailored to a specific client, and establishing S3 buckets for data lake storage. Additionally, an ETL pipeline utilizing Glue/Airflow was implemented, with RDS PostgreSQL serving as the Data Warehouse solution due to budget constraints preventing the adoption of Redshift. This comprehensive overhaul enabled the firm to efficiently manage and process client data while aligning with contemporary data management practices.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Created 15 distinct ETL pipelines;
  • Established servers for the Data Scientist team to mitigate RAM issues on their local computers;
  • Introduced QuickSight as a BI tool, replacing R for graph creation;
  • Adopted Slack for internal communication, replacing email/WhatsApp;
  • Deployed 1Password as a standard practice for password storage;
  • Collaborated directly with clients to integrate ETL for each;
  • Initiated migration to Jira for Agile methodology adoption.
Project Tech stack:
AWS
ETL
Amazon S3
Amazon RDS
Senior Data Engineer
Aug 2020 - Aug 20222 years
Project Overview

The project focused on ensuring the punctual delivery of data for corporate and shareholder meetings, constituting the primary ETL process. This involved overseeing the functionality of numerous Airflow Directed Acyclic Graphs (DAGs). Additionally, the project involved spearheading the development of new ETL processes, including data extraction from various APIs such as Shopify and iAuditor. Moreover, it encompassed facilitating the migration from MSSQL Stored Procedures to Redshift for improved data management. Notably, the project identified and implemented optimizations within Airflow operators, leading to enhanced efficiency despite initial skepticism. Furthermore, advocating for transitioning from Athena to Redshift yielded substantial performance enhancements for query operations, amplifying the project's impact beyond routine duties.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Provided support for Airflow DAGs;
  • Was responsible for fixing DAGs in Non-prod and Prod environments;
  • Created and maintained new Airflow operators;
  • Monitored over 400 DAGs in Non-prod and prod daily;
  • Established new data pipelines for Airflow;
  • Managed and enhanced existing pipelines;
  • Granted permissions to access the data lake to other team members;
  • Offered primary Git support to other team members;
  • Provided AWS support on S3, Docker, Glue, Athena, EMR, Lake Formation, and Redshift.
Project Tech stack:
AWS
Python
Redshift
Docker
Kubernetes
Apache Airflow
SQL
API
Application Architect, Senior Data Engineer, SREs
Nov 2017 - Aug 20202 years 9 months
Project Overview

The project aimed to address a bottleneck in tenant data processing for a client with over 400 tenants. Due to inefficiencies with a service provider, there was a backlog of 300+ pending installations, causing delays of up to 10 months. To resolve this, the project replaced the service provider's processor and implemented two distinct pipelines for image and text files. These pipelines underwent sequential stages, including file sorting, text file cleansing/OCR, invoice detail extraction, and metadata addition, resulting in efficient data handling for the client's extensive tenant network.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Crafted the entire back-end architecture;
  • Generated the initial reports on Sisense, our BI tool;
  • Established everything from scratch in a new AWS Organization dedicated to this project;
  • Transitioned from a Batch process using EC2 servers to a serverless solution using lambdas;
  • Increased file read accuracy from 45% to 95%-99%;
  • Reduced configuration time for each tenant from 3 days to a maximum of 15 minutes;
  • Implemented an approach to minimize reconfigurations compared to the provider queue;
  • Enabled reading SKUs from invoices, a capability not previously available;
  • Processed over 1.5 million files per month;
  • Reduced the time from receiving the file to having it in our Data Warehouse in Redshift to 7 seconds;
  • Developed a functional ETL Pipeline process within 2-3 months.
Project Tech stack:
AWS
Python
RegExp
AWS Lambda
Amazon EC2
Amazon RDS
Amazon S3
Auth0
Cloud Architecture
Cloud Computing
CloudWatch
Linux
PostgreSQL
Redshift

Education

2008
Computer Science
BS

Languages

English
Advanced

Hire Mario or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestPlace a free quotedream dev illustration
Copyright © 2024 lemon.io. All rights reserved.