Mario

From Guatemala (UTC-6)

Data Engineer|Senior

Lemon.io stats

1

projects done

3184

hours worked

1

ongoing project

Skills and seniority verified on May 8, 2024

Mario – Python, AWS, ETL

Versatile Data Engineer with 17 years of expertise, including leadership roles such as Team Lead and CTO of 15 people. Mario possesses strong self-presentation skills, a business-oriented approach, and demonstrated leadership qualities. During the technical interview, he excelled in justifying architectural decisions and exhibited proficiency in SQL, both written and theoretical aspects.

18 years of commercial experience in

Accounting

Administration

Analytics

Asset management

Business intelligence

Cloud computing

Consulting services

Data analytics

Main technologies

Python

7 years

AWS

7 years

ETL

7 years

SQL

17 years

Additional skills

Apache Airflow

Microsoft SQL Server

Data Warehouse

Apache Hadoop

Git

Apache Spark

Kubernetes

Tableau

Docker

Snowflake

Azure DevOps

Redshift

API

PostgreSQL

AWS Lambda

MySQL

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior Data Engineer

Jul 2023 - Ongoing2 years 4 months

Project Overview

The project focused on enhancing the company's datamart ETL processes through the development of stored procedures in Snowflake, AWS Lambda functions, and Data Definition Language (DDL) scripts.

Responsibilities:

Migrated Adyen SFTP process pipeline and Paypal SFTP process pipeline;
Developed new Snowflake Procedures;
Generated lambdas for importing new data from Accounts Receivable Aging Data;
Created views and procedures for Intacct information and Zuora;
Processed incoming files for storage in the AWS S3 data lake through Lambda function development;
Formulated intricate queries for views, altered statements, and new table creation as part of DDL scripting for ETL operations;
Conducted rigorous unit testing prior to ticket progression into peer review and subsequent Quality Assurance (QA) stages;
Demonstrated proficiency in deciphering and modifying inherited codebases from external teams, requiring meticulous code comprehension and adaptation.

Project Tech stack:

Snowflake

AWS

Pandas

Python

Amazon S3

AWS Lambda

MySQL

PostgreSQL

SQL

CTO, Senior Data Engineer, SREs

Aug 2022 - Jul 202311 months

Project Overview

This project addressed the infrastructure upgrade needs of a small firm managing data from over 15 clients, primarily received via email or WhatsApp in CSV or Excel formats. The initiative aimed to modernize their practices to industry standards. It involved deploying 15 distinct organizations, each tailored to a specific client, and establishing S3 buckets for data lake storage. Additionally, an ETL pipeline utilizing Glue/Airflow was implemented, with RDS PostgreSQL serving as the Data Warehouse solution due to budget constraints preventing the adoption of Redshift. This comprehensive overhaul enabled the firm to efficiently manage and process client data while aligning with contemporary data management practices.

Responsibilities:

Created 15 distinct ETL pipelines;
Established servers for the Data Scientist team to mitigate RAM issues on their local computers;
Introduced QuickSight as a BI tool, replacing R for graph creation;
Adopted Slack for internal communication, replacing email/WhatsApp;
Deployed 1Password as a standard practice for password storage;
Collaborated directly with clients to integrate ETL for each;
Initiated migration to Jira for Agile methodology adoption.

Project Tech stack:

AWS

ETL

Amazon S3

Amazon RDS

Senior Data Engineer

Aug 2020 - Aug 20222 years

Project Overview

The project focused on ensuring the punctual delivery of data for corporate and shareholder meetings, constituting the primary ETL process. This involved overseeing the functionality of numerous Airflow Directed Acyclic Graphs (DAGs). Additionally, the project involved spearheading the development of new ETL processes, including data extraction from various APIs such as Shopify and iAuditor. Moreover, it encompassed facilitating the migration from MSSQL Stored Procedures to Redshift for improved data management. Notably, the project identified and implemented optimizations within Airflow operators, leading to enhanced efficiency despite initial skepticism. Furthermore, advocating for transitioning from Athena to Redshift yielded substantial performance enhancements for query operations, amplifying the project's impact beyond routine duties.

Responsibilities:

Provided support for Airflow DAGs;
Was responsible for fixing DAGs in Non-prod and Prod environments;
Created and maintained new Airflow operators;
Monitored over 400 DAGs in Non-prod and prod daily;
Established new data pipelines for Airflow;
Managed and enhanced existing pipelines;
Granted permissions to access the data lake to other team members;
Offered primary Git support to other team members;
Provided AWS support on S3, Docker, Glue, Athena, EMR, Lake Formation, and Redshift.

Project Tech stack:

AWS

Python

Redshift

Docker

Kubernetes

Apache Airflow

SQL

API

Application Architect, Senior Data Engineer, SREs

Nov 2017 - Aug 20202 years 9 months

Project Overview

The project aimed to address a bottleneck in tenant data processing for a client with over 400 tenants. Due to inefficiencies with a service provider, there was a backlog of 300+ pending installations, causing delays of up to 10 months. To resolve this, the project replaced the service provider's processor and implemented two distinct pipelines for image and text files. These pipelines underwent sequential stages, including file sorting, text file cleansing/OCR, invoice detail extraction, and metadata addition, resulting in efficient data handling for the client's extensive tenant network.

Responsibilities:

Crafted the entire back-end architecture;
Generated the initial reports on Sisense, our BI tool;
Established everything from scratch in a new AWS Organization dedicated to this project;
Transitioned from a Batch process using EC2 servers to a serverless solution using lambdas;
Increased file read accuracy from 45% to 95%-99%;
Reduced configuration time for each tenant from 3 days to a maximum of 15 minutes;
Implemented an approach to minimize reconfigurations compared to the provider queue;
Enabled reading SKUs from invoices, a capability not previously available;
Processed over 1.5 million files per month;
Reduced the time from receiving the file to having it in our Data Warehouse in Redshift to 7 seconds;
Developed a functional ETL Pipeline process within 2-3 months.

Project Tech stack:

AWS

Python

RegExp

AWS Lambda

Amazon EC2

Amazon RDS

Amazon S3

Auth0

Cloud Architecture

Cloud Computing

CloudWatch

Linux

PostgreSQL

Redshift

Keep in mind, the experience summary might exclude non-relevant projects

Education

2008

Computer Science

Languages

English

Advanced

Hire Mario or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request