Jhan – Python, SQL, AWS
Senior Data Engineer with 5+ years of experience building scalable batch and real-time data pipelines. Hands-on expertise in GenAI, developing pipelines for LLM embeddings and RAG applications, integrating data from multiple sources into vector databases. Deep experience with dbt and rare expertise in CubeJS for semantic layers. Skilled in managing infrastructure with IaC tools like Terraform, delivering end-to-end data solutions from ingestion to semantic modeling.
7 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Senior AI Engineer
Developed a Document Generation Expert powered by LLMs, with one version using ChatGPT and advanced prompt engineering and another leveraging Gemini via AWS Lambda and CloudFormation, automating professional-grade document creation and integrating into workflows to improve accuracy, efficiency, and scalability.
- Designed and deployed serverless architecture with AWS Lambda;
- Migrated infrastructure from CloudFormation to Terraform;
- Integrated LLM APIs with business logic to support multiple providers;
- Addressed technical debt and improved model inference performance.
Senior Data Engineer
Built and optimized data pipelines ingesting and transforming data from multiple sources (Oracle, S3, Athena) into Snowflake, providing reliable near-real-time analytics while ensuring governance and scalability.
- Designed DBT models for Snowflake across multiple environments;
- Implemented and monitored Airflow DAGs for ingestion and transformations;
- Automated CI/CD pipelines with Jenkins-based in-house tools;
- Validated streaming events and ensured data integrity across ingestion systems.
Senior Data Engineer
Built and maintained data infrastructure for advertising pipelines, enabling near real-time analytics and scalable workflows to support Reddit’s growing ad ecosystem, with automated CI/CD for reliability.
- Built and optimized DBT models on BigQuery;
- Automated CI/CD pipelines using DroneCI and Docker, pushing images to ECR;
- Monitored Airflow DAGs powering ad rollups and transformations;
- Designed GCP resource optimizations to reduce costs and improve SLA compliance.
Senior Data Engineer
Designed a robust near-real-time ingestion system that consumed data from multiple APIs into Snowflake.
- Automated data migrations with Airbyte CLI and validated datasets using DBT;
- Provisioned AWS and Snowflake resources using Terraform, ensuring infrastructure-as-code practices;
- Built scalable ingestion pipelines with AWS Lambda, ECS, SQS, and SNS;
- Developed data models and transformations in dbt to support BI layers;
- Automated CI/CD pipelines with GitHub Actions.