Logo
Thosan – Python, SQL, Apache Spark, experts in Lemon.io

Thosan

From Indonesia (GMT+7)

flag
Data EngineerSenior
Hire developer
7 years of commercial experience
AI
Analytics
Business intelligence
Cloud computing
Data analytics
E-commerce
Farming
Govtech
IoT
Machine learning
Maritime
Scientific research
Marketplace
Trade
Chatbots
Geospatial software
Lemon.io stats

Thosan – Python, SQL, Apache Spark

Thosan is a seasoned Senior Data Engineer with over 5 years of diverse commercial experience spanning both enterprise and start-up environments. His unique background includes a tenure as a public servant at governmental institutions, where he honed his skills amidst bureaucratic challenges and on-premises systems. Transitioning to big and medium-sized startups, he seamlessly adapted to fast-paced environments and cloud infrastructure, showcasing his ability to be flexible. Thosan's proficiency in Python and adeptness at selecting the right technologies, combined with his keen eye for code optimization and business-oriented mindset, make him an invaluable asset for elevating projects to new heights.

Main technologies
Python
4 years
SQL
4 years
Additional skills
Apache Spark
Big Data
GCP
Apache Airflow
Data Warehouse
Data Security
BigQuery
PySpark
Terraform
Ready to start
ASAP
Direct hire
Potentially possible

Experience Highlights

Data Engineer
Sep 2023 - Ongoing9 months
Project Overview

The client was a company that specializes in providing chat commerce solutions, aiming to enhance the customer experience by leveraging conversational AI to facilitate seamless and engaging interactions between businesses and their customers. The main challenge was to architect the event-driven data lakehouse, harvesting insights from hundreds of streaming sources to optimize recommendations, conversion, and retention rates for customer-facing companies worldwide.

Skeleton
Skeleton
Skeleton
Responsibilities:

Thosan managed the following tasks:

  • developed an Airflow-based data ingestion using Dataproc batch;
  • built Data Lakehouse in event-based architecture on BQ-Iceberg using Spark;
  • handled muti-tenant analytics page migration using GoodData and DBT.
Project Tech stack:
Apache Airflow
BigQuery
Big Data
Data Modeling
Data Warehouse
Apache Spark
Scala
Python
SQL
GCP
Tableau
Senior Data Engineer
Nov 2022 - Sep 20239 months
Project Overview

This client was an Indonesian agritech company that provides technology solutions for fish and shrimp farming. Their products included automatic feeders and data-driven platforms to optimize feeding, reduce waste, and improve yields. The aim of the project was to build a modern, high-quality data pipeline, data warehouse, and data analytics platform to help hundreds of data users and thousands of aquaculture farmers across Indonesia.

Skeleton
Skeleton
Skeleton
Responsibilities:

Among others, Thosan carried out such tasks:

  • kickstarted and designed data quality initiatives using multiple custom Great Expectations scripts and Trino, as a query engine;
  • built a data observability platform using Datahub, including a custom data lineage error tracing mechanism;
  • setup & migrated data warehouse modeling platform using DBT.
Project Tech stack:
PostgreSQL
AWS
Kubernetes
BigQuery
Apache Airflow
Apache Kafka
Python
SQL
Big Data
IoT
Data Modeling
Data Security
Data Warehouse
Data Engineer
Sep 2021 - Nov 20221 year 2 months
Project Overview

The customer was a leading Indonesian e-commerce platform, providing a marketplace for small and medium-sized enterprises (SMEs) to sell their products online. They offered a wide range of goods and services, including electronics, fashion, and household items, catering to millions of users across Indonesia. The project aimed to empower local businesses by facilitating digital transformation and improving access to the online market.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • optimized GCP Data Architecture to handle hundreds of queries per minute;
  • assessing multiple data architecture performances between GCP, Azure, and Databrick;
  • spearheaded the development of a customized data modeling platform (DBT-like), tailored to the client's specific requirements;
  • maintained multi-instance airflow for data modeling and pipelining;
  • ensuring a high BigQuery success rate (>90%) by optimizing DS queries and data modeling while decreasing its operational cost;
  • built a cost anomalies detection for multiple GCP Projects;
  • built a custom-curated dashboard website using Looker, ReactJS, and Spring for C-level executives as a daily company overview;
  • mentored and guided junior data engineers for a better onboarding process.
Project Tech stack:
BigQuery
GCP
Big Data
Apache Spark
Looker
Apache Airflow
Terraform
Kubernetes
Databricks
Web developer / Data Engineer
Sep 2018 - Sep 20213 years
Project Overview

The client was the government agency responsible for collecting, analyzing, and disseminating statistical data in Indonesia. Established to support national development, they provided comprehensive data on various sectors, including population, economy, agriculture, and industry. The agency aimed to deliver accurate and timely statistics to inform policy-making, development planning, and public awareness.

Skeleton
Skeleton
Skeleton
Responsibilities:

Thosan proficiently implemented the following:

  • setup automated web crawlers;
  • built pipeline for data collection, data wrangling, and data visualization;
  • analyzed data to get actionable insights from it, and presented it to B-level executives as a policy recommendation for the president;
  • conducted multiple researches on calculating poverty rate based on nighttime light satellite imagery and approximating the occupancy rate using multiple OTA public API data.
Project Tech stack:
Apache Spark
Apache Hadoop
Python
SQL
PostgreSQL
MongoDB
Apache Airflow
Linux
Hive
JavaScript
Web scraping
Scrapy

Education

2016
Computational Statistics
Bachelor
2022
Data Engineering Certification
Professional Data Engineer

Copyright © 2024 lemon.io. All rights reserved.