Daniel – Pandas, Tableau, SQL
Daniel is a senior Data Scientist and Data Analyst with over 7 years of hands-on experience in Python, Pandas, Tableau, and classical machine learning (scikit-learn, gradient boosting). He has led end-to-end analytics projects in manufacturing, supply chain, and fintech, building robust ETL pipelines and dashboards. Daniel demonstrates strong communication, stakeholder alignment, and ownership of production, consistently translating complex data into actionable insights. He would be an excellent addition to any team, bringing not only technical expertise but also a collaborative mindset, mentoring capabilities, and a proactive approach to problem-solving.
8 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Lead Data Scientist
In semiconductor wafer manufacturing, pick tools may collect defective dies because of their close proximity to functional ones on the wafer. To minimize this risk, dies are often manually screened to guide the tool toward clusters with a higher likelihood of reliability.
This project focused on developing an automated die screening approach using machine learning techniques. The objective was to identify reliable die clusters more efficiently and consistently, reducing reliance on manual inspection while improving the scalability and overall efficiency of the manufacturing workflow.
- Re-architected the screening solution by shifting decision-making from manual to a machine learning - based approach. Gathered requirements, including data collected from manual screening for ML modelling;
- Performed advanced feature engineering to generate spatial die features reflecting patterns previously applied in manual screening;
- Trained and validated an XGBoost classification model to identify dies suitable for picking;
- Applied SHAP analysis to interpret model predictions at the die level and confirm that the model captured physically meaningful screening patterns;
- Evaluated model performance using precision–recall curves and ROC analysis, selecting a probability threshold aligned with the operational objective of high-precision screening;
- Packaged the trained model as a reusable pickle bundle and deployed to production by integrating it into the existing ETL pipeline.
Lead Data Scientist
A machine learning solution for early-stage screening of low-yield semiconductor wafers. The approach enabled teams to prioritize high-potential wafers, reduce unnecessary testing, optimize resource allocation, and accelerate production ramp-up.
- Collaborated with test engineers and operations teams to gather requirements, propose the solution, and ensure alignment with production needs;
- Built ETL pipeline to collect and integrate wafer parametric test data and historical yield data from multiple sources;
- Performed data profiling, cleaning, exploratory analysis, and feature engineering to prepare datasets and identify key parameters influencing yield;
- Developed, evaluated, and validated a supervised machine learning model to predict wafer yield using identified parametric features and appropriate performance metrics;
- Built dashboards to show predictions, parameter trends, and insights to engineering teams and trained them on how to independently use the tool;
- Documented modelling approaches, assumptions, and workflows to ensure reproducibility and support future improvements.
Machine Learning Engineer
A machine learning model to detect fraudulent transactions in a fintech application. The solution aimed to improve transaction security, reduce financial risk, and enhance user trust while supporting early-stage product development.
- Performed exploratory data analysis to understand transaction patterns, fraud distribution, and key risk indicators across transaction types;
- Engineered and selected meaningful features, including transaction balances, log-transformed transaction amounts, and temporal variables.Encoded categorical variables and prepared structured datasets for model training and evaluation;
- Implemented strategies to address class imbalance, including balanced class weights during training;
- Trained a fraud detection model using XGBoost, optimised for imbalanced classification, and evaluated its performance using recall as a primary metric;
- Structured the project into a reproducible pipeline with modules for ETL, feature engineering, and prediction;
- Deployed model in app, stress testing by simulating fraudulent transactions, and validating functionality.
Senior Data Scientist
A machine learning solution to predict potential delays along the supply chain’s critical path. The system provides early alerts to stakeholders, enabling proactive mitigation, improving operational efficiency, and minimizing disruption risks.
- Collaborated with stakeholders to understand the causes and impact of process delays and define requirements for a predictive solution;
- Collected and integrated historical supply chain data, including process timelines, order information, and operational metrics from multiple sources;
- Performed data profiling, cleaning, and feature engineering to structure time-series datasets suitable for sequential modelling;
- Developed and trained an LSTM (Long Short-Term Memory) neural network to model temporal dependencies and predict delays in critical supply chain processes;
- Evaluated and validated model performance using historical outcomes and appropriate forecasting metrics to ensure reliability;
- Collaborated with app developers to integrate the model in the team's app and to develop a delay notification feature.
Lead Data Analyst
Developed the foundational analytics infrastructure by creating automated data pipelines, stored procedures, a custom Python module, and interactive dashboards. The solution standardized data processing and reporting, enabling more efficient analysis, consistent insights, and scalable decision-making.
- Built the foundational analytics infrastructure, including automated data pipelines and stored procedures;
- Designed and developed a custom Python library to standardise common data processing, analysis, and modelling tasks, enabling reusable and efficient workflows across projects;
- Built interactive dashboards and reporting tools to visualise key business metrics and provide stakeholders with real-time operational insights;
- Established documentation and coding practices to ensure reproducibility, maintainability, and scalability of analytical solutions.