Daniel – Pandas, Tableau, SQL, experts in Lemon.io

Daniel

From United Kingdom (UTC+1)

Data Analyst|Middle-to-senior

Data Scientist|Senior

Skills and seniority verified on Mar 11, 2026

Daniel – Pandas, Tableau, SQL

Daniel is a senior Data Scientist and Data Analyst with over 7 years of hands-on experience in Python, Pandas, Tableau, and classical machine learning (scikit-learn, gradient boosting). He has led end-to-end analytics projects in manufacturing, supply chain, and fintech, building robust ETL pipelines and dashboards. Daniel demonstrates strong communication, stakeholder alignment, and ownership of production, consistently translating complex data into actionable insights. He would be an excellent addition to any team, bringing not only technical expertise but also a collaborative mindset, mentoring capabilities, and a proactive approach to problem-solving.

8 years of commercial experience in

Accounting

Advertising

AI

Apparel

Banking

Beauty

Customer support

Data analytics

E-commerce

Electronics

Fintech

Machine learning

Manufacturing

Supply chain

Geospatial software

Financial asset management

Main technologies

Pandas

7.5 years

Tableau

5 years

SQL

7.5 years

Python

7 years

Additional skills

Scikit-learn

SQL Server

GCP

Microsoft Azure

Data visualization

Machine learning

AWS

ETL

Data Warehouse

Data Modeling

Plotly

Data Science

Cloud Computing

Data analysis

Docker

AI

Airflow

Cron

Databricks

GPT

Claude Code

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Lead Data Scientist

Jun 2025 - Dec 20256 months

Project Overview

In semiconductor wafer manufacturing, pick tools may collect defective dies because of their close proximity to functional ones on the wafer. To minimize this risk, dies are often manually screened to guide the tool toward clusters with a higher likelihood of reliability.

This project focused on developing an automated die screening approach using machine learning techniques. The objective was to identify reliable die clusters more efficiently and consistently, reducing reliance on manual inspection while improving the scalability and overall efficiency of the manufacturing workflow.

Responsibilities:

Re-architected the screening solution by shifting decision-making from manual to a machine learning - based approach. Gathered requirements, including data collected from manual screening for ML modelling;
Performed advanced feature engineering to generate spatial die features reflecting patterns previously applied in manual screening;
Trained and validated an XGBoost classification model to identify dies suitable for picking;
Applied SHAP analysis to interpret model predictions at the die level and confirm that the model captured physically meaningful screening patterns;
Evaluated model performance using precision–recall curves and ROC analysis, selecting a probability threshold aligned with the operational objective of high-precision screening;
Packaged the trained model as a reusable pickle bundle and deployed to production by integrating it into the existing ETL pipeline.

Project Tech stack:

Python

Scikit-learn

Machine learning

GCP

Data Science

ETL

Pandas

Cloud Computing

Microsoft SQL Server

Data analysis

Linux

Lead Data Scientist

Dec 2024 - Jun 20256 months

Project Overview

A machine learning solution for early-stage screening of low-yield semiconductor wafers. The approach enabled teams to prioritize high-potential wafers, reduce unnecessary testing, optimize resource allocation, and accelerate production ramp-up.

Responsibilities:

Collaborated with test engineers and operations teams to gather requirements, propose the solution, and ensure alignment with production needs;
Built ETL pipeline to collect and integrate wafer parametric test data and historical yield data from multiple sources;
Performed data profiling, cleaning, exploratory analysis, and feature engineering to prepare datasets and identify key parameters influencing yield;
Developed, evaluated, and validated a supervised machine learning model to predict wafer yield using identified parametric features and appropriate performance metrics;
Built dashboards to show predictions, parameter trends, and insights to engineering teams and trained them on how to independently use the tool;
Documented modelling approaches, assumptions, and workflows to ensure reproducibility and support future improvements.

Project Tech stack:

Python

Machine learning

ETL

Pandas

scikit-learn

Plotly

AWS

Data visualization

Data Modeling

Data Warehouse

Azure DevOps

Microsoft SQL Server

GitHub

API

Machine Learning Engineer

Jan 2024 - Aug 20246 months

Project Overview

A machine learning model to detect fraudulent transactions in a fintech application. The solution aimed to improve transaction security, reduce financial risk, and enhance user trust while supporting early-stage product development.

Responsibilities:

Performed exploratory data analysis to understand transaction patterns, fraud distribution, and key risk indicators across transaction types;
Engineered and selected meaningful features, including transaction balances, log-transformed transaction amounts, and temporal variables.Encoded categorical variables and prepared structured datasets for model training and evaluation;
Implemented strategies to address class imbalance, including balanced class weights during training;
Trained a fraud detection model using XGBoost, optimised for imbalanced classification, and evaluated its performance using recall as a primary metric;
Structured the project into a reproducible pipeline with modules for ETL, feature engineering, and prediction;
Deployed model in app, stress testing by simulating fraudulent transactions, and validating functionality.

Project Tech stack:

Python

AI

Scikit-learn

Database Management Systems

Airflow

Pandas

Plotly

AWS

Data Warehouse

Data Modeling

Data visualization

Senior Data Scientist

Jan 2023 - Aug 20237 months

Project Overview

A machine learning solution to predict potential delays along the supply chain’s critical path. The system provides early alerts to stakeholders, enabling proactive mitigation, improving operational efficiency, and minimizing disruption risks.

Responsibilities:

Collaborated with stakeholders to understand the causes and impact of process delays and define requirements for a predictive solution;
Collected and integrated historical supply chain data, including process timelines, order information, and operational metrics from multiple sources;
Performed data profiling, cleaning, and feature engineering to structure time-series datasets suitable for sequential modelling;
Developed and trained an LSTM (Long Short-Term Memory) neural network to model temporal dependencies and predict delays in critical supply chain processes;
Evaluated and validated model performance using historical outcomes and appropriate forecasting metrics to ensure reliability;
Collaborated with app developers to integrate the model in the team's app and to develop a delay notification feature.

Project Tech stack:

Python

Machine learning

Data Modeling

Data Warehouse

Docker

Microsoft SQL Server

Cloud Computing

Data visualization

Database Management Systems

Data analysis

scikit-learn

Microsoft Azure

Lead Data Analyst

Sep 2022 - Feb 20235 months

Project Overview

Developed the foundational analytics infrastructure by creating automated data pipelines, stored procedures, a custom Python module, and interactive dashboards. The solution standardized data processing and reporting, enabling more efficient analysis, consistent insights, and scalable decision-making.

Responsibilities:

Built the foundational analytics infrastructure, including automated data pipelines and stored procedures;
Designed and developed a custom Python library to standardise common data processing, analysis, and modelling tasks, enabling reusable and efficient workflows across projects;
Built interactive dashboards and reporting tools to visualise key business metrics and provide stakeholders with real-time operational insights;
Established documentation and coding practices to ensure reproducibility, maintainability, and scalability of analytical solutions.

Project Tech stack:

Python

Tableau

ETL

Data Warehouse

Data visualization

Data Modeling

eCommerce

Pandas

Databricks

Cron

Keep in mind, the experience summary might exclude non-relevant projects

Education

2023

Data Science (MSc)

Masters Degree

2018

Electronics and Computer Engineering

Bachelors

Languages

English

Advanced

Hire Daniel or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request

Copyright © 2026 lemon.io. All rights reserved.

Terms of use Privacy policy