Alejandro

From Spain (UTC+2)

Data ScientistSenior

Machine Learning EngineerSenior

7 years of commercial experience

Automotive

E-commerce

Machine learning

Pharmaceutics

Real estate

NLP software

Lemon.io stats

1

projects done

163

hours worked

Skills and seniority verified on May 7, 2024

Alejandro – Python, Tensorflow, AWS

Alejandro is a Senior Data Scientist and ML Engineer with an educational background in Mechanical Engineering and Big Data & Business Intelligence, complemented by financial market education and trading experience. He possesses team leadership experience and a versatile portfolio spanning e-commerce, proptech, airlines, and insurance sectors, with a particular passion for the finance.

Main technologies

Python

7 years

Tensorflow

4 years

AWS

4 years

Machine learning

5 years

Additional skills

API

Deep Learning

Apache Spark

PyTorch

LangChain

SQL

Apache Airflow

Microsoft Power BI

GPT

GCP

Kubernetes

Scikit-learn

Flask

Pandas

MLOps

AWS SageMaker

Snowflake

Business intelligence

Claude LLM

NLP

scikit-learn

OpenAI API

Hugging Face

Docker

Testimonials

#21932098173Data Scientist for AI Network Engineer Super Agent

"Alejandro has been fantastic! We actually bumped him up to 40 hrs / week a week in and finished the 160 hours contracted. We are really happy with his deliverables."

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior Data Scientist

Sep 2023 - Feb 20245 months

Project Overview

The project involved the integration of advanced AI technologies for the Pharmaceutical client's development project, delivered by the company's SaaS offering. This initiative focused on leveraging state-of-the-art AI models to automate and refine the analysis of manufacturing deviations, and root causes topics in unstructured data coming from the client's manufacturing plants, significantly enhancing the client's operational processes and decision-making capabilities.

Responsibilities:

Improved the development of an AI-enhanced topic modeling systems using cutting-edge tools from Hugging Face and AWS Bedrock/OpenAI, crucial for parsing vast datasets related to manufacturing deviations;
Led the development of multi-class classification models using TensorFlow, adapted for accurately categorizing manufacturing observations, reducing manual review times, and increasing precision;
Implemented Bayesian hyperparameter optimization and advanced sentence transformer strategies to refine model performance and enhance label and topic selection;
Orchestrated end-to-end pipeline productization, integrating data preprocessing, feature engineering, model training, and prediction serving into the in-house SaaS infrastructure;
Reviewed coding and development planning activities, providing expert mentorship to junior data scientists to ensure high code quality and project goal adherence;
Engaged with client stakeholders to align AI solutions with strategic objectives, participating in product roadmap discussions to consistently deliver value and address client needs.

Project Tech stack:

Python

Deep Learning

Machine learning

NLP

AWS

Claude LLM

OpenAI API

Scikit-learn

Tensorflow

scikit-learn

LangChain

Hugging Face

Senior Data Scientist

Oct 2022 - Mar 20234 months

Project Overview

A demand forecasting system that addresses over 100,000 products at the company's e-commerce store, using advanced preprocessing techniques and time series modeling frameworks. Integrating this system through automated workflows in AWS, orchestrated via Apache Airflow and managed through Terraform, significantly enhanced operational efficiency and decision-making processes across the company.

Responsibilities:

Developed Apache Spark and AWS Glue pipelines for efficient data processing in the retail e-commerce environment, ensuring scalability and automation;
Implemented Facebook's Prophet and hierarchical forecasting techniques through Nixtla for dynamic model selection, training, and evaluation;
Enhanced forecasting accuracy by integrating seasonality, promotions, and website traffic into models, improving demand prediction during peaks;
Conducted cross-validation to ensure forecast reliability, analyzing performance across seasons and time windows for model refinement;
Deployed ML infrastructure using Terraform and AWS, managing complex workflows with Apache Airflow for efficient end-to-end pipelines.

Project Tech stack:

AWS SageMaker

Apache Spark

Apache Airflow

Scikit-learn

Machine learning

Deep Learning

AWS

MLOps

Python

Pandas

Snowflake

Data Scientist

Dec 2019 - May 20204 months

Project Overview

A sophisticated real estate scoring system at a VC-funded startup. There was a need for advanced preprocessing techniques, machine learning models and integration of this system through a custom API into the company's infrastructure to enhance estate valuation processes, optimize sales strategies, and improve operational efficiency.

Responsibilities:

Leveraged into Google Cloud Platform's to manage distributed data preprocessing workloads with high-volume datasets adding up to 200M rows;
Built the production pipeline to ingest the input CSV files and insert the processed data into a database within the GCP stack, using Apache Beam's Python SDK through Dataflow and BigQuery for storage;
Created regression and classification models with LightGBM to predict estate prices and time-to-sell, ensuring generalization through cross-validation;
Implemented extensive feature engineering, enhancing model predictions with geospatial data, improving understanding of estate values and market dynamics;
Developed a custom scoring system to target and classify estates based on model outputs;
Integrated Data Version Control (DVC) to manage datasets and models for reproducibility;
Designed a pipeline for automated model retraining, ensuring accuracy with new data;
Developed a Flask API for real-time property scoring and valuation, deployed on Kubernetes for scalability and reliability.

Project Tech stack:

Python

GCP

BigQuery

Machine learning

Scikit-learn

Flask

Kubernetes

Pandas

Business intelligence

Data Scientist

Aug 2019 - Dec 20194 months

Project Overview

A propensity score system for an online car dealership business. It was designed to calculate customer propensity scores to refine targeting strategies and improve conversion rates on the platform. Integrating this model into the company's operational framework through a custom API enhanced marketing conversion rates, user targeting strategy, and call center efficiency.

Responsibilities:

Employed data ingestion and preprocessing techniques to handle extensive customer behavior data from multiple years, utilized data-wrangling practices;
Built a custom binary classification model using XGBoost to predict customer propensity scores, incorporating features processed from various data sources;
Evaluated the model with cross-validation techniques to ensure prediction generalization and conducted probability calibration for more accurate predictions;
Conducted A/B testing to compare model-driven targeting against standard methods, resulting in a significant increase in conversion rates;
Developed an API for real-time scoring and seamlessly integrated the propensity scoring system into the existing tech infrastructure;
Collaborated with the in-house data analyst throughout the project and provided comprehensive documentation upon code delivery to understand the code base and development logic.

Project Tech stack:

Python

Scikit-learn

Machine learning

GCP

Pandas

Flask

Keep in mind, the experience summary might exclude non-relevant projects

Education

2020

Big Data and Business Intelligence

Master's

Languages

English

Advanced

Hire Alejandro or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request