Alejandro
From Spain (GMT+2)
6 years of commercial experience
Lemon.io stats
1
projects done0
hours workedOpen
to new offersAlejandro – Python, Tensorflow, AWS
Alejandro is a Senior Data Scientist and ML Engineer with an educational background in Mechanical Engineering and Big Data & Business Intelligence, complemented by financial market education and trading experience. He possesses team leadership experience and a versatile portfolio spanning e-commerce, proptech, airlines, and insurance sectors, with a particular passion for the finance.
Main technologies
Additional skills
Ready to start
To be verifiedDirect hire
Potentially possibleExperience Highlights
Senior Data Scientist
The project involved the integration of advanced AI technologies for the Pharmaceutical client's development project, delivered by the company's SaaS offering. This initiative focused on leveraging state-of-the-art AI models to automate and refine the analysis of manufacturing deviations, and root causes topics in unstructured data coming from the client's manufacturing plants, significantly enhancing the client's operational processes and decision-making capabilities.
- Improved the development of an AI-enhanced topic modeling systems using cutting-edge tools from Hugging Face and AWS Bedrock/OpenAI, crucial for parsing vast datasets related to manufacturing deviations;
- Led the development of multi-class classification models using TensorFlow, adapted for accurately categorizing manufacturing observations, reducing manual review times, and increasing precision;
- Implemented Bayesian hyperparameter optimization and advanced sentence transformer strategies to refine model performance and enhance label and topic selection;
- Orchestrated end-to-end pipeline productization, integrating data preprocessing, feature engineering, model training, and prediction serving into the in-house SaaS infrastructure;
- Reviewed coding and development planning activities, providing expert mentorship to junior data scientists to ensure high code quality and project goal adherence;
- Engaged with client stakeholders to align AI solutions with strategic objectives, participating in product roadmap discussions to consistently deliver value and address client needs.
Senior Data Scientist
A demand forecasting system that addresses over 100,000 products at the company's e-commerce store, using advanced preprocessing techniques and time series modeling frameworks. Integrating this system through automated workflows in AWS, orchestrated via Apache Airflow and managed through Terraform, significantly enhanced operational efficiency and decision-making processes across the company.
- Developed Apache Spark and AWS Glue pipelines for efficient data processing in the retail e-commerce environment, ensuring scalability and automation;
- Implemented Facebook's Prophet and hierarchical forecasting techniques through Nixtla for dynamic model selection, training, and evaluation;
- Enhanced forecasting accuracy by integrating seasonality, promotions, and website traffic into models, improving demand prediction during peaks;
- Conducted cross-validation to ensure forecast reliability, analyzing performance across seasons and time windows for model refinement;
- Deployed ML infrastructure using Terraform and AWS, managing complex workflows with Apache Airflow for efficient end-to-end pipelines.
Data Scientist
A sophisticated real estate scoring system at a VC-funded startup. There was a need for advanced preprocessing techniques, machine learning models and integration of this system through a custom API into the company's infrastructure to enhance estate valuation processes, optimize sales strategies, and improve operational efficiency.
- Leveraged into Google Cloud Platform's to manage distributed data preprocessing workloads with high-volume datasets adding up to 200M rows;
- Built the production pipeline to ingest the input CSV files and insert the processed data into a database within the GCP stack, using Apache Beam's Python SDK through Dataflow and BigQuery for storage;
- Created regression and classification models with LightGBM to predict estate prices and time-to-sell, ensuring generalization through cross-validation;
- Implemented extensive feature engineering, enhancing model predictions with geospatial data, improving understanding of estate values and market dynamics;
- Developed a custom scoring system to target and classify estates based on model outputs;
- Integrated Data Version Control (DVC) to manage datasets and models for reproducibility;
- Designed a pipeline for automated model retraining, ensuring accuracy with new data;
- Developed a Flask API for real-time property scoring and valuation, deployed on Kubernetes for scalability and reliability.
Data Scientist
A propensity score system for an online car dealership business. It was designed to calculate customer propensity scores to refine targeting strategies and improve conversion rates on the platform. Integrating this model into the company's operational framework through a custom API enhanced marketing conversion rates, user targeting strategy, and call center efficiency.
- Employed data ingestion and preprocessing techniques to handle extensive customer behavior data from multiple years, utilized data-wrangling practices;
- Built a custom binary classification model using XGBoost to predict customer propensity scores, incorporating features processed from various data sources;
- Evaluated the model with cross-validation techniques to ensure prediction generalization and conducted probability calibration for more accurate predictions;
- Conducted A/B testing to compare model-driven targeting against standard methods, resulting in a significant increase in conversion rates;
- Developed an API for real-time scoring and seamlessly integrated the propensity scoring system into the existing tech infrastructure;
- Collaborated with the in-house data analyst throughout the project and provided comprehensive documentation upon code delivery to understand the code base and development logic.