
Jacob
From New Zealand (UTC+13)
16 years of commercial experience
Jacob – Python, PyTorch, LLM
Meet Jacob, an accomplished AI Engineer with over 5 years of experience specializing in MLOps, CI/CD, and cloud platforms like Azure and AWS. Jacob combines a strong academic & scientific foundation with practical expertise, having led teams and made impactful architectural decisions in complex projects such as transport optimization and deep learning models.
He possesses solid technical skills in PyTorch and large language models, and approaches AI engineering with a balance of theory and hands-on application. Jacob communicates clearly and prefers a structured, collaborative workflow.
Main technologies
Additional skills
Rewards and achievements
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Architect & Developer
A statistical and machine learning software solution designed to uncover patterns in large-scale datasets. Developed using Python and R, the platform combined probabilistic modeling, time series analysis, and statistical simulations to support rapid processing of millions of data rows.
Jacob was responsible for the following:
- engineered the system to efficiently process millions of data rows within minutes;
- designed and implemented a Java-based user interface for intuitive result visualization;
- implemented core features such as Bayesian network learning and inferenced;
- conducted statistical data exploration and time series analysis;
- developed stochastic simulation capabilities within the platform;
- designed the software architecture for easy integration of additional machine learning techniques;
- built interactive data visualization tools including line charts, histograms, bar charts, regression plots, and network graph visualizations.
Data Scientist / Developer
The client was a data science community platform that hosted machine learning competitions and collaborative coding challenges. Through Kaggle, participants accessed real-world datasets and problem statements across diverse domains—such as finance, health, and image processing—competing to develop the most accurate predictive models. The platform facilitated model evaluation, leaderboard ranking, and collaboration via code sharing and discussion forums, empowering practitioners and organizations to accelerate machine learning innovation and skill development.
Jacob participated in several high-profile Kaggle challenges, applying advanced machine learning and optimization techniques:
- CommonLit Readability Scoring: eveloped models to evaluate the readability of educational texts using spaCy NLP pipelines combined with multiple regression techniques;
- Acea Smart Water Analytics – applied evolutionary algorithms to optimize parameters for hydrological models, accurately forecasting water supply from wells across Italian basins;
- Heritage Health Prize – built ensemble learning pipelines using neural networks, genetic programming, and association rule mining to predict hospital admissions from U.S. insurance claims data. Achieved a score of 0.47 after 15 entries, closely approaching the 0.46 score of the winning team (who submitted 671 entries).
Director of Machine Learning
A machine learning–driven solution designed to extract and deliver personalized health insights based on user inputs (text and image).
Jacob's scope of work included, but was not limited to the following:
- fine-tuned a Mistral 7B model to extract domain-specific DSL code from natural language queries related to lifestyle choices, enabling agentic behavior in an LLM-powered system;
- developed and fine-tuned large language models (LLMs) to accurately retrieve food ingredient information for use in a Flutter-based mobile app;
- implemented Retrieval-Augmented Generation (RAG) to enhance model responses with stored and contextual data;
- built an OCR pipeline—including pre-processing and post-processing—to extract food ingredient information from images and videos of food labels, tackling challenges such as variable image quality and natural scene text recognition;
- implemented computer vision models to recognize ingredients and recipes from food images, integrating with downstream data pipelines;
- designed and trained deep learning models to predict sleep stages using mobile-recorded audio data by transforming raw audio into spectrograms and training on labeled sleep stage datasets.
Scientific Technical Team Lead
A logistics optimization system developed to streamline the transportation of water tanks across various delivery routes for industrial and commercial clients. The system was designed to minimize fuel consumption, reduce delivery times, and maximize vehicle utilization through intelligent route planning and scheduling.
Jacob contributed in the following ways:
- adapted GIS software to support geospatial data analysis and visualization for logistics optimization;
- integrated packing and routing algorithms to optimize delivery efficiency and vehicle utilization;
- customized and applied a clustering algorithm originally used in bioinformatics to group delivery destinations effectively;
- combined clustering logic with an evolutionary algorithm to create a robust transport optimization solution;
- developed and deployed the system for an Australian water tank manufacturer, achieving real-world performance improvements;
- contributed to a publication in the OR Insight operations research journal based on the project;
- reapplied the custom clustering algorithm in a document clustering search engine to improve content discovery and retrieval.