Guillaume
From France (UTC+2)
Guillaume – Python, LLM, MLOps
Guillaume is a Senior Data Scientist, machine learning and MLOps engineer with strong expertise in Python, GCP, Terraform, and production ML systems. He has led end-to-end delivery of fraud detection, LLM-based assistants, and large-scale scoring platforms, demonstrating robust architectural judgment and stakeholder management. He is best suited for enterprise or impact-driven projects requiring both technical depth and client-facing skills.
9 years of commercial experience in
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Lead Data Scientist
An international biopharmaceutical group that needed to accelerate drafting of regulatory documents for Medical Writers (ICF, Investigator Brochure, Safety Reports, DSUR, PSUR) and automate content tagging across large document corpora.
Technical Approach
- Developed a RAG GenAI assistant for regulatory authoring (ICF, Investigator Brochure) with writer-in-the-loop review
- Designed and delivered a writing assistant for Safety Reports using multi-document structured information extraction
- Industrialized automated LLM / VLM analysis of documents (PDFs, slides, DOCX, images, video) to identify referenced products and extract key marketing messages
- Low-cost pipeline processing thousands of documents per day
Key Results
- Enabled faster drafting of regulatory documents for Medical Writers
- Pipeline processing thousands of documents/day across multiple formats
Lead Data Scientist
Classification of fraudulent websites at scale to protect customers and automate analyst triage for a major European telco.
Responsibilities
- Lead ML Engineer on a 3-person team
- Built the end-to-end scraping and classification pipeline
- Delivered MVP with production-grade precision/recall
Technical Approach
- Built a high-velocity scraping and extraction system for business labeling
- Multi-stage GenAI + ML pipeline: category classification → risk qualification
- Automated analyst triage to shorten turnaround time and enable large-scale URL review
Key Results
- MVP launched with >75% correct classification (comparable to human analysts)
- Strong precision/recall on POC
- Automated triage enabling large-scale URL review
Lead Data Scientist
An international home improvement retail group aimed to understand the causal impact of 600+ marketing drivers on Customer Lifetime Value across France and Spain.
Responsibilities
- Lead a 5-person squad (DS/DE/DA)
- Delivered automated causal inference pipeline and dashboards
- Productionized workflows
Technical Approach
- Automated pipeline using DoubleML for causal estimation
- 6 dashboards (3 per Business Unit) covering causal exploration, scenario planning, and impact projections
- Analyzed 12M customers over 3 years across FR & ES
Key Results
- 6 dashboards delivered across 2 countries and 3 BUs
- 600+ marketing drivers analyzed for causal impact
- 12M customers analyzed over 3-year period
- Productionized workflows for ongoing use
Lead Data Scientist
Detection of novel fraud patterns in anti-money laundering (AML) operations that traditional rule-based systems could not catch for a leading French retail bank.
-
Lead ML Engineer on a 3-person team
-
Designed and delivered the anomaly detection pipeline
-
Built analyst tooling for investigating detected patterns
-
Novel fraud patterns detected that rule-based systems missed
-
Analyst tooling delivered for on-prem environment
Lead Data Scientist
A leading European banking group that aimed to create a real-time fraud detection system for retail banking transactions, operating under strict on-premises constraints.
Responsibilities
- Solo ML Engineer — architected and built the entire system end-to-end
- Designed training pipeline, inference path, and monitoring
- Aligned decision thresholds and workflows with business stakeholders
Technical Approach
- Architected an end-to-end real-time fraud detection system under on-prem constraints
- Designed and industrialized the PySpark/Hadoop training pipeline with MLflow for experiment tracking and model registry
- Delivered a low-latency inference path in pandas (on-prem runtime), tuning XGBoost and I/O/serialization to achieve < 500 ms
- Integrated Kafka for streaming events and Cassandra for state/signal storage
- Defined SLOs and conducted load testing
- Put in production a MLOps pipeline with canary testing and shadow RC deployment
- Operationalized monitoring and alerting
Key Results
- +40%/year fraud prevented
- Millions of events processed per day
- 8M retail customers covered
- < 500 ms end-to-end latency
- Production MLOps pipeline with canary testing and shadow deployment
Senior Data Scientist
Top European food retailer needed to detect theft at self-checkout (SCO) stations using camera streams, requiring hybrid cloud/on-prem architecture for real-time inference.
Responsibilities
- Co-Lead DE team within a 7-person squad (3 DS, 3 DE, 1 PO), alongside a DS lead
- Designed the hybrid architecture for real-time inference
- Contributed to rollout across stores
Technical Approach
- Designed a hybrid GCP / on-prem architecture for real-time inference on Self-Checkout (SCO) camera streams
- Computer vision models for detecting fraudulent behavior at checkout
- Edge deployment for low-latency inference on camera feeds
Key Results
- 65% fraudulent transactions detected
- 450% projected ROI (50% in conservative estimates)
- Rolled out to 3 stores before client internalization of the solution
Data Scientist
A global consumer electronics leader that needed a daily propensity-scoring platform to predict purchase intent across their product catalog, using GA4 and CRM data to power targeted marketing campaigns.
- Architected and built the entire scoring platform
- Productionized with full MLOps pipeline
- Standardized multi-market configuration for global rollout
- Enabled client teams to take ownership
Technical Approach
- Built a daily propensity-scoring solution using GA4 + CRM data, RFM features, and XGBoost with probability calibration
- Predictions 15x superior to client's and off-the-shelf solutions for the top 5% scores
- Productionized the platform with dbt data models, Vertex AI Pipelines for orchestrated training/inference, and MLflow for experiment tracking and model registry
- Set up CI/CD, monitoring, and data-quality checks
- Standardized multi-market configuration, onboarding, and enablement for smooth rollout
Key Results
- +61% ROAS on marketing campaigns
- -35% Cost per Purchase Intent
- x2 Reach on warm & hot audiences
- 24 countries deployed
- Platform acquired and operated centrally by Global HQ