Logo
Tarik – AWS, Python, Docker, experts in Lemon.io

Tarik

From Turkey (GMT+3)

flag
AI EngineerSenior
Machine Learning EngineerSenior
Data ScientistSenior
Hire developer
11 years of commercial experience
AI
Analytics
Customer support
Entertainment
Gamedev
Healthcare
Healthtech
Human resources
Information services
Machine learning
Management
Sales
Scientific research
B2B
AI software
Chatbots
Enterprise software
Mobile apps
NLP software
SaaS
Virtual assistants
Gaming software
Lemon.io stats

Tarik – AWS, Python, Docker

A seasoned data scientist, Tarik possesses almost a decade of commercial experience and a PhD in mathematics (NLP and Graph ML). His strongest suits are Machine Learning, Data Engineering, LLMs, and Neural Networks. As a cherry on top, Tarik is capable of bringing value both as an individual contributor and manager, so do not hesitate you make him a part of your team.

Main technologies
AWS
3 years
Python
8 years
Docker
3 years
LLM
2 years
NLP
7 years
Pandas
1 year
Scikit-learn
1 year
Neural Networks
1 year
Additional skills
BigQuery
ElasticSearch
Apache Kafka
AI
PostgreSQL
Ubuntu
API
Deep Learning
Redis
OpenAI API
FastAPI
Firebase
Microsoft Azure
Apache Airflow
Snowflake
GCP
LangChain
PyTorch
Ready to start
September 9th
Direct hire
Potentially possible

Experience Highlights

Senior Data Scientist
Jan 2024 - Mar 20241 month
Project Overview

A hyper-personalized content tagging system. It utilizes small-sized Language Models (LLMs) like Microsoft's Phi model, fine-tuned with user profile descriptions and content metadata to identify relevant and irrelevant content for individual users. This project significantly enhanced the content recommendation engine's accuracy, leading to increased user retention.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Implemented a fine-tuning pipeline for small LLMs to adapt them for personalized content relevance classification;
  • Created a scalable system to process and tag large volumes of content in batch inference mode;
  • Integrated the tagging system with the existing content recommendation engine, reducing false positives by 79%.
Project Tech stack:
Hugging Face
PyTorch
LLM
Python
Senior Data Scientist
Jun 2023 - Feb 20248 months
Project Overview

An AI-powered virtual assistant chatbot for the enterprise communication platform, enabling contextual searches and actions through natural language conversations. This assistant enhances user productivity by providing quick access to information and automating routine tasks within the organization's digital workspace.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Implemented the initial POC to showcase the contextual conversational capabilities of LLMs;
  • Presented to the CTO for the progress and the potential directions considering the current STOA in LLMs to determine the development strategy until the product management team took it over;
  • Architected and implemented a conversational AI system using state-of-the-art language models and natural language understanding techniques;
  • Integrated the assistant with various internal systems to enable actions like scheduling meetings, retrieving documents, and answering company-specific queries;
  • Implemented context-aware conversation handling to maintain coherent multi-turn dialogues;
  • Developed a robust intent classification and parameters extraction system to accurately route user requests to appropriate handlers.
Project Tech stack:
OpenAI API
Python
ElasticSearch
Redis
AWS
Microsoft Azure
Docker
LangChain
Vector Databases
Senior Data Scientist
Oct 2023 - Dec 20232 months
Project Overview

A customer churn prediction system. The system analyzes customer behavior, product usage patterns, and engagement metrics to identify at-risk accounts and enable proactive retention strategies. Tarik developed a machine learning model to predict the churn risk of tenants for the SaaS platform, potentially saving $5M in Annual Recurring Revenue (ARR).

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Communicated with the customer success, sales, and product management teams to understand their needs, identify important features, and set the project goals;
  • Developed the churn prediction model, end to end, from data preparation to model deployment;
  • Engineered features from various data sources, including product usage logs, customer support tickets, and financial data;
  • Implemented and compared multiple machine learning algorithms, including basic regression models, RNNs, and LSTMs, ultimately selecting LightGBM for its performance and interpretability;
  • Developed an automated ML pipeline using MLFlow for model training, validation, and deployment;
  • Created a dashboard for the customer success team to visualize churn risk and key factors contributing to potential churn;
  • Achieved an 87% recall in predicting churn 60 days in advance, allowing for timely interventions.
Project Tech stack:
Snowflake
PowerBI
Python
Data visualization
Machine learning
Senior Data Scientist
Dec 2022 - Jun 20235 months
Project Overview

A Retrieval-Augmented Generation (RAG) solution to generate accurate answers based on search results for user queries. This system combines the power of large language models with a company's specific knowledge base to provide contextually relevant and up-to-date responses.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Designed and implemented a RAG pipeline that efficiently retrieves relevant documents and generates coherent answers;
  • Optimized the document indexing and retrieval process using Milvus and ElasticSearch;
  • Implemented a mechanism to qualify the generated answer to show or hide it on top of the search results page.
Project Tech stack:
Python
Docker
AWS
Vector Databases
LangChain
LLM
Senior Data Scientist
Jun 2022 - Nov 20225 months
Project Overview

A scalable, multi-tenant content recommendation system for the modern enterprise intranet SAAS platform serving over 700 tenants with 700K+ users. The system provides personalized content suggestions and related content features, enhancing user engagement and information discovery within organization intranets.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Architected and implemented a scalable recommendation engine using collaborative filtering techniques;
  • Created an auto-modeling method that optimizes the training process for each tenant specific to their usage;
  • Designed the model for multi-tenant scenarios, ensuring data isolation and personalized recommendations for each client;
  • Optimized the training process to update each model with only the new data per tenant;
  • Provided endpoints to retrieve real-time recommendations for the user using Redis indices for user and item embeddings, which reduces memory usage in the inference stage;
  • Deployed the solution using Snowflake, Airflow, MLFlow, Redis and Kubernetes, enabling easy scaling to 100K+ recommendations per day.
Project Tech stack:
Python
Redis
MLOps
Snowflake
AWS
Apache Airflow
Machine Learning Team Lead
Dec 2020 - Apr 20213 months
Project Overview

An automated system to detect duplicate or near-duplicate questions in a large-scale trivia game database. This project aimed to maintain the quality and uniqueness of the question set as third-party providers continuously added new questions.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Analyzed the existing question database to understand the scope and nature of duplication issues;
  • Created a representative dataset of question-answer pairs that were considered near-duplicates based on predefined criteria;
  • Fine-tuned the BERTurk model to detect semantic similarities between questions;
  • Implemented a batch processing system to efficiently check new questions against the existing database;
  • Developed a user interface for content managers to review and act on potential duplicates;
  • Established a continuous monitoring process to ensure ongoing question set quality.
Project Tech stack:
PyTorch
Hugging Face
Python
FastAPI
Docker
NLP
Machine Learning Team Lead
May 2020 - Sep 20203 months
Project Overview

An in-depth market research project to analyze audience sentiment towards TRT's flagship TV drama on social media. This project combined advanced NLP techniques with traditional market research methods to inform strategic decisions for the upcoming season.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Collected and processed large-scale Twitter data related to the TV drama;
  • Fine-tuned BERTurk, a BERT-based model pre-trained on Turkish content, for multi-label emotion classification;
  • Collaborated with a 3rd party company to create an annotated dataset for emotion tags such as anger, jealousy, love and hate;
  • Developed a detailed emotional landscape analysis for audience reactions towards each actor and actress;
  • Integrated findings with offline market research conducted by IPSOS to provide comprehensive insights;
  • Presented results to decision-makers, directly influencing contract negotiations for the next season.
Project Tech stack:
PyTorch
Python
NLP
Hugging Face
Data analysis
Twitter API
Data visualization

Education

2021
Mathematics (NLP and Graph ML)
PhD

Languages

Turkish
Advanced
English
Advanced
Copyright © 2024 lemon.io. All rights reserved.