Tarik

From Turkey (UTC+3)

AI Engineer|Senior

Machine Learning Engineer|Senior

Data Scientist|Senior

Back-end Web Developer|Senior

Lemon.io stats

1

projects done

1120

hours worked

Skills and seniority verified on Aug 12, 2024

Tarik – AWS, Python, Docker

Tarik is an expert in data engineering, machine learning, and large language models, holding a PhD in Graph Theory. He is also proficient in Python, Pandas, and Scikit-learn, and has successfully designed and delivered advanced NLP and AI solutions, including cutting-edge RAG systems. With a deep understanding of machine learning concepts and exceptional problem-solving skills, Tarik consistently drives impactful results in AI-driven and data-intensive projects!

12 years of commercial experience in

Analytics

Customer support

E-learning

Edtech

Entertainment

Gamedev

Healthcare

Healthtech

Human resources

Information services

Job and career services

Machine learning

Management

Media

Recruiting

Sales

Scientific research

B2B

Content creation and licensing

AI software

Chatbots

Enterprise software

Mobile apps

NLP software

SaaS

Virtual assistants

Gaming software

Voice-first system

Agentic automation

Main technologies

AWS

3 years

Python

8 years

Docker

3 years

LLM

4 years

FastAPI

4 years

NLP

7 years

Pandas

8 years

Scikit-learn

8 years

AI agent development

2 years

Additional skills

BigQuery

ElasticSearch

Apache Kafka

PostgreSQL

Ubuntu

API

Deep Learning

Redis

OpenAI API

Firebase

Microsoft Azure

Apache Airflow

Snowflake

GCP

MLOps

LangChain

PyTorch

Neural Networks

Tensorflow

OAuth

Kafka

Airflow

Flask

Rewards and achievements

Tech interviewer

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Freelance Consultant

Dec 2024 - Mar 20253 months

Project Overview

An intelligent news-retrieval system for a global newsroom that answers user queries with accurate, editorially compliant information from a 100k+ document repository. Built from scratch as a modular RAG pipeline (intent classification, retrieval, summarization, editorial guardrailing) exposed through a flexible API with real-time streaming responses. Designed for scale and low latency while keeping the newsroom's editorial standards enforceable in code.

Responsibilities:

Designed and implemented the complete RAG pipeline with LangChain and LangGraph as distinct, swappable components for intent classification, retrieval, summarization, and guardrailing;
Built a configuration system that dynamically switches between cloud LLMs (OpenAI, OpenRouter) and local LLMs (Ollama) to support production, degradation, and A/B-testing scenarios;
Engineered a guardrailing system that retrieves and applies editorial guidelines based on topic context so all generated content keeps brand integrity;
Integrated Elasticsearch and Typesense knowledge bases with recency-based reranking to prioritize the latest news;
Built a prompt-management layer with PromptLayer and integrated LangSmith for debugging, monitoring, and continuous improvement;
Created evaluation frameworks measuring intent-classification accuracy, retrieval precision, summarization quality, and editorial-guideline adherence.

Project Tech stack:

LangChain

AWS

Python

Prompt engineering

FastAPI

WebSocket

REST API

Amazon S3

Docker Compose

Docker

ElasticSearch

Bedrock

Multi-Agent Systems

Multi-agent systems architecture

founder + sole AI builder

Jun 2024 - Feb 20258 months

Project Overview

A consumer AI app that turns a child into the hero of their own illustrated, narrated bedtime story. I co-founded the company, went through the Winter Cold Start acceleration program at Founders Inc. in San Francisco, and personally built the entire AI system: a multi-modal pipeline that orchestrates an LLM for the story, diffusion models for the illustrations, and voice synthesis for narration into one coherent, child-safe output. Beyond the engineering I lived the early-founder reality: investor conversations, product-market-fit iteration, a soft launch in Middle Eastern markets, and organic acquisition through a curated YouTube and social presence.

Project gallery:

Portfolio example for Instant Personalized Story Generator by Tarik, AI Engineer

Responsibilities:

Architected and implemented the core multi-modal generation pipeline coordinating LLM story creation, image generation, and voice synthesis into a single low-latency workflow;
Designed structured output schemas and constraints so generated stories follow sensible narrative arcs and stay child-friendly;
Built the Firebase Cloud Functions backend with Flask: scalable endpoints for story generation, user preferences, and content management;
Built a prompt-management system with PromptLayer for version control and A/B testing across story themes and age groups;
Integrated multiple AI services (OpenRouter for frontier LLMs, diffusion-based image endpoints, Flux) and kept the stack swappable as models improved;
Ran a soft launch with localized, themed story collections and built organic acquisition channels from scratch.

Project Tech stack:

OpenAI API

Google API and Services

Firebase

Firebase Analytic

Firebase Crashlytics

Firebase DB and Storage

Flask

Flux

Flutter

Senior Data Scientist

Jan 2024 - Mar 20241 month

Project Overview

A hyper-personalized content tagging system. It utilizes small-sized Language Models (LLMs) like Microsoft's Phi model, fine-tuned with user profile descriptions and content metadata to identify relevant and irrelevant content for individual users. This project significantly enhanced the content recommendation engine's accuracy, leading to increased user retention.

Responsibilities:

Implemented a fine-tuning pipeline for small LLMs to adapt them for personalized content relevance classification;
Created a scalable system to process and tag large volumes of content in batch inference mode;
Integrated the tagging system with the existing content recommendation engine, reducing false positives by 79%.

Project Tech stack:

Hugging Face

PyTorch

LLM

Python

Senior Data Scientist

Jun 2023 - Feb 20248 months

Project Overview

An AI-powered virtual assistant chatbot for the enterprise communication platform, enabling contextual searches and actions through natural language conversations. This assistant enhances user productivity by providing quick access to information and automating routine tasks within the organization's digital workspace.

Responsibilities:

Implemented the initial POC to showcase the contextual conversational capabilities of LLMs;
Presented to the CTO for the progress and the potential directions considering the current STOA in LLMs to determine the development strategy until the product management team took it over;
Architected and implemented a conversational AI system using state-of-the-art language models and natural language understanding techniques;
Integrated the assistant with various internal systems to enable actions like scheduling meetings, retrieving documents, and answering company-specific queries;
Implemented context-aware conversation handling to maintain coherent multi-turn dialogues;
Developed a robust intent classification and parameters extraction system to accurately route user requests to appropriate handlers.

Project Tech stack:

OpenAI API

Python

ElasticSearch

Redis

AWS

Microsoft Azure

Docker

LangChain

Vector Databases

Senior Data Scientist

Oct 2023 - Dec 20232 months

Project Overview

A customer churn prediction system. The system analyzes customer behavior, product usage patterns, and engagement metrics to identify at-risk accounts and enable proactive retention strategies. Tarik developed a machine learning model to predict the churn risk of tenants for the SaaS platform, potentially saving $5M in Annual Recurring Revenue (ARR).

Responsibilities:

Communicated with the customer success, sales, and product management teams to understand their needs, identify important features, and set the project goals;
Developed the churn prediction model, end to end, from data preparation to model deployment;
Engineered features from various data sources, including product usage logs, customer support tickets, and financial data;
Implemented and compared multiple machine learning algorithms, including basic regression models, RNNs, and LSTMs, ultimately selecting LightGBM for its performance and interpretability;
Developed an automated ML pipeline using MLFlow for model training, validation, and deployment;
Created a dashboard for the customer success team to visualize churn risk and key factors contributing to potential churn;
Achieved an 87% recall in predicting churn 60 days in advance, allowing for timely interventions.

Project Tech stack:

Snowflake

PowerBI

Python

Data visualization

Machine learning

Senior Data Scientist

Jan 2023 - Jul 20236 months

Project Overview

A Retrieval-Augmented Generation (RAG) solution to generate accurate answers based on search results for user queries. This system combines the power of large language models with a company's specific knowledge base to provide contextually relevant and up-to-date responses.

Responsibilities:

Designed and implemented a RAG pipeline that efficiently retrieves relevant documents and generates coherent answers;
Optimized the document indexing and retrieval process using Milvus and ElasticSearch;
Implemented a mechanism to qualify the generated answer to show or hide it on top of the search results page.

Project Tech stack:

Python

Docker

AWS

Vector Databases

LangChain

LLM

RAG

Senior Data Scientist

Jun 2022 - Nov 20225 months

Project Overview

A scalable, multi-tenant content recommendation system for the modern enterprise intranet SAAS platform serving over 700 tenants with 700K+ users. The system provides personalized content suggestions and related content features, enhancing user engagement and information discovery within organization intranets.

Responsibilities:

Architected and implemented a scalable recommendation engine using collaborative filtering techniques;
Created an auto-modeling method that optimizes the training process for each tenant specific to their usage;
Designed the model for multi-tenant scenarios, ensuring data isolation and personalized recommendations for each client;
Optimized the training process to update each model with only the new data per tenant;
Provided endpoints to retrieve real-time recommendations for the user using Redis indices for user and item embeddings, which reduces memory usage in the inference stage;
Deployed the solution using Snowflake, Airflow, MLFlow, Redis and Kubernetes, enabling easy scaling to 100K+ recommendations per day.

Project Tech stack:

Python

Redis

MLOps

Snowflake

AWS

Apache Airflow

Machine Learning Team Lead

Dec 2020 - Apr 20213 months

Project Overview

An automated system to detect duplicate or near-duplicate questions in a large-scale trivia game database. This project aimed to maintain the quality and uniqueness of the question set as third-party providers continuously added new questions.

Responsibilities:

Analyzed the existing question database to understand the scope and nature of duplication issues;
Created a representative dataset of question-answer pairs that were considered near-duplicates based on predefined criteria;
Fine-tuned the BERTurk model to detect semantic similarities between questions;
Implemented a batch processing system to efficiently check new questions against the existing database;
Developed a user interface for content managers to review and act on potential duplicates;
Established a continuous monitoring process to ensure ongoing question set quality.

Project Tech stack:

PyTorch

Hugging Face

Python

FastAPI

Docker

NLP

Machine Learning Team Lead

May 2020 - Sep 20203 months

Project Overview

An in-depth market research project to analyze audience sentiment towards TRT's flagship TV drama on social media. This project combined advanced NLP techniques with traditional market research methods to inform strategic decisions for the upcoming season.

Responsibilities:

Collected and processed large-scale Twitter data related to the TV drama;
Fine-tuned BERTurk, a BERT-based model pre-trained on Turkish content, for multi-label emotion classification;
Collaborated with a 3rd party company to create an annotated dataset for emotion tags such as anger, jealousy, love and hate;
Developed a detailed emotional landscape analysis for audience reactions towards each actor and actress;
Integrated findings with offline market research conducted by IPSOS to provide comprehensive insights;
Presented results to decision-makers, directly influencing contract negotiations for the next season.

Project Tech stack:

PyTorch

Python

NLP

Hugging Face

Data analysis

Twitter API

Data visualization

Keep in mind, the experience summary might exclude non-relevant projects

Education

2021

NLP & Graph Machine Learning

PhD

Languages

Turkish

Advanced

Italian

Pre-intermediate

Arabic

Pre-intermediate

English

Advanced

Hire Tarik or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request