Mark

From United States (UTC-4)

AI EngineerStrong senior

Machine Learning EngineerMiddle-to-senior

6 years of commercial experience

Computer science

Edtech

Govtech

Sales

Social media

NLP software

Lemon.io stats

2

offers now 🔥

Skills and seniority verified on May 7, 2025

Mark – Python, SQL, AWS

Mark is a skilled AI/ML Engineer with strong senior-level experience in building and integrating GenAI and classical machine learning systems. He has a solid track record working with startups, where his proactive mindset and broad technical range have been key to delivering impactful solutions. Mark brings hands-on expertise in LLM-based workflows, including RAG pipelines, prompt engineering, agentic orchestration, and vector database design. He combines a practical understanding of Transformers and attention mechanisms with a system-level view of AI pipelines — from data enrichment and drift handling to evaluation and deployment. His classical ML foundation further strengthens his ability to deliver end-to-end solutions in real-world, fast-paced environments.

Main technologies

Python

5 years

SQL

5 years

AWS

5 years

LLM

3 years

OpenAI

3 years

NLP

4 years

Additional skills

GPT

Deep Learning

Big Data

Machine learning

Docker

PyTorch

Terraform

RAG

LangChain

Hubspot

Salesforce

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Lead Engineer

May 2024 - Dec 20247 months

Project Overview

A scalable system was developed to enable natural language querying over large-scale tweet datasets (~2TB) stored in Parquet format. The solution empowered non-technical users, such as researchers and analysts, to run flexible, human-language queries like “What are the top hashtags about inflation last month?” and receive accurate, explainable results. Leveraging LLMs to translate natural language into SQL, the system executed these queries over the database and returned well-structured responses, streamlining data access and analysis.

Responsibilities:

Designed and implemented a stateless query pipeline over partitioned tweet data using DuckDB and GPT-based SQL generation;
Led architectural decisions for scaling from 12GB to over 1TB of data with under-1-minute latency constraints;
Developed per-day sharded aggregation pattern and materialized view strategy to support real-time analysis;
Built prompt-engineered GPT workflows for robust NL-to-SQL translation, with query fallback and validation;
Integrated caching, semantic filtering, and keyword indexing to reduce scan costs and improve UX;
Enabled cross-functional stakeholders to query and explore social media data without writing code.

Project Tech stack:

Python

Machine learning

Deep Learning

LLM

Big Data

Lead AI Engineer

Feb 2024 - May 20243 months

Project Overview

A custom AI solution was developed to analyze sentiment across large-scale social media datasets, such as Twitter and Reddit. The project involved training advanced sentiment models tailored to domain-specific corpora, enabling the extraction of nuanced signals and attributes from online conversations. These models were deployed across tens of millions of posts to analyze sentiment trends over time. A novel data augmentation method was implemented by combining gold-standard labels with large language models (LLMs) to improve model performance and generalization. This system provided valuable insights into evolving public sentiment and discourse.

Responsibilities:

Led model architecture development and model training;
Created a custom data augmentation method using a combination of gold labels and LLMs;
Made a custom retraining scheme and schedule to prevent model drift;
Added proper MLOps monitoring and telemetry;
Deployed model to analyze sentiment across tens of millions of social media conversations across various platforms.

Project Tech stack:

Python

PyTorch

AWS

Lead Engineer

Dec 2023 - Feb 20242 months

Project Overview

A RAG-based application was developed to identify and surface political bias in both published news articles and social media content. The system embedded and classified tens of thousands of articles from a wide range of U.S. news outlets, analyzing political bias across multiple dimensions validated by academic research. It enabled the tracking of bias and sentiment trends over time. The solution was powered by agentic infrastructure that autonomously analyzed, aggregated, and synthesized relevant data sources to generate comprehensive and explainable insights.

Responsibilities:

Created end-to-end RAG application;
Designed an AWS-powered data ingestion and processing pipeline, responsible for sourcing information from dozens of news sources;
Created agentic infrastructure to analyze, synthesize, and aggregate various data sources over time.

Project Tech stack:

Python

RAG

AWS

Senior ML Engineer

Dec 2022 - Jul 20236 months

Project Overview

A modular, production-grade ML pipeline designed to process multimodal communication data such as Zoom transcripts and emails. The system performs low-latency inference using machine learning models to generate actionable follow-up suggestions and sentiment insights for end users. It incorporates AI-powered agentic components across various communication platforms to provide a unified and intelligent view of user interactions.

Responsibilities:

Designed and deployed a scalable ML pipeline integrating Zoom and email data into contextual AI suggestions;
Developed models for sentiment classification and follow-up generation
Built and managed a model serving infrastructure;
Implemented Redis and S3-based hybrid feature store with online/offline sync, using Feast for feature management;
Tracked and versioned models and metrics with MLflow, enabling reproducible experiments and staged deployment;
Orchestrated ingestion, preprocessing, and inference via AWS Step Functions and Prefect;
Applied Terraform for infrastructure-as-code and modular deployment of all components.

Project Tech stack:

Python

Machine learning

PyTorch

Big Data

AWS

Docker

Terraform

Keep in mind, the experience summary might exclude non-relevant projects

Education

2020

Statistics

Bachelor's

2025

Computer Science

Master's

Languages

English

Advanced

Hire Mark or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request