Logo
Mark – Python, SQL, AWS, experts in Lemon.io

Mark

From United States (UTC-4)

flag
AI EngineerStrong senior
Machine Learning EngineerMiddle-to-senior
5 years of commercial experience
AI
Computer science
Edtech
Govtech
Sales
Social media
NLP software
Lemon.io stats

Mark – Python, SQL, AWS

Mark is a skilled AI/ML Engineer with strong senior-level experience in building and integrating GenAI and classical machine learning systems. He has a solid track record working with startups, where his proactive mindset and broad technical range have been key to delivering impactful solutions. Mark brings hands-on expertise in LLM-based workflows, including RAG pipelines, prompt engineering, agentic orchestration, and vector database design. He combines a practical understanding of Transformers and attention mechanisms with a system-level view of AI pipelines — from data enrichment and drift handling to evaluation and deployment. His classical ML foundation further strengthens his ability to deliver end-to-end solutions in real-world, fast-paced environments.

Main technologies
Python
5 years
SQL
5 years
AWS
5 years
AI
5 years
LLM
3 years
OpenAI
3 years
NLP
4 years
Additional skills
GPT
Deep Learning
Big Data
Machine learning
Docker
PyTorch
Terraform
RAG
LangChain
Hubspot
Salesforce
Ready to start
June 1st, 2025
Direct hire
Potentially possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Lead Engineer
May 2024 - Dec 20247 months
Project Overview

A scalable system was developed to enable natural language querying over large-scale tweet datasets (~2TB) stored in Parquet format. The solution empowered non-technical users, such as researchers and analysts, to run flexible, human-language queries like “What are the top hashtags about inflation last month?” and receive accurate, explainable results. Leveraging LLMs to translate natural language into SQL, the system executed these queries over the database and returned well-structured responses, streamlining data access and analysis.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Designed and implemented a stateless query pipeline over partitioned tweet data using DuckDB and GPT-based SQL generation;
  • Led architectural decisions for scaling from 12GB to over 1TB of data with under-1-minute latency constraints;
  • Developed per-day sharded aggregation pattern and materialized view strategy to support real-time analysis;
  • Built prompt-engineered GPT workflows for robust NL-to-SQL translation, with query fallback and validation;
  • Integrated caching, semantic filtering, and keyword indexing to reduce scan costs and improve UX;
  • Enabled cross-functional stakeholders to query and explore social media data without writing code.
Project Tech stack:
Python
AI
Machine learning
Deep Learning
LLM
Big Data
Lead AI Engineer
Feb 2024 - May 20243 months
Project Overview

A custom AI solution was developed to analyze sentiment across large-scale social media datasets, such as Twitter and Reddit. The project involved training advanced sentiment models tailored to domain-specific corpora, enabling the extraction of nuanced signals and attributes from online conversations. These models were deployed across tens of millions of posts to analyze sentiment trends over time. A novel data augmentation method was implemented by combining gold-standard labels with large language models (LLMs) to improve model performance and generalization. This system provided valuable insights into evolving public sentiment and discourse.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Led model architecture development and model training;
  • Created a custom data augmentation method using a combination of gold labels and LLMs;
  • Made a custom retraining scheme and schedule to prevent model drift;
  • Added proper MLOps monitoring and telemetry;
  • Deployed model to analyze sentiment across tens of millions of social media conversations across various platforms.
Project Tech stack:
Python
PyTorch
AWS
Lead Engineer
Dec 2023 - Feb 20242 months
Project Overview

A RAG-based application was developed to identify and surface political bias in both published news articles and social media content. The system embedded and classified tens of thousands of articles from a wide range of U.S. news outlets, analyzing political bias across multiple dimensions validated by academic research. It enabled the tracking of bias and sentiment trends over time. The solution was powered by agentic infrastructure that autonomously analyzed, aggregated, and synthesized relevant data sources to generate comprehensive and explainable insights.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Created end-to-end RAG application;
  • Designed an AWS-powered data ingestion and processing pipeline, responsible for sourcing information from dozens of news sources;
  • Created agentic infrastructure to analyze, synthesize, and aggregate various data sources over time.
Project Tech stack:
Python
RAG
AI
ML
AWS
Senior ML Engineer
Dec 2022 - Jul 20236 months
Project Overview

A modular, production-grade ML pipeline designed to process multimodal communication data such as Zoom transcripts and emails. The system performs low-latency inference using machine learning models to generate actionable follow-up suggestions and sentiment insights for end users. It incorporates AI-powered agentic components across various communication platforms to provide a unified and intelligent view of user interactions.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Designed and deployed a scalable ML pipeline integrating Zoom and email data into contextual AI suggestions;
  • Developed models for sentiment classification and follow-up generation
  • Built and managed a model serving infrastructure;
  • Implemented Redis and S3-based hybrid feature store with online/offline sync, using Feast for feature management;
  • Tracked and versioned models and metrics with MLflow, enabling reproducible experiments and staged deployment;
  • Orchestrated ingestion, preprocessing, and inference via AWS Step Functions and Prefect;
  • Applied Terraform for infrastructure-as-code and modular deployment of all components.
Project Tech stack:
Python
AI
Machine learning
ML
PyTorch
Big Data
AWS
Docker
Terraform

Education

2020
Statistics
Bachelor's
2025
Computer Science
Master's

Languages

English
Advanced

Hire Mark or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2025 lemon.io. All rights reserved.