Logo
Vinay – AI agent development, LLM, LangChain, experts in Lemon.io

Vinay

From Ireland (UTC+1)flag

AI Engineer|Senior
Machine Learning Engineer|Senior
AI Agent Architect|Strong senior

Vinay – AI agent development, LLM, LangChain

Vinay brings 10 years of experience in AI/ML engineering, with a strong focus on production-ready AI agent systems, RAG architectures, and LLM integration. He's built enterprise-scale solutions across healthcare and fintech — including custom frameworks — and is comfortable working across the full stack from Python and FastAPI to vector databases and compliance-sensitive design. What sets Vinay apart is how he combines deep technical chops with a consultative, client-facing style. He's led hands-on projects in both startup and enterprise environments, and tends to gravitate toward pragmatic, end-to-end ownership — from architecture decisions all the way through to delivery.

10 years of commercial experience in
AI
Analytics
Banking
Consumer services
Fintech
Healthcare
Healthtech
Human resources
Insurance
Recruiting
Chatbots
HRMS
Main technologies
AI agent development
2.5 years
LLM
2 years
LangChain
2 years
LangGraph
2 years
Pinecone
1 year
AI system design
2.5 years
MCP
1.5 years
Python
8 years
Additional skills
FastAPI
OpenAI API
Multi-Agent Systems
AI agent orchestration
Multi-agent systems architecture
AI telemetry
RAG
Vector Databases
AWS Lambda
FastMCP
Qdrant
OpenAI
AI API integration
AWS
AI
AI chatbot development
Anthropic
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

AI Engineer / Founder
Oct 2025 - Apr 20265 months
Project Overview

An AI hiring platform for SMB retail and QSR teams in the US and Canada. It enables low-latency voice-based candidate screening, signal-based scoring across stated, demonstrated, and behavioral dimensions, and hybrid human-in-the-loop interview workflows.

Project gallery:
Portfolio example for Awesom Hires by Vinay, AI Engeneer & Founder
Portfolio example for Awesom Hires by Vinay, AI Engeneer & Founder
Responsibilities:
  • Implemented signals-based candidate scoring across stated, demonstrated, and behavioral dimensions;
  • Built voice AI interviewer agents (Avon and Avey) with low-latency turn-taking and interruption handling;
  • Designed a pluggable workflow engine using strategy and factory patterns for customizable hiring pipelines;
  • Integrated ATS providers via Merge.dev for unified candidate data synchronization;
  • Implemented credit-based usage tracking and billing integration;
  • Hardened voice interviewers against prompt injection and jailbreak attacks;
  • Shipped a human-in-the-loop interview room for hybrid AI and human screening;
  • Built a multi-tenant FastAPI backend with PostgreSQL and pgvector for semantic candidate matching;
  • Implemented the voice pipeline with streaming STT, LLM routing, and ElevenLabs TTS;
  • Integrated Langfuse observability across voice and scoring pipelines;
  • Containerized services with Docker and deployed to AWS with CI/CD via GitHub Actions;
  • Tuned retrieval and prompt strategy to keep per-interview LLM costs predictable.
Project Tech stack:
Python
FastMCP
FastAPI
AI
AI agent development
RAG
Multi-Agent Systems
AI agent orchestration
AI chatbot development
AWS
Docker
PostgreSQL
Redis
React
Vite
AI Engineer / Consultant
Jan 2026 - Mar 20261 month
Project Overview

An AI recruitment platform used by multiple clients on a daily basis. As an AI consultant and engineer, he helped standardize multilingual support across 7+ AI services, established Langfuse as the main observability and prompt-governance layer, improved voice AI cost efficiency through prompt caching, and strengthened context handling and safety defenses across the platform. He also built a golden-dataset evaluation framework for prompt regression testing and resolved several production incidents through formal root-cause analysis, contributing to a 35% overall LLM cost reduction per candidate interview.

Responsibilities:
  • Introduced AI governance and traceability with Langfuse across 10+ microservices;
  • Implemented layered memory for agents to learn user-specific behavior via tool calling;
  • Implemented prompt caching for conversational Voice AI, reducing LLM costs by 30%;
  • Implemented strict guardrails and safety boundaries to prevent system abuse;
  • Improved RAG retrieval with agentic RAG loops and RRF combining BM25 and semantic matches;
  • Shipped to production with complete observability and tracing;
  • Implemented PII redaction using Presidio with a local LLM to protect user data;
  • Added multilingual Voice AI support for Spanish, French, and German;
  • Implemented intelligent model selection per use case, reducing overall LLM costs by 5%.
Project Tech stack:
AWS Lambda
Amazon EC2
Python
FastAPI
Claude API
OpenAI
AI API integration
AI telemetry
Claude LLM
OpenAI API
LangGraph
REST API
AI Engineer
Aug 2025 - Feb 20266 months
Project Overview

An FCA-compliant multi-agent AI platform for UK fintech users. It routes requests through an orchestrator agent to specialized workers for transaction analysis, subscription tracking, savings coaching, and money leak detection, with guardrails for PII redaction and compliance. The platform integrates with open banking and transaction enrichment services, uses token-based subscription tiers, and runs in production on AWS with Langfuse observability.

Responsibilities:
  • Designed multi-agent architecture using LangGraph with an orchestrator-worker pattern and hierarchical agents pattern;
  • Built specialized agents for transaction analysis, subscription tracking, savings coaching, and money leak detection;
  • Implemented FCA-compliant input and output guardrails with PII redaction and prompt injection defense;
  • Integrated TrueLayer for open banking and Ntropy for transaction enrichment;
  • Developed a FastAPI backend with async LLM orchestration and streaming responses;
  • Deployed on AWS ECS with Auto Scaling Groups and Celery workers for background processing;
  • Built Langfuse observability for per-agent tracing, token cost tracking, and latency monitoring;
  • Designed token-based subscription economics across Free, Plus, and Pro tiers;
  • Implemented a PostgreSQL data layer and Redis session caching with rate limiting;
  • Wrote unit and integration tests for agent flows and compliance guardrails;
  • Set up CI/CD pipelines via GitHub Actions with infrastructure as code.
Project Tech stack:
Python
LangChain
LangGraph
FastAPI
React
React Native
OpenAI
Claude API
AI API integration
AI agent orchestration
Multi-Agent Systems
Senior AI Engineer / Lead
May 2024 - Apr 202510 months
Project Overview

A production RAG system for healthcare insurance documentation, built to support executive-facing Q&A over 100,000+ policy and regulatory documents. It uses hybrid retrieval, reranking, metadata-preserving chunking, source citation, and confidence-based refusal logic to improve answer quality and reduce hallucinations. The platform also includes HIPAA-aligned data handling, PII protection, and an evaluation pipeline for retrieval recall and answer faithfulness.

Responsibilities:
  • Built a production RAG pipeline over 100,000+ healthcare insurance documents;
  • Implemented hybrid retrieval combining BM25 sparse search and dense semantic search;
  • Added a reranking layer to improve top-k relevance for complex policy queries;
  • Designed a metadata-preserving chunking strategy for regulatory and policy documents;
  • Implemented source citation and confidence-threshold-based refusal logic to control hallucinations;
  • Implemented a self-evaluation loop using LLM-as-a-judge for relevance checks;
  • Ensured HIPAA-aligned data handling and PII protection across the pipeline;
  • Built an evaluation pipeline measuring retrieval recall and answer faithfulness;
  • Developed a FastAPI backend with pgvector for embedding storage and similarity search;
  • Integrated observability and tracing across retrieval and generation stages;
  • Tuned prompts and retrieval parameters to optimize cost and latency for executive-facing use cases.
Project Tech stack:
Python
FastAPI
OpenAI
Anthropic
AI API integration
PostgreSQL
Qdrant
Kubernetes
Docker
Next.js

Education

2022
Cloud Computing and Machine Learning
Post Graduate Diploma
2016
Information Science
Bachelor of Engineering

Languages

English
Advanced

Hire Vinay or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2026 lemon.io. All rights reserved.