Logo
Fernando – AI agent orchestration, Multi-agent systems architecture, AI telemetry, experts in Lemon.io

Fernando

From Brazil (UTC-3)flag

AI Agent Architect|Senior
AI Engineer|Senior
Lemon.io stats
1
offers now 🔥

Fernando – AI agent orchestration, Multi-agent systems architecture, AI telemetry

Fernando is a senior AI Agent Architect with 8 years of experience and deep expertise in Python, LLMs, multi-agent systems architecture, AI agent orchestration, and RAG. He has led end-to-end delivery of AI-driven platforms in healthtech and legal domains, demonstrating strong product judgment, stakeholder communication, and technical ownership.

8 years of commercial experience in
AI
Healthtech
Legal tech
Productivity
UI/UX
B2B
B2C
AI software
Mobile apps
SaaS
Web development
Software development
Main technologies
AI agent orchestration
2 years
Multi-agent systems architecture
2 years
AI telemetry
2 years
Python
7 years
LLM
2 years
Additional skills
RAG
MCP
LLM orchestration
AI agent development
LangChain
Anthropic
FastAPI
OpenAI
LangGraph
CI/CD
Vector Databases
Prompt engineering
Firebase
LLM integration
Claude Code
Docker
SQLAlchemy
GitHub Actions
Multi-Agent Systems
Voice AI integration
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Senior AI Engineer
Feb 2026 - May 20263 months
Project Overview

A personal AI assistant and multi-agent orchestration layer built around Claude Code and a LangGraph-based agent core, coordinating multiple simultaneous LLM agent instances on long-running development tasks. The system combines an autonomous "brain" agent with a fleet of Claude Code sessions, exposing them through a FastAPI backend (REST + WebSocket), a Typer CLI, and a Flutter client for desktop and mobile. It supports always-on, wake-word-activated voice interaction and event-based triggers, real-time streaming of agent output, self-evolving tool creation, permission-gated tool execution, autonomous decision-making, long-term semantic memory, and multi-LLM routing across providers, all packaged as a one-command installable distribution.The project explores production patterns for agentic systems beyond single-agent prompting, focusing on observability, controlled autonomy, extensibility, and reliability in multi-agent workflows.

Responsibilities:
  • Architected and built a multi-agent orchestration layer coordinating multiple Claude Code instances in parallel, with an autonomous "brain" agent on top of LangGraph with persistent checkpointing, capable of monitoring sessions and deciding when to act, respond, or escalate.
  • Designed an extensible tool ecosystem combining a unified tool registry (filesystem, shell, web, PTY, semantic memory), a self-evolving system where the agent autonomously designs and registers new tools at runtime, and an MCP (Model Context Protocol) client for interoperability with external tool servers.
  • Implemented a multi-LLM routing layer dispatching tasks across providers (Anthropic, OpenAI, local models) based on cost, latency, and task profile.
  • Built long-term semantic memory with RAG, using a local vector store (Chroma + sentence-transformers) and applied advanced prompt engineering with persona/config-driven behavior (YAML-based prompts, tools, and triggers) to improve tool selection accuracy and decision quality.
  • Developed real-time WebSocket streaming for agent output with per-session subscribe/unsubscribe and connection lifecycle handling, alongside a permission-gated tool execution model with explicit approval policies for sensitive actions.
  • Integrated a full voice interaction stack: speech-to-text and text-to-speech (OpenAI Whisper + ElevenLabs) with a bidirectional audio I/O pipeline and always-on wake-word activation (e.g. "Hey Charles") via a continuous low-power keyword spotter for hands-free use.
  • Built async backend infrastructure with FastAPI for concurrent agent session management, including metrics, admin, and event-trigger subsystems.
  • Developed a Flutter client (mobile/desktop) consuming the same REST + WebSocket API as the CLI, and packaged the system as a one-command installable distribution with a macOS launchd service for background operation.
Project Tech stack:
Python
FastAPI
LangGraph
LangChain
Claude Code
Claude API
OpenAI API
Pydantic
WebSocket
SQLite
RAG
MCP
Voice AI integration
Flutter
AI agent development
AI agent orchestration
Multi-Agent Systems
Tech Lead / AI & Full-Stack Engineer
Feb 2025 - Mar 20261 year 1 month
Project Overview

A healthtech platform for pediatricians in Brazil that combines a specialized clinical forum, AI-driven decision support, and intelligent case management. It bridges the gap between static medical content platforms and real clinical workflows by integrating AI directly into forum discussions, with autonomous participation, RAG over historical cases, and clinical modules for diagnostic analysis, medical chat, and case triage. The main challenge was creating AI that clinicians would trust and adopt in a high-stakes domain, while balancing technical reliability with product judgment.

Project gallery:
Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer
Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer
Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer
Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer
Responsibilities:
  • Led end-to-end architecture and delivery of the platform, owning AI integration, backend, and product decisions;
  • Designed and implemented AI clinical modules (diagnostic analysis, intelligent medical chat, case triage) with structured prompt engineering, chain-of-thought reasoning, output validation, and safety guardrails for medical content;
  • Refactored the AI layer through LangChain to standardize prompt templates, chains, and structured output parsing across all modules;
  • Built an autonomous AI agent that monitors forum discussions via Cloud Functions, used as an async message queue, applying contribution heuristics to decide when to act versus stay silent;
  • Implemented a RAG pipeline over the forum's historical case base with embedding pipelines and vector search, grounding AI responses in past clinical discussions and surfacing similar cases at posting time;
  • Implemented subscription-based access via Stripe Billing, role-based authentication, and normalized transactional data modeling;
  • Built CI/CD pipelines with unit and integration testing, plus performance observability;
  • Shaped the product UX of AI participation, moving away from generic AI-style responses toward concise doctor-to-doctor communication patterns.
Project Tech stack:
Python
LangChain
RAG
LLM integration
Vector Databases
Prompt engineering
Node.js
Typescript
React
Flutter
Firebase
Stripe API
CI
CD
Lead AI Engineer / Co-founder
Nov 2025 - Mar 20264 months
Project Overview

An on-premises RAG assistant for law firms that cannot send confidential client data to external LLM providers. The product runs entirely locally, including a hosted LLM and a retrieval stack tuned for legal documents, and supports the full workflow from ingestion and semantic chunking to embedding generation, vector search, and retrieval-aware response generation. The main challenge was delivering production-grade RAG infrastructure under strict data isolation requirements, with no external API dependencies in the critical path.

Responsibilities:
  • Co-founded the project and led technical design end-to-end, including the full RAG architecture;
  • Designed and implemented the embedding generation pipeline tailored to legal document structure;
  • Built the vector search layer on Qdrant with semantic chunking strategies optimized for legal content;
  • Deployed a locally hosted LLM (Qwen3-32B) on Apple Silicon, evaluating model trade-offs against task and latency requirements;
  • Implemented retrieval-aware prompt engineering with citation grounding, ensuring responses reference specific source passages;
  • Integrated MCP-based document ingestion from Google Drive, enabling automated and structured document onboarding;
  • Architected the system to operate fully air-gapped, meeting the data isolation requirements of legal clients;
  • Defined product positioning and go-to-market approach in partnership with stakeholders, including hardware selection and operational cost modeling.
Project Tech stack:
Python
Qdrant
LLM
AI API integration
Vector Databases
RAG
Prompt engineering
MCP
FastMCP
Senior AI & Full-Stack Engineer
Jun 2024 - Jun 20251 year
Project Overview

A no-code landing page builder that lets users describe what they need in natural language and generates fully structured, editable landing pages. It was built to turn LLM output into reliable, deterministic UI content that the editor can consume directly, with validation, fallbacks, and idempotent updates instead of free-form text. The product was designed and delivered as a single-founder effort, from the AI generation pipeline to the editor and deployment infrastructure.

Project gallery:
Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer
Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer
Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer
Responsibilities:
  • Built the frontend and backend of the platform, from initial concept to deployed platform;
  • Designed the product strategy, user experience, and technical architecture end-to-end;
  • Engineered the LLM layer that translates natural language input into deterministic JSON/YAML mapped to UI components (layout, copy, CTAs, forms);
  • Implemented validation, fallback handling, content scoring, and idempotent updates to ensure reliable generation across varied user inputs;
  • Defined the structured output contract between the LLM and the rendering layer, enforcing schema validity end-to-end;
  • Built the editing flow that allows end users to iterate on AI-generated pages without breaking structural integrity;
  • Set up automated deployment pipelines with GitHub Actions for reliable releases;
  • Balanced model selection, prompt design, and latency to maintain responsive UX during interactive AI generation.
Project Tech stack:
Python
React
Flutter WEB
LLM
Prompt engineering
Data Structures
JavaScript
Typescript
Node.js
Firebase
GitHub Actions

Education

2022
Electronics and Automation Engineering
Master's degree.

Languages

French
Advanced
Portuguese
Advanced
English
Advanced

Hire Fernando or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2026 lemon.io. All rights reserved.