Fernando – AI agent orchestration, Multi-agent systems architecture, AI telemetry, experts in Lemon.io

Fernando

From Brazil (UTC-3)

AI Agent Architect|Senior

AI Engineer|Senior

Lemon.io stats

1

offers now 🔥

Skills and seniority verified on May 14, 2026

Fernando – AI agent orchestration, Multi-agent systems architecture, AI telemetry

Fernando is a senior AI Agent Architect with 8 years of experience and deep expertise in Python, LLMs, multi-agent systems architecture, AI agent orchestration, and RAG. He has led end-to-end delivery of AI-driven platforms in healthtech and legal domains, demonstrating strong product judgment, stakeholder communication, and technical ownership.

8 years of commercial experience in

AI

Healthtech

Legal tech

Productivity

UI/UX

B2B

B2C

AI software

Mobile apps

SaaS

Web development

Software development

Main technologies

AI agent orchestration

2 years

Multi-agent systems architecture

2 years

AI telemetry

2 years

Python

7 years

LLM

2 years

Additional skills

RAG

MCP

LLM orchestration

AI agent development

LangChain

Anthropic

FastAPI

OpenAI

LangGraph

CI/CD

Vector Databases

Prompt engineering

Firebase

LLM integration

Claude Code

Docker

SQLAlchemy

GitHub Actions

Multi-Agent Systems

Voice AI integration

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Senior AI Engineer

Feb 2026 - May 20263 months

Project Overview

A personal AI assistant and multi-agent orchestration layer built around Claude Code and a LangGraph-based agent core, coordinating multiple simultaneous LLM agent instances on long-running development tasks. The system combines an autonomous "brain" agent with a fleet of Claude Code sessions, exposing them through a FastAPI backend (REST + WebSocket), a Typer CLI, and a Flutter client for desktop and mobile. It supports always-on, wake-word-activated voice interaction and event-based triggers, real-time streaming of agent output, self-evolving tool creation, permission-gated tool execution, autonomous decision-making, long-term semantic memory, and multi-LLM routing across providers, all packaged as a one-command installable distribution.The project explores production patterns for agentic systems beyond single-agent prompting, focusing on observability, controlled autonomy, extensibility, and reliability in multi-agent workflows.

Responsibilities:

Architected and built a multi-agent orchestration layer coordinating multiple Claude Code instances in parallel, with an autonomous "brain" agent on top of LangGraph with persistent checkpointing, capable of monitoring sessions and deciding when to act, respond, or escalate.
Designed an extensible tool ecosystem combining a unified tool registry (filesystem, shell, web, PTY, semantic memory), a self-evolving system where the agent autonomously designs and registers new tools at runtime, and an MCP (Model Context Protocol) client for interoperability with external tool servers.
Implemented a multi-LLM routing layer dispatching tasks across providers (Anthropic, OpenAI, local models) based on cost, latency, and task profile.
Built long-term semantic memory with RAG, using a local vector store (Chroma + sentence-transformers) and applied advanced prompt engineering with persona/config-driven behavior (YAML-based prompts, tools, and triggers) to improve tool selection accuracy and decision quality.
Developed real-time WebSocket streaming for agent output with per-session subscribe/unsubscribe and connection lifecycle handling, alongside a permission-gated tool execution model with explicit approval policies for sensitive actions.
Integrated a full voice interaction stack: speech-to-text and text-to-speech (OpenAI Whisper + ElevenLabs) with a bidirectional audio I/O pipeline and always-on wake-word activation (e.g. "Hey Charles") via a continuous low-power keyword spotter for hands-free use.
Built async backend infrastructure with FastAPI for concurrent agent session management, including metrics, admin, and event-trigger subsystems.
Developed a Flutter client (mobile/desktop) consuming the same REST + WebSocket API as the CLI, and packaged the system as a one-command installable distribution with a macOS launchd service for background operation.

Project Tech stack:

Python

FastAPI

LangGraph

LangChain

Claude Code

Claude API

OpenAI API

Pydantic

WebSocket

SQLite

RAG

MCP

Voice AI integration

Flutter

AI agent development

AI agent orchestration

Multi-Agent Systems

Tech Lead / AI & Full-Stack Engineer

Feb 2025 - Mar 20261 year 1 month

Project Overview

A healthtech platform for pediatricians in Brazil that combines a specialized clinical forum, AI-driven decision support, and intelligent case management. It bridges the gap between static medical content platforms and real clinical workflows by integrating AI directly into forum discussions, with autonomous participation, RAG over historical cases, and clinical modules for diagnostic analysis, medical chat, and case triage. The main challenge was creating AI that clinicians would trust and adopt in a high-stakes domain, while balancing technical reliability with product judgment.

Project gallery:

Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer

Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer

Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer

Portfolio example for Conecped by Fernando, Tech Lead - AI & Software Engineer

Responsibilities:

Led end-to-end architecture and delivery of the platform, owning AI integration, backend, and product decisions;
Designed and implemented AI clinical modules (diagnostic analysis, intelligent medical chat, case triage) with structured prompt engineering, chain-of-thought reasoning, output validation, and safety guardrails for medical content;
Refactored the AI layer through LangChain to standardize prompt templates, chains, and structured output parsing across all modules;
Built an autonomous AI agent that monitors forum discussions via Cloud Functions, used as an async message queue, applying contribution heuristics to decide when to act versus stay silent;
Implemented a RAG pipeline over the forum's historical case base with embedding pipelines and vector search, grounding AI responses in past clinical discussions and surfacing similar cases at posting time;
Implemented subscription-based access via Stripe Billing, role-based authentication, and normalized transactional data modeling;
Built CI/CD pipelines with unit and integration testing, plus performance observability;
Shaped the product UX of AI participation, moving away from generic AI-style responses toward concise doctor-to-doctor communication patterns.

Project Tech stack:

Python

LangChain

RAG

LLM integration

Vector Databases

Prompt engineering

Node.js

Typescript

React

Flutter

Firebase

Stripe API

CI

CD

Lead AI Engineer / Co-founder

Nov 2025 - Mar 20264 months

Project Overview

An on-premises RAG assistant for law firms that cannot send confidential client data to external LLM providers. The product runs entirely locally, including a hosted LLM and a retrieval stack tuned for legal documents, and supports the full workflow from ingestion and semantic chunking to embedding generation, vector search, and retrieval-aware response generation. The main challenge was delivering production-grade RAG infrastructure under strict data isolation requirements, with no external API dependencies in the critical path.

Responsibilities:

Co-founded the project and led technical design end-to-end, including the full RAG architecture;
Designed and implemented the embedding generation pipeline tailored to legal document structure;
Built the vector search layer on Qdrant with semantic chunking strategies optimized for legal content;
Deployed a locally hosted LLM (Qwen3-32B) on Apple Silicon, evaluating model trade-offs against task and latency requirements;
Implemented retrieval-aware prompt engineering with citation grounding, ensuring responses reference specific source passages;
Integrated MCP-based document ingestion from Google Drive, enabling automated and structured document onboarding;
Architected the system to operate fully air-gapped, meeting the data isolation requirements of legal clients;
Defined product positioning and go-to-market approach in partnership with stakeholders, including hardware selection and operational cost modeling.

Project Tech stack:

Python

Qdrant

LLM

AI API integration

Vector Databases

RAG

Prompt engineering

MCP

FastMCP

Senior AI & Full-Stack Engineer

Jun 2024 - Jun 20251 year

Project Overview

A no-code landing page builder that lets users describe what they need in natural language and generates fully structured, editable landing pages. It was built to turn LLM output into reliable, deterministic UI content that the editor can consume directly, with validation, fallbacks, and idempotent updates instead of free-form text. The product was designed and delivered as a single-founder effort, from the AI generation pipeline to the editor and deployment infrastructure.

Project gallery:

Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer

Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer

Portfolio example for Kreat.me by Fernando, Senior AI & Software Engineer

Responsibilities:

Built the frontend and backend of the platform, from initial concept to deployed platform;
Designed the product strategy, user experience, and technical architecture end-to-end;
Engineered the LLM layer that translates natural language input into deterministic JSON/YAML mapped to UI components (layout, copy, CTAs, forms);
Implemented validation, fallback handling, content scoring, and idempotent updates to ensure reliable generation across varied user inputs;
Defined the structured output contract between the LLM and the rendering layer, enforcing schema validity end-to-end;
Built the editing flow that allows end users to iterate on AI-generated pages without breaking structural integrity;
Set up automated deployment pipelines with GitHub Actions for reliable releases;
Balanced model selection, prompt design, and latency to maintain responsive UX during interactive AI generation.

Project Tech stack:

Python

React

Flutter WEB

LLM

Prompt engineering

Data Structures

JavaScript

Typescript

Node.js

Firebase

GitHub Actions

Keep in mind, the experience summary might exclude non-relevant projects

Education

2022

Electronics and Automation Engineering

Master's degree.

Languages

French

Advanced

Portuguese

Advanced

English

Advanced

Hire Fernando or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request

Copyright © 2026 lemon.io. All rights reserved.

Terms of use Privacy policy