The Frodo Project

Chapter 1

The Origin

The next frontier of AI isn't better chat. It's autonomous execution with memory, structure, and defined roles. The Frodo Project is my exploration into how multiple agents can form a coherent system with strategic oversight, clear delegation, and measurable outcomes, a digital team rather than a collection of isolated assistants.

The goal was not another chatbot. It was an intelligent operational layer with strategic awareness, persistent memory, and real autonomy. A team, not a tool.

Chapter 2

The Challenge

Building a multi-agent system isn't a prompt engineering exercise. It's a systems design problem. The same kind of problem I solve when I'm designing product teams, information architectures, or interaction flows, except the "team members" are AI models with very specific capabilities and limitations.

How do you run concurrent workstreams across business, design, and engineering?
How do you maintain context and intent across dozens of parallel tasks and sessions?
How do you prevent the "chatbot trap" of shallow, one-off interactions with no memory?
How do you keep costs under control when API calls cost real money?
How do you build a system that extends the designer's brain, not just their hands?

Frodo AI Office - Job Board tracking 87 active listings across multiple platforms

Chapter 3

The Journey

Building an autonomous multi-agent system meant solving problems no documentation covered. Each milestone came with a failure that reshaped the architecture.

Milestone

Setting Up the Infrastructure

Started with a VPS (cloud server) and connected it to my home network using Tailscale VPN. Installed OpenClaw as the agent framework. Got the first agent, Frodo, running and responding to commands. The foundation was a Linux server that could talk to the outside world and run AI models.

Milestone

Building the Agent Team

Designed 9 specialized agents, each with a distinct identity, role, and skill set. Scout hunts jobs. Quill writes cover letters. Echo handles outreach. Forge builds code. Pixel designs. Every agent got an IDENTITY.md file defining their personality, responsibilities, and boundaries, the same way you'd write a design system, but for AI behavior.

Roadblock

The Runaway Cost Explosion

Independent scheduled jobs with no shared memory meant hundreds of API requests per day, millions of tokens burned, and costs spiraling from processes running without oversight. The cloud bill doubled overnight.

Each session started from scratch. Agents re-researched completed work, repeated tasks, asked the same questions. No persistent context, no deduplication. Every morning was their first day on the job.

Roadblock

The 3am Cron Chaos

A social media agent was running every 30 minutes with no coordinator oversight. The coordinator would say "waiting for reports" while background crons fired independently, burning through API credits without its knowledge.

Root cause: cron jobs hiding in two layers, the app scheduler and the system crontab. Agents bypassing the coordinator entirely. No oversight, no deduplication. The equivalent of half your team working nights on projects nobody asked for.

Solution: The Architecture Pivot

Single Coordinator Pattern

Deleted every independent cron. Established one rule: Frodo controls everything. Sub-agents never self-trigger. Only the coordinator reads state, decides what needs to happen, spawns the right agent, reviews the output, and updates state.

This is the same principle that makes great product teams work. One decision-maker. Clear delegation. Every action traceable back to a decision.

Milestone

Building the GPU Rig

Set up "DevMan", an AMD Ryzen 9 5900HX with RTX 3080 (16GB VRAM), as a dedicated local inference server. Ollama for model serving, connected to the VPS via Tailscale VPN, configured as a model provider in OpenClaw.

Result: worker agents run on free local models, only the coordinator uses the paid cloud API. Daily operating cost dropped from dollars to cents.

Roadblock

Local Models Can't Do Everything

Tested every model I could find. phi4:14b had no tool support at all, returns "400 does not support tools." gpt-oss:20b worked on the first tool call, then degrades, starts hallucinating function names like "container.exec" that don't exist. Kimi K2.5 was bad at following multi-step commands.

The workaround: design tasks as single-shot shell scripts. If the model only needs to succeed at one tool call instead of five sequential ones, reliability goes way up. Design the system around the model's limitations, not against them.

Roadblock

The Silent Failures

GPU memory conflicts between image generation and LLM serving caused Ollama to get evicted from VRAM. VPN connections went cold after inactivity. In both cases, the system silently fell back to the paid cloud API with no visible error. Everything looked fine on the surface while costs climbed invisibly.

The fix was layered: connection keepalives every 2 minutes, model usage monitoring (not just configuration), and alerts for unexpected fallbacks. The recurring lesson: design for silent failures from the start.

Milestone

Mission Control Dashboard

Built a real-time Next.js monitoring dashboard: agent status, cron health, activity feeds, model routing, and cost tracking. The recurring lesson from every roadblock was the same: you can't manage what you can't see. So I made everything visible.

Chapter 4

The Architecture

The final architecture follows a pattern I've used in every product team I've designed for: one clear leader, specialized roles, defined communication channels, and a single source of truth.

Frodo

CEO / Coordinator

Gemini 2.5 Flash · Paid

The brain. Reads system state, makes decisions, spawns sub-agents, reviews output. Every action in the system traces back to a Frodo decision. Runs on the only paid model because strategic reasoning requires it.

Scout

Job Hunter & Lead Sourcer

Local GPU · Free

Quill

Writer & Content Creator

Local GPU · Free

Echo

Outreach & Communications

Local GPU · Free

Codex

Research & Analysis

Local GPU · Free

Forge

Full-Stack Engineer

Local GPU · Free

Pixel

UI/UX Designer

Local GPU · Free

Swift

Mobile Developer

Local GPU · Free

Sentinel

Security & QA

Local GPU · Free

The Agent Office - 9 specialized agents with defined roles, models, and status indicators

The design principle: One brain pays for quality. Eight workers run for free. The coordinator delegates reasoning-heavy decisions to itself and volume tasks to local models. Same budget philosophy you'd apply to any team, senior talent on strategy, junior talent on execution.

Chapter 5

Design Thinking Applied to AI

Every hard problem in this project mapped directly to a design principle I already use in product work. The skills that make great product designers transfer directly to AI systems architecture.

1 Identity Systems

Each agent has an IDENTITY.md, a structured definition of personality, responsibilities, communication style, and boundaries. It's a design system for AI behavior. The same way design tokens prevent visual drift across a product, identity files prevent role drift across agents. Without them, agents slowly blend into generic assistants that overlap and conflict.

2 Information Hierarchy

The single-coordinator pattern is information hierarchy applied to AI. One focal point. One decision-maker. Sub-agents report up, not sideways. It's the same reason a well-designed dashboard has one primary action per screen, clarity comes from constraint.

3 Signal Over Narration

Agents don't dump raw data. They summarize, prioritize, and flag only what requires human attention. One daily Telegram recap instead of per-agent notifications. It's the same principle behind good notification design, respect the user's attention as a finite resource.

4 State as Source of Truth

Markdown files (job-board.md, applications-log.md, bounties-log.md) serve as the system's single source of truth. Every agent reads from and writes to the same state files. No conflicting copies. No stale data. Same principle as a design system's token library, one source, many consumers.

5 Design for Failure

Every roadblock taught the same lesson: assume things will break silently. GPU gets evicted? Silent fallback to paid API. VPN goes cold? Silent timeout. Model can't handle tools? Silent degradation. The system needed observability baked in, the same way good UX needs error states designed, not afterthought.

Chapter 6

Tech Stack

This isn't a wrapper around ChatGPT. It's real infrastructure that I built, configured, and maintain.

Agent Framework

OpenClaw

Open-source multi-agent platform. Session management, tool calling, cron scheduling, agent spawning.

Coordinator Brain

Gemini 2.5 Flash

Google's fast reasoning model. Reliable multi-step tool calling. ~$0.02-0.08/call with thinking controls.

Local Models

Ollama + gpt-oss:20b

Free inference on local GPU. Also running phi4:14b, deepseek-r1:32b, qwen2.5-coder:32b, qwen3-coder:30b.

GPU Hardware

RTX 3080 16GB VRAM

AMD Ryzen 9 5900HX, 32GB RAM. Running Ollama for model serving + ComfyUI for image generation.

Cloud Server

Linux VPS

Runs OpenClaw gateway, agent sessions, cron scheduler. Connected to GPU rig via Tailscale VPN.

Networking

Tailscale VPN

Mesh VPN connecting VPS, GPU rig, and development machine. Private network, no exposed ports.

Dashboard

Next.js + pm2

Real-time monitoring dashboard. Agent status, cron health, activity feeds. Process-managed for uptime.

Image Generation

ComfyUI + FLUX Schnell

Local AI image generation pipeline. CLIP-L, T5-XXL, VAE. Runs on the same GPU rig.

Cost Control

Hybrid Architecture

Paid API for brain only. Free local models for workers. Token limits, thinking-level controls, concurrent session caps.

Infrastructure - VPS and DevMan GPU nodes connected via Tailscale VPN mesh with model routing

Usage & Costs - detailed cost breakdown by model and agent, showing hybrid architecture savings

Chapter 7

Why This Matters Beyond Frodo

Frodo is one system. But the problems it solves are universal.

Most AI agent systems fail from lack of orchestration, not lack of intelligence
Most cost blowups come from missing governance, not expensive models
Most agents drift without role constraints and identity design
Most teams underestimate silent failures in distributed AI infrastructure

Frodo proves that coordinator-pattern architecture, hybrid inference routing, and behavioral governance make autonomous AI systems viable for small teams and solo operators. The same patterns apply at enterprise scale.

Chapter 8

What I Built

What started as an experiment became a fully operational system. Here's what it delivered.

9

Specialized agents
running 24/7

95%

Cost reduction
after optimization

66

Jobs tracked
and analyzed

35+

Tailored cover letters
generated

What I built: A scalable coordinator architecture with hybrid inference routing, an AI identity and governance system that prevents role drift, a full observability layer for silent failure detection, and a cost engineering framework that reduced operating expenses by 95%. The system scales. Adding new agents requires no architectural redesign.

What's next: Browser automation for real-world execution. The infrastructure, coordination model, and agent team are production-ready. The next phase is giving agents the ability to interact directly with the open web, moving from analysis and generation into full autonomous action.

Chapter 9

Key Learnings

What I'd Do Again

Identity files for every agent from day one
Single coordinator, no exceptions
State files as the source of truth
Hybrid model strategy (paid brain, free workers)
Monitoring dashboard before scaling agents

What I'd Do Differently

Never give agents independent crons
Test model tool support before building workflows around them
Monitor actual model usage, not just configured models
Design for silent failures from the start
Set up cost alerts before letting anything run overnight

Reflections

I named the project after Frodo Baggins. One small person carrying something way too big for them, walking straight into the unknown anyway. That felt right when I started. It still does.

The hardest part of agentic AI isn't the technology. It's the system design. Defining clear roles, establishing communication protocols, knowing when to delegate versus decide, building for failure, designing information hierarchy so the right signal reaches the right person at the right time.

Orchestrating intelligence at scale is not an engineering trick. It is a design problem.

The Frodo Project - Full system overview

Frodo Project

My role

About the project

The Origin

The Challenge

The Journey

Setting Up the Infrastructure

Building the Agent Team

The Runaway Cost Explosion

The 3am Cron Chaos

Single Coordinator Pattern

Building the GPU Rig

Local Models Can't Do Everything

The Silent Failures

Mission Control Dashboard

The Architecture

Design Thinking Applied to AI

1 Identity Systems

2 Information Hierarchy

3 Signal Over Narration

4 State as Source of Truth

5 Design for Failure

Tech Stack

Agent Framework

Coordinator Brain

Local Models

GPU Hardware

Cloud Server

Networking

Dashboard

Image Generation

Cost Control

Why This Matters Beyond Frodo

What I Built

Key Learnings

What I'd Do Again

What I'd Do Differently

Reflections