Production on David Lang

Production on David Lang https://www.davidlang.tech/tags/production/ Recent content in Production on David Lang Hugo en Sat, 28 Feb 2026 00:00:00 +0000 Building Reliable AI Agents: Lessons from Production https://www.davidlang.tech/posts/building-reliable-ai-agents-lessons-from-production/ Sat, 28 Feb 2026 00:00:00 +0000 https://www.davidlang.tech/posts/building-reliable-ai-agents-lessons-from-production/ <p>Production agents fail in boring ways: timeouts, tool errors, runaway loops, and silent wrong answers. Reliability engineering applies to agents too.</p> <h2 id="hardening-checklist">Hardening Checklist</h2> <ul> <li>Max steps and token budgets per session</li> <li>Idempotent tools with clear error messages</li> <li>Checkpoint state for long workflows</li> <li>Circuit breakers when external APIs fail</li> <li>Structured logging of every tool call</li> </ul> <h2 id="graceful-degradation">Graceful Degradation</h2> <p>When the agent fails, fall back to search-only RAG or human handoff-never an empty error.</p>