Technical diagram showing the five-layer architecture of the Relay job-search Agent system: UI layer, API orchestration layer, Agent execution layer, shared services layer, and data and integration layer

Building a Production-Grade AI Agent System from Scratch: A Full Architecture Breakdown of Relay

“Most Agent projects die in the unmapped wilderness between PoC and production.” I wrote that line while reading through the Relay project documentation. Relay is an open-source AI Agent system for job searching — not a demo built on three lines of LangChain plus GPT-4, but a project with complete architectural documentation, 172 engineering tasks, a hybrid tech stack, and explicit counterexamples for every major design decision. It is not fully running yet. The Agent layer code is still being written. That is exactly why I think this article is worth writing: this is a system that has thought very deeply at the design level, and those deep thoughts — regardless of where this project ultimately lands — are valuable references for everyone doing Agent engineering. ...

June 24, 2026 · 20 min · 4223 words · Xinwei Xiong, Me
A technical diagram with a tiny agent loop at the center, surrounded by concentric rings of the eight pillars: orchestration, context, memory, tools, reliability, evaluation, cost, governance

The Agent Engineering Map: Where Does That 98.4% of the Work Actually Live?

“The agent loop is 10 lines of code. Agent engineering is 100,000 lines of code.” The first time I read that, I paused — and the more I sat with it, the sharper it cut. It punctures the single biggest illusion in this whole field: people think building an agent means writing a good prompt and wiring up an LLM API. But the actual work of pushing a demo to production — of running safely, unattended, all night long — is 99% not in that loop. ...

June 17, 2026 · 30 min · 6377 words · Xinwei Xiong, Me