Narrow by Design: The Case for Composable AI Teams

7 min read Original article ↗

Originally published at https://openenvelope.org/writing/narrow-by-design

There is a temptation, when building with AI, to make agents as capable as possible. Give them access to everything. Write prompts that cover every edge case. Hope that the model figures out the rest. This feels like leverage — one agent, infinite surface area.

It is the wrong instinct. And the teams built this way tend to prove it quickly.

The agents that perform best are not the most capable ones. They are the most focused ones. The ones with a single, unambiguous job. This is not a quirk of current models — it is a structural property of how AI systems work, and it has real consequences for how AI teams should be designed.

Composable AI is not a new concept, but it is often described in abstract terms — modular components, interoperable systems, reusable pieces. The practical version is simpler: you build specialised agents that do one thing well, and you connect them into a structure that handles the complexity.

The insight is that a capable agent and a small team doing the same job are the same abstraction at different levels of resolution. A Support Lead agent and a support team with an L1 triage agent, an L2 technical agent, and a comms agent present the same interface to the outside world. The parent system — whatever is routing tasks to them — does not know or care which one it is talking to. The team version just performs better, because each component has one clear job.

This makes the agent abstraction fractal. Any agent in a hierarchy could, in theory, be expanded into a sub-team without changing anything upstream. The question is not whether to decompose — it is how far to go.

When an agent’s role is narrow, several things happen simultaneously:

Instructions become unambiguous. A prompt that covers one job function can be specific and precise. A prompt that covers many job functions has to hedge, qualify, and cover edge cases — and models respond to that ambiguity by hedging too.

Context gets smaller. A focused agent only needs to hold the context relevant to its job. A generalist agent carries everything and has to reason about what matters. Smaller, cleaner context produces more reliable outputs.

The input-output contract gets clear. When an agent knows exactly what it receives and exactly what it is expected to produce, the model can optimise for that path. When the contract is fuzzy, so is the output.

Together, these mean that narrowing an agent’s role is not just a design preference — it is a performance lever. The same underlying model, given a focused role, will consistently outperform the same model given a broad one.

This does not mean narrower is always better. There is a crossover point.

The gains from specialisation are real, but they operate on a curve. Once a role is narrow enough to be genuinely unambiguous — once there is no meaningful ambiguity left to resolve — splitting it further adds coordination overhead without adding focus. An agent that must wait on three upstream agents before it can act may produce worse end-to-end results than a slightly broader agent that handles the full task itself. Latency compounds. Error propagation between agents adds surface area. The seams between components become the weakness.

The right question is not “how narrow can we go?” but “where does a role become coherent?” A coherent role has a clear job function, a defined escalation path, and inputs and outputs that are specific enough that the agent can act without ambiguity. Below that level, you are decomposing a skill rather than a role — and that is where the performance curve flattens.

Designing that boundary well is a skill. The best AI teams are not the ones with the most agents. They are the ones where each agent’s role was thought through carefully enough that the decomposition reflects the actual structure of the work.

There is a structural question underneath all of this: if agents and teams are the same abstraction at different scales, where does the composition get managed?

The answer depends on what you mean by composition.

The execution layer. How agents wake up, how tasks route between them, how context passes across a conversation — this is a runtime concern. Something has to manage the event bus, the task handoffs, the agent lifecycles. It needs to be close to the metal: fast, stateful, and aware of what is happening right now.

The distribution layer. How teams are defined, packaged, discovered, and deployed — this is a design-time concern. A builder decides how many agents to include, what their roles are, how they escalate, which tools they need access to. That definition gets packaged and shipped to wherever it runs. The distribution layer does not need to know how the agents will coordinate at runtime — it just needs to produce a definition clean enough that the execution layer can do its job.

Conflating these two layers is a common mistake. Trying to build runtime orchestration into a distribution product, or trying to make a runtime aware of packaging concerns, creates systems that are hard to reason about and harder to evolve. The separation is not just architectural cleanliness — it is what allows each layer to improve independently.

Design to the role, not to the model. The unit of composition should be a coherent job function — the kind of role you would put on an org chart. Not a feature, not a skill, not a capability. A role. That is the level at which decomposition produces reliable performance gains without excessive coordination cost.

Escalation paths matter as much as prompts. A narrow agent is only as good as its ability to hand off work it cannot handle. The escalation structure — who passes to whom, under what conditions — is load-bearing. Get it wrong and the team fragments. Get it right and the team behaves like a coherent unit.

The boundary between layers should be explicit. Whatever manages orchestration at runtime should not need to understand how teams were packaged and distributed. Whatever manages distribution should not need to know how agents will coordinate when running. When those concerns bleed into each other, both get harder to build and maintain.

Narrow roles compound. A team of five focused agents, each with a well-designed role and a clear escalation path, will outperform a team of ten broadly-scoped agents trying to cover the same ground. The performance gains are not additive — they multiply, because each agent’s clarity reinforces the clarity of the agents it connects to.

Composable AI is sometimes talked about as a future state — something that will be possible once models are more capable, once tooling matures, once standards emerge. But the core principle is available right now, and it does not require any of those things.

The principle is just this: narrow roles, connected deliberately, with clear interfaces between them, outperform broad roles trying to cover the same ground. This is how good organisations work. It is how good software systems work. And it is how good AI teams work.

The teams worth building are not the ones with the most surface area. They are the ones where every agent knows exactly what its job is — and exactly when to hand it to someone else.

This is the design principle the Envelope Schema is built around. Narrow roles, explicit escalation paths, and declared input-output contracts — so the structure of the team is encoded up front, not improvised at runtime.

Discussion about this post

Ready for more?