Agent sprawl: when your AI agents become unmanageable

Going from 3 to 40 agents in four months is the new default. Here's why it breaks, and how to avoid it from your very first sub-process.

Lire la version française · Explore the platform · See solutions

3 agents in January, 40 in May: a story we keep seeing

A platform engineer recently shared his team story on Reddit. Early January, they had 3 agents in production: a code assistant, incident triage, and a deployment helper. Clean, maintainable, everyone knew what each one did. Four months later, they had around 40. The keyword is “around” — nobody had the exact number anymore.

In the meantime, each team shipped its own agents. PR reviews, log analysis, on-call summaries, data pipeline monitoring, support ticket routing, documentation updates. Some lived in personal Cursor configs, others in Claude Code sessions, others in Friday-afternoon n8n workflows. No registry. No ownership. When the person who built one went on vacation, the agent kept running unattended — or silently stopped, and nobody noticed until something broke.

This pattern has a name: agent sprawl. Uncontrolled proliferation of AI agents inside an organization. And it is no longer a theoretical risk.

Why this is exploding now

Three dynamics combine to create agent sprawl, and all three accelerated over the last six months.

Agents are invisible infrastructure

A microservice, even badly managed, still lives in a repo with a Dockerfile and a CI pipeline. You can find it. An AI agent can live in IDE config, low-code workflow, buried Slack bot system prompt, or notebook. There is no docker ps for agents. When you attempt inventory, half the fleet is invisible.

Dataiku’s 2026 build-vs-buy report highlights exactly this risk: “proliferation of redundant, uncoordinated agents creating vulnerabilities, hidden costs, and organizational inconsistencies.” CERT-FR also issued formal guidance: do not deploy these tools in production without strict controls.

MCP democratizes integration — and risk

Model Context Protocol is a great idea on paper: standard access to tools for agents. In practice, each developer connects agents to whatever they want via MCP servers. One team’s agent gets read-write access to production DB. Another can push to main without review. A third pulls customer data through an MCP server nobody security-reviewed.

Tool poisoning — injecting malicious instructions in tool metadata — is now a real attack vector. Researchers have hijacked Gemini through a poisoned calendar invitation. A vulnerability in a Microsoft agentic browser enabled takeover via path traversal. Every unreviewed MCP server widens the attack surface.

The 2018 microservices replay

People who were there remember: in 2018 everyone was “breaking the monolith.” Six months later, companies had 200 services, no service mesh, no ownership map, and one bad deploy cascading through 15 dependencies nobody knew. Cleanup took years and millions.

We are replaying that scene with agents. Except it is worse in two ways: agents make autonomous decisions and act on production systems, and non-determinism makes incidents slower to reproduce.

4 symptoms that show you are already there

If one or more of these signals sound familiar, agent sprawl has already started:

Nobody can list all active agents in under 10 minutes. Not “thinks they can.” Can, with one query, one source of truth.
Two teams built the same agent without knowing it. Common for meeting summaries, lead qualification, ticket synthesis.
Your LLM bill grows faster than perceived usage. Typical symptom: multiple agents call the same APIs with same inputs and no shared cache.
An agent can access data its human owner should not. Credential sprawl. Quiet problem, painful at first audit.

A 2026 Gravitee report estimates that out of 3 million enterprise AI agents in production, fewer than 48% are actively monitored. In other words, around 1.5 million run unsupervised. OutSystems’ April 2026 survey goes further: 94% of IT leaders say sprawl increases complexity, tech debt, and security risk — and only 12% have a centralized platform to manage it.

The real cost: data, dollars, incidents

Amazon recently lived the nightmare version: four high-severity incidents in one week on the retail site, including a 6-hour checkout meltdown. Root cause: internal agents made decisions based on outdated wiki pages. An agent reads stale docs, makes a confidently wrong choice, and the cascade hits millions of users. They had to put humans back in the loop and call emergency meetings.

If this can happen at Amazon, with their platform engineering capabilities, it can happen anywhere. Agent sprawl does not only cost unused compute. It causes production incidents because autonomous systems keep acting even when context is wrong.

The second cost is data exposure. Agents create emergent data flows nobody explicitly designed. A finance agent pulls HR data to contextualize budget. A sales agent reads support tickets to personalize outreach. These flows cross functional boundaries without review. GDPR does not like that. The AI Act even less.

What we do at Origin 137 to avoid this

The dominant 2026 narrative says you need a governance platform layered on top of your agents. We disagree. Governance is not a tool you add later — it is an architectural property from the very first agent.

Registry from the first sub-process

At Origin 137, a process is not one monolithic agent. It is a set of sub-processes, each with owner, trigger, model, and tools. Registry is not a feature you “turn on later.” It is the data model itself. When you declare a sub-process, it must have a name, owner, purpose, and allowed tools. No registry = no running sub-process.

It feels strict for two days. Then it is exactly what lets 3 agents become 40 without losing control.

Tool allow-list by default

Security guidance converges on one principle: default posture should be deny, not allow. Every Origin 137 sub-process has explicit allow-listed tools — not “everything connected to your MCP account,” but a precise list. Adding a tool happens in config, not runtime. If an agent finds a tool outside its list, it cannot call it.

Concrete result: a lead-scoring sub-process cannot accidentally send emails through prompt injection. Even if the LLM “wants” to, the tool is unavailable. Prompts are not a security mechanism; architecture is.

Shared observability, not one dashboard per agent

The classic trap is giving each agent its own monitoring. You end up with 40 dashboards. The observability we build is shared: each sub-process emits to the same trace bus and cost/latency/error metrics. You can ask “how much did lead qualification orchestration cost this week?” and answer without rebuilding data across 8 sources.

When a sub-process fails above normal baseline, it is a signal, not a mystery. When a new sub-process appears, it appears in the same view as all others. Registry and observability are two sides of the same coin.

Where to start if you are already there

If you are reading this thinking “too late, we already have 20,” here is an action plan that works:

Inventory in 48 hours. Flat list: agent name, human owner, connected tools, last execution. No 20-column spreadsheet — if too heavy, nobody updates it.
Shut down orphans. An agent without clear owner is a liability, not an asset. Off by default, re-enable only if someone claims it.
Identify duplicates. “Meeting summary,” “lead qualification,” “doc extraction” agents are often triplicated. Keep the best, kill the rest.
Rationalize MCP access. One security review per connected MCP server. Unreviewed servers get disabled.
Set a hard gate. Every new agent enters registry before deployment. Period.

The real question is not how many agents

The market noise suggests 40 agents means AI maturity. It does not. What matters is knowing what each agent does, who owns it, and what it can touch. A team with 5 well-governed agents manages risk better than one with 40 and no inventory.

This aligns with our product thesis: value is not in models, but in the layer that orchestrates, observes, and governs them. True at 3 agents. Critical at 40.

Agent sprawl is not inevitable. It is the default result of unarchitected adoption. You choose it — or you avoid it from day one.

At Origin 137, we help teams audit their agent fleet before it doubles. If you went from 3 to 15 in a few months and exact count is getting fuzzy, now is the right time to talk.

Talk to an expert · See pricing