Field guide

8 ways agentic AI breaks production.

The failure modes we see across teams running agents in real workloads — and the inline mitigation that actually stops each one.

Rogue tool calls

The agent invents a customer ID, a SKU, an email address, or a refund amount and calls a real tool with it. The model is confident, the arguments parse, the tool fires.

Mitigation

Validate every tool argument against a schema and a live data check before execution. Block calls referencing entities that don't exist. Require approval for destructive verbs (delete, refund, deploy).

Runaway loops

An agent gets stuck. It re-plans, re-tries, re-emails the same lead twelve times in five minutes, or hammers a downstream API until rate limits trip and on-call wakes up.

Mitigation

Loop detection on tool-call signatures, per-agent call budgets, and circuit breakers that trip on repeated identical actions. Halt the run, page no one, log everything.

Prompt injection

User content, scraped pages, email bodies, or tool results contain instructions that hijack the agent — exfiltrate data, call the wrong tool, ignore previous policy.

Mitigation

Don't trust the prompt to enforce safety. Constrain the action surface: tool allowlists per context, argument-level policies, and human approval for any tool that touches sensitive data.

Data exfiltration

The agent reads a sensitive record, then helpfully includes it in an outbound email, a webhook payload, or a debug log shared with a third party.

Mitigation

Tag fields as sensitive at the source. Validate outbound tool arguments against the tags. Block egress that mixes sensitive reads with external sends, even if the model 'meant well'.

Cost blowups

A misconfigured agent burns $4,000 in OpenAI credits overnight, or a tool retries on failure and racks up Twilio charges nobody budgeted for.

Mitigation

Per-agent and per-tool cost ceilings enforced inline. Soft cap pages a human; hard cap halts the agent. Cost is a first-class signal, not a monthly invoice surprise.

Silent regressions

A model upgrade or prompt tweak ships. Quality drops 8%, the agent now picks the wrong tool 1 in 50 runs, and you find out from a customer ticket two weeks later.

Mitigation

Per-agent reliability scoring tied to policy decisions and tool outcomes. Anomaly alerts when block rate, retry rate, or approval rate moves outside its baseline.

Untraceable incidents

Something went wrong. The user is upset. Your logs show 600 LLM tokens and a tool error. You can't reconstruct what the agent saw, planned, or decided.

Mitigation

Capture the full agent run — model output, tool selected, arguments, policy decision, tool result — as a replayable timeline. Every incident becomes a click, not an investigation.

Compliance gaps

Audit asks who approved this action, what the policy was at the time, and whether the data the agent touched left the region. The honest answer is 'we don't know'.

Mitigation

Tamper-evident, append-only action log with policy version, approver, inputs, outputs, and decision per call. The control evidence is generated automatically as agents run.

Keep reading

Free up to 10k actions a month

Ship agents your on-call won't dread.

Add SafeRun in three lines. Validate, block, and replay every risky tool call — before it touches production.

Get early access Read the guide

8 ways agentic AI breaks production.

Rogue tool calls

Runaway loops

Prompt injection

Data exfiltration

Cost blowups

Silent regressions

Untraceable incidents

Compliance gaps

AI agent platform

How to secure AI agents

AI agent evaluation

Ship agents your on-call won't dread.