Pillar guide

How to secure AI agents in production.

Five controls every team running agents needs — and where the gate sits in your stack. Practical, opinionated, code-first.

The problem

Agents take real actions. Logs aren't enough.

An AI agent isn't a chat box. It calls tools — refund APIs, email senders, database writes, deploy pipelines. Once it acts, the action is done. Traditional observability tells you what already broke. Securing agents means putting a decision point before the action, not a dashboard after it.

The 5 controls

What an agent security stack looks like.

1. Declarative tool-call policies

Define what each tool is allowed to do — allowed arguments, forbidden values, rate limits, cost caps — as code that lives next to your agent. Versioned, testable, deployable. Bad calls fail at the gate, not at the database.

2. Human-in-the-loop approvals

Route ambiguous or high-blast-radius actions to a human via Slack, email, or webhook before they execute. Approvers see full agent context: prompt, plan, tool, arguments, policy match.

3. Loop and circuit breakers

Detect runaway agents repeating the same tool call, cap retries, and trip a circuit breaker when cost or error rate spikes. Stops $4,500 refunds and 12-emails-to-the-same-lead before they ship.

4. Replay debugger

Step through every agent run frame by frame: model output, tool selected, arguments, policy decision, tool result, latency, cost. Turn 'the agent did something weird' into a reproducible incident.

5. Tamper-evident action log

Every decision — allowed, blocked, approved, rejected — is written to an append-only log with full context. The audit trail your security and compliance teams will ask for on day one.

Architecture

Where the gate sits.

Inline, between the agent runtime and your tools. Same process, no proxy.

Agent
LLM runtime
SafeRun
Validate · Approve · Log
Tools
APIs · DBs · Email
Three lines

Wrap the tool. That's it.

tools.tstypescript
import { guard } from "@saferun/sdk";

const safeTool = guard(tool, { policy: "production", approval: "slack" });
await safeTool.execute(args);
FAQ

Common questions about agent security.

Runtime action-control for AI agents

Give production agents a checkpoint before they act.

Wrap risky tool calls, pause or block what shouldn't run, and replay the decision so teams can turn each near-miss into a rule.