Pillar guide

LLM observability, extended to the whole agent.

Token traces tell you what the model said. SafeRun records what the agent did — every tool call, argument, return value, and policy decision — and lets you replay any failure step by step.

The gap

LLM tracing ends where the agent begins.

Classic LLM observability — LangSmith, Langfuse, Helicone, Arize Phoenix — does one thing well: it captures prompts, completions, tokens, and latency. That's enough when your product is a chat completion.

But agents are loops. They call tools, pass arguments, get results back, retry, branch, and sometimes do real damage. To debug them you need the action trace, not just the model trace — and you need to replay it.

Six pillars

What agent-grade observability looks like.

Full-trace capture

Every prompt, model response, tool call, argument, return value, latency, and cost — recorded at the action layer, not just the token layer.

Time-travel replay

Re-run any failed agent step with the exact context. Reproduce production bugs locally without copy-pasting JSON between Slack threads.

Live action stream

Watch agent runs as they happen. Filter by agent, tool, status, or user — find the bad run before the customer files a ticket.

Diffs across model versions

Compare runs across model upgrades. Catch the silent regression where GPT-5 picks the wrong tool 3% more often than 4o.

Inline policy decisions

Every block, allow, and approval recorded next to the action it touched. Observability and control in the same trace.

Anomaly + cost alerts

Page when block rate, retry rate, or per-run cost drifts off baseline. Stop runaway loops before they touch your AWS bill.

Side by side

LLM observability vs agent observability.

LayerLLM tracingSafeRun (agent layer)
Prompts & completionsYesYes
Token + cost per callYesYes
Tool calls + argumentsPartialYes
Tool return valuesRareYes
Step-by-step replayNoYes
Policy decisions inlineNoYes
Human approvalsNoYes
Loop + cost breakersNoYes
Three lines

Wire observability in once.

agent.tstypescript
import { observe } from "@saferun/sdk";

observe(agent, { service: "support-agent-v2", env: "prod" });
// Step-by-step traces, replay, and policy decisions in one timeline.
FAQ

Common questions about LLM observability.

Free up to 10k actions a month

Ship agents your on-call won't dread.

Add SafeRun in three lines. Validate, block, and replay every risky tool call — before it touches production.