AI Agent Observability Platform

Debug your AI
agents before
your users do.

OopsTrace gives you a live trace tree for every agent run. Error spans light up the moment they fail — click one to open an AI chat pre-loaded with full context and fix the bug instantly.

traces / agent-pipeline
LIVE
SPAN TIMELINE DUR
0/8 spans 0.00s

Works with every major AI framework

OpenAIAnthropic LangChainLlamaIndex CrewAIAutoGen HaystackDSPy

Built for production AI teams

Everything you need to observe, debug, and improve your agents — in one place.

Real-time trace streaming
Watch spans arrive as your agent runs. Live streaming shows you exactly what's happening the moment it happens — no refresh needed.
Instant error detection
Error spans light up the moment they fail. Stack traces, status codes, and messages — surfaced automatically in the trace tree.
One-click LLM fix
Click a failing span. An AI assistant opens with full trace context pre-loaded — inputs, outputs, parent chain, token usage. Fix in seconds.
Multi-agent tracing
Trace across agent boundaries, tool calls, and sub-agents in a single unified tree. See the full execution graph of complex pipelines.
Session & user analytics
Group traces by user session, correlate failures across runs, and drill into cohort-level behaviour — no separate BI tool needed.
Prompt version tracking
Track which prompt version caused a failure. Compare outputs across model versions and prompt iterations directly from the trace view.

From deploy to fix in minutes

No complex setup. Add two lines, open the dashboard, and start debugging.

01
Instrument in 2 lines
Add the OopsTrace SDK to your agent. One decorator captures the full execution tree — all tool calls, LLM requests, and sub-agents automatically.
from oopstrace import OopsTrace

OopsTrace.init(api_key="ot_...")

@OopsTrace.trace
async def my_agent(query: str):
    result = await llm_call(query)
    return result
02
Observe in real time
Open the dashboard and watch spans arrive as your agent executes. Each span shows timing, token usage, cost, and status as it runs.
# Spans appear as your agent runs:
#
# ● my_agent                    2.4s  ✓
#   ● retrieve_context           320ms ✓
#   ● llm_call                   1.1s  ✗  ← Oops!
#     ● tool_execution            240ms ✗
#   ● generate_response          680ms ✓
03
Fix with AI
Click the red span. An AI assistant opens with the full trace context — error message, inputs, outputs, parent chain. Diagnose and fix instantly.
# AI assistant context (auto-injected):
# Span:  llm_call
# Error: RateLimitError — quota exceeded
# Input: { model: "gpt-4o", messages: [...] }
# Parent: my_agent → retrieve_context
#
> Why did this span fail?
> Suggest a retry strategy with backoff.

Deploy on your infrastructure

Run OopsTrace on your own servers. Your traces stay in your environment — no data leaves your infrastructure. Full control, no vendor lock-in.

bash
$ git clone github.com/oopstrace/oopstrace
$ docker compose up -d
$ # Dashboard ready at localhost:3000

Stop guessing.
Start tracing.

Free plan available. No credit card required. Up and running in under 5 minutes.