Every bot run is captured end-to-end — messages, tool calls, guardrail evaluations, verification outcomes, and latency. Inspect one trace in a split-view replay, or zoom out to workflow-level analytics across thousands of runs in a single screen.
Nothing sampled at capture
Every turn stores the full replay payload — prompts, tool traces, guardrail evaluations, timeline events. The analytics screen samples for speed; the archive keeps the truth.
One pane for health + debug
Traces answer “what happened in that one run?”; Analytics answers “what’s happening across all of them?”. Same filters, same environment split, same bot scope — zero context switching.
Replay + export built in
Open any trace to a full timeline. Copy it to a plain-text report your on-call engineer or a language model can triage in seconds.
One pipeline, two surfaces
The same data model powers the Trace Inspector and the Analytics dashboard. You never have to wonder whether a chart and a trace disagree — they don’t.
Run → Inspect → Understand
one pipeline · two surfacesRun
bot handles a turn
Capture
every field, every call
Store
trace doc + replay payload
Inspect
search, filter, split-view
Aggregate
sampled analytics window
Replay
export to any LLM
Traces
Routing, verification, tools, evidence, and timing — captured with the same schema for dashboard test runs and production API traffic.
What every trace captures
full-fidelity · no samplingRouting decisions
Verification
Escalation
Tool calls
Evidence & retrieval
Performance
Trace Inspector — timeline replay
passTrace ID
trc_9f82…b14
Bot
support-bot
Stage → Action
investigation → refund.issue
Verification
PASS · no reason codes
Latency
1,248 ms
Tokens
prompt 412 · completion 196 · total 608
Timeline (8 events)
stage.classified
Stage predicted with component + version — inspect reason codes.
action.selected
Selected action + class; confidence and verifier inputs saved.
tool.invoked
Tool name, inputs, success, latency, and full JSON output.
guardrail.evaluated
Guardrail name, type, action, and whether the condition matched.
guardrail.triggered
Which guardrail blocked or rewrote — full context preserved.
mcp.called
MCP server, tool, disposition (approved / denied / auto), cost, latency.
missionControl.augmented
Operator whispers, pauses, takeovers — recorded in the turn.
response.composed
Final response text + tokens used (prompt, completion, total).
The Traces workspace
Search across everything
Instant substring search over trace_id, session_id, final_stage, action ID, and bot name — no query language to learn.
Verification + bot + environment filters
Narrow to pass / warn / fail, filter to a single bot, and flip between Test and Prod traffic on one page.
Date-range scoping
From/To pickers scope the list to any window — debug a specific incident or review yesterday’s runs.
Split-pane inspector
Pick a row on the left, see the full replay on the right — summary, verification, escalation, tools, timeline.
Full decision replay
Every trace opens to the exact stage classifier outputs, action selection, verification checks, tool results, and final response.
Copy trace for LLM review
One-click export builds a BOT TRACE REPORT (performance, guardrails, resources, Mission Control augmentations) — paste it into any model for root-cause analysis.
Analytics
Verification trends, latency distribution, top stages, action mix, and reason-code buckets — with a bot filter and environment split on the same page.
Ten KPIs on one screen
workflow healthTotal Runs
with sampled window size
Pass
verification passed
Warn
verification warnings
Fail
verification failed
Blocked
policy or guardrail block
Avg Latency
mean over runs
Escalated
handed to human
Executed Actions
action was taken
Safety Overrides
policy override fired
Avg Events / Run
pipeline density
Six built-in charts
trends · distributions · reason codesVerification Trend
Last 40 runs as vertical bars, colored by outcome. Spot regressions the moment they appear.
Strip chart · pass / warn / fail
Verification Distribution
Donut chart of pass vs warn vs fail across the sampled window.
Donut · three segments
Latency Distribution
Bars for <100ms, 100–250ms, 250–500ms, 500ms–1s, and >1s so you can see the shape, not just the mean.
Histogram · 5 buckets
Top Stages
Which workflow stages dominate traffic — useful for capacity planning and prompt tuning.
Horizontal bars · top N
Top Actions
The actions your bot actually takes most often, ranked.
Horizontal bars · top N
Reason Codes
Verifier and escalation reason-code buckets — go from “what failed” to “why it failed” in one click.
Bucket list · two rails
Dashboard widgets — glance-level signals
Quick Stats
Total traces, plan usage %, avg latency, pass rate, guardrails triggered, inputs blocked.
Request Volume (7d)
Daily bar chart of traffic across the last seven days, by weekday.
Latency gauge
Average, P95, and P99 latency — so a spike can’t hide behind the mean.
Model Usage
Top 6 models by call count, with total tokens per model.
Verification Health
Stacked bar + tile grid for pass / warn / fail at a glance.
Recent Traces
Eight most recent traces, clickable straight into the inspector.
Every dashboard bot-chat run is tagged with environment: "test" and source: "dashboard". Real API calls are tagged prod. Flip the environment pill on the Traces page or the Analytics page to split them, combine them, or isolate either one — stats recompute locally so there’s no waiting.
What you gain
Answers
Is every run recorded, or just a sample?
Every run is captured end-to-end — messages, tool calls, guardrail evaluations, verification outcomes, tokens, and latency. The dashboard’s analytics view samples the most recent runs for speed, but the underlying trace store keeps the full record for replay and audit.
Can I separate test traffic from production?
Yes. Dashboard bot-chat runs are tagged with environment: "test" and source: "dashboard"; real API calls are environment: "prod". Both the Traces page and the Analytics page let you flip between All, Test, and Prod so test noise never pollutes your production metrics.
What’s the difference between a trace and an analytics row?
A trace is one complete decision — every input, every tool call, every guardrail, and the final response. Analytics aggregates many traces: distributions, reason codes, stage counts, latency buckets, model usage. You move between them with one click; every analytics bar drills into the traces behind it.
Can I hand a trace to an LLM for root-cause analysis?
Yes. The Copy Trace button produces a plain-text BOT TRACE REPORT with performance, flags, guardrails, resources, evolution rules, and any Mission Control activity. Paste it into any model and ask “why did this fail?” — no schema knowledge required.
How does this tie into Mission Control and Agent Evolution?
Every trace carries the Mission Control whispers and takeovers that shaped it, plus any Agent Evolution rules that were applied. When a reviewer promotes a rule or an operator intervenes live, those signals become searchable fields on the next trace and inputs to the next analytics window.
Can I pipe traces into my own tooling?
Runs are fetched from the company runs API (GET /v1/analytics/companies/{id}/runs) and each trace has a detail endpoint (GET /v1/traces/{id}/replay). Wire those into your SIEM, warehouse, or on-call tools directly.
Ready to inspect?
Traces are on by default — every run is captured the moment you send it. Analytics starts aggregating from the first call, and the Inspector opens to a full replay with one click.