Traces + Analytics

See every decision. Replay any conversation.

Every bot run is captured end-to-end — messages, tool calls, guardrail evaluations, verification outcomes, and latency. Inspect one trace in a split-view replay, or zoom out to workflow-level analytics across thousands of runs in a single screen.

Nothing sampled at capture

Every turn stores the full replay payload — prompts, tool traces, guardrail evaluations, timeline events. The analytics screen samples for speed; the archive keeps the truth.

One pane for health + debug

Traces answer “what happened in that one run?”; Analytics answers “what’s happening across all of them?”. Same filters, same environment split, same bot scope — zero context switching.

Replay + export built in

Open any trace to a full timeline. Copy it to a plain-text report your on-call engineer or a language model can triage in seconds.

One pipeline, two surfaces

Run. Capture. Store. Inspect. Aggregate. Replay.

The same data model powers the Trace Inspector and the Analytics dashboard. You never have to wonder whether a chart and a trace disagree — they don’t.

Run → Inspect → Understand

one pipeline · two surfaces
01

Run

bot handles a turn

02

Capture

every field, every call

03

Store

trace doc + replay payload

04

Inspect

search, filter, split-view

05

Aggregate

sampled analytics window

06

Replay

export to any LLM

Traces

Full-fidelity record for every turn.

Routing, verification, tools, evidence, and timing — captured with the same schema for dashboard test runs and production API traffic.

What every trace captures

full-fidelity · no sampling

Routing decisions

  • predicted_stage
  • final_stage
  • selected_action_id
  • selected_action_class
  • router_mode
  • router_version

Verification

  • verification_outcome (pass / warn / fail)
  • verification_reason_codes
  • verification_checks_run
  • verification_failures
  • verification_warnings

Escalation

  • should_escalate
  • escalation_target
  • escalation_reason_codes
  • safety_override_applied

Tool calls

  • tool_call_count
  • tool_name per call
  • success / error
  • latency_ms per tool
  • full output payload

Evidence & retrieval

  • selected_evidence_ids
  • evidence_count
  • retrieval hits
  • KB connection used

Performance

  • latency_ms
  • started_at / completed_at
  • execution_status
  • has_errors
  • errors[]

Trace Inspector — timeline replay

pass

Trace ID

trc_9f82…b14

Bot

support-bot

Stage → Action

investigation → refund.issue

Verification

PASS · no reason codes

Latency

1,248 ms

Tokens

prompt 412 · completion 196 · total 608

Timeline (8 events)

  1. stage.classified

    Stage predicted with component + version — inspect reason codes.

  2. action.selected

    Selected action + class; confidence and verifier inputs saved.

  3. tool.invoked

    Tool name, inputs, success, latency, and full JSON output.

  4. guardrail.evaluated

    Guardrail name, type, action, and whether the condition matched.

  5. guardrail.triggered

    Which guardrail blocked or rewrote — full context preserved.

  6. mcp.called

    MCP server, tool, disposition (approved / denied / auto), cost, latency.

  7. missionControl.augmented

    Operator whispers, pauses, takeovers — recorded in the turn.

  8. response.composed

    Final response text + tokens used (prompt, completion, total).

The Traces workspace

Search across everything

Instant substring search over trace_id, session_id, final_stage, action ID, and bot name — no query language to learn.

Verification + bot + environment filters

Narrow to pass / warn / fail, filter to a single bot, and flip between Test and Prod traffic on one page.

Date-range scoping

From/To pickers scope the list to any window — debug a specific incident or review yesterday’s runs.

Split-pane inspector

Pick a row on the left, see the full replay on the right — summary, verification, escalation, tools, timeline.

Full decision replay

Every trace opens to the exact stage classifier outputs, action selection, verification checks, tool results, and final response.

Copy trace for LLM review

One-click export builds a BOT TRACE REPORT (performance, guardrails, resources, Mission Control augmentations) — paste it into any model for root-cause analysis.

Analytics

Workflow health at a glance.

Verification trends, latency distribution, top stages, action mix, and reason-code buckets — with a bot filter and environment split on the same page.

Ten KPIs on one screen

workflow health

Total Runs

with sampled window size

Pass

verification passed

Warn

verification warnings

Fail

verification failed

Blocked

policy or guardrail block

Avg Latency

mean over runs

Escalated

handed to human

Executed Actions

action was taken

Safety Overrides

policy override fired

Avg Events / Run

pipeline density

Six built-in charts

trends · distributions · reason codes

Verification Trend

Last 40 runs as vertical bars, colored by outcome. Spot regressions the moment they appear.

Strip chart · pass / warn / fail

Verification Distribution

Donut chart of pass vs warn vs fail across the sampled window.

Donut · three segments

Latency Distribution

Bars for <100ms, 100–250ms, 250–500ms, 500ms–1s, and >1s so you can see the shape, not just the mean.

Histogram · 5 buckets

Top Stages

Which workflow stages dominate traffic — useful for capacity planning and prompt tuning.

Horizontal bars · top N

Top Actions

The actions your bot actually takes most often, ranked.

Horizontal bars · top N

Reason Codes

Verifier and escalation reason-code buckets — go from “what failed” to “why it failed” in one click.

Bucket list · two rails

Dashboard widgets — glance-level signals

Quick Stats

Last 500 traces

Total traces, plan usage %, avg latency, pass rate, guardrails triggered, inputs blocked.

Request Volume (7d)

7 days

Daily bar chart of traffic across the last seven days, by weekday.

Latency gauge

Last 500 traces

Average, P95, and P99 latency — so a spike can’t hide behind the mean.

Model Usage

Last 500 traces

Top 6 models by call count, with total tokens per model.

Verification Health

Last 500 traces

Stacked bar + tile grid for pass / warn / fail at a glance.

Recent Traces

Live

Eight most recent traces, clickable straight into the inspector.

test · prod · all

Test traffic never pollutes production metrics.

Every dashboard bot-chat run is tagged with environment: "test" and source: "dashboard". Real API calls are tagged prod. Flip the environment pill on the Traces page or the Analytics page to split them, combine them, or isolate either one — stats recompute locally so there’s no waiting.

What you gain

  • Ship test iterations without distorting production dashboards
  • Compare test vs prod pass rates before you go live
  • Keep a clean audit trail per environment, per bot
  • Deep-link to any trace with ?traceId= — shareable with engineers or LLMs

Answers

What teams usually ask

Is every run recorded, or just a sample?

Every run is captured end-to-end — messages, tool calls, guardrail evaluations, verification outcomes, tokens, and latency. The dashboard’s analytics view samples the most recent runs for speed, but the underlying trace store keeps the full record for replay and audit.

Can I separate test traffic from production?

Yes. Dashboard bot-chat runs are tagged with environment: "test" and source: "dashboard"; real API calls are environment: "prod". Both the Traces page and the Analytics page let you flip between All, Test, and Prod so test noise never pollutes your production metrics.

What’s the difference between a trace and an analytics row?

A trace is one complete decision — every input, every tool call, every guardrail, and the final response. Analytics aggregates many traces: distributions, reason codes, stage counts, latency buckets, model usage. You move between them with one click; every analytics bar drills into the traces behind it.

Can I hand a trace to an LLM for root-cause analysis?

Yes. The Copy Trace button produces a plain-text BOT TRACE REPORT with performance, flags, guardrails, resources, evolution rules, and any Mission Control activity. Paste it into any model and ask “why did this fail?” — no schema knowledge required.

How does this tie into Mission Control and Agent Evolution?

Every trace carries the Mission Control whispers and takeovers that shaped it, plus any Agent Evolution rules that were applied. When a reviewer promotes a rule or an operator intervenes live, those signals become searchable fields on the next trace and inputs to the next analytics window.

Can I pipe traces into my own tooling?

Runs are fetched from the company runs API (GET /v1/analytics/companies/{id}/runs) and each trace has a detail endpoint (GET /v1/traces/{id}/replay). Wire those into your SIEM, warehouse, or on-call tools directly.

Ready to inspect?

Ship an agent you can actually see into.

Traces are on by default — every run is captured the moment you send it. Analytics starts aggregating from the first call, and the Inspector opens to a full replay with one click.