Core concepts

Four ideas explain everything you see in the dashboard: sessions, steps, diagnosis confidence, and the sensor/actuator split.

Sessions

A session is one run of your agent — from invoke() to its final answer (or crash). The SDK emits lifecycle events (session_start, session_complete, session_error), so every session has a true status:

StatusMeaning
RunningThe agent is still executing. The session page polls live and new steps appear as they happen.
SuccessThe run completed and no step failed.
FailedA step failed, or the run itself crashed.

Steps

A step is one tool call inside a session. Each step records the tool name, input, output, latency, the LLM reasoning that led to the call, the token cost of that reasoning, and OTel-compatible trace_id / span_id / parent_span_id so nested agents form a proper trace tree.

Diagnosis & confidence tiers

When a step fails, Vorlo produces a structured diagnosis: a title, the root cause in plain English, and a specific fix. Every diagnosis carries a confidence tier so you always know how much to trust it:

TierWhat it means
VerifiedThe fix was confirmed — a developer clicked "this fixed it", or Vorlo observed the failing tool recover after the fix was served (auto-verification).
LikelyMatched by the curated error registry — a known failure shape (auth expiry, rate limits, missing fields, timeouts, …) with a deterministic diagnosis.
GuessA new pattern diagnosed by the LLM with full step context. Useful, but unconfirmed — your feedback promotes or corrects it.

This is the learning loop: each distinct failure is fingerprinted, and a confirmed fix is served instantly to everyone who hits the same fingerprint next. Vorlo gets more accurate the more it's used.

Sensors vs actuators

Every tool is classified by what it does to the world:

  • Sensor — reads state: get_*, search_*, fetch_*, list_*, query_*, …
  • Actuator — changes state: send_*, create_*, charge_*, delete_*, … Unknown tools default to actuator, the safe assumption.

The split matters for debugging (a failed actuator may have half-applied a change) and powers guardrail policies — actuator calls matching a known failure fingerprint can be blocked or require approval before they execute.