Core concepts
Four ideas explain everything you see in the dashboard: sessions, steps, diagnosis confidence, and the sensor/actuator split.
Sessions
A session is one run of your agent — from invoke() to its final answer (or crash). The SDK emits lifecycle events (session_start, session_complete, session_error), so every session has a true status:
| Status | Meaning |
|---|---|
| Running | The agent is still executing. The session page polls live and new steps appear as they happen. |
| Success | The run completed and no step failed. |
| Failed | A step failed, or the run itself crashed. |
Steps
A step is one tool call inside a session. Each step records the tool name, input, output, latency, the LLM reasoning that led to the call, the token cost of that reasoning, and OTel-compatible trace_id / span_id / parent_span_id so nested agents form a proper trace tree.
Diagnosis & confidence tiers
When a step fails, Vorlo produces a structured diagnosis: a title, the root cause in plain English, and a specific fix. Every diagnosis carries a confidence tier so you always know how much to trust it:
| Tier | What it means |
|---|---|
| Verified | The fix was confirmed — a developer clicked "this fixed it", or Vorlo observed the failing tool recover after the fix was served (auto-verification). |
| Likely | Matched by the curated error registry — a known failure shape (auth expiry, rate limits, missing fields, timeouts, …) with a deterministic diagnosis. |
| Guess | A new pattern diagnosed by the LLM with full step context. Useful, but unconfirmed — your feedback promotes or corrects it. |
This is the learning loop: each distinct failure is fingerprinted, and a confirmed fix is served instantly to everyone who hits the same fingerprint next. Vorlo gets more accurate the more it's used.
Sensors vs actuators
Every tool is classified by what it does to the world:
- Sensor — reads state:
get_*,search_*,fetch_*,list_*,query_*, … - Actuator — changes state:
send_*,create_*,charge_*,delete_*, … Unknown tools default to actuator, the safe assumption.
The split matters for debugging (a failed actuator may have half-applied a change) and powers guardrail policies — actuator calls matching a known failure fingerprint can be blocked or require approval before they execute.