Why did my LangChain agent fail?

When a LangChain agent throws, the error you see is rarely the error that matters. ToolException: 403 tells you a call was rejected — not that the API key in your environment expired three hours ago. Here are the five failure modes behind most agent errors, and how to get from the symptom to the real root cause.

1. Authentication and permission errors

The most common agent failure has nothing to do with your prompt. A tool wraps an API — Stripe, a database, an internal service — and that API rejects the call: an expired token, a key missing a scope, a revoked credential. The agent surfaces it as a generic tool error, so it looks like a reasoning problem when it is really a configuration problem.

How to find it: look at the rawerror on the failing tool call, not the agent's summary. A 401 or 403with a body mentioning "invalid key", "insufficient scope", or "expired" points straight at credentials — check the environment the agent actually runs in, which is often not the one you tested locally.

2. Malformed tool inputs

The model decides to call a tool but builds the arguments wrong: a missing required field, a string where a number was expected, an ID in the wrong format. The tool throws a validation error. This is genuinely a reasoning failure — but the fix is usually a clearer tool schema or a better description, not a bigger model.

How to find it:compare the arguments the model produced against what the tool requires. If the model keeps omitting the same field, the field's description is ambiguous — tighten it.

3. Rate limits and timeouts

Agents fan out: a single run can make dozens of tool and model calls. Under load you hit provider rate limits (429) or a slow dependency times out. These failures are intermittent, which makes them maddening — the same run succeeds on retry, so the bug "disappears" before you can inspect it.

How to find it: you need the failing run captured the first time. Look for 429, 503, or timeout signatures and the latency of the step that failed. Add backoff and concurrency limits rather than chasing a flaky reproduction.

4. Context and state problems

A step fails because of something an earlierstep did — or didn't do. A search returned nothing, so the next tool was called with an empty ID. Memory grew past the context window and a key instruction was truncated. The failing step is just where the problem became visible.

How to find it:read the run in order. Trace the bad input back to the step that produced it. The most useful question is rarely "why did step 7 fail" — it is "what did step 4 return."

5. Silent wrong answers

The worst failures throw no error at all. The agent finishes, returns a confident answer, and the answer is wrong — it called the wrong tool, or stopped early, or hallucinated a value. Nothing crashes, so nothing alerts you.

How to find it: compare the bad run against a known-good run of the same agent, step by step. Where they diverge is where the logic broke.

From symptom to root cause, fast

The thread running through all five: the raw error is a symptom, and the root cause is usually one or more steps upstream, in context you no longer have once the process exits. That is exactly the problem Vorlo solves — it captures every run, translates the raw error into a plain-English root cause and a specific fix, and aligns the failing run against the last successful one so divergence is obvious. You add two lines of code and ask "why did my last run fail?" instead of re-reading stack traces.

How to debug AI agents in production →A repeatable workflow for fixing agent failures fast.Quickstart →Instrument your LangChain agent in two lines.