The enthusiasm around LLM agents over the past year has hit an uncomfortable wall: the classic loop pattern —invoke the model, check if it requests a tool, execute it, repeat— works fine in demos and crumbles the moment the task has three steps and a condition. Agents lose context, get stuck in infinite loops on the same tool, can’t be resumed after failure and, when something goes wrong, the operator stares at a flat log wondering why the model decided what it decided. LangGraph, released by the LangChain team in early 2024, tackles that problem by treating the agent as an explicit state graph rather than a conversational black box.
Why “Simple” Agents Break
The hand-rolled ReAct-style agent is always the same template: a loop that calls the LLM with the message history, checks if the response contains a tool call, executes it, appends the result to the history and starts again. It terminates when the model replies without requesting tools. That shape is elegant on paper but has four problems that scale badly. First, there is no natural bound: if the model fixates on the same tool, you spin indefinitely until tokens run out. Second, there is no persistence: a network blip, a timeout or a container restart sends you back to square one. Third, there is no observability: the “step” you’re on is implicit, living only in the length of the message history. Fourth, it’s impossible to unit-test, because all the logic sits inside the same loop body.
When the flow is “ask the user, classify intent, query a database if needed, respond”, these problems rarely bite. When the flow is “support agent that escalates to a human, multi-step research across several sources, ETL pipeline with intermediate validations”, you feel them within the first week.
The LangGraph Mental Model
LangGraph changes the unit of composition. Instead of a loop chaining calls, you define four things: a state, a set of nodes, a set of edges and the graph itself. The state is a typed dictionary that represents everything the agent knows at a given moment: the conversation, intermediate variables, control flags. Nodes are pure functions that take the state and return a partial update. Edges connect nodes and can be unconditional (after node A, always go to B) or conditional (after node A, if the routing function returns “search” go to B, if it returns “respond” go to C). The graph is the compilation of all that into an executable object.
The practical consequence is that every step of the agent is now an independent function, with a clear input-output contract, testable in isolation and observable from outside. A typical classify-and-respond flow gets modelled by declaring three nodes (classifier, searcher, writer), a conditional edge from the classifier deciding whether to search, a direct edge from searcher to writer and an edge from writer to the terminal node. The compiled graph exposes an invoke method taking the initial state and a stream method emitting events on each transition.
from langgraph.graph import StateGraph, END
graph = StateGraph(AgentState)
graph.add_node("classify", classify)
graph.add_node("search", search)
graph.add_node("respond", respond)
graph.set_entry_point("classify")
graph.add_conditional_edges("classify", route_from_classify)
graph.add_edge("search", "respond")
graph.add_edge("respond", END)
app = graph.compile()
The Three Features That Do the Work
Three capabilities make LangGraph worth its learning cost over a bespoke loop or LangChain’s AgentExecutor.
The first is checkpointing. LangGraph can persist the graph state at every transition into SQLite, Postgres or Redis. Compile the graph with a checkpointer, execute with a thread_id, and each step is saved. A later invocation with the same thread_id and null input resumes from the last checkpoint. This enables long-running agents measured in hours or days, chats with real cross-session memory, recovery from process failures and, most importantly, the human-in-the-loop pattern: the graph pauses on an approval node, an operator reviews via an API, updates the state with the decision, and the graph resumes. Without this, agents that take real-world actions —sending emails, making payments, modifying CRM records— are irresponsible in production.
The second is per-step streaming. The stream method emits one event per executed node with the partial state update. UIs can show the user what the agent is doing in real time without waiting for the final result, and observability systems can measure per-node latency instead of only end-to-end latency.
The third is explicit loops with exit conditions. A node can point back to itself through a conditional edge distinguishing between “continue” and “terminate”. It’s the same ReAct pattern we’ve always had, but with the stopping condition declared in one place rather than buried inside the loop body.
Quick Comparison with What Was There Before
Against LangChain’s AgentExecutor, LangGraph wins on native persistence, ease of debugging and ability to model non-trivial conditional logic; it loses on learning curve and simplicity for single-turn cases. Against multi-agent frameworks like CrewAI or AutoGen, LangGraph is lower-level: it doesn’t give you a predefined role or conversation model, it gives you primitives to build your own. Integration with LangSmith for observability —seeing each executed node, its inputs and outputs, token usage, latencies, trace replay— is what tips the balance for teams already inside the LangChain ecosystem. LangGraph also composes well with LCEL (LangChain Expression Language), so existing Runnable objects plug in as nodes without changes.
When It Pays Off and When It Doesn’t
LangGraph pays its cost when the flow has non-trivial conditional logic, when the agent’s duration exceeds a minute, when you need human intervention or when production observability is a hard requirement. It doesn’t compensate for a single-turn chatbot, for a one-shot tool call or for exploratory prototypes where building the graph is more ceremony than product. Useful rule of thumb: if your agent fits in fifteen lines of loop and will never run longer than ten seconds, don’t wrap it in a graph.
My Take
LangGraph is one of those libraries that feels like a detour until you face your second serious production agent, and then it’s hard to go back. The underlying argument isn’t technical but organisational: the hand-rolled ReAct loop forces flow logic, error handling, persistence and observability to live inside the same code body, and that ages badly. A state graph separates those concerns almost by design. The API is still moving —renames and patterns shift between releases— and that’s a real friction, but the conceptual bet is sound and lines up with decades of literature on state machines and BPM. For anyone already inside LangChain, it’s the natural evolution. For anyone starting from scratch, the learning cost is reasonable if the agent is headed for production. For everybody else, a well-written loop is still enough.