Skip to content
Insights
3 min read

Why We Stopped Using LangChain

After six months in production, we replaced LangChain with 200 lines of TypeScript. Here's what we learned about framework dependency in AI systems.

architectureai-agentsinfrastructure

We used LangChain for six months on a production agent system. It started well. The abstractions were convenient. The ecosystem was rich. Then the problems started.

This isn't a hit piece. LangChain solved real problems and accelerated our early development. But the decision to remove it taught us more about agent architecture than the decision to adopt it.

The versioning problem

LangChain moved fast. Too fast for production. We pinned to version 0.1.x. Three months later, the ecosystem had moved to 0.2.x with breaking changes. Community examples, Stack Overflow answers, and documentation all referenced the new version. Our pinned version was already legacy.

This isn't unique to LangChain. It's a property of any rapidly evolving framework. But in infrastructure that runs 24/7, "rapidly evolving" is a synonym for "unstable."

The debugging wall

When an agent call failed, our stack traces were 47 frames deep. Most of them inside LangChain's internal callback system, serialization layer, and memory management. Finding the actual failure, whether a malformed tool response, a token limit exceeded, or a rate limit hit, required reading framework source code.

In production, debugging speed is a reliability metric. If your mean-time-to-diagnosis doubles because of framework abstraction, you've traded development speed for operational cost.

The performance ceiling

LangChain's agent executor is general-purpose. It handles any tool, any model, any memory backend. That generality has a cost: serialization overhead, callback invocation, and memory copies that a purpose-built system doesn't need.

Our agent made an average of 8 tool calls per request. Each call went through LangChain's full execution pipeline. When we replaced it with direct function calling and a simple loop, p99 latency dropped from 12 seconds to 4.

What replaced it

Two hundred lines of TypeScript:

  • A while loop that calls the model with function definitions
  • A switch statement that routes tool calls to handlers
  • A retry wrapper with exponential backoff and circuit breaking
  • A structured logger that traces every step

No framework. No abstractions beyond what the problem requires. Every line is debuggable. Every failure mode is explicit.

The lesson

The lesson isn't "don't use frameworks." It's "understand your dependency deeply enough to know when it stops helping."

LangChain was the right choice for month one. It was the wrong choice for month six. The mistake wasn't adopting it. It was not having an exit plan.

For any framework dependency in production AI systems, ask yourself: if I needed to remove this in a weekend, could I? If the answer is no, you don't have a dependency. You have a liability.

If you're evaluating your agent framework choices, we help teams build and unbuild the right way.