There's a pattern we keep seeing. A team picks LangChain or CrewAI because it's the first result on Google. They build a proof of concept in a weekend. It works. They demo it. Leadership gets excited.
Then production happens.
The abstraction tax
Every framework imposes a tax. LangChain wraps your LLM call in a chain, which wraps it in an agent, which wraps it in a toolkit. Each layer adds latency, memory overhead, and debugging opacity.
For a simple chatbot, none of this matters. But for a multi-step agent that needs to make 15 tool calls in sequence with sub-second latency requirements, every millisecond of framework overhead compounds.
We measured this on a recent project. A direct OpenAI API call took 340ms. The same call through LangChain's agent executor took 890ms. The delta wasn't the model. It was serialization, callback hooks, and memory management we weren't using.
When to use a framework
Frameworks aren't bad. They're bad when they don't match your constraints.
Use a framework when you're prototyping, when your team is learning, or when the framework's opinions match your architecture. Don't use one when latency is critical, when you need fine-grained control over retries and fallbacks, or when you're deploying to edge infrastructure.
The question isn't "which framework?" It's whether you need one at all.
The direct approach
For most production agent systems, we've converged on a thin wrapper around the model's native API. Function calling. Structured outputs. A retry loop with exponential backoff. A simple state machine for multi-step flows.
Less code than a framework configuration file. Faster. And when something breaks at 3 AM, you can actually read the stack trace.
The best infrastructure is the infrastructure you understand completely. If you can't explain every layer between your business logic and the model API, you have a liability, not a tool.
If your framework is the bottleneck and you're not sure what to replace it with, we can help you figure it out.