Vector Databases Are the Wrong Abstraction

Vercel's engineering team just published something that will annoy a lot of RAG vendors: a production knowledge agent that uses grep, find, and cat inside sandboxed containers instead of vector databases and embedding pipelines.

Their cost dropped from $1.00 to $0.25 per query. Output quality improved.

This isn't a contrarian take for attention. It's an architectural insight that most teams are discovering too late: vector databases solve a problem that most agent workflows don't actually have.

The RAG industrial complex

The standard RAG pipeline has become cargo cult engineering. You chunk your documents, embed them, store them in a vector database, retrieve the top-k similar chunks, and stuff them into context. Every AI infrastructure vendor sells this as the default architecture.

But when your agent gives a wrong answer, debugging is nearly impossible. Which chunk did it retrieve? Why did that chunk score 0.82 when the correct one scored 0.79? Is the problem in the chunking boundary, the embedding model, or the similarity threshold? You're debugging a pipeline, not a question.

Why file systems work better

LLMs have been trained on massive codebases. They're exceptionally good at navigating directories, grepping through files, and managing state across complex folder structures. This isn't a hack. It's using the model's strongest capability.

When you give an agent access to a file system and bash tools, three things change.

Results are deterministic and explainable. The agent ran grep -r "pricing" docs/, read docs/plans/enterprise.md, and pulled section three. When the answer is wrong, you know exactly what happened. Fix the file or adjust the search strategy. The debugging loop takes minutes, not hours.

No embedding pipeline to maintain. No chunking strategy to tune. No vector database to scale. No embedding model to evaluate. Add a document, sync, search. Operational surface area drops to near zero.

Context stays structured. Vector retrieval destroys document structure. It returns floating chunks with no awareness of what came before or after them. File-based retrieval preserves headers, sections, cross-references. The structural cues that help the model reason correctly.

When vectors still win

This isn't a universal truth. Vector databases remain the right tool for genuine semantic similarity search across millions of documents where you don't know what keywords to look for. Product recommendation engines, duplicate detection, research discovery across papers: these are real vector database use cases.

But most enterprise agent workflows aren't doing semantic discovery. They're looking for specific facts inside structured documentation. "What's our enterprise pricing?" is a keyword search, not a semantic one. The vector database adds cost, latency, and opacity without adding value.

The pattern we use

Our knowledge agent architecture follows the file-system-first pattern.

Structured ingestion. Documents are converted to markdown with preserved headers and stored in a directory hierarchy that mirrors the source structure. Customer docs in /customers/{name}/, product specs in /specs/, compliance docs in /compliance/.

Tool-based retrieval. The agent has search, read, and list tools that map to file system operations. Search uses a full-text index for speed, with fallback to grep for complex patterns.

Schema validation on output. Every retrieval result includes a source path, section reference, and confidence indicator. If the agent can't find a definitive answer, it says so. It doesn't hallucinate from a similar-sounding chunk.

Cost tracking per query. We measure token usage, tool calls, and wall time for every retrieval. Most queries resolve in 2-3 tool calls. Compared to the embed-retrieve-rerank pipeline, that's a fraction of the compute cost.

Stop running before you can walk

The AI infrastructure ecosystem has a maturity problem. Companies are adopting vector databases, embedding pipelines, and reranking models before they've confirmed that their use case needs any of it.

Start with the simplest thing that works: structured files, keyword search, and LLM reasoning. Add complexity only when you can prove the simple approach fails for your specific workload.

Most will never need to add it.

We build agent systems that are simple enough to debug and cheap enough to scale. If your RAG pipeline is more complex than the problem it solves, let's talk about simplifying it.