Skip to main content

Command Palette

Search for a command to run...

Why RAG Fails on Codebases — And How ReAct Agents Fix It

Updated
4 min read
Why RAG Fails on Codebases — And How ReAct Agents Fix It

Rags on Codebases might seem to be a good idea at first, to some extent it is but the underlying trade offs tells a different story. A brief on how RAG works is:

For more details on RAG there is a detailed article on the working on RAGs and its working.

The main aim of the vectorDB and retrieval is to prevent the LLM from hallucinating. So it should not spit out a whole document, in this case its the codebase nor should it miss out on important information.
Lets take a JAVA code example.
We have multiple packages, each package has multiple class and each class has multiple functions.
Each of them are dependent on each other in a intermingled way. Lets take the below example of two classes and four functions.

For storing the above in vectorDB, two of the most common chunking strategy is either function based chunking or class based chunking. Its trivial to understand the chunking strategy based on the names itself.

When you are retrieving the data or a query related to func-1, the vector similarity search would fetch the details of func-1, may be to some extent func-4 but would start missing details on each dependency we jump to. The issue is not with the information but for a complete information you need the complete trail with proper lineage.

In short:
For code, naive chunking + vector search usually fails because:

  • functions depend on other files

  • variable names matter

  • imports define relationships

  • call chains span modules

  • small syntax changes completely change meaning

  • embeddings often lose structural information

There are multiple algorithms that are made to handle bits and pieces of the above issues, example: language parsing algorithms like tree sitter, storing the knowledge graph of the codebase in a graph db etc.
Details of all the above algorithm is out of scope of the current post will create a separate post describing each of those algorithm in details.

The solution that I would like to show you is just adding a ReAct agent on the top of your RAG system. How does a ReAct agent works is shown below:

On the given function dependency, it would give you the information by making a series of vectorDB queries instead of just one.
Loop 1: Find func: 1, It would get the code and details of function 1. Now there it would find name of function 4[in observation stage].
Loop 2: Find func 4, same as above go to func 2.
Loop 3: details of func2. It got all the required details compiles them and the loop breaks.

After the loop breaks the agent compiles the result and sends user the required response.
Without making much changes to the existing architecture, we got the full lineage with all required info for our query on func 1.

RAG systems work well for documents because information is mostly linear. Codebases are different — understanding a single function often requires traversing dependencies, imports, and multi-hop call chains spread across the repository.

By adding an iterative ReAct-style retrieval layer on top of traditional RAG, we can move from isolated semantic search to dependency-aware code understanding, enabling far more accurate and context-complete responses.
And why ReAct succeeds? Because it turns retrieval into exploration.

But then what are the tradeoffs between a ReAct agent and the existing language parsing algorithm vs storing the knowledge graph of the codebases. Which of them will be the best options in what scenario. These are some of the questions we will discuss in the upcoming posts in details.