Comparing Context Retrieval Approaches for AI Code Review

Meerkat Careers | Comparing Context Retrieval Approaches for AI Code Review

Blog

Back To Blog

Overview

At Compare The Market, we have an internal AI tool that automatically reviews merge requests. The goal is to speed up the time it takes for developers to receive feedback on their code and increase MR throughput across the organisation. Developers get an intelligent first-pass review within minutes of opening an MR.

It works well, but we wanted it to work better. The reviewer was leaning on the side of conservatism, where we often saw false positives in bug detection due to the limitations of reviewing code changes in isolation. We wanted to provide the reviewer with an understanding of how the change sits within the broader system. For example, a deleted function might seem like dead code, unless you know it’s called dynamically from another service.

At this point we faced a fundamental architectural decision: How should the agent retrieve context about the codebase?

We had two main options:

GKG (GitLab Knowledge Graph) : A code analysis engine that uses Tree-sitter AST parsing (via gitlab-code-parser) to build a structured knowledge graph of code entities and relationships, stored in a Kuzu graph database. Enabling precise queries like "find all callers of this function" or "show me the class hierarchy" RAG (Retrieval-Augmented Generation) : A vector similarity search approach that chunks code, creates embeddings, and retrieves semantically similar code snippets.

We chose GKG based on intuition – our hypothesis was that code review requires structural understanding of code relationships, not just semantic similarity. When reviewing a change to a function, you need to know what calls it, what it calls, and how it fits into the broader architecture. RAG excels at finding "similar" code, but similarity isn’t the same as relevance for code review.

This article validates that intuition. Through rigorous evaluation using MLflow on Databricks, we compared four approaches and found that GKG outperforms RAG on the metrics that matter most for code review quality. The data confirms our architectural decision was correct.

1. The 4 approaches

We evaluated four distinct configurations:

2. GKG Integration

What is GKG?

Last year, GitLab introduced a beta version of an MCP (Model Context Protocol) server called the GitLab Knowledge Graph (GKG) . The API indexes the repository and builds a structured, queryable representation of the codebase. It maps dependencies onto nodes in a graph, understands function definitions and their usage, traces inheritance hierarchies, and captures cross-references between modules.

The result is a semantic map of your code – not just a list of files, but a web of relationships. Through the tools provided by the MCP server, AI agents can query this graph in real time:

"Where is this function called?" "What classes inherit from this interface?" "What would be affected if I changed this method signature?"

Our Sidecar Integration Because GKG was still in beta and not yet available as a native GitLab CI/CD feature, we built a separate sidecar service – a lightweight Docker container that wraps the official GKG binary and runs alongside our reviewer in the CI pipeline.

The Workflow Index – When a merge request pipeline kicks off, the sidecar container mounts the project source and indexes the full codebase, building the knowledge graph from scratch. Serve – Once indexed, it starts the GKG MCP server on a local port, exposing a set of tool calls. Query – Our AI reviewer connects to the MCP server and uses these tools as part of its review workflow.

How the Knowledge Graph Works GKG builds a symbol graph — a structured representation of your codebase where nodes represent code entities (classes, functions, variables) and edges represent relationships (calls, inherits, imports).

The Indexed Project Graph When GKG indexes a repository, it creates an interactive graph visualisation showing the entire codebase structure. Here’s what the indexed graph might look like for an example project:

Each node type represents a different code entity:

Orange (Directory) – Folder structure of the repository. Green (File) – Individual source files. Purple (Definition) – Classes, functions, and methods defined in the code. Blue (Imported Symbol) – External dependencies and imports.

The edges (lines) show relationships: which files contain which definitions, which functions call other functions, and which modules import which symbols.

Example: Symbol Graph Structure Consider a simple UserService class. GKG maps it as a graph showing the class, its methods, and all the files that call those methods:

Example: Querying the Graph When the AI reviewer needs to understand the impact of a change, it queries GKG. For example, if someone modifies validate_input(), the agent asks: "Who calls this function?"

This precise information allows...

Comparing Context Retrieval Approaches for AI Code Review

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast