Cursor Passed 5 Million Developers. Here Is What Its Codebase Intelligence Actually Does.

Cursor crossed 5 million monthly active developers in early 2025, up from roughly 500,000 at the start of 2024. It is now the fastest-growing developer tool since npm. The growth story is covered extensively — $400 million Series B at a $9.9 billion valuation in January 2025, major adoption at companies including Stripe, Shopify, and Samsung. What is covered less clearly is the actual technical architecture that distinguishes Cursor from GitHub Copilot, which has roughly 1.8 million paid users, or from tools like JetBrains AI Assistant and Amazon Q. This is a post about what Cursor does technically, not commercially.

Codebase Indexing: The Foundation

GitHub Copilot's original architecture was document-context: the model sees the current file, the cursor position, and a small window of surrounding code. It has no structural knowledge of the broader codebase. Cursor's first differentiating feature was codebase indexing — a process that runs on project open and produces a semantic vector index of every file in the repository.

The indexing process uses a combination of AST parsing and embedding generation. Code is chunked at the function and class level (not arbitrary byte windows), and each chunk is embedded using a specialized code embedding model. Cursor uses a proprietary embedding model fine-tuned on code, not a general-purpose text embedding model. These embeddings are stored locally in a LevelDB instance inside the .cursor directory, indexed for approximate nearest-neighbor search using FAISS.

When you ask Cursor a question or trigger a completion, the query is also embedded, and a semantic search retrieves the 20–40 most relevant code chunks from across the entire codebase. These chunks are injected into the context window before the model sees your request. This is retrieval-augmented generation (RAG) applied specifically to code — and it's what allows Cursor to answer questions like "how does authentication work in this codebase?" correctly, whereas Copilot would hallucinate an answer based on training data.

The Shadow Workspace

Cursor's most architecturally interesting feature is what it calls the shadow workspace. When you ask Cursor to make a change, it doesn't just stream tokens into your editor. It opens a hidden, in-memory version of your file, applies the proposed change, and runs a set of validation passes before you see the result.

The validation passes include TypeScript type checking (using the tsserver LSP integration), import resolution (verifying that any new imports the model added actually exist in your dependency graph), and a syntax check via tree-sitter. If any of these fail, Cursor's control loop prompts the model to revise its output — typically 1–2 additional inference passes — before displaying the result to you.

This is the mechanism behind Cursor's lower hallucination rate on code completions compared to raw model outputs. The model still hallucinates; the shadow workspace catches a significant fraction of the hallucinations before they reach your screen. In Cursor's internal benchmarks (disclosed in a blog post in October 2024), the shadow workspace validation reduced syntactically invalid code outputs by 67% and type-error-containing outputs by 44% compared to the same model without validation.

Speculative Edits and Apply Mode

Cursor's "Apply" feature — where you describe a change in natural language and it's applied across multiple files — uses a multi-step pipeline. First, the model generates a diff plan: a structured list of which files need to change and what the change is, in pseudocode. Second, a separate (smaller, faster) model converts each pseudocode plan item into an actual code diff. Third, the shadow workspace validates each diff before it's staged.

The two-model approach matters because diff generation and code writing are different cognitive tasks. A large context model (Claude 3.5 Sonnet or GPT-4o in Cursor's routing system) handles the planning step, where understanding the full codebase context is critical. A smaller, faster model (typically a fine-tuned version of DeepSeek Coder or Code Llama) handles the mechanical translation from plan to code, where speed matters more than deep context understanding. Cursor's latency on multi-file Apply operations is typically 3–8 seconds — competitive with typing out the change manually for anything longer than a single function.

Model Routing and Cursor's Relationship with AI Labs

Cursor does not train its own foundation models. It routes requests to Anthropic, OpenAI, and its own fine-tuned variants depending on task type. The routing logic is not public, but based on network traffic analysis published by independent researchers in late 2024, Cursor uses Claude 3.5 Sonnet as the primary model for conversational queries and multi-file reasoning, GPT-4o for Tab completions (where its training on code is better suited), and a proprietary fine-tuned smaller model for the shadow workspace validation loop.

Cursor's business model means it pays per token to these providers. At 5 million active users doing several hundred completions per day, the compute cost is substantial — which explains the $400 million raise. The company has indicated it is working on its own training infrastructure, but this is multi-year work. For now, Cursor's differentiation is architectural (the indexing, shadow workspace, routing layer) rather than model quality.

Privacy Mode and Enterprise Deployment

Cursor offers a Privacy Mode that disables codebase training data collection. In this mode, code is still sent to AI provider APIs for inference, but Cursor claims it does not store request/response pairs. For enterprise customers, a self-hosted option routes all inference through the customer's own API keys and optionally through a private network path.

The enterprise tier ($40/user/month) includes SOC 2 Type II compliance, SSO via Okta and Microsoft Entra, and audit logging. This is the tier Stripe and Shopify use. The compliance certification was completed in Q3 2024 — before which, Cursor was not broadly deployable at enterprises with strict data governance requirements.

What Cursor Doesn't Do Well (Yet)

Cursor's codebase indexing has a hard limit around 100,000 tokens of indexed context per query — roughly 75,000–100,000 lines of indexed code surfaced per request. For large monorepos with millions of lines, the relevant code may not be in the retrieved chunks, causing the same hallucination problems that simpler tools exhibit. The engineering team has discussed this as a known limitation in several developer forums.

Real-time collaboration is absent. Cursor is a single-user editor. Teams using it work in parallel on separate instances, which creates coordination problems on shared files that a traditional language server protocol setup handles naturally. JetBrains and VS Code both have better answers here. Cursor's product roadmap has mentioned "collaborative features" for 2025-2026 without specifics.

Actionable Takeaways

If you haven't set up codebase indexing: open Cursor settings and verify that "Codebase Indexing" is enabled (it is by default). Check the index status under Cursor > Settings > Features > Codebase Indexing. Large repos may need 5–10 minutes to fully index on first open.
For multi-file changes, use Composer mode (not Chat): Composer triggers the full speculative edit pipeline with shadow workspace validation. Chat mode uses a simpler single-pass approach. The quality difference is significant for refactoring tasks.
In Privacy Mode, your code still leaves your machine: it goes to Anthropic and OpenAI APIs. If you need truly local inference, Cursor's self-hosted enterprise path supports routing to locally-running Ollama instances — but this requires the enterprise tier.
For large monorepos: manually pin relevant files into context using @ mentions rather than relying solely on semantic search. The RAG retrieval is not perfect, and explicitly surfacing the right files improves output quality measurably.