Technology thesis · Artificial Intelligence

high conviction growth

Retrieval-augmented generation (RAG)

RAG has become the standard architecture for enterprise LLM deployment, enabling models to access private data without retraining.

Position maintained continuously · last reviewed Apr 22, 2026

The thesis

Core thesis

RAG solves the fundamental enterprise LLM problem: models need access to private, current data that wasn't in their training set. LlamaIndex and LangChain provide the infrastructure. Vector databases (Pinecone, Weaviate) store embeddings. Enterprise RAG is a multi-billion dollar market. The challenge: RAG quality depends on chunking strategy, embedding model, and retrieval precision — it's engineering, not research.

State of the art (2026)

By mid-2026, naive vector RAG is a baseline, not a finished system. The default production pattern is hybrid retrieval (semantic plus keyword) followed by a reranking pass, with agentic RAG – a reasoning loop that decomposes the query, retrieves, critiques and retrieves again – now standard for hard questions. Microsoft GraphRAG, open-sourced in 2024, earns its cost on cross-document, connect-the-dots queries, and adaptive routing of each query to the cheapest sufficient pipeline is the emerging best practice. Long context has not killed RAG: Claude Opus and Sonnet 4.6 and Gemini 3 Pro all ship 1M-token windows, yet recall degrades past roughly 600-700K tokens, so retrieval still grounds enterprise systems. Glean ($7.2B, ~$300M ARR) anchors the productised layer.

The rest of the file

Everything below is live inside CanaryIQ

The full analysis behind the verdict — the structure is real; the content unlocks when you log in.

Signal stack

Evidence stacked leading → lagging

10 signals

talent

research

patent

expert

operational

market

Technology-native KPIs

Metrics that predict trajectory, tracked over time

4 tracked

Enterprise RAG Adoption

Vector Database Market

RAG Accuracy Improvement

Hallucination Reduction

Landscape map

Who builds what — and who depends on whom

69 players · 6 layers

Catalyst calendar

Dated events that will move the position

5 ahead

Technology roadmap

Milestones on the path to maturity

8 milestones

Watchlists

Companies, people and papers — each with a remove-by condition

3 · 20

Companies · 3

People · 20

Decision frameworks

The same call, framed for your desk

Locked

Public Equity

PE / VC

Corporate Leader

Thesis changelog

When our view changed, and why

4 updates

Change our mind

3 disconfirming conditions

The rest is inside

You've read the verdict. The file is much deeper.

The full signal stack, technology-native KPIs tracked over time, the landscape of who depends on whom, the dated catalyst calendar, decision frameworks for every desk, live watchlists and the changelog of every time our call on Retrieval-augmented generation (RAG) has changed — all live inside CanaryIQ.