Vectorless RAG Patterns for Database Knowledge Systems

RAG (Retrieval-Augmented Generation) is the default pattern for giving AI assistants context, but chunking structured operational documentation into 300-token vectors destroys the sequence of runbooks precisely when you need them most.

Situation

Engineering teams are increasingly feeding their incident response channels and database documentation into vector databases to build automated on-call assistants. The goal is to surface the right mitigation command at 2:13 a.m. when replica lag climbs or autovacuum gets blocked, without manually paging through Git repositories or wiki pages.

The Problem

The default chunked vector search implementation fails catastrophically for procedural database runbooks. It splits documents into arbitrary token pieces, embedding each piece into a vector, and retrieving chunks based on vocabulary similarity.

A PostgreSQL schema migration runbook contains a precheck, the DDL command, a validation query, and a rollback step. Vector chunking breaks this structure apart. Similarity scoring finds the chunk with the best vocabulary match for “migration,” which might return the validation query without the prerequisite rollback instructions. How do we retrieve operational knowledge while preserving the exact order of execution?

Core Concept

Vectorless RAG bypasses embedding models for structured documentation by using section tree retrieval. Instead of slicing text into chunks and measuring cosine similarity, documents are stored as a structured JSON tree keyed by document path. Retrieval happens via path prefixes rather than semantic approximation, guaranteeing that the precheck, command, validation, and rollback remain attached and in sequence.

Section Tree Retrieval Architecture

To build this, store your operational docs as a structured JSON tree in PostgreSQL using JSONB, keeping a vector store only for messy operational memory like Slack exports.

Step 1: Convert one critical runbook into a section tree.

The tree builder parses your Markdown headings into a nested JSON structure where each node has a path (array of heading titles from root to section), a summary, and the section body. No embeddings — just structure.

python scripts/build_doc_tree.py \
  --input docs/postgres/replication-lag.md \
  --doc-id postgres-replication-lag \
  --output build/postgres-replication-lag.json

Confirm with:

jq '.doc_id, .children[0].path, .children[0].summary' build/postgres-replication-lag.json

Step 2: Store the tree in Postgres JSONB with path-aware lookup.

Each row is one document section. The path column is an array (ARRAY['Postgres','Replication','Lag']) so you can query by prefix — “give me all Replication sections” — without scanning the full document body.

CREATE TABLE doc_index (
  doc_id        text    NOT NULL,
  path          text[]  NOT NULL,
  title         text    NOT NULL,
  summary       text    NOT NULL,
  body          text    NOT NULL,
  owner         text,
  last_verified date,
  node          jsonb   NOT NULL,
  PRIMARY KEY (doc_id, path)
);

CREATE INDEX doc_index_path_gin ON doc_index USING gin (path);
CREATE INDEX doc_index_node_gin ON doc_index USING gin (node jsonb_path_ops);

Step 3: Load sections without flattening the procedure.

python scripts/load_doc_tree_pg.py \
  --file build/postgres-replication-lag.json \
  --dsn "$DOC_INDEX_DSN"

Step 4: Route structured questions to tree retrieval first.

At query time, match document class before calling an LLM. Runbooks and schema docs route to the doc_index table. Incident postmortems route to the vector store.

SELECT path, title, summary, body
FROM doc_index
WHERE doc_id = 'postgres-replication-lag'
  AND (
    summary ILIKE '%schema migration%'
    OR body   ILIKE '%replica lag%'
    OR path @> ARRAY['Postgres','Replication']
  )
ORDER BY array_length(path, 1) DESC
LIMIT 5;

Step 5: Keep vector search for messy incident memory.

python scripts/embed_incidents.py \
  --source s3://db-knowledge/incidents/ \
  --collection db_incidents \
  --vector-store qdrant

flowchart TD
    DBA[DBA question] --> Router[retrieval router]
    Router -->|structured runbook| PostgresJSONB[doc_index in Postgres JSONB]
    Router -->|unstructured tickets| Qdrant[Qdrant — incidents collection]
    PostgresJSONB --> TreePath[section path — parent summaries — body]
    Qdrant --> VectorHits[top-k incident snippets]
    TreePath --> LLM[LLM answer composer]
    VectorHits --> LLM
    LLM --> Answer[answer with exact citation]
    Answer --> DBA

The router decision is intentionally boring: classify the document type first, then retrieve. Boring routing wakes you up less often.

In Practice

The documented pattern across operational knowledge systems is to strictly bound retrieval by how database engines execute commands. Derived from how PostgreSQL handles locking, schema changes hold an AccessExclusiveLock that queues all subsequent reads, often manifesting as replication lag or connection exhaustion. When a standard chunked RAG system encounters a query about this lock state, it routinely hallucinates by stitching together a pg_stat_activity query from a minor version upgrade document with a generic pg_cancel_backend snippet. This disjointed context encourages operators to blindly kill processes without verifying the blocker. By migrating to a section tree, the system instead pulls the entire operational branch—returning the specific diagnostic query, the targeted termination command, and the required rollback sequence as an atomic unit.

This structural alignment yields measurable shifts in how retrieval behaves during incidents:

Metric	Chunked vector search	Section tree retrieval
Runbook answer citation	Chunk ID + similarity score	Exact section path
Migration rollback retrieval	Often split across 2–4 chunks	Full prerequisite, command, validation, rollback in one section
Embedding model change	Re-embed runbooks, tickets, postmortems	Re-embed tickets only; tree index unchanged
Incident query behavior	Finds similar language	Follows operational structure first

The architectural split between structured and unstructured data typically looks like this:

Corpus	Best retrieval pattern	Reason
PostgreSQL failover runbook	Section tree	Procedure order and rollback must stay together
Snowflake warehouse guide	Section tree	Sections map to operational decisions
Prior SEV2 postmortems	Vector search	Language and structure vary across incidents
Slack incident channel export	Vector search	Messy, duplicated, high volume
Schema ownership docs	Section tree	Paths and citations matter
Slow query examples	Hybrid	Similar query shape + exact remediation docs

Where It Breaks

Failure mode	Trigger	Fix
Bad tree structure	Markdown headings are inconsistent or PDF parsing invents sections	Normalize docs to Markdown before building the tree; reject trees with missing `path`, `summary`, or `last_verified`
Wrong retrieval route	Query says “incident” but asks for the official rollback procedure	Add explicit document-class rules before any semantic routing
Stale runbook answer	Section exists but has not been tested since PostgreSQL 14	Require `last_verified`; suppress sections older than the last engine upgrade
JSONB table abuse	Teams start dumping every Slack export as a tree	Enforce: high-volume, messy text stays in the vector store
LLM over-summarizes commands	Retrieved section has multiple guarded branches	Return command blocks verbatim; make the model cite the section path, not paraphrase it

What to Do Next

Problem: Chunked vector search destroys the procedural sequence of database runbooks, leading to dangerous out-of-order execution during incidents.
Solution: Implement section tree retrieval using PostgreSQL JSONB to store and query operational documentation by hierarchical paths instead of token embeddings.
Proof: Extracting a full node path guarantees that prerequisites, commands, and rollbacks are returned as cohesive units, respecting the database’s locking behaviors.
Action: Convert one critical PostgreSQL failover runbook into a JSON tree in doc_index, and test 20 questions from recent incidents against both the tree index and the legacy vector store to compare citation accuracy.

Situation

The Problem

Core Concept

Section Tree Retrieval Architecture

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

pgvector Basics: Embeddings Inside PostgreSQL

The Stack for AI-Accelerated Database Operations Is Now Open Source

Stop Writing Ad-Hoc Queries: Build a Skill Backbone for Your DB Engineering Workflows