Vectorless RAG Patterns for Database Knowledge Systems
RAG (Retrieval-Augmented Generation) is the default pattern for giving AI assistants context, but chunking structured operational documentation into 300-token vectors destroys the sequence of runbooks precisely when you need them most.
Situation
Engineering teams are increasingly feeding their incident response channels and database documentation into vector databases to build automated on-call assistants. The goal is to surface the right mitigation command at 2:13 a.m. when replica lag climbs or autovacuum gets blocked, without manually paging through Git repositories or wiki pages.
The Problem
The default chunked vector search implementation fails catastrophically for procedural database runbooks. It splits documents into arbitrary token pieces, embedding each piece into a vector, and retrieving chunks based on vocabulary similarity.
A PostgreSQL schema migration runbook contains a precheck, the DDL command, a validation query, and a rollback step. Vector chunking breaks this structure apart. Similarity scoring finds the chunk with the best vocabulary match for “migration,” which might return the validation query without the prerequisite rollback instructions. How do we retrieve operational knowledge while preserving the exact order of execution?
Core Concept
Vectorless RAG bypasses embedding models for structured documentation by using section tree retrieval. Instead of slicing text into chunks and measuring cosine similarity, documents are stored as a structured JSON tree keyed by document path. Retrieval happens via path prefixes rather than semantic approximation, guaranteeing that the precheck, command, validation, and rollback remain attached and in sequence.
Section Tree Retrieval Architecture
To build this, store your operational docs as a structured JSON tree in PostgreSQL using JSONB, keeping a vector store only for messy operational memory like Slack exports.
Step 1: Convert one critical runbook into a section tree.
The tree builder parses your Markdown headings into a nested JSON structure where each node has a path (array of heading titles from root to section), a summary, and the section body. No embeddings — just structure.
python scripts/build_doc_tree.py \
--input docs/postgres/replication-lag.md \
--doc-id postgres-replication-lag \
--output build/postgres-replication-lag.json
Confirm with:
jq '.doc_id, .children[0].path, .children[0].summary' build/postgres-replication-lag.json
Step 2: Store the tree in Postgres JSONB with path-aware lookup.
Each row is one document section. The path column is an array (ARRAY['Postgres','Replication','Lag']) so you can query by prefix — “give me all Replication sections” — without scanning the full document body.
CREATE TABLE doc_index (
doc_id text NOT NULL,
path text[] NOT NULL,
title text NOT NULL,
summary text NOT NULL,
body text NOT NULL,
owner text,
last_verified date,
node jsonb NOT NULL,
PRIMARY KEY (doc_id, path)
);
CREATE INDEX doc_index_path_gin ON doc_index USING gin (path);
CREATE INDEX doc_index_node_gin ON doc_index USING gin (node jsonb_path_ops);
Step 3: Load sections without flattening the procedure.
python scripts/load_doc_tree_pg.py \
--file build/postgres-replication-lag.json \
--dsn "$DOC_INDEX_DSN"
Step 4: Route structured questions to tree retrieval first.
At query time, match document class before calling an LLM. Runbooks and schema docs route to the doc_index table. Incident postmortems route to the vector store.
SELECT path, title, summary, body
FROM doc_index
WHERE doc_id = 'postgres-replication-lag'
AND (
summary ILIKE '%schema migration%'
OR body ILIKE '%replica lag%'
OR path @> ARRAY['Postgres','Replication']
)
ORDER BY array_length(path, 1) DESC
LIMIT 5;
Step 5: Keep vector search for messy incident memory.
python scripts/embed_incidents.py \
--source s3://db-knowledge/incidents/ \
--collection db_incidents \
--vector-store qdrant
flowchart TD
DBA[DBA question] --> Router[retrieval router]
Router -->|structured runbook| PostgresJSONB[doc_index in Postgres JSONB]
Router -->|unstructured tickets| Qdrant[Qdrant — incidents collection]
PostgresJSONB --> TreePath[section path — parent summaries — body]
Qdrant --> VectorHits[top-k incident snippets]
TreePath --> LLM[LLM answer composer]
VectorHits --> LLM
LLM --> Answer[answer with exact citation]
Answer --> DBA
The router decision is intentionally boring: classify the document type first, then retrieve. Boring routing wakes you up less often.
In Practice
The documented pattern across operational knowledge systems is to strictly bound retrieval by how database engines execute commands. Derived from how PostgreSQL handles locking, schema changes hold an AccessExclusiveLock that queues all subsequent reads, often manifesting as replication lag or connection exhaustion. When a standard chunked RAG system encounters a query about this lock state, it routinely hallucinates by stitching together a pg_stat_activity query from a minor version upgrade document with a generic pg_cancel_backend snippet. This disjointed context encourages operators to blindly kill processes without verifying the blocker. By migrating to a section tree, the system instead pulls the entire operational branch—returning the specific diagnostic query, the targeted termination command, and the required rollback sequence as an atomic unit.
This structural alignment yields measurable shifts in how retrieval behaves during incidents:
| Metric | Chunked vector search | Section tree retrieval |
|---|---|---|
| Runbook answer citation | Chunk ID + similarity score | Exact section path |
| Migration rollback retrieval | Often split across 2–4 chunks | Full prerequisite, command, validation, rollback in one section |
| Embedding model change | Re-embed runbooks, tickets, postmortems | Re-embed tickets only; tree index unchanged |
| Incident query behavior | Finds similar language | Follows operational structure first |
The architectural split between structured and unstructured data typically looks like this:
| Corpus | Best retrieval pattern | Reason |
|---|---|---|
| PostgreSQL failover runbook | Section tree | Procedure order and rollback must stay together |
| Snowflake warehouse guide | Section tree | Sections map to operational decisions |
| Prior SEV2 postmortems | Vector search | Language and structure vary across incidents |
| Slack incident channel export | Vector search | Messy, duplicated, high volume |
| Schema ownership docs | Section tree | Paths and citations matter |
| Slow query examples | Hybrid | Similar query shape + exact remediation docs |
Where It Breaks
| Failure mode | Trigger | Fix |
|---|---|---|
| Bad tree structure | Markdown headings are inconsistent or PDF parsing invents sections | Normalize docs to Markdown before building the tree; reject trees with missing path, summary, or last_verified |
| Wrong retrieval route | Query says “incident” but asks for the official rollback procedure | Add explicit document-class rules before any semantic routing |
| Stale runbook answer | Section exists but has not been tested since PostgreSQL 14 | Require last_verified; suppress sections older than the last engine upgrade |
| JSONB table abuse | Teams start dumping every Slack export as a tree | Enforce: high-volume, messy text stays in the vector store |
| LLM over-summarizes commands | Retrieved section has multiple guarded branches | Return command blocks verbatim; make the model cite the section path, not paraphrase it |
What to Do Next
- Problem: Chunked vector search destroys the procedural sequence of database runbooks, leading to dangerous out-of-order execution during incidents.
- Solution: Implement section tree retrieval using PostgreSQL JSONB to store and query operational documentation by hierarchical paths instead of token embeddings.
- Proof: Extracting a full node path guarantees that prerequisites, commands, and rollbacks are returned as cohesive units, respecting the database’s locking behaviors.
- Action: Convert one critical PostgreSQL failover runbook into a JSON tree in
doc_index, and test 20 questions from recent incidents against both the tree index and the legacy vector store to compare citation accuracy.
Interactive tools for this topic