RAG (Retrieval-Augmented Generation) is the default pattern for giving AI assistants context, but chunking structured operational documentation into 300-token vectors destroys the sequence of runbooks precisely when you need them most.

Situation

Engineering teams are increasingly feeding their incident response channels and database documentation into vector databases to build automated on-call assistants. The goal is to surface the right mitigation command at 2:13 a.m. when replica lag climbs or autovacuum gets blocked, without manually paging through Git repositories or wiki pages.

The Problem

The default chunked vector search implementation fails catastrophically for procedural database runbooks. It splits documents into arbitrary token pieces, embedding each piece into a vector, and retrieving chunks based on vocabulary similarity.

A PostgreSQL schema migration runbook contains a precheck, the DDL command, a validation query, and a rollback step. Vector chunking breaks this structure apart. Similarity scoring finds the chunk with the best vocabulary match for “migration,” which might return the validation query without the prerequisite rollback instructions. How do we retrieve operational knowledge while preserving the exact order of execution?

Core Concept

Vectorless RAG bypasses embedding models for structured documentation by using section tree retrieval. Instead of slicing text into chunks and measuring cosine similarity, documents are stored as a structured JSON tree keyed by document path. Retrieval happens via path prefixes rather than semantic approximation, guaranteeing that the precheck, command, validation, and rollback remain attached and in sequence.

Section Tree Retrieval Architecture

To build this, store your operational docs as a structured JSON tree in PostgreSQL using JSONB, keeping a vector store only for messy operational memory like Slack exports.

Step 1: Convert one critical runbook into a section tree.

The tree builder parses your Markdown headings into a nested JSON structure where each node has a path (array of heading titles from root to section), a summary, and the section body. No embeddings — just structure.

python scripts/build_doc_tree.py \
  --input docs/postgres/replication-lag.md \
  --doc-id postgres-replication-lag \
  --output build/postgres-replication-lag.json

Confirm with:

jq '.doc_id, .children[0].path, .children[0].summary' build/postgres-replication-lag.json

Step 2: Store the tree in Postgres JSONB with path-aware lookup.

Each row is one document section. The path column is an array (ARRAY['Postgres','Replication','Lag']) so you can query by prefix — “give me all Replication sections” — without scanning the full document body.

CREATE TABLE doc_index (
  doc_id        text    NOT NULL,
  path          text[]  NOT NULL,
  title         text    NOT NULL,
  summary       text    NOT NULL,
  body          text    NOT NULL,
  owner         text,
  last_verified date,
  node          jsonb   NOT NULL,
  PRIMARY KEY (doc_id, path)
);

CREATE INDEX doc_index_path_gin ON doc_index USING gin (path);
CREATE INDEX doc_index_node_gin ON doc_index USING gin (node jsonb_path_ops);

Step 3: Load sections without flattening the procedure.

python scripts/load_doc_tree_pg.py \
  --file build/postgres-replication-lag.json \
  --dsn "$DOC_INDEX_DSN"

Step 4: Route structured questions to tree retrieval first.

At query time, match document class before calling an LLM. Runbooks and schema docs route to the doc_index table. Incident postmortems route to the vector store.

SELECT path, title, summary, body
FROM doc_index
WHERE doc_id = 'postgres-replication-lag'
  AND (
    summary ILIKE '%schema migration%'
    OR body   ILIKE '%replica lag%'
    OR path @> ARRAY['Postgres','Replication']
  )
ORDER BY array_length(path, 1) DESC
LIMIT 5;

Step 5: Keep vector search for messy incident memory.

python scripts/embed_incidents.py \
  --source s3://db-knowledge/incidents/ \
  --collection db_incidents \
  --vector-store qdrant
flowchart TD
    DBA[DBA question] --> Router[retrieval router]
    Router -->|structured runbook| PostgresJSONB[doc_index in Postgres JSONB]
    Router -->|unstructured tickets| Qdrant[Qdrant — incidents collection]
    PostgresJSONB --> TreePath[section path — parent summaries — body]
    Qdrant --> VectorHits[top-k incident snippets]
    TreePath --> LLM[LLM answer composer]
    VectorHits --> LLM
    LLM --> Answer[answer with exact citation]
    Answer --> DBA

The router decision is intentionally boring: classify the document type first, then retrieve. Boring routing wakes you up less often.

In Practice

The documented pattern across operational knowledge systems is to strictly bound retrieval by how database engines execute commands. Derived from how PostgreSQL handles locking, schema changes hold an AccessExclusiveLock that queues all subsequent reads, often manifesting as replication lag or connection exhaustion. When a standard chunked RAG system encounters a query about this lock state, it routinely hallucinates by stitching together a pg_stat_activity query from a minor version upgrade document with a generic pg_cancel_backend snippet. This disjointed context encourages operators to blindly kill processes without verifying the blocker. By migrating to a section tree, the system instead pulls the entire operational branch—returning the specific diagnostic query, the targeted termination command, and the required rollback sequence as an atomic unit.

This structural alignment yields measurable shifts in how retrieval behaves during incidents:

MetricChunked vector searchSection tree retrieval
Runbook answer citationChunk ID + similarity scoreExact section path
Migration rollback retrievalOften split across 2–4 chunksFull prerequisite, command, validation, rollback in one section
Embedding model changeRe-embed runbooks, tickets, postmortemsRe-embed tickets only; tree index unchanged
Incident query behaviorFinds similar languageFollows operational structure first

The architectural split between structured and unstructured data typically looks like this:

CorpusBest retrieval patternReason
PostgreSQL failover runbookSection treeProcedure order and rollback must stay together
Snowflake warehouse guideSection treeSections map to operational decisions
Prior SEV2 postmortemsVector searchLanguage and structure vary across incidents
Slack incident channel exportVector searchMessy, duplicated, high volume
Schema ownership docsSection treePaths and citations matter
Slow query examplesHybridSimilar query shape + exact remediation docs

Where It Breaks

Failure modeTriggerFix
Bad tree structureMarkdown headings are inconsistent or PDF parsing invents sectionsNormalize docs to Markdown before building the tree; reject trees with missing path, summary, or last_verified
Wrong retrieval routeQuery says “incident” but asks for the official rollback procedureAdd explicit document-class rules before any semantic routing
Stale runbook answerSection exists but has not been tested since PostgreSQL 14Require last_verified; suppress sections older than the last engine upgrade
JSONB table abuseTeams start dumping every Slack export as a treeEnforce: high-volume, messy text stays in the vector store
LLM over-summarizes commandsRetrieved section has multiple guarded branchesReturn command blocks verbatim; make the model cite the section path, not paraphrase it

What to Do Next

  • Problem: Chunked vector search destroys the procedural sequence of database runbooks, leading to dangerous out-of-order execution during incidents.
  • Solution: Implement section tree retrieval using PostgreSQL JSONB to store and query operational documentation by hierarchical paths instead of token embeddings.
  • Proof: Extracting a full node path guarantees that prerequisites, commands, and rollbacks are returned as cohesive units, respecting the database’s locking behaviors.
  • Action: Convert one critical PostgreSQL failover runbook into a JSON tree in doc_index, and test 20 questions from recent incidents against both the tree index and the legacy vector store to compare citation accuracy.