A PostgreSQL covering index is not a performance fix by itself; it is a bet that the query, the index payload, and the visibility map will stay aligned under real production churn.

Situation

The default move is still an ordinary B-tree index on the predicate column: CREATE INDEX ON users(email). The better move, when the read path is stable, is a covering index using PostgreSQL 11’s INCLUDE clause, which stores projected columns in the index payload so an index-only scan can answer the query without visiting the heap when visibility permits it.

ApproachWhat it optimizesWhat it still pays for
Ordinary B-tree indexFinds matching tuple IDs quicklyHeap reads for projected columns and Multi-Version Concurrency Control (MVCC) visibility
Covering index with INCLUDEKeeps predicate and selected columns in one indexLarger index, write overhead, visibility map dependency
Covering index plus vacuum disciplineAvoids heap access for stable pagesOperational ownership of autovacuum and long transactions

The Problem

PostgreSQL indexes do not store complete row visibility. They can point to candidate rows, but MVCC visibility is determined from heap state unless PostgreSQL can trust the visibility map. The official PostgreSQL documentation is explicit: index-only scans only win when the needed columns are available from the index and a significant fraction of heap pages have their all-visible bits set in the visibility map.

Failure pointWhat breaksWhy it matters
Projection misses the indexSELECT username, status uses idx_users_email(email) and still reads the heapThe index finds rows, but the table still serves the selected columns
Visibility map is stalePlan says Index Only Scan, but reports Heap Fetches: 12000The scan is only “index-only” for pages marked all-visible
Autovacuum threshold is too looseDefault autovacuum_vacuum_scale_factor = 0.2 can mean roughly 40M changed tuples on a 200M-row table before vacuum triggersLarge tables can accumulate heap pages that are not all-visible for too long
Included column churnUpdating status or username touches an indexed columnPostgreSQL must maintain the index entry, and HOT updates are less likely
Staging lies politelyFreshly loaded and manually vacuumed test data shows zero heap fetchesProduction write churn, old snapshots, and delayed vacuum change the execution profile

The core question is not “did we add an index?” It is: can PostgreSQL answer this production query from the index while proving that the referenced heap pages are visible to the current snapshot?

Design the Index Around the Read Path and the Visibility Map

The right architecture is a measured covering-index loop: identify the hot read path, build the narrowest covering index, verify heap avoidance with EXPLAIN (ANALYZE, BUFFERS), and tune vacuum behavior for that table instead of celebrating the DDL.

flowchart TD
    Query[hot read query — predicate and projection] --> Cover[covering B-tree index — key and included columns]
    Cover --> VM[visibility map — all visible bit]
    VM -->|bit set| Return[index tuple returned]
    VM -->|bit clear| Heap[heap visit for MVCC check]
    Heap --> Return
    Vacuum[VACUUM and autovacuum] --> VM
    Writes[INSERT UPDATE DELETE on page] --> VM
  1. Start from pg_stat_statements, not intuition. Pick one query by total time and call count, then write down its WHERE, ORDER BY, and SELECT columns.
    Verification: the candidate query has a stable fingerprint and enough calls to matter.

  2. Put search columns in the key and projected columns in INCLUDE. For the lookup path below, email is the key; username and status are payload.

    CREATE INDEX CONCURRENTLY idx_users_email_covering
    ON users(email)
    INCLUDE (username, status);
    

    Verification: CREATE INDEX CONCURRENTLY finishes without blocking ordinary reads and writes, and the index size is acceptable via pg_relation_size.

  3. Run the real query with execution metrics.

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT username, status
    FROM users
    WHERE email = 'dev@example.com';
    

    Verification: look for Index Only Scan, low shared buffer reads, and Heap Fetches: 0 or a number small enough to survive peak traffic.

  4. Check visibility health, not just plan shape. PostgreSQL’s visibility map stores all-visible and all-frozen state per heap page, and its bits are set by vacuum and cleared by data-modifying operations.
    Verification: if heap fetches remain high after the index is used, inspect last_autovacuum, n_dead_tup, long-running transactions, and table-level autovacuum settings.

  5. Bound the write cost. Included columns are not search keys, but they still live in the index. A wide text, jsonb, or frequently updated status column can turn a read optimization into write amplification.
    Verification: compare pg_stat_user_indexes.idx_scan, write latency, WAL volume, HOT update ratio, and index size before and after rollout.

In Practice

I am not going to invent a 2:14 AM incident with a heroic graph. The documented production pattern is enough, and the public PostgreSQL material gives a concrete measurement boundary.

PostgreSQL 11 added covering indexes with INCLUDE, documented in the project release notes and in the current index-only scan documentation. The documentation says the scan is physically possible when the index type supports it and the query’s referenced columns are available from the index. B-tree indexes satisfy the access-method requirement. The same documentation adds the operational catch: because visibility data is not stored in index entries, PostgreSQL checks the visibility map before skipping the heap.

That behavior explains why a plan can contain Index Only Scan and still do heap work. The plan node describes the access strategy; Heap Fetches tells you how often the executor had to visit heap pages anyway. If heap fetches are high, the covering index may still reduce work, but it has not removed the table from the read path.

A useful public comparison comes from Dalibo’s PostgreSQL 11 workshop, which uses a 10M-row table with columns a, b, and c. With a unique index on (a, b), selecting only a, b can use an index-only scan with Heap Fetches: 0. Selecting a, b, c from the same predicate cannot be answered by that index, so PostgreSQL uses an index scan and reads the table to get c. After adding a covering index on (a, b) INCLUDE (c), the same a, b, c query returns to an index-only scan with Heap Fetches: 0.

Public PostgreSQL 11 workshop measurementPlan shapeHeap fetch signalExecution time
Existing unique index on (a, b), query selects a, bIndex Only Scan012.628 ms
Existing unique index on (a, b), query selects a, b, cIndex ScanHeap access is inherent16.034 ms
Covering unique index on (a, b) INCLUDE (c), query selects a, b, cIndex Only Scan014.263 ms

The more interesting part is not the small read-time delta in that example. It is the storage and write tradeoff. Dalibo reports 214 MB for the unique (a, b) index and 387 MB for a separate (a, b, c) index, or 602 MB if both are kept. Replacing that pair with one unique covering index on (a, b) INCLUDE (c) is reported at 386 MB. The same workshop then inserts 100k rows: maintaining one covering index reports 502.594 ms; maintaining the two-index design reports 843.147 ms.

That is the design tradeoff senior engineers should care about. The covering index did not make writes free. It reduced a two-index design into one index while preserving uniqueness semantics on (a, b). If your alternative is no extra index, writes still pay. If your alternative is two overlapping indexes, a covering index may be the cheaper structure.

The deeper production gotcha is autovacuum math. PostgreSQL documents autovacuum_vacuum_threshold = 50 and autovacuum_vacuum_scale_factor = 0.2 defaults. On small tables, that is fine. On a 200M-row relation, scale-factor-driven vacuum can wait for a very large number of changed tuples unless table storage parameters override it. That delay matters because visibility map bits are conservative: if PostgreSQL cannot prove a page is all-visible, it visits the heap.

There is also a schema-design trap. Adding INCLUDE (username, status) is reasonable for a hot lookup endpoint. Adding ten payload columns because “index-only scans are fast” is not engineering; it is moving the table into another structure with worse write economics. PostgreSQL will reject oversized index tuples, and before that hard failure, you pay with memory pressure, cache churn, WAL, and slower updates.

The useful mental model is simple: a covering index is a read-path contract. Autovacuum, transaction age, and update patterns are the parties that can break it.

Where It Breaks

Failure modeTriggerFix
Index Only Scan still shows large Heap FetchesPages are not marked all-visible after recent INSERT, UPDATE, or DELETE activityTune table-level autovacuum and remove long-running transactions holding old snapshots
Covering index bloats quicklyINCLUDE contains wide text, jsonb, or low-value projected columnsKeep payload columns narrow and tied to one hot query family
Write latency rises after rolloutIncluded columns are frequently updated, preventing cheap heap-only behaviorDrop volatile payload columns or split read model from write-heavy table
Planner ignores the new indexQuery selects extra columns, uses mismatched predicates, or statistics are staleRe-run ANALYZE, verify exact projection, and compare with EXPLAIN (ANALYZE, BUFFERS)
Staging benchmark overstates gainsTest data was bulk-loaded, vacuumed, and mostly staticReplay production write mix or test after churn before trusting heap-fetch counts
RDS maintenance lags during peak write loadAutovacuum workers and cost limits cannot keep up with dead tuplesUse per-table autovacuum settings and monitor pg_stat_user_tables

What to Do Next

  • Problem: Ordinary indexes still force heap access when the query projects columns outside the index or when MVCC visibility cannot be proven from the visibility map.
  • Solution: Build narrow covering indexes only for high-call-count read paths, then treat autovacuum health as part of the index design.
  • Proof: The validation signal is not the presence of Index Only Scan; it is low Heap Fetches, stable buffer reads, acceptable index size, preserved HOT update ratio, and no write regression.
  • Action: This week, take the top query from pg_stat_statements, add one candidate covering index in staging, and compare EXPLAIN (ANALYZE, BUFFERS), pg_relation_size, write latency, WAL volume, and HOT update ratio before and after real write churn.

A fast PostgreSQL query is rarely the result of one clever index; it is the result of making the storage engine’s promises line up with the workload it is actually running.