PostgreSQL Autovacuum: What Every Engineer Should Know
Autovacuum is not a background nicety. It is the process that keeps PostgreSQL’s MVCC machinery from accumulating dead tuples until the table is unreadable, and the process that prevents transaction ID wraparound — a condition where PostgreSQL freezes all writes and forces an emergency vacuum on the entire cluster. Treating autovacuum as optional, throttling it too hard on OLTP servers, or simply not knowing what its thresholds mean is one of the most common ways production PostgreSQL clusters degrade over months before anyone notices.
Situation
PostgreSQL uses multi-version concurrency control (MVCC). When a row is updated or deleted, PostgreSQL does not overwrite it in place — it marks the old row version as dead and writes a new version. The dead row versions (dead tuples) accumulate on disk and remain visible to old transactions that might still need them. This is what makes non-blocking reads possible: readers never block writers, and writers never block readers.
But dead tuples cost disk space, and they slow down sequential scans because the storage engine has to skip over them. At the extreme end, transaction IDs are 32-bit integers — after about 2 billion transactions, PostgreSQL will wrap around and enter a state where it cannot guarantee which data is old and which is new. To prevent corruption, PostgreSQL will refuse all writes and force a full-cluster VACUUM FREEZE.
Autovacuum is the background daemon that reclaims dead tuples and advances the freeze horizon before either of these problems becomes a crisis.
The Problem
The default autovacuum thresholds are designed for small-to-medium tables. The trigger condition is:
autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × n_live_tup
With autovacuum_vacuum_scale_factor = 0.2 (the default), autovacuum triggers a VACUUM when 20% of the live row count has accumulated as dead tuples. On a table with 1,000 rows, this fires after 200 dead tuples — reasonable. On a table with 50 million rows, it fires after 10 million dead tuples have accumulated. That is a lot of bloat before the cleanup runs.
High-write tables — event logs, audit trails, queues, sessions — accumulate dead tuples faster than autovacuum can clear them at the default settings. The table grows. Indexes bloat. Query plans drift toward sequential scans. The system appears slow without an obvious cause, and the only way to recover is an explicit VACUUM or, worse, a VACUUM FULL (which rewrites the entire table and requires an exclusive lock).
The core question: how do you tune autovacuum before table bloat becomes a production incident?
How Autovacuum Threshold and Cost Throttling Work
Autovacuum has two independently important levers: when it runs and how fast it runs.
When it runs is controlled by the threshold formula above. For large, high-write tables, you almost always need to override autovacuum_vacuum_scale_factor at the table level rather than globally:
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 1000
);
This tells autovacuum to trigger after 1% of rows become dead (plus a baseline of 1,000 dead tuples), rather than 20%. For a 50 million row table, that fires after 500,000 dead tuples instead of 10 million.
How fast it runs is controlled by autovacuum_vacuum_cost_delay (default: 2ms in PG13+, 20ms in older versions). This is a per-page throttle: after vacuuming autovacuum_vacuum_cost_limit worth of pages, autovacuum sleeps for autovacuum_vacuum_cost_delay milliseconds. The intent is to prevent autovacuum from overwhelming I/O on a shared server. The side effect is that on OLTP servers with continuous high write throughput, autovacuum can be so throttled that it never catches up.
You can observe the current autovacuum state per-table in pg_stat_user_tables:
SELECT
relname,
n_live_tup,
n_dead_tup,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
A table with a high n_dead_tup relative to n_live_tup and a stale last_autovacuum timestamp is a table where autovacuum is not keeping up.
autovacuum_max_workers (default: 3) controls how many autovacuum processes can run simultaneously. On clusters with many high-write tables, this can become the binding constraint — all workers are busy on large tables and smaller tables go unvacuumed.
In Practice
PostgreSQL’s autovacuum documentation (postgresql.org/docs/current/routine-vacuuming.html) documents the wraparound risk directly: when a table’s relfrozenxid age approaches autovacuum_freeze_max_age (default: 200 million transactions), PostgreSQL will force an anti-wraparound vacuum that ignores the normal cost throttling. This means a heavily throttled autovacuum configuration will eventually be overridden by the system — but not before the forced vacuum causes a visible I/O spike.
The pg_stat_user_tables view is the documented interface for observing autovacuum behavior per table. The columns n_dead_tup, last_autovacuum, last_autoanalyze, and autovacuum_count give the observable signal for whether thresholds are tuned correctly.
The documented pattern from PostgreSQL’s VACUUM documentation is that per-table storage parameters (autovacuum_vacuum_scale_factor, autovacuum_vacuum_cost_delay) override the server-level postgresql.conf settings — this is the correct mechanism for table-level tuning without changing global behavior.
Where It Breaks
| Scenario | What breaks | Why |
|---|---|---|
Autovacuum disabled explicitly (autovacuum = off) | Dead tuples accumulate unbounded; XID wraparound will eventually force a full-cluster emergency vacuum | The only thing preventing unbounded table bloat is operator-run VACUUM; one missed cycle compounds |
| Cost delay set too high on OLTP servers | Autovacuum runs slower than dead tuples accumulate; table bloat grows continuously | Each worker sleeps too long between pages; on high-write tables the math never closes |
| XID wraparound forces anti-wraparound vacuum | All autovacuum workers redirect to the aging table, ignoring cost limits; other tables go unvacuumed | Anti-wraparound vacuum is not throttled — it will consume I/O to protect data integrity |
What to Do Next
- Problem: On large, high-write tables the default 20% scale factor lets millions of dead tuples accumulate before autovacuum triggers, causing progressive table and index bloat.
- Solution: Override
autovacuum_vacuum_scale_factorat the table level (set to 0.01–0.05 for tables over 1M rows) and reduceautovacuum_vacuum_cost_delayon servers where autovacuum is falling behind. - Proof: Query
pg_stat_user_tablesand confirmn_dead_tupon your high-write tables stays below 1–2% ofn_live_tupover a 24-hour window. - Action: This week, run
SELECT relname, n_dead_tup, n_live_tup, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 20;and identify which tables have not been vacuumed recently or have high dead tuple ratios — those are the candidates for per-table threshold tuning.