Replication Lag Explained
Replication lag is not one number — it is three. Write lag, flush lag, and replay lag measure different things, fail in different ways, and require different interventions. Monitoring only total lag means you cannot tell whether the standby is slow to receive, slow to confirm, or slow to apply.
Situation
PostgreSQL’s pg_stat_replication view exposes three lag components for each connected standby: write_lag, flush_lag, and replay_lag. Most monitoring systems expose only the largest — typically replay_lag — and alert on it as a single number. That number is correct but incomplete.
Replication lag is the delay between a change being committed on the primary and being available on the standby. But “available” means different things depending on what you are protecting against.
The Problem
An alert fires: replication lag on the standby has reached 45 seconds. The on-call engineer does not know: is the primary sending WAL slowly? Is the standby receiving but not flushing? Is the standby flushing but not replaying? Each has a different root cause and a different fix. Without understanding the three components, you cannot triage the alert correctly.
What do the three lag components actually measure, and which one is relevant to your RPO?
The Three Components
PostgreSQL measures lag as the time between a change being committed on the primary and each stage completing on the standby:
Write lag: time between commit on primary and the standby confirming it has written the WAL record to its own WAL buffer (in memory). This measures network latency and standby receive throughput.
Flush lag: time between commit on primary and the standby confirming it has flushed the WAL record to disk. This measures the standby’s I/O performance for WAL writes.
Replay lag: time between commit on primary and the standby confirming it has applied the WAL record to its data files. This measures the standby’s ability to apply changes — which can fall behind under high write volume or during long-running queries on the standby that hold recovery locks.
-- On the primary: all three lag components per standby
SELECT application_name,
write_lag,
flush_lag,
replay_lag,
state,
sync_state
FROM pg_stat_replication
ORDER BY replay_lag DESC NULLS LAST;
-- On the standby: time since last replay
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;
For RPO purposes, replay_lag is what matters — it is the measure of how much committed data could be lost if the primary fails right now and you promote the standby.
In Practice
The documented PostgreSQL behavior for physical streaming replication is that write_lag and flush_lag are typically small (milliseconds in a well-connected environment) and replay_lag is the dominant component. Replay lag grows when: the standby is I/O constrained applying data pages; the standby has long-running read queries that block recovery (hot standby conflict); or the primary is generating WAL faster than the standby can replay.
synchronous_commit = remote_apply causes the primary to wait until replay_lag reaches zero before acknowledging a commit — at the cost of commit latency equal to the standby’s replay time. synchronous_commit = remote_write waits only for write_lag to clear, providing weaker durability guarantees but lower commit latency.
Where It Breaks
| Lag component growing | Root cause | Fix |
|---|---|---|
| Write lag | Network congestion or bandwidth saturation | Investigate network path; consider WAL compression |
| Flush lag | Standby I/O pressure (disk writes slow) | Upgrade standby storage; separate WAL to faster device |
| Replay lag | Long-running queries on standby causing hot standby conflicts | max_standby_streaming_delay; cancel conflicting queries |
| All three | Primary generating WAL faster than standby can process | Vertical scale of standby; reduce primary write throughput |
What to Do Next
- Problem: Monitoring a single lag number does not distinguish between a network problem, a standby I/O problem, and a replay conflict — three very different operational responses.
- Solution: Monitor all three components separately; alert on
replay_lag > RPO_thresholdfor durability; alert onflush_lag > write_lag * 5to detect standby I/O problems specifically. - Proof: After adding per-component monitoring, lag spikes will clearly show which component is growing, cutting triage time from minutes to seconds.
- Action: Run the
pg_stat_replicationquery above right now on your primary and capture the three lag values as your baseline — if you have never looked at them before, you likely do not know which component your standby’s lag comes from.