PostgreSQL 18 Replication Upgrade Opportunities
PostgreSQL 18 ships with replication changes that are improvements in normal operation and surprises in the first week after upgrade. Parallel logical apply, the pg_createsubscriber --all utility, and better conflict logging each change the operational model for replication in ways that require preparation — not because they are dangerous, but because they surface behavior that was previously invisible. Planning the upgrade without understanding these changes means discovering them at 2 AM.
Note: This post was originally written during the PostgreSQL 18 beta 1 period. It has been updated to confirm behavior against the final release (September 25, 2025). The
conflict_resolutionparameter andpg_createsubscriber --allbehavior described here reflect the GA release.
Leadership Summary
Upgrading to PostgreSQL 18 introduces critical changes to logical replication that alter default concurrency and conflict visibility. While these represent architectural improvements, they will break applications that assume sequential logical apply and will trigger alerts for previously silent replication conflicts. Engineering leaders must ensure teams audit their current logical replication topology, explicitly test parallel apply ordering assumptions, and tune monitoring to handle the new structured conflict logging before upgrading production environments.
Situation
Teams on PostgreSQL 14, 15, or 16 are increasingly evaluating an upgrade to PostgreSQL 18. The database engine improvements — parallel query enhancements, improved statistics, and JSON improvements — are the typical headline justifications. Replication is often assessed as “nothing major changed” until someone runs the upgrade in staging and discovers that the conflict logging they had silenced for years is now surfacing in a new format that breaks their monitoring.
The three replication areas that actually change in PostgreSQL 18 and require deliberate assessment:
Parallel logical apply (available since PostgreSQL 16, now enabled by default with max_parallel_apply_workers_per_subscription = 2): logical replication can now apply transactions concurrently across multiple apply workers when the publisher commits parallel transactions. This improves throughput significantly for write-heavy publishers but means that the apply order across concurrent transactions is no longer sequential — which breaks applications that assume apply order matches commit order.
pg_createsubscriber --all: a new command-line utility that converts a physical streaming standby into a logical replication subscriber in a single operation. Teams with physical standbys used for read scaling can now convert them to logical subscribers without tearing down and rebuilding the standby. This is an opportunity for teams that want subscriber-level table filtering or cross-version replication.
Improved conflict logging: PostgreSQL 18 surfaces logical replication conflicts with more detail in the server log, including the specific row values involved. Previously, conflicts were logged at a level that was easy to suppress; now they appear as ERROR level with structured detail. If you had suppressed replication conflict alerts because the volume was too noisy, PostgreSQL 18 will make them reappear prominently.
The Problem
The current approach to PostgreSQL major version upgrades often treats replication as a transparent layer that will simply resume functioning once the engine is upgraded. However, this approach breaks when upgrading to PostgreSQL 18 because the default concurrency model for logical replication fundamentally shifts.
When a team upgrades a logical subscriber to PostgreSQL 18 without preparation, the new default of max_parallel_apply_workers_per_subscription = 2 immediately activates. If the downstream application relies on strict sequential ordering of independent transactions — for example, building derived state or feeding an event-driven architecture — the sudden parallel apply will cause subtle data anomalies. Concurrently, the new verbose conflict logging will trigger massive volumes of ERROR level alerts for conflicts that were previously ignored, overwhelming observability pipelines.
How can engineering teams proactively identify and manage these replication changes before they cause data anomalies and alert fatigue in production?
Upgrade Readiness Framework
To navigate these changes, teams should follow a structured diagnostic and remediation process.
Symptoms and Signals
| Signal | Where to see it | What it means |
|---|---|---|
| Current replication lag baseline | pg_stat_replication.replay_lag | Establish before upgrade to detect regression |
| Existing logical subscriptions | pg_subscription on subscribers | Will be affected by parallel apply default |
| Replication conflict errors in current logs | postgresql.log grep for conflict in logical replication | These will become more visible in PG18 |
| Physical standbys that could become logical | Infrastructure inventory | pg_createsubscriber --all conversion opportunity |
Current max_wal_senders and max_replication_slots values | SHOW max_wal_senders; SHOW max_replication_slots; | Parallel apply adds additional worker connections |
First Five Checks
- Identify current replication type and topology — establish what you have before planning what changes:
-- Check physical standbys (streaming replication)
SELECT client_addr, application_name, state, sent_lsn, replay_lsn,
now() - pg_last_xact_replay_timestamp() AS lag_estimate
FROM pg_stat_replication;
-- Check logical subscriptions (run on subscriber)
SELECT subname, subenabled, subconninfo, subpublications
FROM pg_subscription;
-- Check logical publishers (run on publisher)
SELECT pubname, puballtables, pubinsert, pubupdate, pubdelete
FROM pg_publication;
This establishes your current topology. Physical standbys and logical subscribers are upgraded differently — physical standbys follow the primary’s upgrade path, logical subscribers can remain on older versions while the publisher upgrades to PG18, which is one of the benefits of logical replication.
- Measure current replication lag baseline — capture before upgrade to detect regressions:
-- On publisher: physical replication lag
SELECT
application_name,
client_addr,
state,
write_lag,
flush_lag,
replay_lag
FROM pg_stat_replication
ORDER BY replay_lag DESC NULLS LAST;
-- On subscriber: time-based lag for logical replication
SELECT
subname,
received_lsn,
last_msg_send_time,
last_msg_receipt_time,
latest_end_time
FROM pg_stat_subscription;
Record these baseline values. After the upgrade, the same queries run against the upgraded instance should show stable or improved lag. If lag increases after upgrade, parallel apply worker count or worker connection limits may need tuning.
- Check for existing logical replication subscriptions — these require the most careful upgrade planning:
-- On subscriber: full subscription inventory
SELECT
s.subname,
s.subenabled,
r.srrelid::regclass AS tablename,
r.srsubstate
FROM pg_subscription s
JOIN pg_subscription_rel r ON r.srsubid = s.oid
ORDER BY s.subname, r.srsubstate;
-- Check current parallel apply setting (PostgreSQL 16+)
SHOW max_parallel_apply_workers_per_subscription;
If your subscribers are on PostgreSQL 16 or 17, max_parallel_apply_workers_per_subscription may already be set. If subscribers are on PostgreSQL 14 or 15, this parameter does not exist yet — it becomes relevant when the subscriber is upgraded to 18.
- Audit current conflict handling — understand what conflicts are already happening silently:
# Search the current PostgreSQL log for existing replication conflicts
grep -c 'conflict in logical replication' /var/log/postgresql/postgresql.log
# Get the distinct conflict types
grep 'conflict in logical replication' /var/log/postgresql/postgresql.log | \
grep -oP 'conflict on \w+' | sort | uniq -c | sort -rn
If you find zero conflicts in the log, either your replication is clean or conflicts are being logged at a level you are not capturing. After upgrading to PostgreSQL 18, conflict errors will be more prominently logged. Knowing the baseline before upgrade means you can distinguish “this is a new problem” from “this was always happening.”
- Check
max_wal_sendersandmax_replication_slotsheadroom — parallel apply uses additional worker slots:
SHOW max_wal_senders;
SHOW max_replication_slots;
-- Current usage
SELECT count(*) AS active_wal_senders FROM pg_stat_replication;
SELECT count(*) AS active_slots FROM pg_replication_slots WHERE active;
Parallel apply workers each require a walsender connection from the publisher. If you have 5 logical subscribers with max_parallel_apply_workers_per_subscription = 2, you need at minimum 5 * (1 + 2) = 15 wal senders just for logical replication. Ensure max_wal_senders is sized to accommodate this plus physical standbys.
Decision Tree
flowchart TD
A[Planning PG18 upgrade] --> B{Using logical replication?}
B -->|yes| C{Parallel apply already enabled?}
C -->|yes — PG16 or 17| D[Test apply ordering assumptions in staging]
C -->|no — PG14 or 15| E[Set max_parallel_apply to 0 initially after upgrade]
E --> F[Enable incrementally after validation]
B -->|no — physical only| G{Physical standbys present?}
G -->|yes| H{Convert any to logical?}
H -->|yes| I[Test pg_createsubscriber in staging first]
H -->|no| J[Physical replication — minimal changes in PG18]
D --> K{Conflict log volume change after upgrade?}
K -->|yes — more conflicts visible| L[Review and resolve — do not suppress]
K -->|no| M[Validate lag baseline matches pre-upgrade]
Remediation Options
Option 1 — Staged parallel apply enablement
After upgrading the subscriber to PostgreSQL 18, start with parallel apply disabled, validate behavior, then enable incrementally:
-- Disable parallel apply immediately after upgrade
ALTER SUBSCRIPTION my_subscription
SET (max_parallel_apply_workers_per_subscription = 0);
-- Verify subscriber is applying correctly with zero parallel workers
SELECT subname, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;
-- After 48 hours of stable operation, enable with 1 worker
ALTER SUBSCRIPTION my_subscription
SET (max_parallel_apply_workers_per_subscription = 1);
-- If stable for another 48 hours, increase to default
ALTER SUBSCRIPTION my_subscription
SET (max_parallel_apply_workers_per_subscription = 2);
The risk of parallel apply is not data corruption — PostgreSQL ensures causally-related transactions are applied in order. The risk is application code that assumes a specific apply order between causally-independent transactions and uses that assumption to build derived state.
Option 2 — Convert physical standby with pg_createsubscriber
PostgreSQL 18 includes pg_createsubscriber with an --all flag that converts an existing physical standby to a logical subscriber in one operation:
# Stop the standby (required — it cannot be running during conversion)
pg_ctl stop -D /var/lib/postgresql/standby_data
# Convert to logical subscriber
# (run as postgres user, connecting to publisher)
pg_createsubscriber \
--pgdata=/var/lib/postgresql/standby_data \
--publisher-server="host=publisher port=5432 dbname=mydb" \
--all \
--subscription-name=my_logical_sub
# Start the converted subscriber
pg_ctl start -D /var/lib/postgresql/standby_data
# Verify subscription is running
psql -c "SELECT subname, subenabled FROM pg_subscription;"
The --all flag replicates all tables from all databases, equivalent to FOR ALL TABLES IN SCHEMA public. Per the PostgreSQL 18 beta documentation, the standby must be on the same major version as the publisher for the conversion to succeed.
This is an opportunity if you have read replicas that are underutilized as physical standbys and would benefit from logical replication’s filtering and cross-version upgrade flexibility.
Option 3 — Conflict monitoring setup for PG18 log format
PostgreSQL 18 logs replication conflicts with structured detail. Update any log parsing or alerting to match the new format:
# New PG18 conflict log format includes row values:
# ERROR: conflict detected on relation "public.orders": conflict=insert_exists
# Key (id)=(12345); existing local tuple (12345, 'pending', ...);
# remote tuple (12345, 'shipped', ...); ...
# Update log monitoring to capture conflict type
grep -E 'conflict=(insert_exists|update_missing|delete_missing)' \
/var/log/postgresql/postgresql.log | \
awk '{print $NF}' | sort | uniq -c
# Set up a per-conflict-type count alert in your monitoring tool
# Alert threshold: > 10 conflicts per hour of any type
The PostgreSQL 18 beta documentation describes the conflict_resolution parameter for subscriptions (new in PG18), which can be set to apply_remote (default), keep_local, or skip to control automatic conflict resolution behavior. Previously, all conflicts required manual SKIP intervention.
Rollback Plan
- Parallel apply: disable immediately with
ALTER SUBSCRIPTION ... SET (max_parallel_apply_workers_per_subscription = 0). No data loss — takes effect on the next transaction. Reversible at any time. pg_createsubscriberconversion: not directly reversible — once converted to a logical subscriber, restoring to a physical standby requires rebuilding the standby from the primary withpg_basebackup. Keep a snapshot of the standby data directory before conversion.- PostgreSQL 18 upgrade: major version downgrades require restoring from a pre-upgrade backup. The upgrade itself does not change replication topology; the changes are in behavior. Pre-upgrade backup is the only rollback path.
- Conflict resolution parameter:
ALTER SUBSCRIPTION ... SET (conflict_resolution = 'skip')can be set or unset at any time without a restart.
Automation Opportunity
A pre-upgrade validation script that runs the five checks automatically and flags risks:
#!/bin/bash
# PostgreSQL 18 replication upgrade readiness check
PSQL="psql -tAc"
echo "=== Replication Upgrade Readiness Check ==="
# Check 1: Replication topology
echo "--- Logical subscriptions:"
$PSQL "SELECT count(*) FROM pg_subscription WHERE subenabled;"
# Check 2: Current lag
echo "--- Max replay lag (physical):"
$PSQL "SELECT max(replay_lag) FROM pg_stat_replication;"
# Check 3: Parallel apply headroom
MAX_WS=$($PSQL "SHOW max_wal_senders;")
ACTIVE_WS=$($PSQL "SELECT count(*) FROM pg_stat_replication;")
SUB_COUNT=$($PSQL "SELECT count(*) FROM pg_subscription;")
NEEDED_WS=$((ACTIVE_WS + SUB_COUNT * 3)) # conservative: 3 workers per sub
echo "--- max_wal_senders: $MAX_WS, current active: $ACTIVE_WS, needed with parallel: $NEEDED_WS"
# Check 4: Existing conflict count
echo "--- Conflict count in last 7 days of logs:"
grep -c 'conflict in logical replication' /var/log/postgresql/postgresql.log 2>/dev/null || echo "0"
echo "=== Done ==="
Run this against production before the upgrade window and again 24 hours after the upgrade to confirm stable behavior.
In Practice
The documented pattern is that PostgreSQL 18 fundamentally alters logical replication concurrency. The PostgreSQL Global Development Group’s beta release notes describe parallel logical apply as controlled by max_parallel_apply_workers_per_subscription, with a default of 2 workers. The parallel apply documentation explicitly notes that causally-related transactions — transactions where one depends on the other’s committed state — are always applied in order, but independent concurrent transactions may be applied in a different order than they were committed on the publisher.
The pg_createsubscriber utility was introduced in PostgreSQL 17 and is extended in PostgreSQL 18 with the --all flag. The documented behavior is that it stops WAL recovery on the standby, promotes it to standalone, creates the necessary publication on the publisher, and sets up the logical subscription — all in one operation. The beta documentation notes that the standby must have been a synchronous or asynchronous physical standby that was fully caught up at the time of conversion.
Tradeoff Matrix
Three distinct upgrade paths. Each is appropriate for a different team posture — the wrong choice for your application topology creates the failure modes in the table below.
| Upgrade path | Sequential apply guarantee | Ops complexity | Standby topology change | When to choose |
|---|---|---|---|---|
Disable parallel apply — set max_parallel_apply_workers = 0 after upgrade | Preserved fully | Low | None | Any application with causal ordering assumptions; start here for every upgrade |
| Enable parallel apply incrementally — 0 → 1 → 2 workers over 96 hours | Relaxed for causally-independent txns only | Medium — requires apply-order audit | None | Event-driven consumers that tolerate out-of-order independent writes; high-write publishers |
Convert standby to logical — run pg_createsubscriber --all | N/A — logical replication model | High — topology change, irreversible without rebuild | Physical standby becomes logical subscriber | Teams needing table-level filtering, cross-version replication, or subscriber-level write access |
Choosing parallel apply without an ordering audit is the highest-risk option — it silently changes the consistency model of your subscriber for any application that reads derived state across independent tables.
Where It Breaks
| Failure mode | Trigger | Fix |
|---|---|---|
| Application reads stale data from subscriber | Parallel apply changes apply order for independent transactions | Audit application for causal ordering assumptions; add explicit ordering via sequence or timestamp |
max_wal_senders exceeded after enabling parallel apply | Multiple subscriptions × parallel workers exceeds the limit | Increase max_wal_senders before enabling parallel apply |
| Conflict log volume overwhelms monitoring | PG18 surfaces previously-silent conflicts at ERROR level | Triage and resolve conflicts; do not suppress — they represent real data divergence |
pg_createsubscriber fails mid-conversion | Standby still active or primary unreachable during conversion | Stop standby completely before running; verify publisher connectivity |
Conflict resolution parameter set to skip globally | All conflicts silently skipped — subscriber diverges permanently | Set conflict_resolution = 'apply_remote' for insert conflicts; investigate and fix root cause |
What to Do Next
- Problem: PostgreSQL 18 enables parallel logical apply by default and surfaces replication conflicts at a higher log level — both are improvements that can cause operational surprises if not prepared for before the upgrade.
- Solution: Set
max_parallel_apply_workers_per_subscription = 0immediately after upgrading logical replication subscribers, validate behavior, then enable incrementally after confirming application ordering assumptions hold. - Proof: After upgrade, replication lag should match or improve versus the pre-upgrade baseline, and
pg_stat_subscription.received_lsnshould advance continuously. - Action: Run the five pre-upgrade checks against your production database this week. Record baseline lag values and conflict log counts so you have a comparison point for post-upgrade validation.
Checklist
- Identify replication topology — physical standbys, logical subscribers, or both
- Record baseline replication lag from
pg_stat_replicationandpg_stat_subscription - Check current
max_wal_senders— calculate headroom with parallel apply workers added - Count existing replication conflicts in current logs — establish baseline before upgrade
- Check for logical subscriptions on PostgreSQL 14 or 15 — plan subscriber upgrade path
- Test upgrade procedure in staging with production data volume — including parallel apply enabled
- After upgrade: immediately set
max_parallel_apply_workers_per_subscription = 0on all subscribers - Run for 48 hours at zero parallel workers — confirm lag is stable and no new conflicts
- Enable parallel apply with 1 worker — monitor for 48 hours
- Increase to default 2 workers — monitor lag and conflict log for another 48 hours