Azure Database for PostgreSQL: Flexible Server vs Hyperscale (Citus) Architecture Decision
The default Azure PostgreSQL offering handles most OLTP workloads correctly, but teams that hit connection limits, multi-tenant scale, or distributed query requirements discover they chose the wrong architecture after the schema is in production.
Situation
Azure offers two managed PostgreSQL architectures: Flexible Server (the current default and successor to Single Server) and Hyperscale, which runs the Citus extension for distributed PostgreSQL. Both are managed services on Azure with similar operational interfaces. The architectural difference is not a sizing question — it is a data distribution question. Most teams never need Citus. The teams that do need it typically discover the need late, after their schema is built around single-node PostgreSQL assumptions.
Azure announced that PostgreSQL Single Server reached end of life in March 2025, making Flexible Server the standard entry point for new deployments and migrations.
The Problem
Azure Flexible Server is a single-primary managed PostgreSQL instance with read replicas, high availability via standby promotion, and built-in PgBouncer connection pooling. It scales vertically and handles standard PostgreSQL workloads. The failure mode is predictable: beyond a certain write throughput threshold and connection count, a single PostgreSQL primary saturates regardless of how large the VM SKU is.
Citus distributes table rows across worker nodes using a shard key. This enables horizontal write scaling and parallel query execution across shards — but it requires designing the schema and query patterns around the distribution key from the start. Application queries that do not include the distribution key cannot be routed to a single shard and must fan out across all workers, which is expensive.
The core question: does the workload require horizontal scaling of writes and data volume, or does it require operational simplicity with vertical scaling?
Flexible Server vs Hyperscale (Citus)
flowchart TD
A[PostgreSQL workload on Azure] --> B{Multi-tenant or single-tenant?}
B -->|single tenant — standard OLTP| C[Flexible Server]
B -->|multi-tenant at scale or distributed analytics| D{Can schema be distributed on tenant ID?}
D -->|yes — queries filter by tenant| E[Citus — sharded by tenant]
D -->|no — cross-tenant joins required| F[Flexible Server — accept vertical limits]
C --> G[Scale vertically — HA standby — PgBouncer]
E --> H[Coordinator node — worker shards — distributed queries]
Azure Flexible Server
Flexible Server provides a single primary PostgreSQL instance with:
- Zone-redundant high availability (primary + synchronous standby in a secondary AZ)
- Built-in PgBouncer for connection pooling (configurable pool sizes per database)
- Read replicas for read offload (asynchronous replication)
- Automatic minor version patching and maintenance windows
- Private endpoint and VNet integration
The HA model uses a standby in a secondary availability zone with synchronous replication. Azure documents typical failover in 60–120 seconds with automatic DNS cutover (Flexible Server HA docs). The built-in PgBouncer connection pooler is enabled separately from the HA feature and must be explicitly configured — applications that connect directly to the PostgreSQL port bypass PgBouncer.
Connection pooling is the most commonly misconfigured element. Azure Flexible Server supports a maximum of 5,000 backend connections for the largest SKU (D64s v3), but each PostgreSQL backend process consumes memory. The practical limit before performance degrades is substantially lower. PgBouncer on Flexible Server runs in transaction-pooling mode by default, which releases the backend connection between transactions — enabling more clients than physical backends.
Hyperscale (Citus)
Citus distributes a PostgreSQL database across a coordinator node and multiple worker nodes. The coordinator routes queries to shards based on the distribution column. A table distributed on tenant_id routes queries that filter on tenant_id to the single worker holding that tenant’s shards. Queries without a tenant_id filter fan out to all workers.
The operational consequence: Citus is most efficient for multi-tenant SaaS workloads where each tenant’s data is isolated and queries are tenant-scoped. It is less effective for workloads with heavy cross-tenant analytics or complex joins between distributed and reference tables.
Azure-managed Citus (now branded as part of Azure Cosmos DB for PostgreSQL) provides managed coordinator and worker nodes, automatic rebalancing, and built-in high availability per node.
In Practice
Azure Flexible Server’s PgBouncer documentation explicitly states that PREPARE, DEALLOCATE, LISTEN, NOTIFY, LOAD, and advisory locks are not compatible with transaction-pooling mode (PgBouncer compatibility). Applications that use prepared statements with PgBouncer in transaction mode will encounter errors. This is a documented PostgreSQL connection pooler constraint, not Azure-specific — but it is frequently missed by teams migrating from AWS RDS or on-premises PostgreSQL where client-side connection pooling was used at the application layer instead.
Citus’s documented design requires that the distribution column be present in the primary key and all unique constraints of the distributed table. A table distributed on tenant_id must include tenant_id in its primary key (e.g., PRIMARY KEY (tenant_id, id)). This is documented as a hard requirement — the coordinator cannot enforce uniqueness across shards without the distribution column in the constraint (Citus distribution docs). Applications migrated from single-node PostgreSQL typically have auto-increment primary keys without a tenant prefix, requiring a schema migration before Citus distribution is feasible.
Where It Breaks
| Scenario | What breaks | Why |
|---|---|---|
| Flexible Server — prepared statements with PgBouncer in transaction mode | ERROR: prepared statement does not exist | Transaction-pooling releases connections between statements; prepared statements don’t persist |
| Flexible Server — application connects to PostgreSQL port, bypasses PgBouncer | Connection saturation under load | PgBouncer only intercepts connections on port 6432; direct PostgreSQL port (5432) bypasses pooling |
| Citus — cross-tenant queries on distributed tables | Fan-out to all workers, high latency | No shard routing possible without distribution column in WHERE clause |
| Citus — unique constraints without distribution column | Cannot enforce constraint across shards | Coordinator cannot run a distributed uniqueness check efficiently |
| Flexible Server — HA failover to standby | 60–120s DNS propagation delay during failover | Applications not using connection retry logic see errors during the HA switchover window |
| Citus — uneven tenant distribution (hotspot) | One worker shard saturated while others idle | All rows for a large tenant land on one shard; distribution column alone does not balance load |
What to Do Next
- Problem: Choosing between Flexible Server and Citus after the schema is designed and populated is expensive — Citus requires a distribution-column-aware schema that cannot be retrofitted easily.
- Solution: Use Flexible Server as the default; evaluate Citus only when the workload is multi-tenant with tenant-scoped queries, write throughput exceeds what a single large SKU can sustain, or data volume per tenant is large enough to benefit from distributed storage.
- Proof: Benchmark your top write-intensive operations on the largest available Flexible Server SKU under expected peak load; if the primary CPU or WAL write throughput saturates, that is the signal that horizontal distribution is worth the schema redesign cost.
- Action: If you are building on Flexible Server, enable and configure PgBouncer this week, connect your application through port 6432, and verify prepared statement behavior — this is the most common production misconfiguration on Azure PostgreSQL.