Most applications exhaust their database long before the database is under load. The failure is not query pressure — it is connection pressure. Every new connection to PostgreSQL forks a backend process. Every new connection to MySQL spawns a thread. Without a pool capping that number, a traffic spike generates hundreds of OS-level resources in seconds, and the database runs out of capacity to accept connections before it runs out of capacity to execute queries.

Situation

Backend engineers know connection pools exist. Most frameworks configure one by default — SQLAlchemy, HikariCP, ActiveRecord, and similar libraries all ship with pool settings. The problem is that those library-level pools live inside a single application process. Scale to five app pods and you have five independent pools, each with their own ten connections: fifty total connections to the database. Scale to fifty pods and you have five hundred. Add a deployment rollout that starts new pods before draining old ones and the math gets worse fast.

This matters because databases have hard limits. PostgreSQL’s max_connections defaults to 100. MySQL’s defaults to 151. Those limits are not arbitrary — they map to real resource consumption per connection.

The Problem

PostgreSQL’s connection model, documented in the PostgreSQL Server Programming documentation, forks a new backend process for each client connection. Each backend process carries its own memory space — typically 5–10 MB per connection depending on work_mem settings and query state. One hundred connections means one hundred processes. At five hundred connections you are consuming several gigabytes of RAM just in process overhead before a single row is read.

MySQL uses a thread-per-connection model rather than processes, which reduces per-connection overhead, but the problem is structurally identical: threads consume stack space, file descriptors, and scheduler overhead. At high connection counts both systems degrade.

The acute failure mode is a connection storm: an app deployment or autoscale event brings up many new pods simultaneously, each opening their full pool. The database hits max_connections, new connection attempts queue or return errors, and the application starts logging “too many connections” at the moment it most needs to be available — during a traffic spike or recovery event. The database itself is not overloaded. It simply cannot accept new clients.

What is the right way to decouple application instance count from database connection count?

How Connection Poolers Work

A connection pooler sits between application processes and the database. Applications connect to the pooler, which maintains a fixed, smaller set of long-lived connections to the actual database. The application sees a normal database endpoint; the database sees a bounded number of backend processes regardless of how many application pods are running.

The two dominant tools are PgBouncer for PostgreSQL and ProxySQL for MySQL.

PgBouncer operates in three modes, documented in the PgBouncer documentation:

ModeHow it worksWhat breaks
Session modeOne server connection per client session; held for the life of the client connectionMinimal breakage; connection count reduction only happens if clients disconnect promptly
Transaction modeServer connection returned to pool after each transaction completesLISTEN/NOTIFY, advisory locks, prepared statements, and SET LOCAL state do not survive across transactions
Statement modeServer connection returned after each statementBreaks transactions; use only for simple read-only workloads

Transaction mode delivers the most aggressive multiplexing — a pooler with 20 server-side connections can service hundreds of application clients that are between transactions — but it breaks any feature that assumes state persists across transactions. PostgreSQL’s LISTEN/NOTIFY mechanism relies on a persistent server connection; in transaction mode the pooler may reassign that connection to another client between events. Advisory locks held at session scope are lost the moment the transaction commits. Applications using SET LOCAL to configure session parameters will find those settings gone after each transaction boundary.

ProxySQL applies the same multiplexing principle to MySQL, with additional query routing capabilities (read-write splitting, rule-based routing) that make it common in MySQL environments with replicas. Its connection pool size is configured independently of the application-side connection settings.

The practical deployment pattern is to configure application connection pools small (3–5 connections per pod) so the pooler remains the single point of configuration, and set the pooler’s server-side pool to a number the database can sustain — typically 20–50% of max_connections, leaving headroom for administrative connections and monitoring.

In Practice

The PostgreSQL project documents the process-per-connection model explicitly, and the PgBouncer FAQ describes the transaction mode tradeoffs in detail, noting that applications must be verified compatible before enabling it.

The Heroku Postgres team published guidance on PgBouncer in transaction mode specifically because Heroku’s platform runs many small dynos each with their own application process — exactly the multi-pod scaling problem described above. Their tooling, pgbouncer-heroku, emerged from the documented operational reality that a modest Heroku app on ten dynos could exhaust a standard PostgreSQL max_connections without any pooler in place.

The documented pattern from the PgBouncer project itself is: use session mode as a starting point when application compatibility is uncertain, verify that no LISTEN/NOTIFY or advisory lock usage exists, then migrate to transaction mode for maximum multiplexing.

Where It Breaks

ScenarioWhat breaksWhy
Transaction mode with LISTEN/NOTIFYNotifications are never received or delivered to the wrong clientThe pooler reassigns server connections between events; the persistent channel the listener expects does not exist
Pool exhaustion under burstsNew client connections are queued or rejected by the pooler itselfThe pooler’s server-side pool is also bounded; if all server connections are busy, clients wait or time out
Health check connections consuming pool slotsLiveness probes open a connection and close it repeatedly, consuming pool capacityHealth checks should connect to the pooler’s stats port or use a single persistent probe connection rather than opening fresh database connections

What to Do Next

  • Problem: Without a standalone pooler, application pod count directly drives database connection count — a deployment event can exhaust max_connections before the database processes a single query.
  • Solution: Deploy PgBouncer (PostgreSQL) or ProxySQL (MySQL) as a sidecar or dedicated service; configure application pools to 3–5 connections per pod; set the pooler’s server pool to a fraction of max_connections.
  • Proof: After deploying the pooler, run SELECT count(*) FROM pg_stat_activity during a load test — the number should stay flat as application replicas scale, rather than increasing proportionally.
  • Action: This week, check your current connection count and compare it to your max_connections setting; if you are above 60% of the limit without a pooler, that is the gap to close first:
-- Connection count by state
SELECT count(*), state
FROM pg_stat_activity
GROUP BY state;

-- Show the configured limit
SHOW max_connections;