Per-App Postgres on Kubernetes Changes the Failure Boundary

Per-application PostgreSQL does not make databases easier to operate; it makes the failure boundary smaller and the operating contract larger. The trade is worth considering only when the platform can prove that every declared database can fail over, rotate credentials, archive WAL, restore into a clean namespace, and survive Kubernetes maintenance without relying on tribal memory.

Situation

The old platform default was a shared managed PostgreSQL cluster with many application databases. It is efficient, familiar, and often the right answer. It also couples teams through change windows, noisy neighbors, backup policy, major-version lifecycle, and shared operational risk.

The newer pattern is one PostgreSQL cluster per application, declared in Git and reconciled by a Kubernetes operator such as CloudNativePG. That changes what the platform owns. The platform is no longer only offering “a database”; it is offering a repeatable database lifecycle.

Default model	Alternative model	What changes
One shared managed PostgreSQL cluster, many databases	One CloudNativePG cluster per application	Failure moves from shared infrastructure to per-service blast radius
Central database administrator controls change windows	GitOps declares database intent per service	Review moves into pull requests, admission policy, and runbooks
Backups and upgrades handled at the shared cluster level	Backups and upgrades handled per cluster	More isolation, more fleet operations
Credentials and connectivity are centrally managed	Secrets are synchronized into each namespace	Rotation becomes an end-to-end workflow, not a secret-store update
Database operations are concentrated in a few large systems	Database operations are repeated across many smaller systems	Templates, policy, alerts, and restore drills become the product

CloudNativePG makes this viable because PostgreSQL becomes a Kubernetes custom resource. Argo CD can reconcile the database intent from Git. External Secrets Operator can pull credentials from Azure Key Vault or another external store into Kubernetes Secrets. Kustomize overlays can keep environment differences explicit.

That is a strong architecture. It is not managed-database simplicity with YAML in front of it.

The Problem

The operator can create the cluster. That is the least interesting part.

The production question is whether the database survives the ordinary failures: node drains, bad migrations, storage latency, broken WAL archiving, stale credentials, object-store access errors, version drift, and emergency changes made while GitOps is still reconciling the old state.

Failure point	What breaks	Why it matters
Shared cluster migrations	One application’s migration can saturate I/O, bloat catalogs, or hold locks visible to unrelated tenants	Per-database isolation inside one PostgreSQL instance is not operational isolation
GitOps self-healing	Argo CD can reapply the desired state after manual emergency changes when `selfHeal: true` is enabled	Incident response needs a documented reconciliation pause; Argo CD retries self-heal after a default 5 second timeout when configured that way (Argo CD docs)
Backup configuration	WAL archives exist, but the physical base backup is missing, stale, or unrecoverable	CloudNativePG’s docs warn that a WAL archive alone is not a restore strategy (CloudNativePG backup docs)
Kubernetes storage	PostgreSQL restarts cleanly, but the StorageClass has poor latency, weak snapshot behavior, or unsafe reclaim defaults	A database operator cannot paper over unreliable persistent volume semantics
Secret rotation	External Secrets updates a Kubernetes Secret, but PostgreSQL roles and application connection pools keep using old credentials	Secret synchronization is not end-to-end credential rotation
Version drift	A manifest copied from an older CloudNativePG example keeps working until the operator lifecycle changes	Starting with CloudNativePG 1.26, backup and recovery capabilities are moving toward CNPG-I plugins, so backup templates need version review (CloudNativePG backup docs)

The right question is not “can Kubernetes run PostgreSQL?” It can. The better question is: what operational boundary are you buying, and what repeated work are you accepting for every application database?

Architecture Problem

The shared database model and the per-application database model solve different coordination problems. In the shared model, operational consistency is achieved at the cost of coupling. In the per-application model, coupling is removed at the cost of operational repetition.

The architectural problem is not technical feasibility. Kubernetes can schedule PostgreSQL pods. CloudNativePG can declare a cluster as a custom resource. Argo CD can reconcile it from Git. External Secrets Operator can synchronize credentials into namespaces. These mechanisms are documented and widely deployed.

The actual architectural problem is: which operational concerns can be automated once at the platform layer, and which must be repeated per database — and is the platform mature enough to absorb the repetition safely?

The failure mode of the shared model is coupling: one application’s migration, bloat, or connection saturation affects every tenant of the cluster. The failure mode of the per-application model is multiplication: every new database adds backup monitoring, restore verification, credential rotation, upgrade planning, and failover testing. If these are not templated, tested, and owned by platform tooling, the per-application model exchanges shared risk for invisible risk.

Design Options

Three options are in common use, and each distributes risk and work differently.

Option	Description	Coupling risk	Multiplication risk	Recommended for
Shared managed cluster	One cloud-managed PostgreSQL cluster hosts many application databases; DBA team or cloud provider owns operations	High — shared change windows, noisy neighbors, shared version lifecycle	Low — operations are centralized	Teams early in database operational maturity; stable workloads without strict isolation requirements
Per-app PostgreSQL, manual management	Each application gets a dedicated cloud-managed database instance; teams manage their own backups, creds, and versions	Low — isolated failure boundary	High — no shared templates, policy, or tooling	Teams that need isolation but cannot invest in a Kubernetes-native platform
Per-app PostgreSQL via operator (CloudNativePG + GitOps)	Kubernetes operator reconciles PostgreSQL clusters from Git; external secrets, backups, monitoring, and failover are declared resources	Low — each application cluster is independent	Medium — operator and templates absorb repetition, but restore drills and upgrade testing must still run per cluster	Teams with mature Kubernetes platform capability and willingness to own the database lifecycle

Option A should remain the default until coupling failure modes are actively limiting teams. The argument for per-app databases should be made from incident reports and blocking dependencies, not from preference for patterns.

Option B increases operational isolation without a shared template layer. Teams that choose this option often discover that they have recreated the shared-cluster problem in a distributed form: many databases with inconsistent backup policies, no shared restore testing, and no centralized visibility into credential expiry or disk saturation.

Option C is the strongest option when the platform investment has been made. CloudNativePG provides a consistent operator lifecycle, standardized service semantics, and Prometheus integration. GitOps provides audit history, review gates, and reconciliation. External Secrets provides credentialed automation. The platform team owns the templates, admission policy, and restore drill cadence. Application teams declare their database intent and trust the platform to handle the lifecycle correctly.

Tradeoff Matrix

Dimension	Shared managed cluster	Per-app managed instances	Per-app operator (CloudNativePG)
Failure blast radius	Shared across all tenants	Per application	Per application
Noisy neighbor risk	High	None	None
Operational repetition	Low	High	Medium — templates absorb most repetition
Backup and restore	Centralized, consistent	Per-team, inconsistent without tooling	Per-cluster, consistent if platform owns templates
Credential rotation	Central secret store	Per-instance manual or scripted	External Secrets + per-cluster runbook
Version upgrades	Scheduled at cluster level	Per-instance, team-owned	Per-cluster, GitOps-managed
GitOps compatibility	External to database	External to database	Native — cluster is a Kubernetes custom resource
Restore drill burden	One drill for shared cluster	One drill per instance	One drill per cluster tier (production, staging)
Platform investment	Low	Low	High — operator lifecycle, policy, monitoring, templates

Core Concept: Per-App PostgreSQL as a Declared Failure Boundary

A per-application PostgreSQL cluster works when the platform treats the database manifest as an operating contract, not a deployment snippet.

flowchart TD
    Dev[developer commit] --> Git[Git repository — apps and databases]
    Git --> Argo[Argo CD — reconcile desired state]
    Argo --> App[application namespace]
    Argo --> CNPGCluster[CloudNativePG Cluster resource]
    KeyVault[external secret store] --> ESO[External Secrets Operator]
    ESO --> K8sSecret[Kubernetes Secret]
    K8sSecret --> App
    K8sSecret --> CNPGCluster
    CNPG[CloudNativePG operator] --> Primary[PostgreSQL primary]
    CNPG --> ReplicaA[PostgreSQL replica]
    CNPG --> ReplicaB[PostgreSQL replica]
    App --> RWService[cluster rw service]
    RWService --> Primary
    Primary --> WAL[WAL archive in object storage]
    ReplicaA --> WAL
    ReplicaB --> WAL
    Backup[scheduled base backup] --> ObjectStore[object storage recovery boundary]

CloudNativePG creates service endpoints for each cluster: rw points to the current primary, ro points to replicas when available, and r can point to any instance. The rw service is essential and cannot be disabled because CloudNativePG relies on it for PostgreSQL replication behavior (CloudNativePG service docs). Application write traffic should use the generated *-rw service unless there is a deliberately tested routing layer in front of it.

A production-grade manifest should look less like a tutorial and more like a contract:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: linkding-db-prod
  labels:
    app.kubernetes.io/name: linkding
    platform.example.com/owner: bookmarks
    platform.example.com/tier: production
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:16.4

  storage:
    size: 100Gi
    storageClass: premium-rwo

  resources:
    requests:
      cpu: "500m"
      memory: 2Gi
    limits:
      memory: 4Gi

  monitoring:
    enablePodMonitor: true

  bootstrap:
    initdb:
      database: linkding
      owner: linkding
      secret:
        name: linkding-db-owner

  backup:
    barmanObjectStore:
      destinationPath: https://example.blob.core.windows.net/postgres/linkding
      azureCredentials:
        storageAccount:
          name: linkding-backup-creds
          key: storage-account
        storageSasToken:
          name: linkding-backup-creds
          key: sas-token
      wal:
        compression: gzip
      data:
        compression: gzip
    retentionPolicy: 14d

The contract is not complete until it has tests.

Split day-0 infrastructure from day-2 database intent.

Install CloudNativePG, External Secrets Operator, Argo CD, monitoring CRDs, admission policy, namespaces, and storage classes through Terraform or another cluster-admin workflow. Application repositories should declare database intent, not own operator installation.

Verification:

kubectl auth can-i create clusters.postgresql.cnpg.io -n linkding-prod
kubectl auth can-i update deployment cloudnative-pg -n cnpg-system
kubectl auth can-i patch storageclass premium-rwo

The expected shape is narrow: application delivery can create its own Cluster resource in its namespace, but cannot modify the operator deployment, cluster-wide secret stores, or storage classes.

Make policy enforce the minimum contract.

For production clusters, reject manifests that omit ownership labels, resource requests, monitoring, backup configuration, explicit storage class, or a three-instance topology.

A CI or admission rule should fail a manifest like this:

spec:
  instances: 1
  storage:
    size: 5Gi

The exact policy engine is less important than the invariant. Kyverno, OPA Gatekeeper, Conftest, or a custom CI check can all work. The point is to stop “temporary” database YAML from becoming production state.

Route applications through the CloudNativePG read-write service.

Do not hardcode pod names. Do not point applications at ordinal 0. Do not teach application teams that the first pod is the primary. In a failover, the application needs the service abstraction to follow the writable instance.

Verification:

kubectl -n linkding-prod get cluster linkding-db-prod \
  -o jsonpath='{.status.currentPrimary}{"\n"}'

kubectl -n linkding-prod delete pod "$(kubectl -n linkding-prod get cluster linkding-db-prod \
  -o jsonpath='{.status.currentPrimary}')"

kubectl -n linkding-prod wait cluster/linkding-db-prod \
  --for=condition=Ready \
  --timeout=300s

kubectl -n linkding-prod get cluster linkding-db-prod \
  -o jsonpath='{.status.currentPrimary}{"\n"}'

Then verify the application can still write through the same hostname:

create table if not exists platform_failover_probe (
  id bigserial primary key,
  observed_at timestamptz not null default now()
);

insert into platform_failover_probe default values;
select count(*) from platform_failover_probe;

A changed primary is not enough. The application write must succeed without changing connection strings.

Prove recovery before calling the platform production-ready.

CloudNativePG can archive WAL to object storage and recover from physical backups. For Barman object-store backups, current CloudNativePG docs say the operator sets archive_timeout to 5min by default, giving a deterministic time-based RPO boundary for low-write workloads (CloudNativePG object-store backup docs). That boundary is meaningful only after restore has been tested.

Verification:

kubectl -n linkding-prod apply -f - <<'YAML'
apiVersion: postgresql.cnpg.io/v1
kind: Backup
metadata:
  name: linkding-manual-restore-drill
spec:
  cluster:
    name: linkding-db-prod
YAML

kubectl -n linkding-prod get backup linkding-manual-restore-drill

A restore drill should create a new namespace, restore from object storage, run application migrations against the restored database, and record observed RTO and RPO. The output should be boring enough to put in a runbook:

Drill field	Recorded value
Backup identifier	Exact backup object or CloudNativePG backup name
Restore namespace	Isolated namespace name
Restore start time	Timestamp
Application migration result	Pass or fail
Observed RTO	Measured duration
Observed RPO	Last committed test row recovered
Operator version	CloudNativePG version
PostgreSQL image	Exact image tag
StorageClass	Exact class

Make GitOps incident-aware.

Automated pruning and self-healing are useful until an incident commander needs to patch a live object. Argo CD automated sync does not prune by default; pruning and self-healing are explicit settings (Argo CD docs). Database resources need operational rules around those settings.

Verification:

argocd app set linkding-db-prod --sync-policy none

kubectl -n linkding-prod annotate cluster linkding-db-prod \
  incident.example.com/reconciliation-paused="$(date -u +%Y-%m-%dT%H:%M:%SZ)"

# Apply the emergency change, then commit the final desired state back to Git.

argocd app set linkding-db-prod --sync-policy automated --self-heal --auto-prune
argocd app sync linkding-db-prod

The runbook should say who can pause reconciliation, how the change is recorded, and how drift is reconciled afterward.

Monitor the database fleet, not just one cluster.

CloudNativePG provides predefined metrics and Prometheus integration. A PodMonitor for a cluster can be created by setting .spec.monitoring.enablePodMonitor: true, and CloudNativePG publishes Grafana dashboard material for the operator and clusters (CloudNativePG monitoring docs, Grafana dashboard).

Per-application databases multiply alert surfaces. That is acceptable only if ownership is encoded.

Minimum alert classes:

Alert class	Why it matters
Replication lag	Failover safety depends on replicas being current enough for the workload
Failed WAL archiving	PITR depends on the archive, not only the running pods
Backup age	A configured backup policy can still fail silently
Disk saturation	PostgreSQL availability usually fails gradually before it fails completely
Failover events	The application may need connection-pool and retry validation after promotion
Certificate or secret expiry	A synchronized Secret does not prove clients are using it correctly
External Secrets sync errors	The Kubernetes Secret can drift from the external source
Object-store errors	Restore readiness depends on credentials, network path, and storage availability

In Practice

The documented pattern is not “Kubernetes makes databases easy.” The documented pattern is “Kubernetes gives the operator a control plane, and the operator still depends on PostgreSQL, storage, object storage, secrets, and reconciliation semantics behaving correctly.”

The strongest public warning is GitLab’s January 31, 2017 database outage. It was not a Kubernetes incident, and it should not be misrepresented as one. Its relevance is narrower and more useful: GitLab’s public postmortem shows how PostgreSQL HA, replication, snapshots, dumps, and restore procedures can all look plausible until the one day they are needed together.

GitLab reported accidental removal of data from the primary database, replication already propagating the damage, missing pg_dump backups caused by a PostgreSQL client version mismatch, backup failure notifications that were not reaching operators, and a restore path bottlenecked by slow disk transfer from a staging snapshot (GitLab postmortem). The public incident summary also noted that a six-hour-old backup was used and database changes in that window were lost (GitLab incident update).

The lesson for CloudNativePG is not that Kubernetes would have prevented the incident. It would not automatically do that. The lesson is that database resilience is a chain:

flowchart TD
    Write[application write] --> WAL[WAL generated]
    WAL --> Archive[WAL archived]
    Data[database files] --> BaseBackup[physical base backup]
    Archive --> Restore[restore procedure]
    BaseBackup --> Restore
    Restore --> AppCheck[application migration and read write check]
    AppCheck --> Evidence[recorded RTO and RPO]

If any link is assumed rather than tested, the platform is carrying hidden risk.

Evidence type	Public mechanism	Production implication
GitLab public postmortem	Backup jobs failed because the wrong PostgreSQL client version was used, and failure notifications were not reaching operators (GitLab postmortem)	Backup configuration must be verified by restore tests and alert delivery, not only scheduled jobs
GitLab restore behavior	Restore was constrained by the available snapshot and storage transfer path (GitLab postmortem)	RTO depends on data size, object-store throughput, volume performance, and the restore procedure
CloudNativePG service behavior	CloudNativePG documents `rw`, `ro`, and `r` services, with `rw` pointing to the primary and being non-disableable (service docs)	Application failover depends on using the service, not pod identity
CloudNativePG backup behavior	CloudNativePG documents WAL archiving, physical base backups, PITR, and warns that WAL alone cannot restore a cluster (backup docs)	Backup success is not restore readiness
CloudNativePG object-store behavior	CloudNativePG documents a default `archive_timeout` of `5min` for Barman object-store WAL archiving (object-store backup docs)	Low-write workloads still need explicit RPO measurement and restore validation
Argo CD reconciliation	Argo CD documents automated prune, self-heal, sync semantics, and rollback limits under automated sync (auto-sync docs)	Database emergency operations need a GitOps pause and resume procedure
External Secrets refresh	External Secrets Operator documents `CreatedOnce`, `Periodic`, and `OnChange` refresh policies; `Periodic` updates the Kubernetes Secret on `refreshInterval` (ExternalSecret API docs)	Secret rotation must include application reload and PostgreSQL role behavior
Kubernetes disruption behavior	Kubernetes distinguishes voluntary and involuntary disruptions and notes that not all voluntary disruptions are constrained by PodDisruptionBudgets (Kubernetes docs)	Node drain, pod deletion, node loss, and storage failure are separate tests

I have not run this exact Linkding-style reference deployment at production scale personally. The documented mechanics are still enough to draw the boundary: a three-instance PostgreSQL cluster can fail over correctly at the Kubernetes object level while the user-visible service still fails because the application pinned stale connections, the volume layer stalled, External Secrets rotated a value no process reloaded, WAL archiving failed unnoticed, or Argo CD reverted an emergency patch.

That is why the proof must be operational, not visual. A green Argo CD dashboard proves convergence. It does not prove recoverability. A promoted replica proves one HA path. It does not prove connection-pool behavior, restore speed, backup freshness, or data-loss bounds.

Where It Breaks

Failure mode	Trigger	Fix
Correlated downtime across replicas	Kubernetes schedules PostgreSQL instances onto nodes sharing the same failure domain	Require topology spread constraints, node affinity, and anti-affinity across zones or node pools
False confidence from HA	Primary pod deletion succeeds, but storage-zone failure or object-store outage was never tested	Run separate drills for pod deletion, node drain, node loss, storage latency, and restore from object storage
Backup drift across CloudNativePG versions	Templates depend on older `barmanObjectStore` examples while the operator lifecycle moves toward CNPG-I plugins from 1.26 onward	Pin operator versions, maintain upgrade notes, and test backup plus restore for every operator upgrade
GitOps conflicts with emergency repair	`selfHeal: true` reapplies Git state after manual database-related Kubernetes changes	Document Argo CD suspension, require incident annotations, and reconcile the final state back into Git
Secret rotation only updates Kubernetes	External Secrets updates the Secret, but PostgreSQL connections remain open with old credentials	Use explicit rotation runbooks: create new role secret, restart or reload clients, verify new logins, then revoke the old role
Read traffic hits the wrong endpoint	Application sends writes to `ro` or uses `r` because it appears to work during steady state	Standardize environment variables and policy checks so write paths use only `*-rw`
Cost expands quietly	Every service gets PostgreSQL pods, persistent volumes, backups, metrics, and alerts	Define tiers: production HA, staging reduced HA, ephemeral development, and explicit cost labels
Noisy fleet operations	One-off manifests diverge across teams	Generate manifests from reviewed templates and enforce policy with Kyverno, OPA Gatekeeper, or CI checks
Restore exceeds incident budget	PITR exists in theory, but base backup size, object-store throughput, and migration replay time were never measured	Record RTO and RPO during scheduled restore drills, then publish them with the service SLO
Kubernetes maintenance causes failover churn	Node drains evict database pods without a maintenance strategy	Use PodDisruptionBudgets, maintenance windows, topology constraints, and CloudNativePG-aware drain procedures
Backup alerts are too shallow	The backup job exits successfully, but restore would fail because credentials, object paths, or versions drifted	Alert on backup age and WAL archive failures, then run scheduled restore verification into a clean namespace
Application retry behavior is untested	PostgreSQL primary changes while clients hold old sessions	Test failover through the real application path, including connection pool settings and transaction retry behavior

What to Do Next

Problem: Per-application PostgreSQL reduces blast radius, but multiplies operational surfaces across storage, backup, monitoring, secrets, upgrades, GitOps, and cost.
Solution: Build a database platform contract around CloudNativePG manifests, admission policy, restore drills, and incident-aware reconciliation.
Proof: A valid proof creates a cluster from Git, writes test data, kills the primary, confirms application writes through *-rw, rotates credentials, restores from object storage into a clean namespace, and records observed RTO and RPO.
Action: This week, add CI or admission checks for instances >= 3, backup configuration, monitoring enabled, resource requests, owner labels, explicit storage class, and no plaintext Secret manifests.

A per-application database is not a smaller managed service. It is a sharper failure boundary. Use it when the platform is prepared to test the edge.

Situation

The Problem

Architecture Problem

Design Options

Tradeoff Matrix

Core Concept: Per-App PostgreSQL as a Declared Failure Boundary

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

Azure Database for PostgreSQL: Flexible Server vs Hyperscale (Citus) Architecture Decision

GCP AlloyDB vs Cloud SQL for PostgreSQL: When to Upgrade

Oracle to Aurora PostgreSQL: License Cost Elimination in Practice