Cloud SQL for PostgreSQL handles most managed database workloads on GCP correctly, but teams that hit analytical query performance ceilings or need HTAP capabilities discover they should have evaluated AlloyDB before the schema was in production.

Situation

Google offers two managed PostgreSQL services on GCP: Cloud SQL and AlloyDB. Cloud SQL is the established managed PostgreSQL (and MySQL, SQL Server) offering with straightforward HA, backups, and read replicas. AlloyDB is a Google-developed PostgreSQL-compatible database that separates compute from storage using a distributed storage layer, adds an adaptive adaptive columnar cache, and supports read pool instances that can run both OLTP and analytical queries against the same data.

AlloyDB became generally available in May 2023. Most GCP teams deploying PostgreSQL choose Cloud SQL as the default path and only encounter AlloyDB when they are researching options or hitting specific performance limits.

The Problem

Cloud SQL for PostgreSQL is a managed PostgreSQL instance with HA standby and read replicas. It scales vertically. The limiting pattern: as analytical query volume grows alongside OLTP traffic, the primary instance saturates on CPU, and read replicas lag under heavy read load — because they are executing the same row-scan-based queries that the primary executes. Adding read replicas distributes read connections but not the per-query execution cost.

AlloyDB’s design addresses a different bottleneck. For OLAP-style queries (aggregations, wide scans, joins across large tables), AlloyDB’s columnar cache stores frequently accessed columns in a compressed columnar format in memory, separate from the row-store. The query engine uses the columnar representation when it is faster, without requiring the application to target a separate analytical store. This is what Google means by HTAP — both OLTP and analytical queries run against the same PostgreSQL-compatible interface, with the storage engine selecting the execution path.

The core question: does the workload contain a meaningful volume of analytical queries running against live OLTP data, and is Cloud SQL’s execution performance the actual bottleneck?

AlloyDB vs Cloud SQL Architecture

flowchart TD
    A[PostgreSQL workload on GCP] --> B{Workload shape?}
    B -->|standard OLTP — transactional reads and writes| C[Cloud SQL — managed single-primary]
    B -->|mixed OLTP and analytical queries on same data| D{Is Cloud SQL CPU the bottleneck?}
    D -->|no — query volume is moderate| C
    D -->|yes — analytical queries saturating primary or replicas| E[AlloyDB — columnar cache — HTAP]
    C --> F[HA standby — read replicas — automatic backups]
    E --> G[Primary — read pool instances — columnar cache — distributed storage]

Cloud SQL for PostgreSQL

Cloud SQL provides a managed PostgreSQL instance with:

  • High availability via a synchronous standby in a secondary zone; Google documents zonal failover typically completing in under 60 seconds with automatic IP cutover (Cloud SQL HA)
  • Read replicas in the same or different regions (asynchronous replication)
  • Automatic backups and point-in-time recovery up to the retention window
  • Private IP, VPC peering, and Cloud SQL Auth Proxy for secure connectivity
  • Maintenance windows with configurable timing

Cross-region disaster recovery with Cloud SQL uses cross-region read replicas. Google documents these as asynchronous, meaning a regional failure can result in data loss equal to replication lag at the moment of failure. Replica promotion is a manual operation (Cloud SQL DR).

AlloyDB for PostgreSQL

AlloyDB separates PostgreSQL compute from storage:

  • The primary instance handles writes; the storage layer is distributed across Google’s infrastructure, replicating synchronously across zones within the region
  • Read pool instances share the same storage layer as the primary — there is no replication lag for reads because read pool instances read directly from the shared distributed storage
  • The adaptive columnar cache stores frequently accessed column data in memory on read pool instances and the primary; the query engine selects columnar or row-store execution per query
  • Google documents AlloyDB storage as synchronously replicated within the region; the storage tier handles I/O and durability independently of compute

AlloyDB is PostgreSQL-compatible at the protocol level. Standard PostgreSQL drivers, pgAdmin, and most tools that connect to PostgreSQL connect to AlloyDB without modification. Extensions that depend on specific storage internals may behave differently.

In Practice

Google’s AlloyDB documentation describes the columnar cache as an adaptive structure — the database populates it based on query patterns without requiring explicit configuration (AlloyDB columnar engine). The engine analyzes which columns are accessed frequently by scan-heavy queries and promotes them into the columnar representation. This is distinct from creating a materialized view or a separate analytical table: the data source is the same live table; the storage representation changes based on access patterns.

The documented design consequence is that AlloyDB read pool instances can satisfy analytical queries from the columnar cache without adding lag from replication — because they read from the same distributed storage layer as the primary rather than applying a WAL stream. Cloud SQL read replicas apply WAL asynchronously; under heavy write load, replication lag can grow, making replica reads stale for time-sensitive analytics.

Migration from Cloud SQL to AlloyDB uses the Database Migration Service. Google documents that DMS supports online migrations from Cloud SQL for PostgreSQL to AlloyDB with minimal downtime using logical replication (DMS AlloyDB migration). Schema-level PostgreSQL extensions used in Cloud SQL that are not supported in AlloyDB require application changes before migration. The AlloyDB documentation lists supported extensions; notably, some PostGIS and pg_partman functionality may require version verification.

AlloyDB costs more than Cloud SQL at equivalent compute sizes. Google’s pricing for AlloyDB reflects the separate storage layer billing model — storage is billed per GB regardless of instance size, and read pool instances add compute cost beyond the primary. For workloads where Cloud SQL’s row-store execution is adequate, AlloyDB’s additional cost produces no measurable benefit.

Where It Breaks

ScenarioWhat breaksWhy
AlloyDB — columnar cache cold on startupAnalytical queries revert to row-store performance until cache warmsCache is populated from query patterns; a restarted instance has no cached columns initially
AlloyDB — extension dependency not supportedMigration blocked or application behavior changesAlloyDB does not support all PostgreSQL extensions available in Cloud SQL; verify before migrating
Cloud SQL cross-region replica — regional failoverManual promotion, potential data loss equal to replication lagCross-region replicas are asynchronous; no automatic promotion to primary
AlloyDB — write-heavy workload with no analytical queriesCost increase with no performance benefitThe columnar cache and read pool architecture only benefit mixed or analytical workloads
Cloud SQL — analytical query on primary during peak OLTPCPU saturation affects write latencyRow-store execution for wide scans competes with OLTP for CPU; no separate execution path
AlloyDB — connection to read pool for write operationsWrite rejectedRead pool instances are read-only; writes must target the primary endpoint

What to Do Next

  • Problem: Cloud SQL’s row-store execution handles OLTP well but has no separate code path for analytical queries, meaning mixed workloads compete for the same CPU on primary and replicas.
  • Solution: Evaluate AlloyDB when analytical queries represent a meaningful share of query volume, Cloud SQL CPU is the bottleneck during analytical load, and the workload runs in a single GCP region (AlloyDB does not currently support cross-region reads with the shared storage model).
  • Proof: Run EXPLAIN ANALYZE on the three slowest analytical queries in Cloud SQL and measure CPU time; if the bottleneck is scan and aggregation (not I/O or lock contention), AlloyDB’s columnar cache addresses the actual bottleneck.
  • Action: Before committing to AlloyDB, verify that all PostgreSQL extensions in use are supported by AlloyDB and budget for the cost differential; if the workload is exclusively transactional with no wide-scan analytics, Cloud SQL remains the correct choice.