GCP Database Cost Review: Cloud SQL, Spanner, Bigtable, Memorystore, and BigQuery
Database cost failures rarely start with a bad price sheet; they start when every workload gets treated like the same workload with a different product name.
Situation
Most GCP database estates grow through local decisions. A team needs PostgreSQL semantics, so it provisions Cloud SQL. Another needs global consistency, so it evaluates Spanner. An ingestion path needs low-latency keyed writes, so Bigtable appears. Session state, locks, queues, and leaderboards find their way into Memorystore. Analytics lands in BigQuery because SQL over large data is operationally easier than running another warehouse.
Each choice is defensible in isolation. The failure appears later, when finance reviews spend by SKU while engineering reasons by service. Those views do not line up. A Cloud SQL bill might be driven by provisioned HA capacity, storage growth, backups, and read replicas. A BigQuery bill might be driven by accidental full-table scans. A Bigtable bill might be mostly idle nodes kept online for peak traffic. A Memorystore bill might be memory reserved for data that should have expired. A Spanner bill might be the cost of buying global correctness for a workload that only needed regional isolation.
The review has to start one layer above pricing. It has to ask what shape of state each workload actually owns.
The Problem
The common anti-pattern is service-first cost review: list every database, sort by monthly spend, and ask owners to reduce it. That usually produces local optimizations: smaller instances, fewer replicas, cheaper storage, shorter retention, lower query frequency. Some of those help. Many transfer risk into latency, recovery, correctness, or operator toil.
The more dangerous version is product substitution without workload analysis. Moving Cloud SQL to Spanner may replace vertical scaling pressure with distributed transaction cost. Moving BigQuery workloads into Bigtable may avoid scan charges but create operational read-path complexity. Moving hot reads into Memorystore may reduce database load while introducing cache stampede risk and silent memory bloat.
The core question is not “which GCP database is cheapest?” The core question is: what workload contract are we paying for, and is the system using that contract enough to justify its cost?
Cost Control Is a Workload Placement Architecture
flowchart TD
A[Billing export — daily cost facts] --> B[Workload taxonomy — latency and shape]
B --> C[Cloud SQL — relational steady state]
B --> D[Spanner — global transactional state]
B --> E[Bigtable — wide row access]
B --> F[Memorystore — hot ephemeral state]
B --> G[BigQuery — analytical scans]
C --> H[Guardrails — sizing and retention]
D --> H
E --> H
F --> H
G --> H
H --> I[Review loop — schema and access patterns]
I --> A
Cloud SQL should be reviewed as managed relational capacity. The right questions are boring and important: is HA required for this environment, are read replicas serving production reads, are backups and point-in-time recovery aligned with the recovery objective, and is vertical scaling masking missing indexes or connection misuse? Cloud SQL cost is usually easiest to control when ownership is tight: one application boundary, explicit lifecycle, clear retention, measured connection pools, and query plans reviewed before scaling.
Spanner should be reviewed as a correctness and distribution purchase. Its value is strongest when the workload needs horizontal scale, relational access, strong consistency, and multi-region behavior together. If the application does not need those properties, Spanner can become an expensive substitute for schema discipline. If it does need them, the review should focus on schema design, key distribution, transaction shape, and placement configuration rather than treating node cost as the only lever.
Bigtable should be reviewed as a high-throughput keyed access system. It rewards predictable row-key design and punishes accidental hot spotting. Cost review is therefore inseparable from access review: row-key distribution, cluster sizing, storage class, replication, retention, and whether large analytical scans have leaked into an operational store.
Memorystore should be reviewed as reserved memory for volatile performance. The key question is whether the data is truly hot, bounded, and disposable. If the answer is no, Redis becomes a memory-priced database with weaker durability assumptions than the application may realize. Expiration policy, max key cardinality, value size, and cache-miss behavior matter more than a generic “cache hit rate” dashboard.
BigQuery should be reviewed as analytical execution over stored data. It is not just a database line item; it is a query behavior line item. Partitioning, clustering, materialized views, table expiration, reservations, query limits, and user-level attribution are cost controls. Google’s own BigQuery guidance emphasizes estimating and controlling query costs, including limiting bytes processed and analyzing billing data in BigQuery itself (Google Cloud BigQuery cost practices).
In Practice
Context: The documented pattern across Google’s data systems is specialization, not a universal database. The Spanner paper describes a globally distributed database built for externally consistent transactions across datacenters (Spanner OSDI 2012). The Bigtable paper describes a sparse, distributed, persistent sorted map for large-scale structured data (Bigtable OSDI 2006). Dremel, the system behind BigQuery’s analytical model, was designed for interactive analysis over web-scale datasets (Dremel paper). These are different contracts.
Action: Treat every database review as a contract test. For each workload, write down the required latency, consistency, access pattern, retention period, recovery target, regionality, and failure behavior. Then map it to the cheapest service configuration that still satisfies those constraints. Cloud SQL gets query-plan and instance-rightsizing review. Spanner gets transaction and key-design review. Bigtable gets row-key and hot-spot review. Memorystore gets TTL and memory-bound review. BigQuery gets scan, partition, and attribution review.
Result: The result is not a guaranteed lower bill from one setting change. The result is cost explainability. A Spanner line item can be defended because the system needs global transactions. A BigQuery spike can be traced to a query class or user group. A Bigtable increase can be tied to replication, node count, or access skew. A Memorystore increase can be tied to retained keys, larger values, or missing expiration. This turns cost review from negotiation into engineering evidence.
Learning: The durable pattern is that cost follows shape. Transactional cost follows isolation, availability, and write coordination. Wide-column cost follows node count, replication, and key distribution. Cache cost follows memory residency. Analytical cost follows scanned data and slot consumption. A mature architecture does not ask one database to be cheaper at doing the wrong job; it routes state to the service whose failure model matches the business contract.
Where It Breaks
| Service | Cost failure mode | Why it happens | Review lever |
|---|---|---|---|
| Cloud SQL | Oversized always-on instances | Scaling used to compensate for missing indexes, excess connections, or unclear environment lifecycle | Query plans, connection pooling, rightsizing, retention, HA scope |
| Spanner | Paying for global correctness without needing it | Workload needs relational scale but not multi-region consistency or distributed transactions | Regionality review, transaction boundaries, schema and key design |
| Bigtable | Idle or skewed cluster capacity | Nodes are sized for peak, hot keys reduce effective throughput, replication multiplies storage | Row-key distribution, autoscaling policy, replication review, TTL |
| Memorystore | Memory becomes permanent storage | Keys lack TTLs, values grow, cache miss paths are unsafe, eviction policy is unclear | TTL contracts, key cardinality budgets, miss testing, value-size limits |
| BigQuery | Unbounded analytical scans | Users query raw wide tables, partitions are ignored, exploratory workloads lack limits | Partition filters, clustering, materialized views, reservations, query quotas |
What to Do Next
- Problem: Database spend is being reviewed after the architecture has already encoded access patterns, retention, and correctness requirements.
- Solution: Build a workload placement matrix before changing SKUs: latency, consistency, read shape, write shape, retention, recovery, regionality, and failure tolerance.
- Proof: Use billing export, query logs, database metrics, schema review, and documented system behavior from Cloud SQL, Spanner, Bigtable, Memorystore, and BigQuery to tie cost to workload shape.
- Action: For the next review cycle, pick the top five database cost centers and write one contract per workload. If the contract does not justify the service configuration, change the architecture before shaving capacity.