Azure SQL vs Cosmos DB: The Partition Key Decision

The wrong database choice usually announces itself late: not during schema design, but when one tenant, customer, region, or workflow becomes hot enough to make every clean abstraction look expensive.

Situation

Teams often frame Azure SQL versus Cosmos DB as a database-model decision: relational tables against JSON documents, joins against denormalization, SQL transactions against globally distributed NoSQL. That framing is useful, but incomplete.

The harder question is operational. Azure SQL asks you to model consistency, indexing, and query shape around a relational engine. Cosmos DB asks you to model distribution first. The partition key is not a tuning knob in Cosmos DB. It is the boundary that determines where data lives, how requests are routed, how throughput is consumed, and which transactions are cheap.

That difference matters because modern applications rarely fail evenly. A SaaS control plane might have thousands of quiet tenants and three enormous ones. A commerce system might have normal catalog traffic until one product launch concentrates writes. A telemetry platform might look horizontally scalable until every device in one fleet reports at the same minute.

The database choice is not “SQL or NoSQL.” It is whether your dominant operational invariant is relational integrity or distributed access locality.

The Problem

Azure SQL lets teams postpone some physical-design decisions. You can normalize first, add indexes later, tune queries, introduce read replicas, split hot tables, or shard after the access patterns prove themselves. Those moves are not free, but the engine gives you a strong relational baseline: constraints, joins, transactions, secondary indexes, and mature query planning.

Cosmos DB moves the critical design decision earlier. A poor partition key can create hot partitions, expensive cross-partition queries, awkward transactions, and data models that cannot evolve without migration. A good partition key aligns with the request path: one logical operation touches one partition, consumes predictable request units, and avoids coordination.

The trap is that the application model often suggests the wrong key. tenantId feels natural for SaaS. userId feels natural for personalization. orderId feels natural for commerce. Each can be right, but only if it matches the workload’s heat distribution and transaction boundary.

If the system needs relational integrity across many entities, Azure SQL absorbs that complexity better. If the system needs low-latency, high-scale access to independently partitionable records, Cosmos DB can be simpler operationally. The question is: which boundary will hurt more when the system is under load — relational coordination or partition imbalance?

Partition Around the Operational Invariant

A practical architecture starts by naming the unit of contention. That unit is not always the entity name in the domain model. It is the smallest boundary inside which the system needs fast reads, fast writes, and strong correctness.

flowchart TD
    A[Workload shape — read and write paths] --> B[Correctness boundary — what must commit together]
    A --> C[Heat boundary — where traffic concentrates]
    B --> D{Primary invariant}
    C --> D
    D -->|relational integrity| E[Azure SQL — constraints joins transactions]
    D -->|access locality| F[Cosmos DB — partition key document model]
    F --> G[Choose key — high cardinality even heat]
    F --> H[Model requests — single partition first]
    E --> I[Model schema — normalized core indexed paths]
    E --> J[Scale plan — replicas pools sharding later]

Use Azure SQL when the write path depends on relationships that must be enforced together: account balances, entitlement state, order lifecycle transitions, billing ledgers, or admin workflows where ad hoc queryability matters. The cost is that scale-out usually requires deliberate architecture: read replicas, elastic pools, caching, queue-backed writes, or sharding.

Use Cosmos DB when the application can make one partition the natural home for most operations. The ideal partition key has high cardinality, even request distribution, and semantic alignment with the transaction boundary. The cost is that mistakes are structural. If every request hits one key, the system is partitioned in name only. If every query fans out across partitions, the document model has not removed coordination; it has moved it into the request path.

The decision is clearest when written as a failure-mode table before implementation:

Workload signal	Azure SQL bias	Cosmos DB bias
Multi-entity transactions are common	Strong	Weak
Queries change frequently	Strong	Weak
Access pattern is stable and key-addressable	Moderate	Strong
Traffic is globally distributed	Moderate	Strong
Hot tenants or hot users dominate traffic	Needs sharding plan	Needs synthetic key or redesign
Data must be joined many ways	Strong	Weak
Request latency depends on single-record lookups	Moderate	Strong

In Practice

Context. The documented Cosmos DB pattern is that partitioning is part of the logical data model, not merely infrastructure. Microsoft guidance emphasizes choosing a partition key that spreads request unit consumption and storage while supporting the application’s common queries and transactions. The documented system behavior is that items with the same logical partition key can be handled together more efficiently than operations that span many logical partitions.

Action. For a SaaS workload, do not automatically choose tenantId. First classify tenants by expected size, write rate, and query shape. If most operations are tenant-scoped and tenants are evenly sized, tenantId may be correct. If a few tenants dominate traffic, a synthetic key such as tenantId—bucketId may distribute heat, but it also changes query and transaction semantics. That tradeoff must be explicit, not discovered during an incident.

For an order system, do not automatically choose orderId either. It gives excellent point reads for a single order, but weak locality for customer history queries unless those queries are served by a separate projection. A common documented pattern in distributed systems is command-side and query-side separation: keep the write model optimized for correctness and maintain read models optimized for access paths.

Result. The result is not one universal database answer. It is a split architecture that often looks boring on purpose. Azure SQL owns relational control-plane state where constraints and cross-entity workflows matter. Cosmos DB owns high-volume, key-addressable documents where the partition key matches the dominant request path. Events or change feeds move data into projections when the read shape differs from the write shape.

This is not polyglot persistence for fashion. It is an operational boundary. The system avoids forcing Azure SQL to behave like an infinitely distributed document store and avoids forcing Cosmos DB to behave like a relational engine with arbitrary joins.

Learning. The partition key decision should happen after workload modeling, not after framework selection. The useful design artifact is a request matrix: operation, read keys, write keys, consistency requirement, expected cardinality, expected hot spots, and fallback behavior during partial failure. If that matrix shows many operations crossing partition boundaries, Cosmos DB is warning you early. If it shows many normalized entities changing together, Azure SQL is probably the simpler core.

Where It Breaks

Choice	Failure mode	Mitigation
Azure SQL for everything	Hot tables, lock contention, expensive scale-up, read pressure	Index deliberately, separate read paths, use queues, plan sharding before emergency
Cosmos DB for relational workflows	Cross-partition queries, duplicated state, weak ad hoc reporting, difficult migrations	Keep relational core in SQL, use Cosmos for projections or bounded aggregates
`tenantId` partition key	One large tenant becomes a hot partition	Use synthetic partitioning, isolate large tenants, or route premium tenants to dedicated containers
`userId` partition key	Shared workflows require fan-out across many users	Add workflow-centric projections or choose a higher-level aggregate key
`orderId` partition key	Customer and support queries become cross-partition scans	Maintain customer-order read models keyed by customer
Synthetic partition key	Better distribution but harder transactions and reads	Make bucket logic deterministic and visible in the domain model
Dual stores	Consistency lag and operational complexity	Define source of truth, idempotent events, replay process, and reconciliation checks

What to Do Next

Problem: The database decision is being made from data shape alone. Add workload shape: request paths, write contention, query volatility, transaction boundaries, tenant skew, and failure behavior.
Solution: Choose Azure SQL when relational correctness is the primary invariant. Choose Cosmos DB when access locality and horizontal distribution are the primary invariant. Use both only when the boundary is explicit.
Proof: Build a request matrix before implementation. For every critical operation, identify whether it is single-row, single-aggregate, single-partition, cross-partition, or cross-entity. The painful cells usually reveal the right database.
Action: Decide the partition key before writing production code. Then test the ugly cases: largest tenant, hottest key, cross-partition query, backfill, replay, support lookup, and schema migration. A partition key that survives those tests is architecture. A partition key chosen from the entity diagram is a guess.

Situation

The Problem

Partition Around the Operational Invariant

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

The Staff Engineer's System Design Review: Questions That Expose Real Risk

Designing for Peak Traffic Without Designing for Permanent Waste

Building a Commerce Platform Data Plane: OLTP, Search, Cache, Queue, Warehouse