Azure Reference Architecture: Front Door, App Service, SQL, Cache, and Service Bus
A cloud application usually fails at the boundaries first: the global edge, the web tier, the database connection pool, the cache invalidation path, and the asynchronous backlog nobody watched until users were already waiting.
Situation
A common Azure production stack looks deceptively simple. Azure Front Door terminates global traffic. Azure App Service runs the application. Azure SQL Database stores transactional state. Azure Cache for Redis absorbs hot reads and coordination pressure. Azure Service Bus decouples slow work from request latency.
On a reference diagram, that stack reads like a clean web architecture. Requests come in through the edge, application instances scale horizontally, the database remains managed, cache keeps latency low, and messages handle deferred processing. The managed services remove server maintenance, but they do not remove distributed systems behavior.
The operational shift is that the application team no longer owns machines. It owns failure boundaries. Front Door can route to an unhealthy origin if health probes are weak. App Service can scale out faster than the database can absorb connections. SQL can throttle before the web tier notices. Redis can become a correctness dependency instead of a performance aid. Service Bus can preserve work while hiding a downstream outage behind a growing queue.
The Problem
The failure mode is not that any one Azure service is unreliable. The failure mode is believing the services compose into reliability automatically.
A synchronous request path couples Front Door, App Service, SQL, and Redis into a single user-visible transaction. If one component slows down, the others begin amplifying the problem. App instances retry database calls. Retries consume more connection slots. Cache misses stampede into SQL. Service Bus publishers continue accepting work that workers cannot drain. Health probes remain green because the process still returns HTTP 200 on a shallow endpoint.
The design question is therefore not, “Which Azure services should be on the diagram?” The question is: where does the architecture absorb failure without making the user, database, or operators pay for it?
The Reference Architecture
The practical answer is to treat the stack as five control points: edge admission, request execution, state protection, read pressure relief, and asynchronous load shedding.
flowchart TD
U[user request] --> F[Azure Front Door — global entry]
F --> WAF[WAF policy — edge filtering]
WAF --> APP[App Service — stateless web tier]
APP --> CACHE[Azure Cache for Redis — hot read path]
APP --> SQL[Azure SQL Database — transactional system of record]
APP --> BUS[Azure Service Bus — deferred work]
BUS --> WORKER[App Service worker — queue consumer]
WORKER --> SQL
WORKER --> CACHE
MON[observability — traces metrics logs] --> F
MON --> APP
MON --> SQL
MON --> CACHE
MON --> BUS
Azure Front Door should be the global admission layer, not just a vanity endpoint. It owns TLS, WAF policy, routing, and origin failover. Its health probes should test an application dependency profile that is meaningful enough to prevent routing to broken origins, but cheap enough not to become a synthetic load generator.
App Service should stay stateless. Instances can scale out, restart, or move without requiring local session recovery. Any per-user state belongs in signed tokens, SQL, or a deliberately bounded cache entry. Deployment slots should be used for controlled rollouts, but slot swaps are not a replacement for backward-compatible schema and message contracts.
Azure SQL Database should remain the source of truth. The application should protect it with connection limits, query timeouts, bounded retries, and circuit breakers. Retry policies must use jitter and must distinguish transient failures from sustained overload. A retry that makes sense for a single request can become an outage multiplier when thousands of instances execute it together.
Azure Cache for Redis should reduce read pressure, not own correctness by accident. Cache entries need explicit TTLs, versioning where appropriate, and a safe miss path. If the cache is unavailable, the application should either degrade intentionally or shed nonessential features. It should not stampede SQL with every cache miss at once.
Azure Service Bus should absorb work that does not need to complete inside the user request. It gives the architecture a buffer, but the buffer must be observable. Queue depth, message age, dead-letter count, handler failure rate, and drain time are production signals, not dashboard decoration.
In Practice
Context: Microsoft’s Azure Architecture Center documents this exact shape as a common web application pattern: a global entry service, an application hosting tier, managed data stores, caching, messaging, and centralized monitoring. Azure Well-Architected guidance repeatedly separates reliability concerns into redundancy, health modeling, retry behavior, and operational observability.
Action: The documented pattern is to make the web tier stateless, put durable state in a managed database, use cache for performance-sensitive reads, and move long-running work onto a queue. In Azure terms, that usually means App Service instances behind Front Door, Azure SQL for transactional data, Azure Cache for Redis for hot data, and Service Bus for asynchronous workflows.
Result: The architecture gains independent scaling axes. Front Door can manage global routing and edge protection. App Service can scale request handlers. SQL can be sized and tuned around transactional load. Redis can absorb repeated reads. Service Bus can preserve work during downstream slowness.
The result is not automatic resilience. It is separability. Each layer can now have its own timeout, quota, alert, and recovery mechanism.
Learning: The pattern works when every boundary has an explicit contract. Front Door needs a real origin health model. App Service needs bounded concurrency and dependency timeouts. SQL needs query discipline and connection governance. Redis needs a cache consistency strategy. Service Bus needs poison message handling and backlog SLOs.
A documented reference architecture is a starting point. The production architecture is the reference design plus the failure policies.
Where It Breaks
| Failure mode | Why it happens | Architectural response |
|---|---|---|
| Healthy process, broken dependency | Health endpoint only checks the web process | Add dependency-aware readiness with cheap critical checks |
| Retry storm | App instances retry the same overloaded dependency | Use bounded retries, jitter, circuit breakers, and budgets |
| SQL connection exhaustion | Scale-out creates more concurrent database clients | Cap pool sizes, tune queries, and limit request concurrency |
| Cache stampede | Popular key expires and all instances miss together | Use TTL jitter, request coalescing, and stale-while-revalidate where safe |
| Queue hides outage | Service Bus accepts messages faster than workers drain them | Alert on message age, queue depth, dead letters, and drain time |
| Poison messages block progress | One malformed job repeatedly fails | Use max delivery counts, dead-letter queues, and replay tooling |
| Slot swap breaks contracts | New code assumes new schema or message format | Use expand-contract migrations and versioned message handlers |
| Edge failover is too late | Front Door probes do not match user-visible failure | Probe critical paths and tune origin failover thresholds |
What to Do Next
Problem: The main risk in this architecture is hidden coupling. The diagram says the services are separate, but runtime behavior can still bind them into one failure domain.
Solution: Put explicit policies at every boundary: admission control at Front Door, concurrency limits in App Service, timeouts around SQL, cache degradation rules for Redis, and backlog controls for Service Bus.
Proof: Test the failure modes directly. Disable Redis in a staging environment. Force SQL throttling. Slow the queue consumer. Return failed readiness from one origin. Confirm that alerts fire before users become the monitoring system.
Action: Build the first production checklist around five questions: what gets rejected at the edge, what times out in the app, what protects SQL, what happens when cache is missing, and how long Service Bus can fall behind before the business notices.