Caches, Queues, and Databases: When to Use Each

A cache is not a database. A queue is not a cache. These three structures have different guarantees about durability, ordering, and access patterns — and using the wrong one for the job produces failure modes that are hard to diagnose because the system works correctly under normal load.

Situation

Most production systems use all three: a relational database (PostgreSQL, MySQL) as the system of record, a cache (Redis, Memcached) for hot read paths, and a queue (Kafka, SQS, RabbitMQ) for asynchronous processing. Engineers frequently reach for a cache when they should use a queue, or use a database where a queue would serve better.

The confusion is understandable — Redis can act as both a cache and a queue; PostgreSQL can be used as a queue with SKIP LOCKED; a queue can replay events that look like a cache. But the operational guarantees differ, and those differences matter at failure time.

The Problem

A system uses Redis as a work queue: tasks are pushed to a list, workers pop and process them. Under normal load, it works. During a Redis restart, all in-flight tasks are lost — because Redis’s default persistence does not guarantee durability across restarts, and “pop” removes the item before the worker confirms it processed successfully. The engineers chose a cache for a job that required queue semantics.

What are the actual guarantees each structure provides, and when does each one break?

The Decision Framework

Use a cache when: you need to accelerate reads of data that already exists in a durable store, and the cost of a cache miss is a slower read (not a lost operation). Caches are explicitly lossy by design — eviction, expiry, and cold restarts all produce misses. The system must work (slower) without the cache.

Use a queue when: you need work items to survive producer/consumer failures, be processed exactly once (or at least once), and be consumed in order or at a controlled rate. Queues guarantee delivery in the face of consumer failures. A message that is consumed but not acknowledged is redelivered. This is fundamentally different from a cache’s eviction behavior.

Use a database when: you need durable, queryable state with transactional consistency. Databases provide ACID guarantees, support complex queries, and allow multiple processes to read and write shared state correctly.

Cache:    READ-HEAVY, TOLERATE MISS, LOSSY OK
Queue:    WRITE-ONCE, CONSUME-ONCE, DURABILITY REQUIRED
Database: SHARED MUTABLE STATE, QUERYABLE, ACID REQUIRED

In Practice

PostgreSQL supports queue-like patterns with SELECT ... FOR UPDATE SKIP LOCKED:

-- Dequeue pattern using PostgreSQL as a job queue
BEGIN;
SELECT id, payload FROM job_queue
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;

-- After processing:
UPDATE job_queue SET status = 'done' WHERE id = $1;
COMMIT;

This gives ACID guarantees for job dequeue — a crashed worker leaves the job in FOR UPDATE lock, which releases when the transaction rolls back, making the job visible to the next worker. PostgreSQL is documented as a valid job queue for low-to-moderate throughput (thousands of jobs/sec). Kafka or SQS are more appropriate for high-throughput, high-fan-out, or replay-required patterns.

Redis used as a queue requires AOF persistence (appendonly yes) and careful handling of the race between RPOP and worker failure. Without these, messages are lost on crash. Redis Streams (XADD, XREADGROUP) provide consumer-group semantics with acknowledgment — closer to a proper queue, but still lacks the transactional guarantees of a relational database.

Where It Breaks

Anti-pattern	Failure mode	Correct tool
Cache used as queue (Redis list + RPOP)	Items lost on crash or before worker acks	Proper queue (Kafka, SQS) or PostgreSQL with SKIP LOCKED
Database used as message bus for high throughput	Lock contention and table bloat under load	Dedicated queue
Queue used as state store	No queryability; ordering not preserved for concurrent consumers	Database
Cache without TTL on mutable data	Stale reads served indefinitely; no invalidation	Add TTL; or use cache-aside with explicit invalidation

What to Do Next

Problem: Using a cache for work items or a database for high-throughput messaging produces failure modes that only appear under load or during restarts.
Solution: Apply the framework: durable work items require a queue; hot read acceleration requires a cache; shared mutable state with queries requires a database.
Proof: After switching from Redis list to PostgreSQL SKIP LOCKED or a proper queue, job loss during worker restarts disappears from your error monitoring.
Action: Audit your current Redis usage today — identify any Redis list or set being used as a work queue, and verify that AOF persistence is enabled and that worker failures cannot lose items.

Situation

The Problem

The Decision Framework

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

CAP Theorem in Operational Terms

B-tree vs LSM Tree: The Storage Engine Tradeoff

Consistency Models Your Application Actually Needs