Chapter 19: Caching System
Metadata Card
| Attribute | Content |
|---|---|
| Difficulty | (Intermediate) |
| Prerequisites | Chapter 14 (In-Memory Database), understanding of read-heavy workloads |
| Keywords | Cache, CDN, Redis, Memcached, cache aside, read through, write through, cache invalidation, TTL, eviction policy |
| Code | Python, Redis commands |
Your Progress
"Setting up a quick-access table at the warehouse entrance — caching solves the problem of reading hot data fast."
Why Cache?
- Latency: Memory is 1000x faster than disk
- Throughput: Databases have limits; caches absorb traffic
- Cost: Caching reduces database load, delaying expensive scaling
- Hot data: 80% of reads hit 20% of data (Pareto principle)
Cache Strategies
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Cache-Aside | App checks cache → misses → reads DB → writes cache | Simple, app controls | Cache inconsistency |
| Read-Through | Cache layer reads DB on miss | Transparent to app | Complex cache setup |
| Write-Through | Write to cache first, cache writes to DB | Strong consistency | Higher write latency |
| Write-Behind | Write to cache, async flush to DB | Very fast writes | Risk of data loss |
| Write-Around | Write directly to DB, invalidate cache | Avoids cache pollution | More cache misses |
Why Cache Invalidation is Hard
"Computer science has only two hard problems: cache invalidation and naming things." (Phil Karlton)
Stale data problem: If you update in DB but not in cache, users see old data. Solutions:
- TTL (Time To Live): Accept eventual consistency, data expires after TTL
- Explicit invalidation: Delete/update cache entry when DB changes
- Version numbers: Store version with cached data, reject stale versions
Eviction Policies
| Policy | Description | Best For |
|---|---|---|
| LRU | Evict least recently used | General-purpose |
| LFU | Evict least frequently used | Stable popularity |
| FIFO | Evict oldest entry | Cache with bounded lifetime |
| TTL | Evict expired entries | Fixed validity period |
| Random | Evict random entry | Minimal overhead |
Redis
Most popular caching system. Features:
- In-memory key-value store with data structures (strings, hashes, lists, sets, sorted sets)
- Persistence: RDB (snapshots) and AOF (append-only file)
- Replication: Leader-follower
- Sentinel: High availability (automatic failover)
- Cluster: Auto-sharding across nodes
import redis
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Basic cache operations
r.set('user:1001', '{"name":"Alice","level":42}', ex=3600) # TTL=1 hour
user = r.get('user:1001')
# Cache-aside pattern
def get_user(user_id):
cache_key = f'user:{user_id}'
user = r.get(cache_key)
if user is None:
user = db.query(f"SELECT * FROM users WHERE id = {user_id}")
r.setex(cache_key, 3600, json.dumps(user))
return json.loads(user)Memcached
Simple, fast, distributed memory cache. Key-value only (no data structures). Distributed via consistent hashing on the client side.
Cache Topologies
| Topology | Description | When to Use |
|---|---|---|
| Local cache | Cache in application process (e.g., Guava, Caffeine) | Single-node, ultra-low latency |
| Distributed cache | External cache cluster (Redis, Memcached) | Multi-node, shared state |
| Multi-tier | Local + distributed combination | Extreme scale |
Cache Stampede (Thundering Herd)
When many requests miss cache simultaneously and all hit the DB. Solutions:
- Mutex: Only one request loads into cache, others wait
- Probabilistic early expiration: Randomize TTL expiry
- Backup cache: Layer 2 cache with longer TTL
Hot Keys
When a single key receives disproportionate traffic. Solutions:
- Replication: Distribute hot key reads across replicas
- Local caching: Read replicas have local caches
- Sharding by request: Distribute hot key load
Traveler's Notes
Caching is the most cost-effective performance optimization. Adding a cache layer can reduce database load by 90%+ with minimal effort. But cache invalidation is genuinely hard — you'll encounter stale data, cache stampedes, and hot keys. Start with simple TTL-based caching and add invalidation logic as needed. And remember: a cache is only as useful as its hit rate.