Caching Strategies: Read Path
Session 3.8 · ~5 min read
Why Cache the Read Path
Most applications are read-heavy. A typical e-commerce product page is read 5,000 times for every update. A user profile is read hundreds of times between edits. If every read hits the database, you are doing redundant work. The data has not changed, but the database parses the query, searches the index, fetches the row, and serializes the result every single time.
A cache stores the result of a previous computation so it can be reused. For read-heavy workloads, caching reduces database load, cuts response time from milliseconds to microseconds, and lets you serve more traffic without scaling the database.
There are two primary patterns for caching on the read path: cache-aside and read-through. They differ in who is responsible for populating the cache.
Cache-Aside (Lazy Loading)
In cache-aside, the application manages the cache directly. The cache sits beside the database, not in front of it. The application checks the cache first. On a hit, it returns the cached data. On a miss, it queries the database, writes the result to the cache, and returns it.
Cache-aside puts the application in control. The application decides what to cache, when to cache it, and when to evict it. The cache and database are independent systems with no awareness of each other.
Cache-aside has several strengths. The cache only contains data that has actually been requested, so you do not waste memory caching things nobody reads. The application has full control over cache keys, TTLs, and invalidation logic. And because the cache and database are independent, a cache failure does not bring down the application. Reads fall back to the database, which is slower but functional.
The weakness is the cold start problem. When the cache is empty (after a restart or deployment), every request is a cache miss. All traffic hits the database simultaneously. This is the thundering herd problem.
Read-Through
In read-through, the cache itself is responsible for loading data from the database. The application only talks to the cache. If the cache does not have the data, it fetches it from the database, stores it, and returns it. The application never queries the database directly for cached entities.
Read-through delegates data loading to the cache layer. The application treats the cache as the only data source. Cache misses are handled internally by the cache provider.
Read-through simplifies the application code. The application does not contain any cache-miss logic or database fallback code. The cache provider handles everything. This is particularly useful when multiple services need the same caching behavior. Instead of implementing cache-miss handling in each service, the cache provider handles it once.
Read-through also avoids one specific problem with cache-aside: duplicate database queries on concurrent misses. In cache-aside, if 100 requests arrive for the same uncached key simultaneously, all 100 may query the database before any of them writes to the cache. Read-through implementations typically use internal locking: the first request triggers a database fetch, and subsequent requests for the same key wait for the result instead of making redundant database calls.
Refresh-Ahead
Refresh-ahead is a proactive strategy that reloads cached data before it expires. The cache tracks access patterns and pre-fetches entries that are likely to be requested again. If a cached item with a 300-second TTL is accessed at the 250-second mark, the cache proactively refreshes it in the background so it never actually expires.
This eliminates the latency spike that occurs when a popular cache entry expires and the next request must wait for a database query. The downside is that it can waste resources refreshing data that nobody will request again.
Strategy Comparison
| Dimension | Cache-Aside | Read-Through | Refresh-Ahead |
|---|---|---|---|
| Who loads the cache | Application code | Cache provider | Cache provider (proactive) |
| Cache miss latency | Full DB query time | Full DB query time (first request) | Near zero (pre-fetched) |
| Thundering herd risk | High (N concurrent misses = N DB queries) | Low (internal locking) | Very low (entries rarely expire) |
| Application complexity | Higher (miss logic in app) | Lower (cache handles misses) | Lowest (transparent) |
| Wasted cache memory | Low (only requested data cached) | Low (only requested data cached) | Higher (pre-fetches may go unused) |
| Cache provider requirements | Any (Redis, Memcached) | Must support data loader callbacks | Must support TTL tracking + async refresh |
| Staleness control | TTL or explicit invalidation | TTL or explicit invalidation | Minimal staleness (continuous refresh) |
| Best for | General purpose, read-heavy, simple setups | Multi-service architectures, consistent patterns | Hot keys with strict latency requirements |
TTL Strategy
Time-to-live (TTL) determines how long a cached entry remains valid. Set it too short, and you get frequent cache misses that negate the benefit. Set it too long, and users see stale data.
The right TTL depends on two things: how often the underlying data changes and how much staleness your users can tolerate. A product catalog that updates twice a day can tolerate a 5-minute TTL. A stock price that changes every second needs a TTL under 1 second, or you should not cache it at all.
A practical approach: start with the ratio of reads to writes.
- 10,000:1 read/write ratio (product pages): TTL of 5-15 minutes. The data rarely changes, and even stale data is acceptable for a few minutes.
- 100:1 read/write ratio (user profiles): TTL of 1-5 minutes with explicit invalidation on write.
- 10:1 read/write ratio (inventory counts): TTL of 10-30 seconds, or skip caching and use database read replicas instead.
Cache Hit Ratio
The cache hit ratio measures what percentage of requests are served from the cache. Production systems typically target 90-99% hit ratios. Below 80%, the cache is not providing enough benefit to justify its operational cost. Above 95%, you are in excellent shape.
Hit ratio depends on the working set size relative to cache capacity, the access pattern (power-law distributions cache well, uniform distributions do not), and TTL settings. Monitor this metric continuously. A sudden drop in hit ratio usually indicates either a change in traffic patterns or a cache configuration problem.
Systems Thinking Lens
Caching introduces a balancing loop: as database load increases, you add caching, which reduces database load. But caching also introduces a new reinforcing loop: as the cache absorbs more traffic, the application becomes dependent on it. A cache failure that was tolerable at 50% hit ratio becomes catastrophic at 99% hit ratio, because the database has been sized for 1% of the traffic, not 100%.
The leverage point is not the cache itself but the invalidation strategy. A cache that serves stale data is worse than no cache, because it creates bugs that are intermittent and hard to reproduce. Design the invalidation path with the same rigor as the caching path.
Further Reading
- AWS Whitepapers, Database Caching Strategies Using Redis. Official AWS guide covering cache-aside, read-through, and write-through patterns with Redis.
- CodeAhoy, Caching Strategies and How to Choose the Right One. Practical comparison of all major caching strategies with decision criteria.
- NCache, Read-Through, Write-Through, Write-Behind Caching. Detailed explanation of cache provider-managed patterns with code examples.
- System Overflow, Cache-Aside Pattern. Deep dive into cache-aside implementation including thundering herd mitigation.
Assignment
You are building an e-commerce product page that serves 10,000 reads per second and receives approximately 2 writes per second (price changes, description edits). The product catalog has 50,000 active products. Average product data size is 2 KB.
- Which read caching strategy do you choose: cache-aside, read-through, or refresh-ahead? Justify your choice based on the read/write ratio and workload characteristics.
- What TTL do you set, and why? Consider the write frequency, acceptable staleness, and the thundering herd risk when a popular product's cache entry expires.
- Calculate the cache memory needed if 80% of traffic goes to 20% of products (power-law distribution). How much RAM does your Redis instance need?
- A flash sale starts and one product suddenly gets 50,000 reads per second. Its cache entry expires at the worst possible moment. What happens? How do you prevent it?