Social Media Feed
Session 7.2 · ~5 min read
The Problem: Personalized Timelines at Scale
When a user opens Twitter (now X), they see a feed of tweets from people they follow, ranked and ordered. This looks simple. It is not. At peak, Twitter handled 300,000 timeline read requests per second while ingesting around 4,600 new tweets per second (12,000 at peak). The system must assemble a personalized timeline for each of those 300K requests in under 200 milliseconds.
The core design question is deceptively simple: when User A posts a tweet, how does it appear in User B's feed? There are exactly two strategies, and Twitter has tried both.
Strategy 1: Fan-Out on Read (Pull Model)
When User B opens their timeline, the system queries all accounts B follows, fetches their recent tweets, merges them, ranks them, and returns the result. Every timeline request triggers a fresh computation.
This approach is simple and storage-efficient. You store each tweet once. But the read path is expensive. If User B follows 500 accounts, that is 500 lookups plus a merge-sort on every single request. At 300K requests per second, this does not scale.
Strategy 2: Fan-Out on Write (Push Model)
When User A posts a tweet, the system immediately writes a copy of that tweet's ID into the timeline cache of every follower. User B's timeline is pre-computed and sitting in memory (Redis) before B even opens the app.
This is what Twitter actually adopted. Each user's home timeline is stored as a Redis list of tweet IDs, replicated across 3 machines in a datacenter. The read path becomes a single Redis lookup. Fast. Predictable. Easy to cache.
The cost is write amplification. User A posts one tweet. If A has 5,000 followers, that single tweet generates 5,000 timeline writes. Twitter reported that this amplification inflated their 4,600 tweets/sec into 345,000 timeline writes per second.
The Celebrity Problem
Fan-out on write works beautifully until a celebrity posts. A user with 10 million followers generates 10 million timeline writes for a single tweet. At Twitter's scale, multiple celebrities tweet every second. The fan-out queue backs up. Timelines go stale. Users complain their feed is delayed.
This is not a theoretical problem. Twitter engineers confirmed that tweets from high-follower accounts could take more than 5 seconds to propagate to all followers. For breaking news from a journalist with millions of followers, that delay is unacceptable.
| Strategy | Read Cost | Write Cost | Latency | Storage | Best For |
|---|---|---|---|---|---|
| Fan-out on Read | High (N queries per request) | Low (1 write) | High read latency | Low | Users with many followers |
| Fan-out on Write | Low (1 cache read) | High (N writes per tweet) | Low read latency | High | Users with few followers |
| Hybrid | Medium | Medium | Low read latency | Medium | Production systems at scale |
Key insight: Fan-out on write trades storage for latency. Fan-out on read trades latency for storage. The celebrity problem forces you to do both.
The Hybrid Approach
Twitter's actual production system uses a hybrid. Regular users (say, under 10,000 followers) use fan-out on write. Their tweets are pushed to follower timelines immediately. High-follower accounts are flagged. Their tweets are not fanned out. Instead, when User B requests their timeline, the system reads B's pre-computed timeline from Redis, then fetches recent tweets from any high-follower accounts B follows, merges them in, and returns the result.
The threshold is not fixed. Twitter uses a dynamic cutoff based on follower count and tweet frequency. An account with 5 million followers who tweets once a day is treated differently than one with 500,000 followers who tweets 50 times a day. The fan-out cost depends on both.
Cursor-Based Pagination
Timelines are paginated. The naive approach is offset-based: "give me tweets 20-40." This breaks when new tweets arrive between page requests. Tweet 20 on page 2 might be tweet 21 by the time the user scrolls. Items get duplicated or skipped.
Cursor-based pagination solves this. Instead of "give me page 2," the client says "give me 20 tweets older than tweet ID 1892347." The cursor is a pointer to the last item seen. New tweets arriving at the top of the timeline do not shift the cursor position. The user sees a consistent stream regardless of how fast new content arrives.
Storage Architecture
The timeline itself lives in Redis for fast reads. But the tweets themselves are stored in a persistent datastore. Twitter uses a MySQL cluster (sharded by user ID) for tweet storage. The Redis timeline contains only tweet IDs, not full tweet objects. When the client requests a timeline, the system reads the list of IDs from Redis, then hydrates them by fetching the full tweet data from the persistent store (with another cache layer in front).
This separation matters. Redis holds the ordering and personalization. The persistent store holds the source of truth. If a Redis node fails, timelines can be recomputed from the tweet store. Slow, but recoverable.
Further Reading
- The Architecture Twitter Uses to Deal with 150M Active Users, High Scalability
- Real-time Tweet Delivery Architecture at Twitter, HdM Stuttgart CS Blog
- Twitter's Tough Architectural Decision, Denny Sam
- System Design Primer: Twitter, Donne Martin
Assignment
A user with 10 million followers posts a tweet. If you used pure fan-out on write, that single tweet would generate 10 million timeline writes.
- Calculate how long this fan-out would take if each write takes 1ms and you have 100 fan-out workers running in parallel.
- Design a hybrid system. Define your threshold for "celebrity" accounts. What data do you use to set this threshold?
- Draw the sequence diagram for a timeline request where User B follows 3 regular users and 2 celebrities. Show every system interaction.
- Your Redis cluster holding timelines loses a node. 5% of users have lost their pre-computed timelines. How do you recover? What is the user experience during recovery?