Session 7.1: URL Shortener

Course → Module 7: Real-World Case Studies I

Why URL Shorteners Matter

A URL shortener converts a long URL like https://example.com/articles/2026/04/very-long-slug-that-nobody-wants-to-type into something like https://short.ly/a3Bx9k. That is the entire product. The simplicity is deceptive.

Bitly processes over 10 billion clicks per month and creates roughly 256 million short links monthly. The read-to-write ratio is roughly 40:1 at Bitly's scale, and can reach 100:1 or higher for popular link shorteners. This asymmetry is the defining characteristic of the system and drives every architectural decision.

Key insight: The simplest system design problem teaches the most fundamental lesson: reads dominate writes in almost every system. A URL shortener makes this ratio visible and unavoidable.

Capacity Estimation

Before drawing any diagrams, pin down the numbers. Assume a service handling 100 million new URLs per month, with a 100:1 read-to-write ratio. Store URLs for 5 years.

Metric	Calculation	Result
New URLs / month	Given	100M
New URLs / second	100M / (30 × 86,400)	~40 writes/sec
Redirects / second	40 × 100	~4,000 reads/sec
Total URLs over 5 years	100M × 60 months	6 billion
Storage per URL (avg 500 bytes)	6B × 500 bytes	~3 TB
Short key length needed	62⁷ = 3.5 trillion	7 chars (base62)

A 7-character base62 key gives 3.5 trillion possible combinations. For 6 billion URLs, that is a collision probability so low you can ignore it with a proper generation strategy. But "so low you can ignore it" only holds if your generation strategy is actually proper. That is where the real design starts.

Short Key Generation: Three Approaches

There are three common strategies, each with distinct tradeoffs.

1. Random generation with collision check

Generate a random 7-character string. Check the database. If it already exists, generate another. Simple and stateless. The problem: as the keyspace fills, collision rate increases. At 6 billion keys out of 3.5 trillion, the probability of a collision on any single attempt is about 0.17%. That means roughly 1 in 600 writes needs a retry. Manageable, but not free.

2. Hash-based (MD5/SHA256 truncation)

Hash the long URL. Take the first 7 characters of the base62-encoded hash. Deterministic: the same input always produces the same output. The problem: different URLs can produce the same 7-character prefix. You still need a collision check, and you have lost the randomness that distributes keys evenly.

3. Counter-based with base62 encoding

Use a global auto-incrementing counter. Encode the counter value in base62. Counter 1 becomes 0000001. Counter 62 becomes 0000010. No collisions by definition. The problem: sequential keys are predictable. Anyone can guess the next URL. And you need a distributed counter that does not duplicate values across servers.

The counter approach is the most reliable at scale. Predictability is solved by adding a simple shuffle or XOR step before encoding. The distributed counter problem is solved by pre-allocating ranges: Server A gets IDs 1-1,000,000. Server B gets 1,000,001-2,000,000. No coordination needed until a range is exhausted.

flowchart TD A[Generate Short Key] --> B{Which Strategy?} B -->|Random| C[Generate random 7-char string] C --> D{Exists in DB?} D -->|Yes| C D -->|No| E[Store mapping] B -->|Hash| F[Hash long URL] F --> G[Take first 7 chars base62] G --> H{Exists in DB?} H -->|Yes| I[Append counter, re-hash] H -->|No| E B -->|Counter| J[Get next ID from counter service] J --> K[Base62 encode] K --> E

High-Level Design

The system has two flows: URL creation and URL redirection. Both go through a load balancer to a stateless application layer, which talks to a cache and a database.

The database is a key-value store. The key is the short code. The value is the original URL plus metadata (creation date, expiration, user ID, click count). DynamoDB, Cassandra, or even a sharded MySQL table all work here. The access pattern is simple: point lookups by key. No range queries. No joins. This is where key-value stores shine.

Caching: The 80/20 Rule in Action

At a 100:1 read-to-write ratio, caching is not optional. It is the primary scaling mechanism. A small percentage of shortened URLs receive a disproportionate share of traffic. A viral tweet with a Bitly link might get millions of clicks in hours, while most links are clicked fewer than 10 times total.

Place a Redis or Memcached layer between the application servers and the database. Use an LRU (Least Recently Used) eviction policy. With even a modest cache, you can absorb 80-90% of read traffic without hitting the database.

Cache sizing math: if 20% of URLs account for 80% of traffic, and we have 6 billion total URLs, we need to cache roughly 1.2 billion entries. At 500 bytes each, that is 600 GB. Large, but well within what a Redis cluster can handle.

The 301 vs. 302 Decision

When a user clicks a short URL, the server responds with an HTTP redirect. The choice between 301 (Moved Permanently) and 302 (Found / Temporary Redirect) has real consequences.

A 301 tells the browser to cache the redirect. Next time the user clicks the same short URL, the browser redirects directly without contacting your server. This reduces server load but means you lose visibility into click analytics. You cannot count how many times the link was clicked because repeat visits never reach your servers.

A 302 tells the browser this redirect is temporary. Every click hits your server. You get accurate analytics, but at the cost of higher server load.

Bitly uses 301 for performance. Analytics are supplemented by JavaScript tracking on the destination page. Most URL shorteners with analytics features use 302 to ensure every click is counted.

Read vs. Write Traffic Distribution

The chart below illustrates why caching and read optimization dominate the architecture. For every URL created, the system handles dozens to hundreds of redirect requests.

Create and Redirect Flows

The two core operations in detail:

sequenceDiagram participant C as Client participant LB as Load Balancer participant App as App Server participant KG as Key Generator participant DB as Database participant Cache as Cache (Redis) Note over C,Cache: URL Creation Flow C->>LB: POST /api/shorten {long_url} LB->>App: Forward request App->>KG: Request next short key KG-->>App: "a3Bx9k" App->>DB: INSERT (a3Bx9k, long_url, metadata) DB-->>App: OK App->>Cache: SET a3Bx9k = long_url App-->>C: 201 {short_url: "https://short.ly/a3Bx9k"} Note over C,Cache: URL Redirect Flow C->>LB: GET /a3Bx9k LB->>App: Forward request App->>Cache: GET a3Bx9k Cache-->>App: long_url (cache hit) App-->>C: 302 Redirect to long_url

On a cache miss during redirect, the app server falls through to the database, fetches the mapping, populates the cache, and then redirects. The next request for the same key hits the cache.

Assignment

Design a URL shortener that handles 500 million new URLs per month and stores them for 10 years.

Redo the capacity estimation table with these new numbers. How many characters do you need in your short key? How much total storage?
Choose a key generation strategy. Justify why it handles your scale without collisions.
One of your short URLs goes viral and receives 1 million hits per day. Walk through exactly what happens at each layer (load balancer, app server, cache, database). Where is the bottleneck?
Draw a complete high-level design diagram showing all components, including the cache layer, database replicas, and the key generation service.

URL Shortener