Ticketing System
Session 8.2 · ~5 min read
The Problem
A popular artist announces a concert. 50,000 seats. Within seconds of tickets going on sale, 500,000 users hit the "Buy" button simultaneously. The system must ensure that no seat is sold twice, that the experience feels fair, and that payment processing does not corrupt inventory state. Failure at any point means overselling, angry customers, and refund chaos.
Ticketing is fundamentally different from e-commerce. In e-commerce, if one item sells out, there are usually substitutes. In ticketing, every seat is unique. Section 103, Row F, Seat 12 either belongs to one person or it does not. There is no partial fulfillment.
Key insight: Ticketing is a fairness problem disguised as a scaling problem. The hardest part is not handling 500K requests per second. It is ensuring that the person who clicked first actually gets the seat, and that no seat is promised to two people at once.
High-Level Architecture
Redis] API --> INV[Inventory Service
In-Memory Cache] API --> PAY[Payment Service] PAY --> DB[(Primary Database)] SL --> DB INV --> DB DB --> NOTIFY[Notification Service] NOTIFY --> U
The architecture separates concerns into distinct services. The virtual waiting queue controls admission rate. The seat lock service prevents double-booking. The inventory service tracks availability in memory for fast reads. The payment service handles the financial transaction. The notification service confirms or rejects.
Seat Locking: Optimistic vs. Pessimistic
When a user selects a seat, the system must temporarily reserve it while they complete payment. Two strategies exist for this lock.
| Strategy | How It Works | Best For | Risk |
|---|---|---|---|
| Pessimistic Locking | Lock the seat row in the database immediately when selected. No other transaction can read or modify it until released. | Low concurrency, strong consistency requirements | Lock contention under high load; database becomes bottleneck |
| Optimistic Locking | Allow multiple users to select the same seat. At commit time, check a version number. If it changed, the commit fails and the user must retry. | High read volume, lower write contention | Users see "available" seats that are already taken; poor UX during flash sales |
| Distributed Lock (Redis) | Use Redis SET with NX (set-if-not-exists) and TTL. A Lua script atomically checks availability and sets the lock in one operation. | Flash sales, extreme concurrency | Requires TTL management; lock expiry before payment completes can cause issues |
For flash sale scenarios, the Redis-based distributed lock is the standard choice. The lock is set with a TTL (typically 10 minutes). If the user completes payment within that window, the lock converts to a confirmed booking. If the TTL expires, the seat releases back to inventory automatically.
The critical detail is atomicity. Checking "is this seat available?" and "lock it for me" must happen in a single, uninterruptible operation. Without atomicity, two users can both see the seat as available, both attempt to lock it, and one ends up with a phantom reservation. Redis Lua scripts solve this by executing the check-and-set as one atomic unit on the server side.
Flash Sale Queue Architecture
When 500,000 users click "Buy" simultaneously, letting all of them hit the booking API directly would overwhelm every downstream service. The solution is a virtual waiting queue that controls admission.
The queue serves two purposes. First, it acts as a buffer, absorbing the initial traffic spike without passing it downstream. Second, it enforces fairness by processing users in the order they arrived. The admission controller releases users in batches sized to match the booking API's throughput capacity.
Users in the queue see their position and an estimated wait time. This transparency is important. People tolerate waiting when they can see progress. They do not tolerate a spinning wheel with no information.
In-Memory Inventory
The primary database is the source of truth for seat ownership. But querying it for every "show me available seats" request is too slow under flash-sale load. The solution is an in-memory inventory cache, typically Redis, that mirrors seat availability.
When a seat is locked, both the Redis lock and the inventory cache are updated. When a lock expires, the cache is updated to show the seat as available again. The database is updated only on confirmed booking, not on lock acquisition. This reduces write pressure on the database to only successful transactions.
The trade-off is eventual consistency. The cache may briefly show a seat as available when it has just been locked, or show it as locked when the lock just expired. For flash sales, this is acceptable. The lock service is the true gatekeeper, not the inventory display.
Payment Idempotency
Network failures during payment create a dangerous scenario. The user's payment goes through, but the confirmation response is lost. The user clicks "Pay" again. Without idempotency, they get charged twice.
Idempotency means that processing the same request multiple times produces the same result as processing it once. The standard implementation assigns a unique idempotency key to each booking attempt. The payment service stores this key with the transaction. If the same key arrives again, the service returns the previous result instead of processing a new charge.
Traffic Pattern: The Spike
Ticketing traffic does not follow normal web patterns. It is dominated by extreme spikes at the moment tickets go on sale, followed by rapid decay.
This spike pattern drives every architectural decision. The virtual queue exists because of this spike. The in-memory inventory exists because the database cannot handle this spike. The Redis lock exists because traditional database locks cannot handle this spike. Every component is shaped by the fact that 90% of all traffic arrives in the first 60 seconds.
Failure Modes
Three failure scenarios require explicit handling:
- Lock expires before payment completes. The seat releases back to inventory and another user grabs it. The first user's payment must be refunded. The system needs a reconciliation process that detects "payment succeeded but seat was lost" and triggers automatic refunds.
- Payment service is down. The seat is locked but payment cannot be processed. The lock TTL acts as a safety valve. After 10 minutes, the seat returns to the pool. The user is told to retry.
- Double submission. The user clicks "Pay" twice. The idempotency key prevents double charging, but the system must also prevent creating two booking records for the same seat.
Further Reading
- Design a Ticket Booking Site Like Ticketmaster (Hello Interview). Complete system design walkthrough with HLD and deep dives.
- Design Ticketmaster: A Comprehensive Guide (System Design School). Covers seat locking, queue architecture, and payment flow.
- Ticketmaster System Design: Step-by-Step Guide (System Design Handbook). Detailed treatment of distributed locking and inventory management.
- Ticketmaster's System Design (Educative Blog). Estimation, API design, and database schema for ticketing.
Assignment
A popular concert has 50,000 seats. 500,000 users hit the "Buy" button within the first second of tickets going on sale. Design the admission and booking flow.
- How do you prevent all 500K requests from reaching the booking API simultaneously? Describe the queue mechanism and batch sizing strategy.
- A user selects Section 103, Row F, Seat 12. Another user selects the same seat 200ms later. Walk through exactly what happens at the Redis lock level for both users.
- The first user's payment takes 8 minutes. The lock TTL is 10 minutes. What happens if payment takes 12 minutes instead? How does the system handle the resulting state?
- Why is optimistic locking a poor choice for flash sale seat reservation? What specific failure mode makes it unsuitable?