Message Queues vs. Pub/Sub
Session 5.5 · ~5 min read
Two Patterns for Asynchronous Communication
When services need to communicate without waiting for each other, they pass messages. But "pass messages" hides a critical design choice: should the message go to one recipient, or to many? Message queues and publish/subscribe (pub/sub) answer this differently, and choosing wrong creates architectural debt that compounds over time.
A queue says "someone handle this." Pub/sub says "everyone who cares, here is what happened." The first is a work assignment. The second is a broadcast.
Message Queues: Point-to-Point
A message queue holds messages until a consumer picks them up. Each message is delivered to exactly one consumer. Once consumed, the message is removed from the queue (or marked as processed). If multiple consumers are listening, they compete for messages. This is called the competing consumers pattern.
This model is ideal for work distribution. You have 10,000 image resize jobs. Five workers pull from the same queue. Each image is resized exactly once. No duplication, no missed work. If a worker crashes mid-processing, the message becomes visible again after a timeout (visibility timeout in SQS, negative acknowledgment in RabbitMQ) and another worker picks it up.
In the diagram above, the producer sends a message to the queue. Only one consumer receives it. The dashed lines indicate that Consumers B and C exist but do not receive this particular message. They will get the next ones.
Key properties of message queues:
- One message, one consumer. No fan-out.
- Guaranteed delivery. Messages persist until acknowledged.
- Ordering. FIFO queues (like SQS FIFO) preserve message order. Standard queues offer best-effort ordering.
- Backpressure. If consumers slow down, the queue grows. You can monitor queue depth as a scaling signal.
Common implementations: Amazon SQS, RabbitMQ, ActiveMQ, Azure Service Bus queues.
Pub/Sub: Fan-Out
In pub/sub, a publisher sends a message to a topic. Every subscriber to that topic receives a copy. The publisher does not know (or care) how many subscribers exist. This decouples the sender from receivers entirely.
Consider an order placement event. The inventory service needs to reserve stock. The email service needs to send a confirmation. The analytics service needs to record the sale. The shipping service needs to prepare a label. With a queue, you would need four separate queues or a single consumer that routes to all four. With pub/sub, you publish once to an "order.placed" topic. All four services receive the event independently.
Every subscriber gets its own copy of the message. Adding a fifth subscriber (say, a fraud detection service) requires zero changes to the publisher. This is the power of fan-out: extensibility without coordination.
Key properties of pub/sub:
- One message, many consumers. Fan-out by default.
- Fire and forget (for the publisher). The topic handles distribution.
- No inherent ordering guarantee across subscribers. Each subscriber processes at its own pace.
- Message durability varies. SNS does not persist messages. If a subscriber is down when the message arrives, it misses it (unless backed by a queue).
Common implementations: Amazon SNS, Google Cloud Pub/Sub, Redis Pub/Sub, Kafka topics (with consumer groups for hybrid behavior).
The SNS + SQS Pattern
In AWS, a common architecture combines both. SNS provides fan-out. SQS provides durability and competing consumers. An event publishes to an SNS topic. Each downstream service subscribes via its own SQS queue. This gives you fan-out (pub/sub) with guaranteed delivery (queue) per subscriber.
This hybrid pattern is widely used in production AWS architectures. Each queue can scale its consumers independently, retry failed messages, and use dead-letter queues for poison messages.
Comparison Across Dimensions
| Dimension | Message Queue | Pub/Sub |
|---|---|---|
| Delivery model | Point-to-point (one consumer per message) | Fan-out (all subscribers get a copy) |
| Consumer count | Competing consumers share the load | Each subscriber is independent |
| Message persistence | Stored until consumed and acknowledged | Varies: SNS does not persist, Kafka persists |
| Ordering | FIFO available (SQS FIFO, RabbitMQ) | Per-partition ordering (Kafka), none (SNS) |
| Replay | Not possible (message deleted after ack) | Possible in Kafka (offset-based), not in SNS |
| Backpressure | Queue depth grows, natural backpressure signal | Subscribers must keep up or buffer independently |
| Coupling | Producer knows the queue, loosely coupled | Producer knows only the topic, fully decoupled |
| Use case | Task distribution, job processing, serial workflows | Event notification, broadcasting, multi-service fanout |
| Dead-letter handling | Built-in DLQ support (SQS, RabbitMQ) | Requires per-subscriber queue (SNS+SQS pattern) |
| Scaling consumers | Add more competing consumers to one queue | Each subscriber scales independently |
When to Use Which
The decision is usually straightforward once you ask one question: does this message need to reach one handler or many?
Use a queue when:
- Work must be processed exactly once by one worker (payment processing, image resizing, PDF generation).
- You need backpressure and rate control (queue depth as a scaling metric).
- Order matters (FIFO processing of sequential events).
Use pub/sub when:
- Multiple independent services need to react to the same event (order placed, user registered).
- You want to add new subscribers without modifying the publisher.
- Events are informational rather than actionable tasks (audit logs, analytics events).
Use both (SNS+SQS or Kafka consumer groups) when:
- You need fan-out AND guaranteed delivery per subscriber.
- Different subscribers process at different speeds.
- You want dead-letter queues per subscriber for independent failure handling.
Systems Thinking Lens
Queues create a balancing loop. As work arrives faster than consumers process it, queue depth grows. Queue depth triggers autoscaling (more consumers), which drains the queue. The system self-regulates. Pub/sub creates a reinforcing loop of extensibility: each new event type attracts more subscribers, which increases the value of the event system, which encourages publishing more events. Left unchecked, this becomes an "event soup" where hundreds of event types flow through the system with no clear ownership. The systems thinker sets boundaries: event catalogs, ownership per topic, and explicit contracts between publishers and subscribers.
Further Reading
- AWS Documentation, Amazon SQS Developer Guide. Official reference for SQS queue types, visibility timeout, dead-letter queues, and FIFO ordering.
- Ably, Apache Kafka vs RabbitMQ vs AWS SNS/SQS. Comprehensive comparison of messaging systems with architecture diagrams and use-case recommendations.
- Gregor Hohpe and Bobby Woolf, Enterprise Integration Patterns (Addison-Wesley, 2003). The canonical reference for messaging patterns including point-to-point channels, publish-subscribe channels, and message routing.
- Encore, Message Queues vs Pub/Sub. Practical comparison with code examples and decision framework.
Assignment
For each scenario below, decide whether you would use a message queue, pub/sub, or a combination. Name a specific technology (SQS, SNS, RabbitMQ, Kafka) and explain your reasoning in 2-3 sentences.
- Send a welcome email after user signup. Only one email service should send the email. Duplicate emails are unacceptable.
- Notify 5 services when an order is placed. Inventory, email, analytics, shipping, and fraud detection all need the event. They process at different speeds.
- Process credit card payments one at a time. Payments must be processed in the exact order they were submitted. Each payment is handled by one worker.
Bonus: For scenario (b), draw the architecture. Would you use SNS alone, SQS alone, or SNS+SQS? What happens if the analytics service goes down for 2 hours?