Course → Module 5: Distributed Systems & Consensus

Two Patterns for Asynchronous Communication

When services need to communicate without waiting for each other, they pass messages. But "pass messages" hides a critical design choice: should the message go to one recipient, or to many? Message queues and publish/subscribe (pub/sub) answer this differently, and choosing wrong creates architectural debt that compounds over time.

A queue says "someone handle this." Pub/sub says "everyone who cares, here is what happened." The first is a work assignment. The second is a broadcast.

Message Queues: Point-to-Point

A message queue holds messages until a consumer picks them up. Each message is delivered to exactly one consumer. Once consumed, the message is removed from the queue (or marked as processed). If multiple consumers are listening, they compete for messages. This is called the competing consumers pattern.

This model is ideal for work distribution. You have 10,000 image resize jobs. Five workers pull from the same queue. Each image is resized exactly once. No duplication, no missed work. If a worker crashes mid-processing, the message becomes visible again after a timeout (visibility timeout in SQS, negative acknowledgment in RabbitMQ) and another worker picks it up.

graph LR P[Producer] -->|sends message| Q[Queue] Q -->|delivers to ONE| C1[Consumer A] Q -.->|or| C2[Consumer B] Q -.->|or| C3[Consumer C] style Q fill:#222221,stroke:#c8a882,color:#ede9e3 style P fill:#191918,stroke:#6b8f71,color:#ede9e3 style C1 fill:#191918,stroke:#c8a882,color:#ede9e3 style C2 fill:#191918,stroke:#8a8478,color:#8a8478 style C3 fill:#191918,stroke:#8a8478,color:#8a8478

In the diagram above, the producer sends a message to the queue. Only one consumer receives it. The dashed lines indicate that Consumers B and C exist but do not receive this particular message. They will get the next ones.

Key properties of message queues:

Common implementations: Amazon SQS, RabbitMQ, ActiveMQ, Azure Service Bus queues.

Pub/Sub: Fan-Out

In pub/sub, a publisher sends a message to a topic. Every subscriber to that topic receives a copy. The publisher does not know (or care) how many subscribers exist. This decouples the sender from receivers entirely.

Consider an order placement event. The inventory service needs to reserve stock. The email service needs to send a confirmation. The analytics service needs to record the sale. The shipping service needs to prepare a label. With a queue, you would need four separate queues or a single consumer that routes to all four. With pub/sub, you publish once to an "order.placed" topic. All four services receive the event independently.

graph LR P[Publisher] -->|publishes event| T[Topic: order.placed] T -->|copy| S1[Inventory Service] T -->|copy| S2[Email Service] T -->|copy| S3[Analytics Service] T -->|copy| S4[Shipping Service] style T fill:#222221,stroke:#c8a882,color:#ede9e3 style P fill:#191918,stroke:#6b8f71,color:#ede9e3 style S1 fill:#191918,stroke:#c8a882,color:#ede9e3 style S2 fill:#191918,stroke:#c8a882,color:#ede9e3 style S3 fill:#191918,stroke:#c8a882,color:#ede9e3 style S4 fill:#191918,stroke:#c8a882,color:#ede9e3

Every subscriber gets its own copy of the message. Adding a fifth subscriber (say, a fraud detection service) requires zero changes to the publisher. This is the power of fan-out: extensibility without coordination.

Key properties of pub/sub:

Common implementations: Amazon SNS, Google Cloud Pub/Sub, Redis Pub/Sub, Kafka topics (with consumer groups for hybrid behavior).

The SNS + SQS Pattern

In AWS, a common architecture combines both. SNS provides fan-out. SQS provides durability and competing consumers. An event publishes to an SNS topic. Each downstream service subscribes via its own SQS queue. This gives you fan-out (pub/sub) with guaranteed delivery (queue) per subscriber.

graph LR P[Publisher] --> SNS[SNS Topic] SNS --> Q1[SQS: Inventory] SNS --> Q2[SQS: Email] SNS --> Q3[SQS: Analytics] Q1 --> C1[Inventory Worker] Q2 --> C2[Email Worker] Q3 --> C3[Analytics Worker] style SNS fill:#222221,stroke:#c8a882,color:#ede9e3 style Q1 fill:#191918,stroke:#6b8f71,color:#ede9e3 style Q2 fill:#191918,stroke:#6b8f71,color:#ede9e3 style Q3 fill:#191918,stroke:#6b8f71,color:#ede9e3

This hybrid pattern is widely used in production AWS architectures. Each queue can scale its consumers independently, retry failed messages, and use dead-letter queues for poison messages.

Comparison Across Dimensions

Dimension Message Queue Pub/Sub
Delivery model Point-to-point (one consumer per message) Fan-out (all subscribers get a copy)
Consumer count Competing consumers share the load Each subscriber is independent
Message persistence Stored until consumed and acknowledged Varies: SNS does not persist, Kafka persists
Ordering FIFO available (SQS FIFO, RabbitMQ) Per-partition ordering (Kafka), none (SNS)
Replay Not possible (message deleted after ack) Possible in Kafka (offset-based), not in SNS
Backpressure Queue depth grows, natural backpressure signal Subscribers must keep up or buffer independently
Coupling Producer knows the queue, loosely coupled Producer knows only the topic, fully decoupled
Use case Task distribution, job processing, serial workflows Event notification, broadcasting, multi-service fanout
Dead-letter handling Built-in DLQ support (SQS, RabbitMQ) Requires per-subscriber queue (SNS+SQS pattern)
Scaling consumers Add more competing consumers to one queue Each subscriber scales independently

When to Use Which

The decision is usually straightforward once you ask one question: does this message need to reach one handler or many?

Use a queue when:

Use pub/sub when:

Use both (SNS+SQS or Kafka consumer groups) when:

Systems Thinking Lens

Queues create a balancing loop. As work arrives faster than consumers process it, queue depth grows. Queue depth triggers autoscaling (more consumers), which drains the queue. The system self-regulates. Pub/sub creates a reinforcing loop of extensibility: each new event type attracts more subscribers, which increases the value of the event system, which encourages publishing more events. Left unchecked, this becomes an "event soup" where hundreds of event types flow through the system with no clear ownership. The systems thinker sets boundaries: event catalogs, ownership per topic, and explicit contracts between publishers and subscribers.

Further Reading

Assignment

For each scenario below, decide whether you would use a message queue, pub/sub, or a combination. Name a specific technology (SQS, SNS, RabbitMQ, Kafka) and explain your reasoning in 2-3 sentences.

  1. Send a welcome email after user signup. Only one email service should send the email. Duplicate emails are unacceptable.
  2. Notify 5 services when an order is placed. Inventory, email, analytics, shipping, and fraud detection all need the event. They process at different speeds.
  3. Process credit card payments one at a time. Payments must be processed in the exact order they were submitted. Each payment is handled by one worker.

Bonus: For scenario (b), draw the architecture. Would you use SNS alone, SQS alone, or SNS+SQS? What happens if the analytics service goes down for 2 hours?