Session 9.7: Content-Based Messaging & Priority Queues

Course → Module 9: Advanced Topics & Emerging Architectures

How Messages Find Their Destination

Session 5.5 introduced message queues and pub/sub as communication patterns. Session 5.6 covered Kafka's internal architecture. This session goes deeper into routing: how do messages reach the right consumer when you have dozens of message types, varying priorities, and messages that sometimes cannot be processed at all?

Two routing paradigms dominate: topic-based routing, where producers publish to named channels and consumers subscribe to those channels, and content-based routing, where the message's content determines its destination. Each has tradeoffs in complexity, flexibility, and performance. On top of routing, priority queues ensure that urgent messages are processed before routine ones, and dead letter queues handle the messages that no consumer can process.

Topic-Based vs Content-Based Routing

In topic-based routing, the producer decides the destination by publishing to a specific topic. A payment service publishes to payments.completed. An order service subscribes to that topic. The routing logic is static: if you subscribe to the topic, you get the message. Kafka, Amazon SNS, and Google Pub/Sub all default to this model.

In content-based routing, the messaging system inspects message attributes or payload and routes based on rules. A message with {"region": "EU", "amount": > 10000} might route to a fraud detection queue, while a message with {"region": "US", "amount": < 100} routes directly to fulfillment. The producer does not need to know the downstream topology. The routing logic lives in the broker or a routing layer.

Dimension	Topic-Based Routing	Content-Based Routing
Routing decision	Producer chooses the topic	Broker evaluates message content against rules
Producer coupling	Producer must know topic names	Producer publishes to a single endpoint
Flexibility	Adding a new consumer requires a new topic or subscription	Adding a new consumer requires a new rule
Performance	Fast: simple lookup by topic name	Slower: broker must evaluate filter expressions per message
Complexity	Low at the broker, higher at the producer	Higher at the broker, lower at the producer
Debugging	Easy: check which topics a service subscribes to	Harder: must trace filter rules to understand routing
Examples	Kafka topics, RabbitMQ exchanges (direct, fanout)	RabbitMQ headers exchange, AWS SNS message filtering, Azure Service Bus filters

In practice, most systems use topic-based routing as the primary mechanism and add content-based filtering for specific use cases. Kafka, for example, does not support content-based routing natively. You implement it by either creating fine-grained topics (one per event type and region) or by having consumers filter messages after receiving them.

Kafka Topic Design Patterns

Topic design in Kafka is a structural decision that affects parallelism, ordering, and consumer isolation. Three common patterns:

Single topic, multiple event types. All events for a domain go to one topic (e.g., orders contains order.created, order.paid, order.shipped). Consumers filter by event type. This preserves ordering within a partition key (order ID) and keeps the topic count low. The downside: consumers receive messages they do not care about and must discard them.

Topic per event type. Each event gets its own topic (orders.created, orders.paid). Consumers subscribe only to relevant topics. This is cleaner for consumer logic but creates many topics and loses cross-event ordering guarantees unless you coordinate partition keys across topics.

Bucket priority pattern. For priority handling in Kafka, create separate topics for each priority level (alerts.p0, alerts.p1, alerts.p2). High-priority consumers poll P0 first and only move to P1 when P0 is empty. This is the standard workaround for Kafka's lack of native priority queues.

Priority Queue Implementations

Some messages are more urgent than others. A cardiac arrest alert must be processed before a medication reminder. A fraud detection flag must be handled before a marketing email. Priority queues ensure that processing order reflects business importance, not arrival order.

graph TB P[Producer] --> R{Priority
Router} R -->|P0: Critical| Q0[Queue P0
Cardiac Arrest] R -->|P1: High| Q1[Queue P1
Abnormal Vitals] R -->|P2: Normal| Q2[Queue P2
Medication Reminder] Q0 --> C[Consumer Pool] Q1 --> C Q2 --> C C --> H[Handler
Service] style Q0 fill:#6b3a3a,stroke:#c47a5a,color:#ede9e3 style Q1 fill:#4a3a2a,stroke:#c8a882,color:#ede9e3 style Q2 fill:#2a3a2a,stroke:#6b8f71,color:#ede9e3

Implementation approaches vary by broker:

RabbitMQ supports native priority queues. You declare a queue with x-max-priority: 10 and set a priority field on each message. The broker delivers higher-priority messages first. Simple, but the priority evaluation adds latency under high load.

Kafka has no built-in priority. The bucket pattern described above is the standard approach. Consumers use a weighted polling strategy: poll P0 with every cycle, poll P1 every second cycle, poll P2 every fourth cycle. Under load, P0 always gets processed first because it is checked on every iteration.

Amazon SQS does not support priority natively. You create separate queues per priority level and implement the polling logic in your consumer application, similar to the Kafka approach.

Dead Letter Queues

A dead letter queue (DLQ) is a holding area for messages that a consumer has tried and failed to process. After a configurable number of retry attempts (typically 3 to 5), the message is moved to the DLQ instead of being retried indefinitely or dropped.

DLQs exist because some failures are not transient. A malformed message will never parse correctly no matter how many times you retry. A message referencing a deleted user will always fail validation. Without a DLQ, these messages block the queue (head-of-line blocking) or get silently dropped. Neither outcome is acceptable.

A dead letter queue is where messages go when the system admits it cannot process them. The best systems check this queue before anything else.

Effective DLQ management requires three things. First, every message in the DLQ must retain its original headers, payload, and metadata plus the error reason and retry count. Second, the DLQ must be monitored with alerts. A growing DLQ is a symptom, not a destination. Third, there must be a reprocessing path: fix the bug, then replay the DLQ messages back into the main queue.

Putting It Together: Hospital Alert System

Consider a hospital monitoring system. Bedside sensors emit events: heart rate, blood pressure, oxygen saturation, medication schedules. These events have vastly different urgency levels.

graph LR S1[Bedside
Sensor] --> I[Ingestion
Service] S2[Bedside
Sensor] --> I I --> CL{Content
Classifier} CL -->|cardiac_arrest| P0[P0 Queue] CL -->|abnormal_vitals| P1[P1 Queue] CL -->|med_reminder| P2[P2 Queue] P0 --> D[Dispatch
Service] P1 --> D P2 --> D D --> N[Nurse Station
+ Pager] P0 -.->|failed after 1 retry| DLQ[Dead Letter
Queue] P1 -.->|failed after 3 retries| DLQ P2 -.->|failed after 5 retries| DLQ DLQ --> MON[DLQ Monitor
+ Alert]

The content classifier examines the event payload. A heart rate of zero or ventricular fibrillation pattern triggers P0 classification. Blood pressure outside safe ranges triggers P1. A scheduled medication reminder goes to P2. The dispatch service always drains P0 before checking P1, and P1 before P2. For P0, the retry limit is 1 (if it fails, alert a human immediately via DLQ monitor). For P2, five retries are acceptable because a medication reminder delayed by 30 seconds is not life-threatening.

Assignment

Design a hospital alert system with three priority levels:

P0: Cardiac arrest, ventricular fibrillation, respiratory failure
P1: Abnormal vitals (blood pressure, oxygen saturation outside safe range)
P2: Medication reminders, routine check-in prompts

Routing design. Will you use topic-based or content-based routing? Justify your choice. Draw the message flow from sensor to nurse station.
Priority guarantee. Describe exactly how your consumer ensures P0 is always processed first, even when P1 and P2 queues have thousands of pending messages. Write pseudocode for the consumer polling loop.
DLQ policy. Define retry limits for each priority level. A failed P0 message has different implications than a failed P2. What happens when a P0 message lands in the DLQ?
Load test scenario. During a mass casualty event, 200 P0 alerts arrive in 10 seconds. Your consumer pool has 5 instances. Calculate processing time per message if each P0 handler takes 200ms. Will all P0 alerts be acknowledged within 10 seconds? If not, what scaling strategy do you propose?

Content-Based Messaging & Priority Queues