Synchronous vs. Event-Driven Architecture
Session 5.7 · ~5 min read
Two Ways to Wire Services Together
Every distributed system faces the same question: when Service A needs something from Service B, should A wait for the answer or fire a message and move on? This is the fundamental split between synchronous and event-driven architecture. Neither is universally better. Each creates a different failure profile, a different debugging experience, and a different coupling structure.
Synchronous: Request-Response
In synchronous communication, the caller sends a request and blocks until it receives a response. HTTP REST calls, gRPC, and GraphQL queries are all synchronous by default. The caller knows immediately whether the operation succeeded or failed.
This simplicity has a cost. If Service B is slow, Service A is slow. If Service B is down, Service A fails. Chain five synchronous calls together (A calls B calls C calls D calls E) and your latency is the sum of all five, your availability is the product of all five. Five services at 99.9% availability each give you 99.5% end-to-end availability. That is 4.4 hours of downtime per year instead of 52 minutes.
If any service fails, the whole chain fails
Synchronous architectures fail together. Event-driven architectures fail independently. This is the core tradeoff. Tight coupling gives you simplicity and immediate feedback. Loose coupling gives you resilience and independent scaling.
Event-Driven: Fire and React
In event-driven architecture, services communicate by publishing and consuming events. The Order Service does not call the Inventory Service directly. It publishes an "order.created" event. The Inventory Service subscribes to that event and reacts. The Order Service does not wait, does not know, and does not care what the Inventory Service does with the event.
This decoupling changes the failure profile. If the Inventory Service goes down for 10 minutes, events accumulate in the message broker. When the service comes back, it processes the backlog. The Order Service never noticed the outage. No cascading failures. No retry storms.
The cost is complexity. With synchronous calls, you can trace a request through a single HTTP call chain. With events, the flow is scattered across multiple services reacting to multiple event types. Debugging "why did this order not ship?" requires reconstructing the event timeline across services.
| Dimension | Synchronous | Event-Driven |
|---|---|---|
| Coupling | Tight: caller depends on callee's availability and interface | Loose: publisher and subscriber share only the event schema |
| Latency | Sum of all downstream calls | Immediate response to client; processing happens asynchronously |
| Failure handling | Cascading: one failure breaks the chain | Isolated: failed service catches up from backlog |
| Traceability | Simple: follow the call chain | Hard: reconstruct event flows across services |
| Consistency | Immediate (within a transaction or call) | Eventual: state converges over time |
| Scaling | Scale every service in the call chain equally | Scale each service independently based on its own load |
| Testing | Straightforward: mock downstream calls | Requires event replay, integration testing across services |
| Data ownership | Often shared databases or direct queries across services | Each service owns its data, reacts to events to stay in sync |
The Saga Pattern
In a monolith, you wrap a multi-step operation in a database transaction. Deduct inventory, charge payment, send confirmation. If any step fails, the whole transaction rolls back. In microservices, there is no single database. Each service owns its data. You cannot use a distributed transaction (two-phase commit) at scale because it blocks all participants and does not tolerate network partitions well.
The saga pattern replaces a single transaction with a sequence of local transactions, each in its own service. If a step fails, the saga executes compensating transactions to undo the previous steps. There are two ways to coordinate a saga: choreography and orchestration.
Choreography: Decentralized Coordination
In choreography, each service listens for events and decides what to do next. There is no central coordinator. The Order Service publishes "order.created." The Inventory Service hears it, reserves stock, and publishes "inventory.reserved." The Payment Service hears that, charges the card, and publishes "payment.completed." The Notification Service hears that and sends the confirmation email.
If payment fails, the Payment Service publishes "payment.failed." The Inventory Service hears it and releases the reserved stock (compensating transaction).
Choreography is simple for short sagas (2-3 steps). It becomes unmanageable for longer flows because the logic is spread across every participating service. Adding a step means modifying multiple services. Debugging requires tracing events across all participants.
Orchestration: Central Coordinator
In orchestration, a dedicated saga orchestrator controls the flow. It tells each service what to do and waits for the result. If a step fails, the orchestrator triggers compensating transactions in reverse order.
The orchestrator is the single place where you can see the entire saga flow. It handles retries, timeouts, and compensation in one location. The downside is that the orchestrator becomes a single point of failure and a potential bottleneck. It also creates coupling: the orchestrator must know about every service in the flow.
The Transactional Outbox Pattern
Event-driven systems have a dual-write problem. When the Order Service creates an order, it must (1) write to its database and (2) publish an event. If the database write succeeds but the event publish fails, downstream services never learn about the order. If the event publishes but the database write fails, downstream services act on an order that does not exist.
The outbox pattern solves this. Instead of publishing the event directly, the service writes the event to an "outbox" table in the same database transaction as the business data. A separate process (a poller or a CDC connector like Debezium) reads the outbox table and publishes the events to the message broker. Because the business data and the outbox entry are in the same transaction, they succeed or fail together.
Debezium is the most widely used open-source CDC tool for the outbox pattern. It reads the database's transaction log (PostgreSQL WAL, MySQL binlog) and streams changes to Kafka topics. No polling, no missed events, no dual-write risk.
Systems Thinking Lens
Synchronous architectures have a tight feedback loop: you call, you get a response, you know immediately. This is a reinforcing loop for developer productivity in the short term. But it creates a balancing loop at scale: more services in the chain means more latency and more failure modes, which pushes teams toward event-driven patterns.
Event-driven architectures introduce delay in the feedback loop. Events are processed asynchronously, so the cause (order created) and the effect (email sent) are separated in time. This delay makes debugging harder but makes the system more resilient to individual component failures. The saga pattern is a conscious design of compensating feedback loops: every forward action has a defined reverse action, creating a system that can self-correct when failures occur.
Further Reading
- Chris Richardson, Saga Pattern (microservices.io). The canonical reference for saga design in microservices, with choreography and orchestration examples.
- Microsoft Azure, Saga Design Pattern (Azure Architecture Center). Detailed guidance with decision flowcharts and implementation considerations.
- Debezium, Reliable Microservices Data Exchange with the Outbox Pattern. Step-by-step implementation guide using Debezium CDC with PostgreSQL and Kafka.
- Chris Richardson, Transactional Outbox Pattern (microservices.io). Pattern definition with polling and CDC-based implementations.
- ByteByteGo, Saga Pattern Demystified: Orchestration vs Choreography. Visual comparison with practical decision criteria.
Assignment
Design a saga for the following order flow. Specify whether you would use choreography or orchestration, and justify your choice.
Flow:
- Step 1: Deduct inventory for the ordered items.
- Step 2: Charge the customer's credit card.
- Step 3: Send a confirmation email.
Questions:
- What happens if Step 2 (payment) fails after Step 1 (inventory deduction) has already succeeded? Write the compensating transaction.
- What happens if Step 3 (email) fails after Steps 1 and 2 succeed? Should you compensate Steps 1 and 2? Why or why not?
- How would you use the outbox pattern to ensure the "order.created" event is reliably published after the order is saved to the database?
- Draw a sequence diagram (choreography or orchestration) for the happy path and the failure path from question (a).