Session 2.2: Load Balancing: ALB vs. NLB

Course → Module 2: Scalability, Load Balancing & API Design

What Load Balancers Do

A load balancer sits between clients and a pool of servers. It receives incoming requests and forwards each one to a server that can handle it. The goals are straightforward: distribute traffic evenly, detect and route around failed servers, and allow the backend to scale without clients needing to know about it.

But not all load balancers work the same way. The two major categories, defined by where they operate in the network stack, are Layer 4 (transport) and Layer 7 (application). In AWS terminology, these map to the Network Load Balancer (NLB) and the Application Load Balancer (ALB).

The OSI Model Context

To understand the difference, you need to know two layers of the OSI model:

Layer 4 (Transport): Deals with TCP and UDP. At this layer, the load balancer sees source IP, destination IP, source port, and destination port. It does not inspect the content of the request.
Layer 7 (Application): Deals with HTTP, HTTPS, WebSocket, and other application protocols. At this layer, the load balancer can read HTTP headers, URL paths, cookies, and request bodies.

The layer at which a load balancer operates determines what it can see and what routing decisions it can make.

graph TB Client[Client] --> L7["Layer 7 (ALB)
Reads HTTP headers, paths, cookies"] Client --> L4["Layer 4 (NLB)
Reads TCP/UDP: IP + port only"] L7 -->|"/api/*"| Backend1[API Servers] L7 -->|"/static/*"| Backend2[Static Servers] L4 -->|"TCP forward"| Backend3[Server Pool]

Application Load Balancer (ALB)

ALB operates at Layer 7. It understands HTTP and HTTPS, can inspect request content, and makes routing decisions based on URL paths, hostnames, HTTP headers, query strings, and source IP.

ALB is the default choice for web applications. It supports:

Path-based routing: Send /api/* to one target group and /images/* to another.
Host-based routing: Route api.example.com to backend services and www.example.com to frontend servers.
SSL/TLS termination: The ALB decrypts HTTPS traffic, inspects the request, and forwards plain HTTP to backend servers. This offloads CPU-intensive TLS processing from your application servers.
WebSocket support: ALB natively handles WebSocket upgrades.
Authentication integration: ALB can authenticate users via OpenID Connect providers (Cognito, Okta, etc.) before requests reach your backend.
Sticky sessions: Route a user's requests to the same target using cookies.

The tradeoff is latency. Because the ALB reads and parses the full HTTP request before making a routing decision, it adds processing time. For most web applications, this overhead is negligible (single-digit milliseconds). For ultra-low-latency workloads, it matters.

Network Load Balancer (NLB)

NLB operates at Layer 4. It routes TCP and UDP traffic based on IP addresses and port numbers without inspecting the application-layer content. It is designed for extreme throughput and ultra-low latency.

NLB is built for raw performance. It supports:

Millions of requests per second with latencies in the microsecond range.
Static IP addresses: Each NLB gets one static IP per Availability Zone. This is critical for clients that need to whitelist specific IPs (firewalls, DNS records, partner integrations).
Source IP preservation: The original client IP is visible to backend servers without needing X-Forwarded-For headers.
TLS passthrough: NLB can forward encrypted traffic directly to backend servers without decrypting it. Useful when end-to-end encryption is required.
AWS PrivateLink support: NLB is the only load balancer type that works with VPC endpoint services for private connectivity.
Non-HTTP protocols: Any TCP or UDP workload. Game servers, IoT brokers, custom binary protocols.

The tradeoff is intelligence. NLB cannot route based on URL paths, HTTP headers, or cookies. It sees packets, not requests. If you need content-aware routing, NLB cannot do it alone.

Comparison Table

Dimension	ALB (Layer 7)	NLB (Layer 4)
OSI Layer	Layer 7 (Application)	Layer 4 (Transport)
Protocols	HTTP, HTTPS, WebSocket, gRPC	TCP, UDP, TLS
Routing intelligence	Path, host, header, query string, source IP	IP address and port only
SSL/TLS handling	Terminates and re-encrypts (offloading)	Passthrough or terminate
Static IP	No (DNS-based, IP can change)	Yes (one per AZ)
Latency	Low (single-digit ms)	Ultra-low (sub-ms)
Throughput	High	Extreme (millions of requests/sec)
Source IP preservation	Via X-Forwarded-For header	Native (client IP visible directly)
Sticky sessions	Cookie-based	Source IP-based
PrivateLink	Not supported	Supported
Cost model	Per hour + LCU (capacity units)	Per hour + NLCU (capacity units)
Best for	Web apps, REST APIs, microservices	IoT, gaming, real-time, non-HTTP protocols

When to Use Which

The decision usually comes down to what the load balancer needs to understand about the traffic.

If your load balancer needs to inspect HTTP content to make routing decisions (path-based routing, host-based routing, header inspection), use ALB. If it just needs to forward packets as fast as possible, use NLB.

A common production pattern combines both: an NLB as the external entry point (for static IPs and PrivateLink) that forwards to an ALB for content-based routing. This gives you the best of both layers, at the cost of additional hops and complexity.

Systems Thinking Lens

The choice of load balancer is not a local decision. It propagates through the system. Choosing ALB means your backend servers do not need to handle TLS termination, saving CPU. But it also means you depend on ALB's connection limits and request processing latency. Choosing NLB means your servers see the real client IP natively, but your application code must handle routing logic that ALB would have done for you.

Every capability you move into the load balancer is a capability you remove from your application. Every capability you keep in the application is latency you add to the load balancer. This is a tradeoff, not a best practice.

Assignment

For each scenario below, choose the appropriate load balancer type (ALB or NLB) and explain your reasoning:

A REST API where /api/v1/* routes to the backend service cluster and /static/* routes to a CDN origin server. The team wants SSL offloading so backend servers do not handle TLS.
An IoT platform that accepts 1 million persistent TCP connections from embedded sensors. Each sensor sends a 64-byte payload every 30 seconds. Partners need to whitelist specific IP addresses in their firewalls.
A microservices application with 12 services. The team wants the load balancer to handle SSL termination for all HTTPS traffic and authenticate users via an OIDC provider before requests reach any backend service.

For each answer, identify what would go wrong if you chose the other load balancer type instead.

Load Balancing: ALB vs. NLB