Session 0.8: Causal Loop Diagrams

Course → Module 0: Foundation: Systems Thinking Principles

What Is a Causal Loop Diagram?

A Causal Loop Diagram (CLD), as formalized by John Sterman in Business Dynamics (2000), is a visual map of causal relationships in a system. It consists of two elements: nodes (variables) and arrows (causal links). Each node represents something that can increase or decrease. Each arrow shows that a change in one variable influences another.

CLDs do not show exact quantities. They show structure. The point is to make the feedback relationships in a system visible so you can reason about behavior before you start measuring or modeling.

If you have ever drawn a box-and-arrow sketch on a whiteboard to explain why a system behaves a certain way, you were building an informal CLD. The formal version adds two things: polarity on each arrow, and loop identification.

Key concept: A CLD answers the question "what influences what?" It does not answer "by how much?" That distinction matters. CLDs are for understanding structure, not for producing numbers.

Nodes and Arrows

A node in a CLD is a variable, something that can go up or down. "Request volume" is a valid node. "The database" is not, because it is a thing, not a quantity. Good node names are measurable or at least directional: response time, cache hit rate, user satisfaction, technical debt, team size.

An arrow from A to B means "a change in A causes a change in B." The arrow does not mean A is the only cause of B. It means A is a contributing cause worth including in this diagram.

Every arrow carries a polarity mark: + or -.

Polarity: Same Direction and Opposite Direction

A + (positive) link means the two variables move in the same direction. If A increases, B increases (all else being equal). If A decreases, B decreases. The "+" does not mean "good." It means "same direction."

A - (negative) link means the two variables move in opposite directions. If A increases, B decreases. If A decreases, B increases. The "-" does not mean "bad." It means "opposite direction."

Examples from software systems:

Request volume (+) → Database load: More requests mean more database load. Same direction.
Cache hit rate (-) → Database load: Higher cache hit rate means less database load. Opposite direction.
Response time (-) → User satisfaction: Higher response time means lower satisfaction. Opposite direction.
User satisfaction (+) → Request volume: More satisfied users come back more often. Same direction.

Identifying Loops

A loop exists when you can trace the arrows from a variable back to itself. To classify the loop, count the number of negative (-) links in the path.

Even number of minus signs (including zero): Reinforcing loop (R). The change amplifies itself. Growth breeds more growth. Decline breeds more decline.
Odd number of minus signs: Balancing loop (B). The change counteracts itself. The system pushes back toward equilibrium.

This counting rule works because two negatives cancel out. If A going up causes B to go down (-), and B going down causes A to go up (-), the net effect is: A going up causes A to go up. That is reinforcing.

Loop naming convention: Label each loop with R1, R2 (reinforcing) or B1, B2 (balancing) and give it a short descriptive name. For example: "R1: Growth engine" or "B1: Performance degradation." Naming loops makes them easier to discuss with your team.

Delays

Not all causal effects are instant. When a change in A takes time to affect B, we mark the arrow with a delay symbol (two parallel lines: ||). Delays are critically important because they cause oscillation and overshoot.

Consider auto-scaling. When CPU load increases, the auto-scaler provisions new instances. But the instances take 2 to 5 minutes to spin up, pass health checks, and start serving traffic. During that delay, the system is under-provisioned. By the time new capacity arrives, the load spike may have passed, leaving you over-provisioned. The delay in the balancing loop causes the system to oscillate around the target instead of settling smoothly.

Delays also explain why organizations overshoot when hiring. The effect of a new hire on team output is delayed by months of onboarding. In the meantime, leadership sees the team is still behind and approves more hires. By the time the first wave is productive, too many people are on the team.

CLD Notation Reference

Symbol	Name	Meaning
Variable name	Node	A quantity that can increase or decrease
→ (+)	Positive link	Same direction: A up, B up; A down, B down
→ (-)	Negative link	Opposite direction: A up, B down; A down, B up
\|\| on arrow	Delay	Effect takes significant time to manifest
R	Reinforcing loop	Even number of negative links. Amplifies change.
B	Balancing loop	Odd number of negative links. Resists change.

Three Approaches to Building a CLD

There is no single correct way to construct a CLD. Three approaches work well in practice, and you will likely use all of them at different times.

1. Jigsaw Puzzle

Start by listing every variable you think matters. Write them all down without worrying about connections. Then systematically ask: "Does changing this variable affect that one?" Connect the pairs. This approach is thorough but slow. It works well for group exercises where different stakeholders each contribute variables from their domain.

2. Mental Model

Start from your understanding of how the system works. Draw the loops you believe exist, then validate them against data or the experience of others. This is fast but biased. You will tend to draw the loops you already know and miss the ones you do not. Combine this approach with peer review to catch blind spots.

3. Start With One

Pick a single variable that concerns you, such as response time or deployment frequency. Ask: "What does this variable affect?" Draw those arrows. Then ask: "What affects each of those?" Keep expanding outward until you find loops. This approach is focused and efficient. It naturally centers the diagram on the problem you care about.

Example: Caching System CLD

Consider a system where users make requests, and a cache sits between the application and the database. Here is the causal structure:

graph TD RV["Request Volume"] -->|"(+)"| CU["Cache Usage"] CU -->|"(+)"| CHR["Cache Hit Rate"] CHR -->|"(-)"| DBL["Database Load"] DBL -->|"(+)"| RT["Response Time"] RT -->|"(-)"| US["User Satisfaction"] US -->|"(+)"| RV CHR -->|"(-)"| RT RV -->|"(+)"| DBL

Trace the loops in this diagram:

R1 (Growth loop): Request Volume (+) → Cache Usage (+) → Cache Hit Rate (-) → Response Time (-) → User Satisfaction (+) → Request Volume. Count the negatives: two. Even number. Reinforcing. Better caching leads to happier users, who generate more requests, which get served from cache.
B1 (Overload loop): Request Volume (+) → Database Load (+) → Response Time (-) → User Satisfaction (+) → Request Volume. Count the negatives: one. Odd number. Balancing. More requests increase database load, which slows responses, which reduces satisfaction, which reduces request volume.

The system's behavior depends on which loop dominates. If the cache hit rate is high, R1 dominates and the system grows healthily. If the cache is cold or poorly configured, B1 dominates and the system degrades under load.

Common Mistakes

When you first start drawing CLDs, watch for these common errors:

Using things instead of variables. "Database" is not a valid node. "Database load" or "database query latency" is. Nodes must be quantities that can change direction.
Confusing polarity with value judgments. A positive link is not a good thing. A negative link is not a bad thing. They describe directional relationships, not outcomes.
Drawing too many variables. A useful CLD has 5 to 15 nodes. Beyond that, it becomes unreadable. Focus on the variables most relevant to the behavior you are trying to understand.
Forgetting delays. If an effect takes hours, days, or weeks to manifest, mark it. Delays change system behavior dramatically.

Assignment

Draw a CLD for a caching system with these five variables: request volume, cache hit rate, database load, response time, and user satisfaction.

Place all five variables as nodes.
Draw arrows between them with + or - polarity. Justify each polarity choice in one sentence.
Identify at least one reinforcing loop and one balancing loop. Label them R1 and B1.
Add at least one delay mark (||) where you think a causal effect is not instantaneous. Explain why that delay matters for system behavior.

Causal Loop Diagrams