The Human Gates
Session 2.4 · ~5 min read
A quality gate is a point in your pipeline where production stops until a human reviews and approves the output. Not "AI checks AI." Not "it's probably fine." A human with domain knowledge looks at the work and decides whether it passes or fails. If it fails, it goes back to the previous stage. If it passes, it moves forward.
Quality gates are expensive in time. They are non-negotiable in quality. The question is not whether to have them. The question is where to place them and what criteria to apply.
The Minimum Viable Gate Structure
Every content pipeline needs at least three human gates. Fewer than three means you are publishing content that has not been adequately reviewed. More than three is fine, but three is the floor.
Is the plan right?"] G1 -->|"Pass"| B["AI Generation"] G1 -->|"Fail"| A B --> G2["GATE 2: Output Review
Does output meet spec?"] G2 -->|"Pass"| C["Editing + Formatting"] G2 -->|"Fail"| B C --> G3["GATE 3: Pre-Publish Review
Ready for audience?"] G3 -->|"Pass"| D["Publish"] G3 -->|"Fail"| C style G1 fill:#2a2a28,stroke:#c8a882,color:#ede9e3 style G2 fill:#2a2a28,stroke:#c8a882,color:#ede9e3 style G3 fill:#2a2a28,stroke:#c8a882,color:#ede9e3
A quality gate is where production stops until a human says "go." No automation, no "AI checks AI," no "it's probably fine."
Gate 1: Specification Review
Before any AI generation begins, a human reviews the specification. This gate prevents the most expensive failure: generating content from a bad plan.
| Check | Question | Fail Condition |
|---|---|---|
| Audience | Is the target audience clearly defined? | Vague or missing audience definition |
| Purpose | Why does this content exist? | "Because we need content" is not a purpose |
| Sources | Are research inputs sufficient? | No primary sources, no expert input |
| Structure | Is the outline specific and logical? | Generic structure, no clear argument flow |
| Voice | Are voice constraints documented? | No voice specification |
| Constraints | Are forbidden patterns listed? | No negative constraints |
Gate 1 catches problems that would be expensive to fix later. A specification missing voice constraints produces output that does not sound like you. A specification with no research inputs produces output that contains no original information. Catching these at Gate 1 costs minutes. Catching them after generation costs hours.
Gate 2: Output Review
After AI generates content, a human reviews the output against the specification. This is not a "does this look okay?" check. It is a systematic comparison of output against defined criteria.
| Check | Question | Fail Condition |
|---|---|---|
| Format compliance | Does the output match the specified structure? | Wrong number of sections, missing elements |
| Content coverage | Are all specified topics covered? | Missing subtopics, added unrequested content |
| Factual accuracy | Are claims verifiable? | Unsourced claims, hallucinated data |
| Voice compliance | Does the voice match the specification? | AI voice markers present (hedging, filler, false enthusiasm) |
| Forbidden patterns | Are forbidden patterns absent? | Any forbidden pattern present |
| Originality | Does the content contain the specified original elements? | Generic content with no unique perspective |
Gate 2 is where the 15 forensic markers from Module 1 become operational tools. Scan the output for hedging, filler, false enthusiasm, hollow metaphors, and the other markers. If the marker density exceeds your threshold (a reasonable starting point is 5 markers per 1,000 words), the output fails and goes back to generation with adjusted prompts.
Gate 3: Pre-Publish Review
The final gate asks one question: would you attach your name to this? Not "is it good enough." Not "will it rank." Would you show this to your most respected colleague and feel confident about it?
Gate 3 checks what the other gates do not: overall impression, coherence, and whether the piece achieves its purpose as a whole. Individual sections might pass Gate 2 while the overall piece lacks flow or coherence. Gate 3 is the human reading the complete work as a reader would experience it.
Designing Gate Criteria
Gate criteria must be specific enough that someone other than you could apply them. "Is it good?" is not a gate criterion. "Does every factual claim include a citation?" is a gate criterion. The test: could you hand the criteria to a competent colleague and get the same pass/fail result?
apply these criteria
consistently?"} B -->|"Yes"| C["Criteria are
specific enough"] B -->|"No"| D["Revise: make
criteria binary
and verifiable"] D --> A
Good gate criteria are binary (pass or fail, no "kind of"), specific (checking a defined attribute), and verifiable (another person can confirm the result). Building these criteria takes time upfront. It saves exponentially more time downstream because every piece of content goes through the same consistent review process.
Further Reading
- Quality Gate (Wikipedia)
- Prompt Engineering Overview (Anthropic Documentation)
- Quality Management System (Wikipedia)
- Creating Helpful, Reliable, People-First Content (Google Search Central)
Assignment
- Define 3 quality gates for your content pipeline.
- For each gate, specify:
- Where in the pipeline it occurs
- What gets checked (list specific criteria)
- What determines pass vs. fail (binary, verifiable conditions)
- What happens when something fails (regenerate? revise? discard?)
- Write the criteria as if you are training someone else to run your gates. Could a competent colleague apply your criteria and reach the same pass/fail decisions you would?
- Test your Gate 2 criteria on a piece of AI-generated content. Does it pass or fail? Do the criteria catch the right problems?