Chapter 9: Stop Rules + Pitfalls: When to Upgrade, Bail, or Go Manual

January 31, 2026 · 3 min read
blog

Series: LLM Development Guide

Chapter 9 of 15

Previous: Chapter 8: Security & Sensitive Data: Sanitize, Don’t Paste Secrets

Next: Chapter 10: Measuring Success: Solo + Team Metrics Without Fake Precision

What you’ll be able to do

You’ll be able to avoid the two common failure outcomes:

  • Spending hours fighting the model.
  • Shipping output you can’t review.

You’ll do it with explicit stop rules, upgrade triggers, and a short recovery checklist.

TL;DR

  • If the change is under a minute manually, do it manually.
  • If you can’t review the output competently, don’t ship it.
  • If you’re on your third attempt for the same logical unit, upgrade or re-scope.
  • Add verification steps to plans and prompts so “done” is testable.

Table of contents

Stop rules

These are pragmatic defaults. Tune them to your environment.

Stop rule 1: tiny changes

If it is a tiny change (one line, one rename, one version bump), do it manually.

LLM overhead is real:

  • You still have to explain.
  • You still have to review.
  • You still have to verify.

Stop rule 2: you can’t review it

Never commit code you could not explain in a review.

If you don’t understand the domain:

  • break the work into smaller pieces you can understand, or
  • involve a reviewer who does.

Stop rule 3: you’re fighting output quality

The 10-minute rule:

  • If you’ve spent about 10 minutes fighting the output, stop.
  • Upgrade the model tier, or shrink the scope to a smaller logical unit.

Stop rule 4: high-risk code needs extra caution

Be cautious with:

  • Authentication and authorization.
  • Cryptography.
  • Payment flows.
  • Input validation.

You can still use LLMs, but the bar for review and verification is higher.

Top pitfalls

These show up repeatedly.

  • Trusting output without review.
  • Skipping planning.
  • Not providing reference implementations.
  • Letting sessions run too long.
  • Scope creep mid-session.
  • Vague prompts.
  • Not capturing decisions.
  • No verification step.

A simple rule:

  • If you wouldn’t merge a junior developer’s PR without review, don’t merge LLM output without review.

Recovery checklist

When things go wrong:

  • Stop iterating on bad output.
  • Decide what kind of problem it is:
    • prompt problem,
    • model capability problem,
    • task is a poor fit.
  • Simplify:
    • smaller logical unit,
    • more references,
    • clearer constraints.
  • Fresh session if context has drifted.
  • Manual fallback is a valid outcome.

Verification

Create a one-page stop-rules file so you can apply this consistently across tasks:

mkdir -p work-notes

cat > work-notes/stop-rules.md <<'MD'
# Stop Rules (Personal Defaults)

## Manual first
- If change is <= 1 minute manually, do it manually.

## Upgrade triggers
- Third attempt on same logical unit.
- Repeated misunderstandings.
- Output ignores constraints.

## Bail triggers
- I cannot review this competently.
- Task requires live debugging with runtime state.
- Sensitive data would be required to reproduce.

## Required gates
- Verification commands exist in plan.
- Verification commands exist in prompt.
- Work notes updated before continuing.
MD

Expected result:

  • You have a written policy you can apply without debating every time.

Continue -> Chapter 10: Measuring Success: Solo + Team Metrics Without Fake Precision

Authors
DevOps Architect · Applied AI Engineer
I’ve spent 20 years building systems across embedded firmware, security platforms, fintech, and enterprise architecture. Today I focus on production AI systems in Go: multi-agent orchestration, MCP server ecosystems, and the DevOps platforms that keep them running. I care about systems that work under pressure: observable, recoverable, and built to last.