The naive way to use a model is one prompt at a time: you ask, you read the reply, you adjust the prompt, you ask again. You are the loop. A better structure hands the loop to the system. You give it a goal, a way to score its own attempts, and a limit on how many tries it gets before a person looks. The pattern is everywhere now under the name AI loop, and the clearest writeup we have read breaks it into three parts: an objective, a metric, and a boundary.
An AI loop is three things: an objective worth reaching, a metric the work is scored against, and a boundary that decides when to stop and hand back to a person.
We did not set out to use this. We kept reaching for it, because building a site this size one prompt at a time would have taken forever. Three of the loops we ran:
- The narration script. Objective: a two-minute script that holds attention and follows our voice rules. Metric: score each draft against those rules, and against whether each line earns the next. Boundary: write several, grade them all, hand back the top two. The system produced a batch, scored each, and returned the strongest two with their grades, and we chose from there.
- The essay reframe. When a piece was not landing, we generated several different reframings and scored each with a panel of independent judges on clarity and voice. The judges were the metric, and the ranked shortlist was the boundary, so only the few that cleared the bar reached us.
- The voice rules. A loop that learns, not one that only runs: every time we caught a sentence that read wrong, the correction became a written rule, so the standard the next draft was scored against was stricter than the one before. None of it was automated, but the metric kept sharpening, so the work did too.
One difference is worth flagging. The loops written up elsewhere are built to run on their own, scoring and revising without you until they hit the limit. Most of ours were built to stop early and hand the decision to a person. That is the same loop with the dial turned toward judgment, which is the argument of Taste Is a Muscle, Not a Gift and How We Used Preview Labs to Design This Site: the machine is good at generating and scoring, and the choosing is the part we kept.
This is what the Evals part of The Builder's Stack teaches, where the quality bar is the metric and the regression gate is the boundary, and it is a part of our core framework, Shape · Ship · Track.
If you want to build your own, start where we did: write the objective in one sentence, write the metric as something you could actually score a draft against, and decide the boundary, including whether that boundary is a person.
Sources
- The three-part loop (objective, metric, boundary) and the "loops that learn" distinction come from "Learning to Write Your First AI Loop," simple.ai: https://simple.ai/p/learning-to-write-your-first-ai-loop
- The eval discipline this mirrors is the Evals part of The Builder's Stack; the framework is Shape · Ship · Track.