Why We Told the AI to Disagree With Us

The most useful instruction we gave the assistant building this site was not a task at all. It was a standing permission to disagree. Over and over, in different words, the same direction: vet this thought and push back if it does not hold; give the right advice regardless of what I just said I wanted; if something is unclear or you see a better path, raise it before you act.

That instruction matters because the default of a helpful assistant runs the other way. Left alone, it tends to take the request as given and produce a competent version of exactly what was asked, which is wonderful until the request itself is the weak part. When you are making real decisions, a collaborator who only ever agrees is not safe, because the agreement tells you nothing. You cannot tell the difference between a good idea and a bad one if the response to both is yes.

A useful AI collaborator is one you instruct to argue against the plan, because its default is to agree, and a default yes hides the objection you needed to hear.

The value showed up exactly where the instruction did. When we leaned toward deleting the References section of the site, the honest answer did not just execute it; it laid out the case for keeping it and made us decide on the merits rather than the impulse. When we proposed a new term for the core framework, the next step was not to apply it but to pressure-test the name against existing uses for collisions, so we changed it knowing what we were choosing. In both, the useful part was the objection, not the compliance. The agreement would have felt better and taught us nothing.

Setting this up is a matter of asking for the right output, not hoping for a mood. Ask for the counterargument by name: what is wrong with this plan, where does it break, what would a skeptic say. Ask for the objections alongside the proposal, so you see both at once instead of a clean recommendation with its doubts hidden. Tell it to ask a question when the request is ambiguous rather than guess and proceed. The instruction is to argue and to ask, and the result is that the decision stays yours, made against the strongest version of the other side rather than a polite silence.

Where this goes next

Keeping a person at the point of decision, with the machine arguing rather than deciding, is the supervision principle our essay The Human Factors builds on, and it runs through the human-factors part of The Builder's Stack. The same instinct, never trusting a single confident output, is the subject of How We Caught AI Mistakes With a Second Agent.

Sources

The supervision principle, the person stays the decider, is covered in our essay The Human Factors.
The companion habit of verifying rather than trusting output is in How We Caught AI Mistakes With a Second Agent.