Writing

Every Approval Prompt Was Making You a Worse Operator

May 28, 2026 5 min read Updated for May 2026

Yes. Yes. Yes. Yes.

Every approval prompt in Claude Code was slowly turning you into a worse operator.

At first the prompts feel responsible. Approve this command? Approve this file edit? Approve this test run? Of course you want to see what the agent is about to do. Oversight is the whole point.

Then you build with agents for a week. And somewhere in that week, you stop evaluating the prompts. You just click yes. Yes. Yes. Yes. The dialog appears, your thumb hits approve before your eyes finish reading, and the agent rolls on. The ritual still happens. The judgment behind it is gone.

That's not a discipline problem. It's what the design trains you to do.

The Ring Doorbell Problem

A Ring doorbell that goes off every fifteen minutes stops being a security system.

The first day, you check every alert. By the third, it's the delivery guy, the neighbor's dog, a car passing, a leaf in the wind, another false alarm. By the second week, you've muted the part of your brain that was supposed to be watching. So when something actually worth seeing crosses the camera, you scroll past it with the rest.

Approval prompts fail the same way. A safety mechanism that fires constantly doesn't make you more careful. It trains you to ignore the one moment when care actually matters. The signal you needed is buried under a thousand identical interruptions you taught yourself to dismiss.

The cruelty of it is that the prompt feels like safety the entire time it's eroding it. You're clicking approve, so you must be in control. You're not. You're pattern-matching your way past a dialog box.

Fake Control

The problem was never too much autonomy. The problem was fake control.

A prompt you don't read isn't a safety check. It's a speed bump that you, the human, have been quietly trained to drive straight over. It costs you a half-second of friction and buys you zero protection, because the part that was supposed to protect you (your attention) checked out three hundred clicks ago.

Real control is the opposite of constant. It's scarce on purpose. You want the system silent for the routine stuff and loud for the one command that could delete something you can't get back. A safety layer that can't tell the difference between "run the tests" and "force-push to main" isn't a safety layer. It's a metronome you've learned to tune out.

This is why the volume of prompts matters more than their existence. Ask a person to make three hundred trivial yes-or-no calls a day and the three-hundred-and-first, the one that mattered, gets the same reflex as the rest.

Classify the Risk, Not Every Action

The fix that shipped with Auto Mode, announced at Anthropic's Code with Claude conference, is the obvious one in hindsight. Instead of asking you to approve everything, the agent classifies the risk first.

The logic is simple. Is this command destructive, hard to reverse, or high blast-radius? Could this be prompt injection from something the agent just read in a file or a web page? If the answer is no, the agent keeps moving. If the answer is yes, it stops and flags it.

That maps to how careful people actually work. You don't deliberate over reading a file or running a test, because both are reversible and contained. You do stop and think before dropping a database table, force-pushing over someone's work, or acting on an instruction that appeared inside a document you were only supposed to summarize. The weight of the decision should match the weight of the action.

Reversible and local: let it run. Irreversible, shared, or suspicious: surface it loudly, while your attention is still intact. Same number of genuinely important decisions, a fraction of the noise around them.

How to Actually Build Trust

Risk classification is the default doing the triage for you, but the principle generalizes to how you set up any agent.

Spend your attention where it's irreversible. Local edits, test runs, reads, anything you can undo with a keystroke or a git checkout: let those flow. Reserve your judgment for the operations that touch shared state or can't be walked back. That's where a human glance still earns its keep.

Write the trust down where the agent reads it. Repository etiquette, which commands are safe, which paths are off limits, the gotchas specific to your project: that belongs in a lightweight CLAUDE.md or AGENTS.md, not in three hundred individual approvals. Encode the boundary once and the agent respects it every session. Keeping that file thin and pointed is its own discipline.

Make the irreversible things hard to do by accident. Branch protection, sandboxes, scoped credentials, a dry-run flag. Guardrails that live in the environment don't suffer alarm fatigue, because they don't depend on a tired human reading a dialog at 6pm.

The throughline is that trust isn't a click. It's a setup. You decide once, deliberately, what deserves a human in the loop, and you let the rest run.

Attention Is the Budget

The reason this matters more every month is that the number of agents you run is going up, not down. The day you're running four agents in parallel, the approve-everything model is already dead. You cannot read twelve hundred prompts a day across four threads and still have judgment left for the one that counts.

Agents are built to loop until the goal is met, adapting as they go. (That loop is the whole point of an agent.) Standing between the agent and every step of its own loop defeats the thing you hired it for, and it doesn't even keep you safe, because you stopped reading the prompts in week one.

Your attention is the scarce resource, not the agent's compute. Spend it on the handful of moments that are genuinely irreversible, automate the boundary for everything else, and stop clicking yes to questions you're no longer reading.