Security
Build Ops Tools That Are Safe by Construction
When a tool can change production, safety shouldn't live in the operator's caution — it should live in the shape of the tool: read-only by default, the dangerous verbs quarantined and clearly named, and confirmation that scales with the blast radius.
- Security
- Operations
- Tooling
- Automation
Every operator has a story about the command that did more than they meant. The usual takeaway is “be more careful,” which is useless advice — careful people have bad days, and a tool that’s only safe when you’re sharp isn’t safe. I’d rather the safety live in the tool, not in my attention span. When I build tooling that can change real infrastructure, I design it so the dangerous things are hard to do by accident and the safe things are the path of least resistance. Here’s the shape of that.
Separate the verbs that can hurt you
The first move is structural: put the operations that change things in clearly-named, separate units from the ones that only read. When the read-only inventory commands live in one place and the create/modify/delete commands live in another — named so the dangerous ones announce themselves — a few good things happen. You can hand someone the read-only tool with no anxiety. You can audit the mutating surface by looking at one small, obvious place. And nobody fat-fingers a destructive verb while meaning to list something, because the destructive verbs aren’t sitting in the same drawer as the safe ones.
Make the blast radius legible from the command’s name. “List” and “delete” should not live in the same easily-confused namespace.
This is just least privilege expressed as ergonomics: the common, safe case is frictionless and always available; the rare, dangerous case is somewhere you have to go on purpose. The split is also a documentation win — a new person can see, at a glance, exactly which commands are the ones to respect.
Default to read-only
The safe state should be the default state. A tool that does nothing destructive unless you explicitly ask it to is a tool you can explore, demo, and hand off without a warning label. Reading is free; changing requires intent. I want “I just wanted to look” to be literally impossible to turn into “I changed something,” and the way you get there is by making the inspecting path the one that costs nothing and the mutating path the one that requires a deliberate, different invocation.
Confirmation should scale with the blast radius
Not every dangerous action deserves the same friction — and getting this calibration right is what keeps the gates from being ignored. A confirmation that’s too heavy for a trivial action trains people to bypass it; too light for a catastrophic one and it’s decoration. So I scale the gate to the consequences:
- A single, scoped change — confirm by typing the exact name of the target. That forces you to look at what specifically you’re about to affect, not just mash “y.” Typing the real name is a tiny act of attention at exactly the right moment.
- A batch, or something broad — show the full plan of everything that will be touched, then require an explicit confirmation after the operator has seen the list. The plan-then-confirm is itself a dry run: you see the blast radius before you authorize it.
- Anything irreversible — say so loudly, in the prompt, in plain words. “This cannot be undone” belongs on the screen, not in the docs.
The principle is that the friction should be proportional to how much you’d regret a mistake. Cheap-to-undo gets a light touch; impossible-to-undo gets a wall.
A tool that can’t tell if a human is watching must assume the worst
Here’s the rule that matters most once a tool gets used by automation: when there’s no human at the keyboard — a script, a pipeline, a scheduled job — the tool must refuse to do the dangerous thing unless it’s been given an explicit, unambiguous “yes, I mean it” flag. The interactive confirmation can’t protect a non-interactive caller, so the absence of a human has to fail closed, not silently proceed. Otherwise the safety gate you built for operators evaporates the moment the tool runs unattended, which is exactly when a mistake has no one to catch it.
And the discipline that goes with it, aimed at me as much as anyone: don’t reach for the bypass flag to skip the prompt because it’s convenient. The confirmation gate is a feature, not an obstacle. The override exists for the narrow case where a human has genuinely, specifically approved this exact action — using it to shut up a prompt you find annoying is how you delete the safety you spent effort building. If you’re typing the override out of impatience, stop; the impatience is the thing the gate is there to catch.
Safety in the shape, not the operator
Step back and the throughline is that none of this depends on the operator being careful. The read-only default, the quarantined dangerous verbs, the proportional-to-blast-radius confirmation, the fail-closed behavior when unattended — each one moves a piece of safety out of “remember to be careful” and into “the tool won’t let you do this casually.” That’s the goal: a tool where the careless path is the safe path, and hurting yourself takes deliberate effort.
It’s the same instinct as building automation with a panic button and treating security as architecture rather than decoration — safety you design in beats safety you remind people about. If you’re building tooling that can touch production and want to pressure-test the guardrails, I’m easy to reach.