Systems Thinking

Findings Without Owners Are Just Complaints

A list of everything wrong with a system is noise — a useful assessment ranks each problem by real risk, pairs it with a one-line fix and a named owner, and sequences the work so the safest changes go first.

June 7, 2026

Systems Thinking
Engineering Practice
Operations
Reliability

I do a fair amount of “go look at this system and tell us what’s wrong with it” work, and I’ve learned that the deliverable people think they want — a list of everything that’s wrong — is almost useless. A flat list of twenty problems doesn’t get anything fixed. It overwhelms, it doesn’t say where to start, and it quietly implies every item is equally urgent, which is never true. The value isn’t in finding the problems. It’s in turning them into a plan someone can actually execute.

A flat list is a pile, not a plan

Twenty findings with no structure is a pile. The reader has to do all the hard work themselves: which of these will actually hurt us, which is cosmetic, what do we do first, who even owns this one? If the assessment doesn’t answer those, it has handed the thinking back to the person who asked for help — which is the part they couldn’t do, which is why they asked.

Finding the problem is the easy half. The deliverable is the order of operations, not the inventory.

So every finding in something I hand over carries three things the bare problem doesn’t: how much it actually matters, what specifically to do about it, and who’s going to do it.

Severity has to mean something specific

“High / medium / low” is useless if it’s a vibe. I tie the tiers to concrete, defensible meanings so two people would sort the same finding the same way:

Real, measurable risk under load — this will bite you in production, on a bad day, in a way you can measure. These are the ones that justify interrupting other work.
Hygiene / readiness — not actively on fire, but it’s a gap you want closed before you lean on the system harder. Cleanup, missing safety nets, defaults nobody chose.
Quality of life — real, worth doing, won’t hurt you if it waits. The honest bottom tier that keeps the top tier credible.

The point of a defined severity scale is that it forces you to decide why something is urgent, not just assert that it is. A finding you can’t justify placing in a tier is a finding you don’t understand well enough to report yet.

Each finding needs a fix and an owner, or it’s a complaint

This is the line that separates an assessment from a gripe. “The GC settings are suboptimal” is a complaint. “Swap the garbage collector to the modern default and drop the manual tuning, rolling one node at a time with a snapshot ready — owned by the DB team” is a finding. Same observation; only one of them is actionable.

The fix is one concrete line. Not “investigate” — the actual change, specific enough that the owner knows what “done” looks like.
The owner is a real party. And critically, the owner might not be you. A lot of what an honest assessment surfaces belongs to a different team — the application owners, the platform team, whoever controls the thing. Saying so plainly (“this one isn’t ours to fix, here’s who needs to”) is part of the value, not a dodge. An assessment that pretends everything is the reader’s to fix is as useless as one that blames everything on someone else.

If a finding has no concrete fix and no owner, it’s not ready to ship. It’s a feeling.

Sequence by risk and reversibility, safest first

Once everything is ranked and owned, the last piece is order — and the order isn’t just “highest severity first.” I sequence by a blend of impact and reversibility, front-loading the changes that are both valuable and safe to make:

Start with the low-risk, no-restart, easily-reverted changes. They build confidence and often deliver real improvement immediately, before you touch anything scary.
Then the changes that need a restart or a careful rollout, one careful step at a time, with a known rollback point before each.
Defer the cross-team conversations and the structural changes that need buy-in — flag them early so they can start moving in parallel, but don’t block the safe wins on them.

And keep a rollback target throughout — a snapshot or a known-good state you can return to — so every step has an undo. The sequence is itself a risk-management tool: you’re deliberately spending the cheap, safe improvements first to earn trust for the expensive, risky ones.

The assessment is a starting line, not a finish line

The last honest note: a findings document is the beginning of the work, not the end of it. I keep an explicit “not done yet” section — what hasn’t been applied, what’s waiting on another team, what needs monitoring stood up first so improvements can actually be measured. That last one matters more than it sounds: applying fixes before you can measure anything means you’re changing a system you can’t see, and “it feels better” is not a result. Stand up the observability, then spend the findings against it.

Done this way, an assessment stops being a list of grievances and becomes a plan: ranked by real risk, each item with a concrete fix and a named owner, sequenced safest-first, with a rollback at every step and a clear line between what you own and what you don’t. That’s the same turn-analysis-into-action discipline behind a good technical audit, and it’s why security is architecture, not decoration — the recommendation only matters if someone can act on it. If you’ve got a system that needs that kind of look, I’m easy to reach.