Engineering Practice

Make the Error Visible Where People Are Looking

A CI job failed with a useless generic message while the real diagnostics sat in a log file the harness never read. The fix wasn't better error messages — it was teaching the tool to surface the cause where people actually look. A reporter is only as good as the log it reads.

March 10, 2026

Engineering Practice
CI/CD
Observability
Tooling

A validation job in CI kept failing with a generic “initialization failed” and nothing else — no root cause, no hint, just a dead end at the bottom of the job log. The actual diagnostics existed the whole time; they were written to a different log file that the test harness simply never looked at. The fix wasn’t writing better errors — the good errors already existed. The fix was teaching the harness to read the right log and surface it where the developer was already looking: the job output. That’s a deceptively deep lesson about tooling: a tool that hides the real error is worse than one that crashes loudly, and an error reporter is only ever as good as the log it chooses to read.

The diagnosis existed; it just wasn’t where anyone looked

The frustrating shape of this bug: nothing was missing. The underlying system produced perfectly good error messages explaining exactly what was wrong. They went into its run-directory log. But the harness that summarized failures for the CI job only knew to check a different log path — one that, for this case, contained just a startup banner and no errors. So the harness found nothing useful to show, shrugged, and printed a generic failure. The information and the person who needed it were one directory apart, and that gap cost real debugging time on every failure.

A great error message in a log nobody reads is the same as no error message at all.

Surface the cause where attention already is

Developers look at the job log. That’s the surface. If the root cause lives anywhere else — a file on a build agent that evaporates when the job ends, a separate service’s logs, a path you have to know to go fetch — then for practical purposes it doesn’t exist, because under deadline pressure nobody goes spelunking for it. The whole job of a failure reporter is to move the diagnosis to the surface where people are already looking. The fix here was exactly that: teach the harness to pull from the log that actually holds the errors and print it in the job output. Same information, relocated to where attention is — and suddenly every future failure self-explains.

A reporter is only as good as the log it reads

The bug was instructive because the summary logic was fine — it knew how to spot and extract error markers. It was pointed at the wrong source. That’s the part worth generalizing: an error-surfacing tool has two jobs, and we obsess over one and forget the other. We polish how it formats and highlights errors, and neglect which inputs it reads. A beautiful summarizer reading an empty log produces a confident, useless “no errors found.” When you build (or trust) a reporter, the first question isn’t “does it format errors well?” — it’s “is it actually looking at every place errors can appear?” Coverage of sources beats sophistication of formatting every time.

A skipped check is worse than a failing one

There was a second trap hiding underneath. Digging into the history, a run that had looked green earlier had actually skipped the validation entirely — so the check had effectively never passed; it had just never run. That’s more dangerous than a red build, because a red build tells the truth and a silently-skipped check manufactures false confidence. “Green” can mean “passed” or it can mean “didn’t actually run,” and those are opposite states wearing the same color. Any check worth having needs to make “I didn’t run” loudly distinct from “I ran and passed” — an unconditionally-green check is decoration, and decoration is worse than nothing because it stops people from looking.

Treat error-surfacing as a feature, not a freebie

The throughline: getting the diagnosis in front of the right person is a designed capability, not something you get automatically because the underlying error exists. Make tools read every log where failures can hide, print the cause where people already look, and make a skipped check impossible to mistake for a passing one. It’s the tooling-side companion to logging why something failed, not just that it did and to remembering that a clean exit isn’t proof of success. The cause is usually already written down somewhere — the work is making sure it lands in front of the person who can act on it. If you’ve fixed a tool that was hiding its own best diagnostics, I’d like to hear about it.