Pakkit.net
← Back to blog

Systems Thinking

A Shared Name Is Not a Shared File

Before you consolidate a helper that's been copied into a dozen projects, prove the copies are actually identical — a shared filename is not evidence of shared content, and the copies have usually already drifted.

  • Systems Thinking
  • Engineering Practice
  • Refactoring
  • Source Control

A set of related projects had the same helper files copied into each of them. Same names, sitting at the same spot in every tree, doing the same job. The obvious cleanup writes itself: pull them into one shared module, delete the copies, done. So I started by gathering all the duplicates by filename to see what I was working with — and that’s where the easy story fell apart. A shared filename turned out to be almost no evidence at all that the files were the same. The copies had been drifting for a long time, quietly, and the name was the last thing still holding them together.

Duplication’s real cost is silent drift

Copy a file into ten projects and you haven’t made ten copies of one thing — you’ve made ten things that happen to start identical. The moment someone fixes a bug in one and hand-copies it to nine others (or forgets to), they diverge. Do that across months and a dozen contributors, and “the shared helper” becomes a dozen subtly different helpers wearing a shared name.

The expensive part of duplication isn’t the wasted bytes. It’s that the divergence is invisible. Nothing tells you the copies stopped matching; you find out when a fix that worked over here mysteriously doesn’t over there.

A shared name is not a shared file

So before moving anything, I did the boring, decisive thing: a content diff of every duplicated name. The result was a spectrum:

  • A couple of files were byte-identical across every copy. Genuinely shared.
  • One had five copies that were three different versions — most identical, one off by a comment, one lightly refactored, but all with the same stable public interface. Safely consolidatable.
  • And the cautionary one: five files sharing a single name that were completely different from each other. A per-project grab-bag that had collected unrelated odds and ends under one convenient filename.

A shared filename is not a shared file. Only a content diff is. Treat a matching name as a question, never an answer.

If I’d trusted the names and merged on basename alone, I’d have picked one of those five grab-bags as “the” version and silently broken whatever the other four were doing.

Consolidate only what has one true definition

That diff produced a clean rule. A file belongs in the shared module only if it can have a single canonical definition — byte-identical copies, or trivially different ones whose public interface is stable enough that one version serves everyone. Those are real shared code.

The grab-bag files that merely collide on a name are not shared code, and trying to force them into one definition would break callers that depend on their particular local version. Those stay where they are (or get renamed to stop pretending). Sharing is earned by sameness, not granted by a matching filename.

Solve it at the layer the constraint actually lives on

The thing that kept the duplication in place was a real constraint — the runtime loads everything flat from one directory and resolves references by bare filename, with no notion of a shared include path. It’s tempting to conclude “so the files have to be duplicated.” But that’s the wrong layer. The runtime is perfectly happy to execute one shared copy that’s been placed into that flat directory; it doesn’t care how the copy got there.

Which means the duplication was never a runtime requirement. It was a source-control and packaging artifact. So the fix belongs in packaging — one canonical module, plus a build step that flattens its files into each project’s layout at assembly time — not in fighting the engine or patching how it loads. When something feels forced, check whether you’re solving it at the layer that actually imposes the constraint, or just the layer you happened to be standing on.

Guard the flat namespace on the way in

Collapsing many copies into one shared definition trades a drift problem for a collision problem, so the build has to defend the new invariant. With everything resolving by bare name in one directory, two rules get enforced mechanically:

  • Two shared files can’t claim the same name — one global namespace, no silent winner.
  • A local file that shadows a shared one is the dangerous case. If a project still carries its own copy that differs from the canonical version, the assembly fails loudly rather than quietly preferring one. A byte-identical leftover only warns.

The point is that the safety can’t be a doc nobody reads; it has to be a check that fails the build, because the whole problem started with drift that nothing was watching for.

Single source of truth is a discipline, not a folder

Moving the files into one repo is the easy 10%. The other 90% is treating that repo like what it now is: a shared library. One owner. Stable public interfaces. Version bumps that each consumer adopts deliberately, after its own checks pass — not a change that silently rewrites everyone’s behavior the instant it merges. A shared module without that discipline is just centralized drift, which can be worse than the scattered kind because it fails everyone at once.

The habit that survives all of this is small and unglamorous: when two things share a name, diff them before you believe they’re the same. Names are a hope; the bytes are the truth. If you’ve untangled a pile of “identical” copies that turned out not to be, I’d like to hear how it went.