Automation
Automatic Updates That Can't Reboot Aren't Done
Default unattended upgrades on a Linux server quietly do less than people assume — they skip non-security updates, never reboot, and can deadlock for months on a prompt no one is there to answer.
- Automation
- Linux
- Operations
- Reliability
“It does automatic updates” is one of those reassuring sentences that hides a lot. On a stock Ubuntu or Debian server, the automatic updater is real and it’s enabled out of the box — but what it actually does by default is narrower than the comfort it provides, and its most dangerous failure mode is one where it looks like it’s working while it silently hasn’t run in months. If you lean on unattended upgrades, it’s worth knowing exactly where the line is.
”Automatic updates” means “automatic security updates”
The first surprise is scope. The default configuration only pulls from the security
pockets — the release packages plus security fixes. Ordinary bug-fix updates from
the -updates pocket are not installed automatically unless you add that origin
to the allow-list yourself. So a box you believed was fully current is really
“current on security patches, frozen on everything else.”
That’s a defensible default — security fixes are the ones you most want hands-off — but it’s a default you should choose, not inherit by accident. If you actually want the box to track all updates, you have to say so. And there’s a second wrinkle: updates roll out in phases to a percentage of machines at a time, so “it didn’t install yet” can be correct rather than broken.
It installs the kernel and then keeps running the old one
Here’s the gap that turns into a real security problem. By default the updater
never reboots on its own. A kernel update installs cleanly, the package is on
disk, everything reports success — and the machine keeps running the old kernel
until a human reboots it. The only outward sign is a reboot-required flag file
sitting there, which nothing acts on unless you’re looking.
A patched-but-not-rebooted kernel is the worst of both worlds: you did the work and you don’t have the protection.
So an honest patching story has two halves: install the update, and then actually
cut over to it. If your automation does the first and not the second, your dashboard
says “patched” while the running system says otherwise. Either enable automatic
reboots in a maintenance window or wire the reboot-required flag into something
that pages a human. Picking “no automatic reboot” is fine; not noticing the flag
is not.
The deadlock: a prompt with no one to answer it
This is the one that cost the most time, and it’s the most generalizable lesson in the whole topic. On a headless server, the updater installs packages and then runs a post-step that decides which services need restarting after the upgrade. By default, that step is interactive — it wants to ask “which services should I restart?”
On a server, there’s no one at the keyboard to answer. So the prompt blocks. Forever. And because that step runs inside the upgrade, it’s still holding the package-manager lock the entire time. Every subsequent daily run starts, finds the lock held, and exits with a terse “lock already taken” — which reads like a transient collision, not a process that has been frozen for months.
I’ve seen a long-running box in exactly this state: one upgrade started, hit the interactive restart prompt, and sat there wedged for the better part of half a year. The package install had finished; the post-step never returned; nothing automatic had succeeded since. The deceptive part is how healthy it looks from the outside — the timer is “active,” the daily job keeps firing, and the only complaint in the logs is about the lock. You have to walk the process tree to find the actual culprit several layers down, stuck on a question.
The fix is to make the silent parts loud
Recovering one box is easy once you’ve named it: kill the wedged process tree to release the lock, reboot to cut over to the patched kernel and clear the pending restarts, and move on. But the recurrence prevention is the real fix, and it’s two moves:
- Make the restart step non-interactive. Set the post-upgrade service restart to automatic (or to list-only) so it can never block on a prompt no one will answer. An interactive default on a headless box is a latent deadlock, full stop.
- Alarm on the symptoms that hide. Alert on the upgrade job sitting in “running” for too long, and on the age of the last successful update. The lock message lies to you; the timestamp of the last real success doesn’t.
That second bullet is the durable principle, and it’s bigger than apt: any unattended job should be monitored on “when did it last actually succeed,” not on “is the scheduler still ticking.” A scheduler that fires faithfully into a wall looks identical to one that’s working, right up until you check what it accomplished.
The general lesson: interactive defaults are landmines in automation
Strip away the Linux specifics and what’s left is a rule I’d put on any automated system: a process that runs without a human cannot contain a step that waits for one. An interactive prompt in an unattended pipeline isn’t a small papercut — it’s a deadlock that holds a lock and starves everything behind it, while presenting as fine. The same instinct shows up in why automation needs a panic button: the scary failures aren’t the loud ones, they’re the ones that look like success.
Auto-updates are genuinely good. Just hold them to the same bar as any other automation you trust unattended — know its real scope, make sure it can finish the job it started, and measure it by what it actually completes, not by whether the timer is still blinking. If you’ve found your own creatively-stuck background job, tell me about it; the homelab notes are full of systems that looked healthy and weren’t.