Another solution is to accept that mistakes happen and do a phased rollout of updates. Heck, Windows Updates are known to be enough of a crapshoot that every place I’ve worked at, over the past decade or so, has had a plan for updating systems in batches. That CrowdStrike just YOLO’d their updates out (on a Friday, no less) to everyone at once, shows a mindset which didn’t accept that bad stuff can happen.
An ounce of actual QA and QC work would go a long way, but Microsoft fired their entire QA department years ago, and told engineers that they’re responsible for QA’ing all of their own work. That’s a terrible policy, but it saves them money, so they like it.
This seems intractible.
Malware scanners want to run at as low a level as possible so they can catch stuff.
Fault-recovery mechanisms want to run at as low a level as possible so there are very few things that can cause a BSOD.
It seems like the only possible solution is “just never make any mistakes”.
Like, either don’t have any vulnerabilities that a user space scanner can’t catch, or don’t ever ship a bad update to a kernel mode scanner.
Another solution is to accept that mistakes happen and do a phased rollout of updates. Heck, Windows Updates are known to be enough of a crapshoot that every place I’ve worked at, over the past decade or so, has had a plan for updating systems in batches. That CrowdStrike just YOLO’d their updates out (on a Friday, no less) to everyone at once, shows a mindset which didn’t accept that bad stuff can happen.
An ounce of actual QA and QC work would go a long way, but Microsoft fired their entire QA department years ago, and told engineers that they’re responsible for QA’ing all of their own work. That’s a terrible policy, but it saves them money, so they like it.