Skip to content

Security update kills computers worldwide

Holy hell. Half the computers in the world¹ can be bricked by a single automated update from a security service? And for some reason the update can't be automatically rolled back?

MQAGA. Make QA Great Again.

¹OK, not half. But a lot.

31 thoughts on “Security update kills computers worldwide

  1. Citizen Lehew

    Smells like a cover story to me.

    I suppose "cyberattack targeting security service knocks out half the computers in the world" might cause a panic.

    1. dausuul

      Never attribute to malice what can be adequately explained by stupidity.

      And if you think this can't be adequately explained by stupidity, you have not worked in enterprise IT.

      1. bbleh

        Lol indeed.

        "Wait, what? That's supposed to be a colon instead of a comma? That's really dumb! Who uses colons for that?"

        ... or maybe ...

        "Well, yeah, we were supposed to do another sim run after those last few bug fixes, but Jerry took off for his long weekend, and the VP was really leaning on us to get the update out the door, and we didn't think we had enough time without him, and besides there were just a few little fixes -- we had no idea something like this could happen ..."

    2. different_name

      I hope you're not serious. Stop and think about how many people would have to be in on that one, and either realize you're being silly or you should be acting as if literally millions of highly paid people are out to get you.

      Anyway, this hit my employer. Our parent company lost quite a bit of time cleaning up; we're a lot less windows-dependent, so it was pretty quick for us.

      The real scandal is this software is utter crap. We had worse problems with the Linux version of it a couple years ago.

      But there's no escape. There are a set of interrogatories about endpoint security in SOC II (also in other compliance schemes) that can only be satisfied if you install one of a handful of things like this. And they all have problems, because they're effectively malware themselves - they hook all the same mechanisms and try to stop other things from doing the same. And just like malware, it breaks a lot.

      And CloudStrike in particular can be kind of cowboy, as these companies go. They still have a fairly old-school infosec attitude - vaguely edgy attitude, arrogant, crappy support. They're also fairly dominant, so it doesn't surprise me they blew up the world.

      When I get depressed about it, I remind myself it could be worse - I could be doing this in a hospital.

  2. Goosedat

    TAP had a story about 'middleman' software mucking up the economy earlier this month. "The reason things feel like they're so broke is that they are."

    The common thread here is an economy of middlemen, a group of linkers, connectors, and bridgers that offer little in value (or in these cases actively detract from it) and much in opportunity for skimming and causing prices to rise. This has in a real sense become the U.S. economy in microcosm, and in many ways it speaks to public frustration with it.

    https://prospect.org/economy/2024-07-01-hello-from-the-middleman-economy/

  3. rather_be_fishing

    I've been in QA for most of the last 3 decades. I've lost count of the times when testing was cut short due to 'budget constraints' or reduced because development ran over schedule. Everyone in the software QA space knows exactly how this could have happened.

    Oh, and guess who will get blamed? (yup, QA.)

    1. Scott_F

      As a software engineer, I treasure my QA team. But, yeah, you all are stuck at the end of the chain and are thus most likely to lopped off. Management gives lip service to quality but communicates clearly with their actions that testing is luxury if it leads to sacrificing speed.

  4. Pittsburgh Mike

    It is pretty amazing that this happened. Even though I've been in IT for decades, I'm surprised that:

    1 -- so many companies allow automatic updates, even of kernel modules, into live production systems

    2 -- If a kernel extension kills off a kernel several times in a row, the OS doesn't disable that extension.

    3 -- People aren't running production systems in VMs, where repairing something like this, which requires deleting a file from a system that's not booting, can be scripted and applied to multiple machines at once from a remote location. Instead, apparently some customers need to visit each server individually to repair it.

    IT folks should be testing systems in test cells, and then roll out updates to a small percentage of production servers before doing a full roll out.

    1. realrobmac

      Where I work we were recently required to install Crowdstrike on all of our computers by our parent company. Many on team were skeptical though I can't say any of us predicted anything like this.

      But my point is, these kinds of decisions generally come from very high up in an organization.

      1. kahner

        "these kinds of decisions generally come from very high up in an organization"

        Yup, and these are usually the people least likely to understand the implications of the decisions.

    2. Batchman

      Re #2: I read that one piece of advice from Microsoft was to reboot your computer 15 times and it would make the blue screen go away. So maybe the OS will "disable" the extension after "several" iterations where "several" = 15?

      1. dausuul

        Servers in data centers is what Linux is for. And Linux servers play nicely with a Mac for local development.

  5. csherbak

    I'm more surprised so much stuff runs MS Windows.

    We had a number of machines hit worldwide, but I think it was maybe a couple hours to recover. Apparently the big problem is you couldn't automate the recovery (oops!) but had to safe mode and remove/replace the sys file by hand. (Not a Windows guy so not sure how much can/can't be automated. But having to use safe mode to fix seems Really Bad If Only Annoying.)

    We have some Web/UI stuff (and our MS Infrastucture - Outlook, Teams, etc.) on Windows but nothing keeping jobs/trans from running.

    1. pjcamp1905

      Yeah. I've been using El Nino for years. House is dirty? I'm blaming it on El Nino. Late for work? That's El Nino.

  6. pjcamp1905

    QA? We don need no stinkin QA! That shit costs money and it doesn't move any product.

    Doesn't anyone do backups anymore? Restore from the last known good.

  7. Special Newb

    Liability is the only way to fix the issue. If you don't do a certain ammount of QA, etc. you have to pay.

    Also this is what Y2K would have been like I imagine just with no ready fix.

  8. sdean7855

    Time back, a volunteeer association I did work with, were discussing backups. Me, in computer since the 80's, I wanted a physical backup disc I could put in my safety deposit box at the bank. They wanted Carbonite cloud backups, so easy. We part ed company over this and other things.

Comments are closed.