I still maintain anyone who didn't follow any change control and auto updates everything direct into prod is an idiot begging for exactly what happened Friday. I don't understand how after having so many supply chain attacks everyone forgot that too.I work for a Fortune 10 company that is constantly the target of radical ideologues, state actors and cyber criminals. Normally you would have a test/dev/stage level of deployment for most things, and even for most anti-malware stuff, there is some level of testing. CrowdStrike is different, though, and is trusted to protect against 0-day attacks; for that, you don't pass go and trust their updates, assuming they aren't going to fuck you. They have made billions being the go-to provider for a huge chuck of the world in that regard. It appears they let a broken kernel-level .sys driver as part of its Falcon Sensor update out into the wild which caused BSODs on every version of Windows it touched. It didn't break Azure or AWS itself but rather all the VMs.
Easy to fix: All you had to do was manually boot into Safe/Recovery mode and delete the file. Let me say that again.
All you had to do
is
MANUALLY
Log in
to every Virtual machine in every private host and public/private cloud you had VMs/Guests.
In a world in which people don't have to pay for physical hardware, the VM estate sprawl and labor shrinkage that has taken place in the last 15 years in AMAZING: Fewer people are managing more machines than ever before.
It's an interesting moment and a wake up call for many folks.
You're not wrong; in a perfect world, everyone should follow textbook ITIL design. I used to be a Datacenter Mgr at a mid-sized company, and we had perfect ITIL test/dev/stage environments for every app or suite; Change Management was rigorously followed, and things were done by the book unless we were in a Major Incident. 0-Day attacks weren't quite the thing back then but as I moved to larger companies, the rigorous T/D/S segmentation broke down as the environments got larger and the estates sprawled out, and were subject to business cycles. When you have a couple thousand servers, its one thing. When you have 35,000+ servers and 100,000 laptops, (and more as the size and footprint of the enterprise grows) quantity takes on a quality all its own. IRM/CyberDefense isn't my area but they determined that speed of response was more important than extensive QA testing/prevention, particularly with how well CrowdStrike was trusted.I still maintain anyone who didn't follow any change control and auto updates everything direct into prod is an idiot begging for exactly what happened Friday. I don't understand how after having so many supply chain attacks everyone forgot that too.
View attachment 537773
Turns out our infosec engineers had auto in prod on.You're not wrong; in a perfect world, everyone should follow textbook ITIL design. I used to be a Datacenter Mgr at a mid-sized company, and we had perfect ITIL test/dev/stage environments for every app or suite; Change Management was rigorously followed, and things were done by the book unless we were in a Major Incident. 0-Day attacks weren't quite the thing back then but as I moved to larger companies, the rigorous T/D/S segmentation broke down as the environments got larger and the estates sprawled out, and were subject to business cycles. When you have a couple thousand servers, its one thing. When you have 35,000+ servers and 100,000 laptops, (and more as the size and footprint of the enterprise grows) quantity takes on a quality all its own. IRM/CyberDefense isn't my area but they determined that speed of response was more important than extensive QA testing/prevention, particularly with how well CrowdStrike was trusted.
They may want to rethink that architecture after Friday.
This is America people don't get held accountable, they get promoted! They just justified quadrupling their budgets!Will any of them get let go over it?
My job was unaffected by it but external dependencies that did have these issues did cause us some problems.
Dude. No one is accountable this days.Will any of them get let go over it?
My job was unaffected by it but external dependencies that did have these issues did cause us some problems.
Another option to take exists. I'm not sure if it exists in Crowdstrike as it's been a hot minute since I was working through their UI and options but...I still maintain anyone who didn't follow any change control and auto updates everything direct into prod is an idiot begging for exactly what happened Friday. I don't understand how after having so many supply chain attacks everyone forgot that too.
View attachment 537773
Yea you can set host groups to n-1 on auto and then push updates I'm pretty sure.Another option to take exists. I'm not sure if it exists in Crowdstrike as it's been a hot minute since I was working through their UI and options but...
In most modern EDR/NGAV tools I have worked with there are options for "Auto-update, but delay X days before applying it". Which then would have bought you time to hit the "DO NOT UPDATE" button if a bad patch got out and bombed others.... This is a middle ground for those who don't have the resources to do their own full dev/test/prod motion and figure to let the rest of the world take the hit.
Giving ANY tool, even your trusted to stop zero days tool, immediate access is taking on risk. Now, question becomes if you hadn't had absolutely lightning fast coverage for zero days how many system hours would you have lost, or data lost versus what you experienced Friday.
I am not on the frontlines but the CyberDefense war is real, yo. Nobody wants to be Maersk, Colonial Pipeline or fall victim to a Solar Winds™ style extended attack.I just got a text from a sys admin I worked with 15 years ago asking if I remembered some local account admin password from a particular server because crowdstrike nuked it.
Boy do I have questions.
I am not on the frontlines but the CyberDefense war is real, yo. Nobody wants to be Maersk, Colonial Pipeline or fall victim to a Solar Winds™ style extended attack.