We are aware that many users globally are encountering the "blue screen of death" on their computers and servers. This issue is causing significant disruptions across multiple industries. The problem has been identified as being related to a recent update for the Crowdstrike endpoint protection software.
If you are experiencing this issue and are stuck in a Blue Screen of Death or Recovery loop, please follow these steps:
1. Boot your computer into Safe Mode.
2. Navigate to the folder "c:\windows\system32\drivers\crowdstrike".
3. Rename the "crowdstrike" folder to something else. It's common to put a "_" in the name: "_crowdstrike" or "crowdstrike_old."
4. Reboot your PC.
This procedure should resolve the Blue Screen of Death error and allow you to boot normally.
Overview
In a rare event, on July 19th of the year 2024 an update to CrowdStrike's Falcon endpoint detection and response product was issued with unintentional consequences for PCs running Windows. While the attack disrupted several global sectors, such as airlines, media and emergency services in particular areas of Europe with an impact on operational capabilities.
What Happened?
The fault rested with an update to the Falcon sensor, in particular relating to csagent. sys driver which resulted in a PAGE_FAULT_IN_NONPAGED_AREA stop code. It is thrown when process tries to access a non-existent or unallocated memory address in system and then it result as crashing of the system due which we sometime end up with boot loop.
Technical Breakdown
Driver Update Bug: A bug in csagent after the update affected the memory manager of the system driver.
Memory Management:Â The driver incorrectly managed paged and non-paged areas, causing page fault errors (BSOD) and system crashes.
System Impact:Â Affected systems booted into a BSOD boot loop, rendering them unbootable.
Immediate Impact
Commercial Flights:Â Numerous commercial flights were canceled due to bugs in aviation control systems. These bugs disrupted flight schedules, leading to significant delays and financial losses for airlines and passengers. The malfunction in the systems also caused widespread confusion and inconvenience at airports.
Media Outlets:Â Major media platforms, such as Sky News, experienced downtime. This interruption in service affected news broadcasts and online updates, leaving the public without access to timely news and information. The outage also impacted the media companies' credibility and advertising revenue.
Emergency Services:Â The bug had a critical impact on some 911 call centers, potentially causing delays in emergency response times. The compromised systems hindered the ability of operators to efficiently manage and dispatch emergency services, posing a risk to public safety and possibly leading to adverse outcomes in critical situations.
Resolution and Workarounds
CrowdStrike admits to the flaw and offers a long fix for it that can address BSOD issue as follows:
Booting into Safe Mode: Deleting or Renaming Problematic Files: Modifying Registry Settings: Make sure to run these steps with administrative privileges.
Restart your computer and enter the Advanced Repair Options menu.
Navigate to "Startup Settings" and select "Restart."
Once the system reboots, press F4Â to boot into Safe Mode.
Preventive Measures
These best practices help users and the organization alike to prevent such issues in the future:
Administrative Updates:Â Test updates in a theoretical lab environment responsibly before pushing them to Production.
Back-up Systems:Â Regularly back up the most important systems and data.
Developed Rollouts:Â Implement updates in parts to limit the impact of serious issues.
Monitoring and Alerts:Â Use a monitoring system to track workloads, resource usage, user sessions, IPv4 utilization, etc.
Vendor Communication:Â Maintain open communication with vendors to stay updated on new updates or patches.
Incident Response Plan:Â Build and maintain a comprehensive incident response plan to manage such scenarios.
Conclusion
The CrowdStrike update highlights the need for regular testing, backup and monitoring practices. These controls reduce the damage in potential future incidents, and guarantee normal operation of business.
Comments