< Back to mainView all posts

Recapping the CrowdStrike service outage: a wake-up call for risk management

On 19 July this year, a major disruption occurred in the cybersecurity world when CrowdStrike, one of the most trusted platforms in the industry, experienced a significant outage. This incident affected thousands of systems globally, with a particular impact on Microsoft environments, and revealed critical vulnerabilities in the digital infrastructure many businesses rely on for protection. The event forced organizations and IT leaders to confront the realities of cyber risk in an interconnected world and sparked discussions on how to better mitigate these risks in the future.

A few days ago, Adam Meyers, a senior executive at the company, appeared before a US congressional committee to answer questions about its faulty software update that disabled millions of devices. Meyers said the firm was "deeply sorry" for the outage that affected millions of people and is "determined to prevent it from happening again". Should we just trust or is there something we can do on our part?

In this article, we will take a closer look at the details of the CrowdStrike outage, its repercussions, and, most importantly, the lessons learned. These include renewed discussions around access to the Windows kernel, the importance of SaaS backup solutions, and the growing demand for multi cloud strategies.

blog 3

What Happened: The July 2024 CrowdStrike Outage

The outage in July was the result of a critical malfunction within CrowdStrike’s Falcon platform, a widely used cloud-native endpoint security solution that provides protection against malware, ransomware, and other advanced cyber threats. This failure left thousands of systems running Windows unable to access their devices or crucial data for hours, disrupting operations on a global scale. The malfunction led to widespread blue screen of death (BSOD) errors and system crashes.

The root cause of this malfunction was a critical configuration error within the Falcon platform that propagated across its user base during a routine update. Specifically, the error was linked to a misconfigured component in the software’s endpoint protection system, which led to a conflict with certain Windows processes. This conflict triggered the BSOD on thousands of devices. Unfortunately, the developers at CrowdStrike did not detect the flaw during pre-release testing. Their testing procedures failed to simulate the diverse, real-world conditions across the environments in which the software operates, allowing the bug to slip through unnoticed. Once deployed, the faulty update caused widespread system failures, particularly affecting businesses that depend on CrowdStrike for continuous threat monitoring and real-time protection.

The ripple effects extended beyond individual users, impacting the broader ecosystem of businesses reliant on seamless uptime for mission-critical operations. With key systems going offline, many organizations scrambled to restore functionality, underscoring the importance of robust risk management and incident response strategies.

The consequences were severe. Major industries, including healthcare, finance, and manufacturing, were affected, with many critical systems brought offline. This incident highlighted the necessity for resilient incident response plans, as organizations struggled to regain normalcy.

CrowdStrike responded swiftly, working around the clock to develop and deploy a patch that resolved the issue. The company explained that the error stemmed from an internal misconfiguration, and after identifying the root cause, normal service was restored. Recently, Meyers said the company would continue to act on and share "lessons learned" from the incident to make sure it would not happen again.

Despite the rapid resolution, the incident raised important questions about the reliability of cybersecurity tools and the vulnerabilities inherent in relying on a single provider.

Microsoft’s Response: revisiting kernel access for security providers

One of the most intriguing developments in the wake of the CrowdStrike outage has been Microsoft’s consideration of a dramatic shift in its cybersecurity strategy: blocking third-party security vendors from accessing the Windows kernel. The Windows kernel, which is essentially the core of the operating system, has traditionally been accessible to trusted cybersecurity providers like CrowdStrike to enhance security. However, this access also poses risks when something goes wrong, as was demonstrated by the July incident.

Microsoft’s contemplation of limiting kernel access is not new. In fact, the company attempted a similar move in 2006 with the release of Windows Vista. At that time, Microsoft proposed restricting kernel access as part of its "Kernel Patch Protection" program, designed to prevent unauthorized modifications to the operating system core. However, this effort was met with significant resistance from both cybersecurity vendors and regulators, who argued that restricting access would weaken the overall security landscape by limiting the effectiveness of third-party tools.

Fast forward to 2024, and the idea of restricting kernel access has resurfaced. This time, however, there seems to be more willingness from the cybersecurity industry to engage in productive dialogue. Microsoft has been actively collaborating with its security partners, including CrowdStrike, to explore ways to balance system stability with the need for effective third-party security solutions. Interestingly, CrowdStrike has voiced support for these discussions, signaling a shift in the industry’s stance. The focus now appears to be on finding a solution that prevents disruptions like the July outage while maintaining the high level of security that businesses expect from their tools.

This renewed debate around kernel access reflects a broader shift in the way organizations approach cybersecurity. As more businesses rely on cloud-based solutions and third-party vendors to protect their systems, the risks associated with vendor dependencies have become more pronounced. In this context, Microsoft’s efforts to reevaluate kernel access may pave the way for more resilient cybersecurity architectures in the future.

Lessons Learned: the importance of Multi Cloud strategies

One of the key lessons from the CrowdStrike outage is the importance of diversification, particularly when it comes to cybersecurity and cloud services. Many businesses affected by the outage had heavily relied on a single provider for both cybersecurity and cloud-based services. This concentration of risk can leave businesses vulnerable when a provider experiences failure.

A growing number of organizations are now embracing multi cloud strategies to mitigate this risk. This approach enhances resilience and provides greater flexibility in how businesses manage their operations and respond to disruptions. For IT leaders, adopting a multi cloud strategy means thinking beyond just disaster recovery. It involves ensuring that critical applications and services can continue to run smoothly, even in the face of vendor outages or failures.

SaaS Backup and Recovery Solutions

The CrowdStrike outage also highlighted the increasing importance of SaaS (Software as a Service) backup and recovery solutions. With so many businesses relying on SaaS tools for day-to-day operations, any interruption can have devastating consequences. The outage raised an essential question: What happens when the SaaS provider itself goes down?

SaaS backup is now considered critical for business continuity. The market for these solutions is growing rapidly as organizations recognize the need for greater control over their data. IT teams must implement comprehensive backup strategies that cover not just traditional infrastructure but also SaaS environments. This includes backing up data stored in platforms like Microsoft 365, Google Workspace, and Salesforce to ensure that important business data remains accessible, even in the face of service interruptions.

blog 3

Backup as a Service (BaaS), the next frontier in data protection

Another trend gaining momentum following the CrowdStrike incident is the rise of Backup as a Service (BaaS). BaaS is a cloud-based offering where vendors provide data protection solutions as a fully managed service. For organizations looking to enhance their resilience without managing their own backup infrastructure, BaaS is an appealing option.

With BaaS, providers package all necessary backup and recovery capabilities, allowing businesses to offload the complexities of data protection. This includes automating backups and ensuring compliance with regulatory requirements. The popularity of BaaS is growing, particularly as companies seek to streamline operations and focus on their core business activities without having to manage backup hardware or software.

For enterprise clients, BaaS offers scalability and reliability. As the volume of enterprise data stored in SaaS environments continues to grow, so does the need for robust and scalable data protection solutions. BaaS providers are stepping up to meet this demand, offering solutions that cater to the increasingly complex needs of modern businesses.

Preparing for the future of cybersecurity and risk management

The CrowdStrike outage has served as a stark reminder of the vulnerabilities that exist in even the most advanced cybersecurity tools. However, it has also sparked important conversations about how organizations can better protect themselves in the digital age. From Microsoft’s renewed focus on kernel access to the rise of multi cloud strategies and the growing demand for SaaS backup solutions, the landscape of cybersecurity is evolving.

For businesses, the lessons from this incident are clear: diversify your risk, invest in comprehensive backup and recovery solutions, and stay vigilant in managing vendor relationships. In a world where data is the lifeblood of business operations, protecting that data—and ensuring access to it even in the face of disruption—has never been more critical.

The future of cybersecurity will be shaped by these lessons, as organizations work to build more resilient systems, adopt multi cloud strategies, and embrace the latest innovations in backup technology. However, navigating the complexities of multi cloud environments, SaaS data protection, and cybersecurity risk management requires deep expertise, and this is where cloud strategy consulting becomes essential. By partnering with experienced consultants, organizations can design and implement tailored cloud strategies that ensure optimal performance, security, and scalability. This expertise can help businesses evaluate cloud providers, plan disaster recovery scenarios, and create robust backup solutions that are aligned with both regulatory requirements and business objectives.

placeholder image

Blockchain 101

+ Read Article
placeholder image

Towards Sustainable IT

+ Read Article
placeholder image

Business Process Outsourcing

+ Read Article

CONTACT US

Get in touch with us. We'd love to help you!

Headquarters

NORTH AMERICA

110 E. Broward Boulevard, Fort Lauderdale, Florida 33301, United States

Call: +1-954-315-3933
Email: info@devrank.co

linkedin logo-white

Regional Development Centers

EUROPE

Jägerstraße 42, 10117, Berlin, Germany

ASIA

15A Loyang Crescent, Blk 105 Avenue 3, Singapore

SOUTH AMERICA

Praça Marechal Eduardo Gomes, 50 - Vila das Acacias, São José dos Campos - SP, 12228-900, Brazil