The Weakest Link: Single Points of Failure and Toward a National Cybersecurity
The MOC
By
Nicholas Weising
August 27, 2024
In 2023, Russia-affiliated groups launched a global cyberattack that compromised the Department of Energy and succeeded in harvesting sensitive information. This is only a more recent example in the waves of cyberattacks targeting American corporations and government agencies. Although the government is increasingly attentive to cyber security vulnerabilities in government, industry, and wider American society, vulnerabilities remain. The cyber workforce is of particular concern: the number of cybersecurity vacancies in the job market creates national security liabilities which are not easily resolved. The Department of Defense established the Office of the Assistant Secretary of Defense for Cyber Policy in March 2024 with the goal of cutting the cyber workforce vacancy rate by 25 percent by 2025. Even if our adversaries never launched another cyberattack again, the United States’ information technology systems are poorly designed, and building a robust cyber infrastructure with no single points of failure (SPOFs) is a national priority.
SPOFs have plagued telecommunications for decades. Systems that require reliability must minimize SPOFs, because if that single node fails, the entire process or network stops working. The windowless AT&T Long Lines building at 33 Thomas Street in Manhattan was once a major SPOF for the U.S. telecommunications network. It was built to be the nation’s longest-distance telephone switching hub, home to vast machines that connected long-distance telephone calls transmitted to and from the building over copper wires. Nearly all communications on the East Coast connected through 33 Thomas Street, and it was also the termination point for the then-primary transatlantic cable connecting the American telecommunications network to Europe.
On September 17th, 1991, a combination of equipment failure and human error caused the building to go dark. Millions of calls were dropped, transatlantic communication ceased, and the air traffic control system for half the continental United States shut down. As a result of this failure, deliberate moves were made to reduce the concentration of telecommunications infrastructure in a single building controlled by one company, eliminating the SPOF.
Though we now enjoy wireless communications, SPOFs still exist. The worldwide IT outage in mid-July, 2024 was caused by a single point of failure. A buggy update to antivirus software pushed by cybersecurity firm CrowdStrike—a major partner to Microsoft—rendered computers on the Windows operating system unusable for several hours. Similar to the 33 Thomas outage, airports were crowded with stranded travelers, healthcare providers delayed scheduled procedures, and 911 call service was disrupted in some areas. Unlike 33 Thomas, however, this was a worldwide outage—not merely a regional one. The interconnectedness, convenience, and ubiquity of modern wireless communication comes at a cost: vulnerability to sweeping outages during events such as this. The impacts of the CrowdStrike outage were worse than any Russian cyberattack yet unleashed, but it was caused by human error and inept system design, not malice.
CrowdStrike quickly fixed the update and normality was re-established in mere hours. However, imagine the potential damage and disruption of a cyberattack that strikes a SPOF and cannot be fixed in several hours. If IT systems were down over a prolonged period, the country would be incredibly vulnerable to mass data harvesting from organizations affiliated with rival countries or even a conventional attack from non-state actors. The consequences would be catastrophic. Even a small but coordinated cyberattack targeting a third-party managed-service provider, which small and medium-sized businesses disproportionately use, would create economic losses of $80 billion and tens of thousands of lost jobs. A larger attack which targets critical infrastructure involving several days of power grid disruption could cost more than $190 billion.
The CrowdStrike outage revealed a SPOF not within a single IT or telecom network, but within the cybersecurity sector at large. The world has become entirely reliant on and thus a handful of cybersecurity firms like CrowdStrike. Fifteen companies comprise 62 per cent market share of the worldwide cybersecurity industry. That one company falling to human error can grind the world to a halt for hours highlights the true brittleness of global technology infrastructure. If there is a greater diversity in operating systems and greater competition between cybersecurity firms in the market, a large-scale cyberattack would not be able to result in the same scope of damage.
The market will not naturally build the redundancies required to eliminate SPOFs. Current cybersecurity companies are not incentivized to prepare for human error events or cyber-attacks. Building systems to be resilient against hacking is very difficult, costly, and requires constant review and upgrading. It also reduces earnings in the short term, which is especially perilous for executives who manage publicly traded companies. The only way to eliminate SPOFs is to create redundancies such that if one of the nodes fails, the system can continue unabated.
Different systems will require different types of added redundancy to address outstanding SPOFs, but adding the nodes is unavoidable if the problem is to be solved. For the case of 33 Thomas Street, that meant redirecting cables and constructing new switching hubs. For our cybersecurity sector, it would involve expanding the number of operating services and cybersecurity firms. Another way to create more nodes would be to have robust internal IT teams with increased cybersecurity capabilities. However, firms that exist in concentrated markets want to keep their market power and thus acquire startup firms while they are still growing, effectively ‘nipping them in the bud’. The concentrated cybersecurity market is less innovative and productive than what the country needs it to be as non-state actors’ hacking ability improves. Furthermore, outsourcing is consistently cheaper than training and retaining internal IT teams.
Both solutions to the cybersecurity problem—increasing the number of cybersecurity suppliers or bringing cybersecurity teams in-house—involve increasing the number of cybersecurity professionals. There is already a strong labor market demand for cybersecurity professionals. Building a cybersecurity security with more nodes would demand even more than the status quo. To fix the economic SPOF that the CrowdStrike outage identified, the U.S. government should oversee vigorous cybersecurity training. The Biden administration has begun this effort but it demands increased attention via legislation from Congress. An ideal legislative package would appropriate funds to reduce costs of attendance for cybersecurity education at community colleges and other training facilities. It would also provide tax breaks to companies seeking to bring cybersecurity acumen in-house and grants to startups in the cybersecurity sector.
Single points of failure highlight the tension between efficiency and security. The CrowdStrike outage has shown the institutional failure that happens when incentives are growth at all costs with no regard for organizational memory or stability. A strengthened cyber workforce and a friendly market environment are needed to eliminate SPOFs and improve national security.
Nicholas Weising, Program Associate
The views expressed in this piece are the sole opinions of the author and do not necessarily reflect those of the Center for Maritime Strategy or other institutions listed.
By Nicholas Weising
In 2023, Russia-affiliated groups launched a global cyberattack that compromised the Department of Energy and succeeded in harvesting sensitive information. This is only a more recent example in the waves of cyberattacks targeting American corporations and government agencies. Although the government is increasingly attentive to cyber security vulnerabilities in government, industry, and wider American society, vulnerabilities remain. The cyber workforce is of particular concern: the number of cybersecurity vacancies in the job market creates national security liabilities which are not easily resolved. The Department of Defense established the Office of the Assistant Secretary of Defense for Cyber Policy in March 2024 with the goal of cutting the cyber workforce vacancy rate by 25 percent by 2025. Even if our adversaries never launched another cyberattack again, the United States’ information technology systems are poorly designed, and building a robust cyber infrastructure with no single points of failure (SPOFs) is a national priority.
SPOFs have plagued telecommunications for decades. Systems that require reliability must minimize SPOFs, because if that single node fails, the entire process or network stops working. The windowless AT&T Long Lines building at 33 Thomas Street in Manhattan was once a major SPOF for the U.S. telecommunications network. It was built to be the nation’s longest-distance telephone switching hub, home to vast machines that connected long-distance telephone calls transmitted to and from the building over copper wires. Nearly all communications on the East Coast connected through 33 Thomas Street, and it was also the termination point for the then-primary transatlantic cable connecting the American telecommunications network to Europe.
On September 17th, 1991, a combination of equipment failure and human error caused the building to go dark. Millions of calls were dropped, transatlantic communication ceased, and the air traffic control system for half the continental United States shut down. As a result of this failure, deliberate moves were made to reduce the concentration of telecommunications infrastructure in a single building controlled by one company, eliminating the SPOF.
Though we now enjoy wireless communications, SPOFs still exist. The worldwide IT outage in mid-July, 2024 was caused by a single point of failure. A buggy update to antivirus software pushed by cybersecurity firm CrowdStrike—a major partner to Microsoft—rendered computers on the Windows operating system unusable for several hours. Similar to the 33 Thomas outage, airports were crowded with stranded travelers, healthcare providers delayed scheduled procedures, and 911 call service was disrupted in some areas. Unlike 33 Thomas, however, this was a worldwide outage—not merely a regional one. The interconnectedness, convenience, and ubiquity of modern wireless communication comes at a cost: vulnerability to sweeping outages during events such as this. The impacts of the CrowdStrike outage were worse than any Russian cyberattack yet unleashed, but it was caused by human error and inept system design, not malice.
CrowdStrike quickly fixed the update and normality was re-established in mere hours. However, imagine the potential damage and disruption of a cyberattack that strikes a SPOF and cannot be fixed in several hours. If IT systems were down over a prolonged period, the country would be incredibly vulnerable to mass data harvesting from organizations affiliated with rival countries or even a conventional attack from non-state actors. The consequences would be catastrophic. Even a small but coordinated cyberattack targeting a third-party managed-service provider, which small and medium-sized businesses disproportionately use, would create economic losses of $80 billion and tens of thousands of lost jobs. A larger attack which targets critical infrastructure involving several days of power grid disruption could cost more than $190 billion.
The CrowdStrike outage revealed a SPOF not within a single IT or telecom network, but within the cybersecurity sector at large. The world has become entirely reliant on and thus a handful of cybersecurity firms like CrowdStrike. Fifteen companies comprise 62 per cent market share of the worldwide cybersecurity industry. That one company falling to human error can grind the world to a halt for hours highlights the true brittleness of global technology infrastructure. If there is a greater diversity in operating systems and greater competition between cybersecurity firms in the market, a large-scale cyberattack would not be able to result in the same scope of damage.
The market will not naturally build the redundancies required to eliminate SPOFs. Current cybersecurity companies are not incentivized to prepare for human error events or cyber-attacks. Building systems to be resilient against hacking is very difficult, costly, and requires constant review and upgrading. It also reduces earnings in the short term, which is especially perilous for executives who manage publicly traded companies. The only way to eliminate SPOFs is to create redundancies such that if one of the nodes fails, the system can continue unabated.
Different systems will require different types of added redundancy to address outstanding SPOFs, but adding the nodes is unavoidable if the problem is to be solved. For the case of 33 Thomas Street, that meant redirecting cables and constructing new switching hubs. For our cybersecurity sector, it would involve expanding the number of operating services and cybersecurity firms. Another way to create more nodes would be to have robust internal IT teams with increased cybersecurity capabilities. However, firms that exist in concentrated markets want to keep their market power and thus acquire startup firms while they are still growing, effectively ‘nipping them in the bud’. The concentrated cybersecurity market is less innovative and productive than what the country needs it to be as non-state actors’ hacking ability improves. Furthermore, outsourcing is consistently cheaper than training and retaining internal IT teams.
Both solutions to the cybersecurity problem—increasing the number of cybersecurity suppliers or bringing cybersecurity teams in-house—involve increasing the number of cybersecurity professionals. There is already a strong labor market demand for cybersecurity professionals. Building a cybersecurity security with more nodes would demand even more than the status quo. To fix the economic SPOF that the CrowdStrike outage identified, the U.S. government should oversee vigorous cybersecurity training. The Biden administration has begun this effort but it demands increased attention via legislation from Congress. An ideal legislative package would appropriate funds to reduce costs of attendance for cybersecurity education at community colleges and other training facilities. It would also provide tax breaks to companies seeking to bring cybersecurity acumen in-house and grants to startups in the cybersecurity sector.
Single points of failure highlight the tension between efficiency and security. The CrowdStrike outage has shown the institutional failure that happens when incentives are growth at all costs with no regard for organizational memory or stability. A strengthened cyber workforce and a friendly market environment are needed to eliminate SPOFs and improve national security.
Nicholas Weising, Program Associate
The views expressed in this piece are the sole opinions of the author and do not necessarily reflect those of the Center for Maritime Strategy or other institutions listed.