Telecom Risk and Security Part 2 – The Carrier Hotel SuperNode

February 1996. A half-ton bomb planted in a small truck near South Quay Station close to the recently renovated commercial district of Canary Wharf. The bomb detonated around 1900 hours, bringing down a six story building, and severely shaking Canary Wharf  Tower and other buildings around the Docklands area. The area, home to much of the telecommunications interconnection capacity connecting the UK and Europe to the rest of the world, is severely damaged and all surrounding activity disrupted.

Today the Docklands area continues to support many important, high density communications interconnection points, including Telehouse Europe, the London Internet Exchange (LINX), and the London Network Access Point (LONAP) – in addition to individual nodes and facilities operated by European and other international telecommunications carriers.

This includes companies operating submarine fiber optic cable systems. These densely interconnected areas are referred to as telecommunications “SuperNodes,” or if the facilities are located at individual facilities, “Carrier hotels.”

A Global IssueThe US National Security Telecommunications Advisory Committee (NSTAC) defines a carrier hotel (or SuperNode) as “conditioned floor space operated by a commercial landlord for the purpose of hosting multiple service providers.” The most well-known supernodes are 60 Hudson in New York City, The NAP of the Americas in Miami, One Wilshire in Los Angeles, and the Westin Building in Seattle.

Carrier hotels emerged in the late 1900s following the Telecommunications Act of 1996, which required US incumbent carriers to provide interconnection or collocation space for the new competitive carrier industry. The problem for the carriers, and opportunity for commercial building owners, was one of the carrier facilities exhausting available space.

The commercial landlords were able to provide building space, partially due to low occupancy in city center areas near large carrier central offices (such as Bunker Hill in Los Angeles) during the late 1990s, and competitive carriers were able to build out their interconnection infrastructure with little or no interference by the incumbent carriers.

Carrier hotels can also be considered “scale free,” with the only real limitation on growth being the physical space available within a property, as well as electricity and cooling for electronics and switching equipment. This may not even be a large problem, as much of the carrier hotel interconnection volume is done through “passive” cross connects. Cross connects fiber optic to fiber optic splicing which does not require local electronics, and thus is not directly vulnerable to cooling and power issues.

What is the Impact of Losing a Carrier Hotel or SuperNode?

Could another attack similar to the 1996 Docklands incident potentially have the impact of severing interconnection capacity between communications carriers, Internet service providers, and news or information resources?

The extent of disruption would depend on the amount of switching and multiplexing equipment and physical interconnection capacity each company locates within the Telehouse facility, or the immediate area.

This is a source of much debate. In the US, nearly all facility-based (own their own cable) carriers and large virtual carriers have numerous interconnection sites located throughout the country. The loss of a single node or interconnection facility would not significantly disrupt national or international communications.

The Federal Communications Commission provides guidelines for facility-based carriers through the Network Reliability and Interoperability Council (NRIC) which advises “carriers place a high priority on service reliability by building networks with alternative routes, backup facilities, and other assurance capabilities.”

The danger at the SuperNode or carrier hotel is not necessarily one of the incumbent or long distance facility-based carrier. It is more an issue with:

  • International carriers with only one or two physical landing points in North America (or Europe)
  • Local exchange carriers with limited interconnection capacity outside of the carrier hotel
  • Internet service providers operating in a smaller geography (Tier 3 access networks)
  • Hosting companies and content delivery providers with single or limited Internet access
  • Local fiber providers with limited diversity within a city center

This is actually quite alarming. When you start to consider the outsourcing industry, including cloud computing, entertainment, and the number of companies who do not have strong disaster recovery plans – including geographic diversity within their applications and communications access – the potential for disruption is high.

Most of the SuperNodes provide interconnections for more than 200 facility-based carriers, networks, content providers, cloud service providers, and other hosting or business outsourcing. Understanding the reality that we live in a very global economy, losing interconnection capacity of even one SuperNode could render a large percentages of the global financial, logistics, business-to-business, disaster response, and government communications inoperable for hours of days while restoral plans are either implemented or conceived.

Companies with hosted applications and data center presence either in or near the failure point could be isolated or destroyed. Hosted companies “single-threaded” with one carrier connection that using the carrier hotel for its main interconnection point would be shut down.

The bottom line, companies without a strong restoral, backup, disaster recovery, and physically diverse network will suffer a catastrophic failure of their systems, with the length of outage entirely dependent on the facilities ability to recover from an outage or failure.

If more than one SuperNode is disrupted, such as all facilities on the US West coast, international communications both on Internet links (the majority of international communications today) and dedicated capacity will cause significant damage and disruption to both US and international communications.

What Can Cause a Major Failure?

There are many factors to consider, both human and natural, when looking at global communications infrastructure. Just in the past 5 years we’ve seen significant submarine cable disruptions due to both undersea earthquakes and cable cuts to strong waves hitting cable landing facilities on the coasts. Carrier hotels are primarily located on the coasts, in large cities, due to the proximity of both submarine cables supporting international communications, and the fact most North American and European terrestrial cable routes tend to interconnect at major coastal cities.

Coastal cities are vulnerable to:

  • Earthquakes
  • Typhoon and Hurricane wind/storm swells
  • Tsunami
  • Tropical rain and flooding

Human factors are also a concern, with potential problems such as:

  • Civil disorder
  • Terrorist attack
  • Vandalism
  • Employees (disgruntled, human error, etc)

If you look at the streets adjacent to buildings such as One Wilshire, you can see the evidence of dozens of carrier tags trying to mark and protect their conduit routes running through the streets, and entering the carrier hotel facility at One Wilshire. Few of the manholes around the area are locked, and few if any local building security officers or police officers will ever challenge a company setting up a couple of traffic cones and entering the manhole.

The potential for human disruption, just by having access below the street level near a building such as One Wilshire or 60 Hudson could be extreme. From below ground potential terrorists have access to power substations, water lines, and hundreds of conduits supporting the entire metro area – including the carrier hotel. A well placed explosive below grade in downtown Los Angeles could potentially disrupt the communications of more than 450 network and Internet-connected companies operating within One Wilshire or immediately adjacent buildings.

Many of the carrier hotels do not have battery backup or even redundant power, as the “meet-me-rooms” fell under the “scale free” rapid growth in the late 1990s and 2000s when those rooms had little or no management, admistrative controls/regulation. This is gradually being brought under control in the largest facilities, and most smaller facilities such as the NAP of the Americas in Miami are very well controlled.

This was proven possible during the 1996 attack in London, and could occur again at any single, or multiple carrier hotel facilities located in the United States and other countries. It is a real problem, and one that is not lost on governments around the world.

What We Can, and Are Doing to Protect Our Communications Assets

The key to all applications and communications security is diversity and redundancy. Very few submarine cables are being built today without at least a diverse loop, or a restoral agreement with a competitive cable company. If there is a single location or cable disrupted across the oceans, and restoral capacity is planned, the problem can be managed.

For North American carriers and Internet Service Providers, having a network with multiple “peering” points in different geographic locations will minimize disruption, and in the case of most regional and global networks that is the case. In fact, most large Internet networks require interconnections in multiple locations before they will consider “peering” relationships. That is of course for both traffic management, as well as disaster planning.

This would mean an Internet Service Provider would best plan their network for both physical high capacity interconnections in multiple carrier hotels, but also peering or disaster peering plans for interconnecting at public peering points, such as PAIX, Any2, Equinix, and Telehouse in the US, or other major Internet Exchange Points (IXPs) in London, Amsterdam, Frankfurt, and other Asian cities.

For those carriers and ISPs planning long distance interconnections, care must be taken to ensure route diversity. In some cases, multiple carriers will purchase capacity on a wholesaler fiber provider’s infrastructure (such as Level 3 Communications, XO, and Time Warner), with the possibility several different network providers will buy capacity on their long distance route using the same cable system.

In many cases, such as cable landing stations dotting Long Island in New York, the actual cable connecting those facilities to the carrier hotels and their own cable capacity management facilities follow a single route. The risk is that a single backhoe, terrorist, or vandal could potentially cause serious international communications damage by simply cutting a trough across the roadway, or jumping into a manhole and cutting cable.

“Vandals are to blame for the massive phone and Internet outage in Silicon Valley on Thursday, an AT&T representative has confirmed.” (CNET News, 9 Apr 2009)

An incident in early 2009 near San Jose (California) where an individual performed a similar act of vandalism caused significant disruption across a large area in Northern California. The above story confirms the danger present when critical infrastructure is not adequately protected, and a single person can enter a manhole with the potential of such widespread impact.

Physical cable and route diversity guarantees should be part of every disaster recovery and route planning negotiation.

Those companies outsourcing their mission and company-critical data and applications must look at geographic diversity, with the ability to dynamically restart applications with industry and customer-acceptable recovery point and recovery time objectives. Cloud computing technology is getting closer to providing this for the future, but not quite ready for offering service level objectives.

The US Government Weighs In

The NSTAC believes the government should work with private industry to develop both operational best practices, as well as a solid, coordinated, threat warning system to assist carrier hotel, data center, and SuperNode operators to ensure the best level of security for national and global infrastructure.

Police departments should have some level of visibility into carrier hotels and SuperNodes, data centers, and telecommunications company central offices. Not because we want “big-brother” looking into our business, but because we want law enforcement to understand the nature of our telecom business, and what could potentially happen if human beings are able to damage local infrastructure (which includes emergency responder infrastructure).

The NSTAC recommends individuals employed at carrier hotels and critical infrastructure facilities go through an initial security check. This may be in part because the national authorities probably have either own communications running through SuperNodes, and have recognized there is a reasonable chance US government and military communications could also be damaged or disrupted in the event of a facility failure or loss.

The FCC and NSTAC also recognize the burden of responsibility ultimately falls on the individual networks and customers. Our economy and communications infrastructure depend on each company having good disaster recovery and diversity plans. Individual users must ensure we get service level agreements with a clause ensuring physical route diversity in backup and DR site interconnections.

ISPs need to multi-home their networks. Not just at a single interconnection point, carrier hotel, or IXP – but in separate facilities, preferably in separate geographies.

The government is working with representatives from the telecom, vendor (electronic switching equipment, etc), applications, business community, and government agencies on a continuing basis to ensure US policy is kept current, and the threat/risk of our current infrastructure is understood. The President’s National Security Telecommunications Advisory Committee (NSTAC) is now part of the US Department of Homeland security, and coordinates much of the discussion.

As users, we need to take action as well. We can do any or all of the following to ensure not only our security in global communications, but also at our businesses and home:

  • Ask your hosting provider if they have a disaster recovery plan – Get proof
  • Ask your network provider if they are multi-homed and multi-homed in multiple geographies – Get proof
  • Ask your provider if their physical diversity is using physically separate fiber routes
  • Ask your hosting provider if they have good coordination with law enforcement for local security – Get proof
  • Ask your international VPN (virtual private network) provider if their cable system has a restoral plan, or if you have geographic fail-over on a separate cable – Get proof

In short, the burden is ultimately on the end user to ensure their business or activity survives a major disaster. We must drive our vendors, and should seriously consider strongly supporting greater regulation and oversight of our critical infrastructure facilities to ensure we do not lose a resource that could potentially contribute to a global economic and communications catastrophe.

What are your concerns? Do you believe we are OK in our current telecom environment? Should we do more? Your comments are welcome.

John Savageau. Long Beach

Other articles in this series:

  • Risk and Security in the Telecommunications Industry Series – Part 1
%d bloggers like this: