Every year the internet suffers numerous disruptions and outages, and 2021 was no exception. Kentik’s Doug Madory recaps the top 10. And now the world’s network engineers deserve a load of #HugOps in 2021.
Every year the internet experiences numerous disruptions and outages, and 2021 was certainly no exception. This year we documented outages, including multiple government-directed shutdowns, as well as what might be the internet’s biggest outage in history. In this post, I run through 10 of the top outages that we covered in 2021. Needless to say, the world’s network engineers deserve a load of #HugOps in 2021.
Back in February, my friend John Kristoff kicked off a lively discussion on the NANOG listserv by asking subscribers to rank their top three most famous internet outages. The subsequent exchange of internet war stories inspired the creation of the lead-off panel at NANOG 83 entitled “Famous Internet Outages.” Kentik Co-founder and CEO Avi Freedman and I were two of the panel’s four speakers.
The following is our list of the 10 top internet outages that occurred in 2021 listed in rough chronological order.
The first major outage on our list occurred in January. That was when the government of Uganda cut the country’s internet services in the days around a national presidential election. This outage took place almost 10 years to the day after the internet shutdown in Egypt during the Arab Spring.
Egypt’s shutdown was a watershed moment for the internet community. It signaled the beginning of the era of the large-scale, government-directed internet shutdown that we presently find ourselves in.
Uganda was almost completely offline for five days around the day of the vote. A vote which resulted in the re-election of President Museveni, extending his 36-year rule over the East African country. Sadly, the disruption came as little surprise to observers, who anticipated and tried to prevent the internet blockage.
Peter Micek of Access Now and I co-wrote a blog post stressing that the campaign to combat shutdowns is far from over and needs more support. It’s especially worrisome that, from this, other embattled authoritarian regimes may draw the conclusion that shutdowns work, given the president’s re-election.
In the middle of the first Covid-19 pandemic winter, the Verizon Fios service suffered a major outage. According to Kentik data, Verizon experienced a 12% drop in traffic volume nationally while the service was down.
While it was initially attributed to a fiber optic cut in New York City, later it was clarified that the fiber cut was a separate issue. The outage lasted over an hour and disrupted the midday activities of thousands of remote workers and online students along the East Coast.
Only weeks after the shutdown in Uganda came a military coup in Southeast Asia. On February 1, the Myanmar military seized control of the country through a coup d’état and ordered a shutdown of most of the country’s internet services for several hours.
During the coming months, disruptions to Myanmar’s internet ran the gamut: a total shutdown of all services, nightly internet blockages, extended shutdown of mobile internet services, and even a leaked BGP hijack of Twitter.
In an effort to understand and document the complex situation in Myanmar, we joined forces with CAIDA’s Internet Outage Detection and Analysis (IODA) group, the Tor Project’s Open Observatory of Network Interference (OONI) team and Censored Planet from the University of Michigan, to combine our technologies and produce a comprehensive analysis of the internet blockages that took place in Myanmar this spring.
This collaboration culminated in a paper published in August in SIGCOMM’s 2021 Workshop on Free and Open Communications on the Internet.
On March 10, Russia’s agency for regulating the country’s communications (Roskomnadzor) attempted to slow traffic to Twitter, but it inadvertently disrupted much of the country’s mobile internet service instead.
In a mistake that IT professionals around the world can relate to, they were stung by a bad substring match from a poorly formed regular expression. Intending to block Twitter’s link shortener t.co, Russia blocked traffic associated with all domains containing t.co, for example, Microsoft.com and Reddit.com.
In early June, content delivery network Fastly experienced a major outage because of a faulty configuration push causing thousands of high-profile websites to become unreachable. According to Kentik data, Fastly saw a 75% drop in traffic volume during the outage.
Fastly’s downed services began returning to normal within 50 minutes, which led me to take a glass-half-full perspective on the incident. Despite our best efforts, outages will happen and we should take comfort in the speed at which Fastly was able to restore their services following the disruption.
“There is no error-free internet, it doesn’t exist,” I told The New York Times. I was also invited on a live BBC World News broadcast to discuss the outage — mostly to define CDN for the audience. :-)
Syria experienced multiple hours-long national internet blackouts in May and June as part of an ongoing effort to combat cheating on student exams. Continuing a practice that began in the summer of 2016, the government of Syria ordered internet service cut for 4.5 hours on multiple days while high school final exams were administered.
The outages began at 1:00 UTC (4am local) and lasted until 5:30 UTC (8:30am local) during which time the exams were physically distributed around the country. A similar practice has happened in Iraq since 2015. However, there were no exam-related national outages in Iraq this year. Perhaps this should give us hope that common sense might prevail in Syria with respect to these shutdowns.
On Sunday, July 11, Cubans in cities across the island nation spilled out into the streets in unprecedented numbers to protest an authoritarian government that has been in power since 1959. As these protests grew, the country’s internet went completely offline. At first, it was just down for 30 minutes; then we observed sporadic and limited outages over the next several hours. On the following day, the Open Observatory documented blockages of social media that continued throughout the week.
While shutdowns are new to Cuba in 2021, July’s outage wasn’t the first. There was a mobile service outage in January that appeared to be government-directed following additional protests motivated by the 27N movement, as well as a complete shutdown for several hours in February.
Despite the 2013 activation of the ALBA-1 submarine cable, internet access in Cuba has historically been very limited due to a combination of the effects of the U.S. embargo and Cuba’s domestic policy which has restricted access to the internet.
Following the protests and outages, there was a push by some U.S. politicians to find a way to provide internet service to the Cuban people via balloon or other unorthodox means. Citing the success of Google’s shuttered Project Loon to extend balloon-based mobile internet service to Puerto Rico and Kenya, they proposed using the same technology to provide an alternative source of mobile internet service in Cuba.
Of course, those Project Loon achievements were done in cooperation with local providers, not in contention with them, as would be the case in Cuba. Additionally, there are numerous unresolved challenges to what I termed a “Hollywood scenario” in the Washington Post.
On October 4, the world’s largest social media platform suffered a global outage of all of its services for nearly six hours, during which time Facebook and its subsidiaries, including WhatsApp, Instagram and Oculus, were unavailable. While Facebook and Instagram had suffered a brief outage on April 8, the big outage for the social media giant in 2021 was clearly October 4.
With a claimed 3.5 billion users of its combined services, Facebook’s downtime of at least five and a half hours comes to more than 1.2 trillion person-minutes of service unavailability, a so-called “1.2 tera-lapse,” or the largest communications outage in history.
According to Facebook’s official explanation, it was a routine maintenance job that took down the entire platform by issuing a command to “assess the availability of the global backbone capacity which unintentionally took down all the connections in our backbone network, effectively disconnecting Facebook data centers globally.”
There were numerous ancillary effects from the loss of the Facebook platform. After initially welcoming the hordes of Facebook refugees, Twitter began to feel the crunch as one of the few remaining social media options. Signal reported millions of new signups in the wake of the temporary loss of WhatsApp. And users accustomed to authenticating via Facebook for various apps or websites were unable to access them.
Lastly, the outage highlighted the confusion that average users have in understanding the difference between their internet service and the apps that run on it. During the outage, Downdetector reported spikes in complaints about nearly every company in the internet ecosystem when people could not reach the Facebook platform.
Beginning on Monday, November 8, and continuing into the next day, Comcast suffered several large outages affecting thousands of customers across the country. Beginning on the West Coast and continuing to the East, Comcast users experienced outages lasting anywhere from a few minutes to hours.
Most large outages begin with a single precipitating event that takes down services until they are later restored. What makes these outages unusual is that different parts of the Comcast network went down at different times, sometimes hours apart.
The graphic above shows two drops in traffic to AS33651 of Comcast beginning at 22:00 UTC on November 8 and again just before 6:00 UTC the following day. Meanwhile, AS33491 experienced a total outage beginning just after 13:00 UTC on November 9.
As of this writing, Comcast has yet to publish an explanation for the outages.
On October 25, we saw another military coup d’état take down a country’s internet - this time in the North African country of Sudan. While Sudan remained connected to the global internet, this shutdown targeted mobile internet services rendering them inoperable for more than 24 days. Before restoring services on November 18, Sudanese military forces violently cracked down on anti-coup protesters in the streets of Khartoum.
Sudan is no stranger to shutdowns. The country has sadly experienced numerous blackouts in recent years, starting as far back as an incident in September 2013, as well as blackouts this summer to prevent student cheating on exams.
This year’s outages that didn’t make our top-10 list included September’s incident in New Zealand caused by a faulty DDoS attack mitigation and the slow restoration of internet service in New Orleans as the region recovered from Hurricane Ida.
February’s freeze that knocked out electricity for millions of customers in Texas also led to power outages impacting 4.7 million people in northern Mexico. Without electricity, internet services were also knocked out, as we observed in Kentik data.
This summer saw the departure of U.S. military forces from Afghanistan after nearly 20 years of conflict. In a blog post, I posed the question of how the departure would impact the Afghan domestic internet, which had grown dramatically in the past decade.
While we have yet to see any major internet disruptions in Afghanistan, it’s fascinating to observe the military withdrawal from a BGP perspective in the global routing table. The decline of prefixes originated by AS5800, the primary ASN used by the Department of Defense in Afghanistan, mirrors the troop drawdown. Below is a timeline of originations from AS5800 this year.
Another ASN that stopped announcing DoD prefixes in 2021 was the mysterious AS8003. That was the obscure ASN that appeared out of nowhere in the final seconds of the previous administration and started announcing more IPv4 space than any ASN in history leading to headlines in the Washington Post and Associated Press.
That’s our list of the 10 top internet outages that occurred in 2021. Would any of them make your all-time, top-three most famous outages? Did we leave any out? Let us know.