Kentik recently hosted a virtual panel with network leaders from Dropbox, Equinix, Netflix and Zoom and discussed how they are scaling to accommodate the unprecedented growth in network traffic during COVID-19. In this post, we highlight takeaways from the event.
This week, Kentik hosted a virtual panel with network leaders from Dropbox, Equinix, Netflix and Zoom to discuss how they are scaling their infrastructure and services to accommodate the unprecedented growth in network traffic during COVID-19. Avi Freedman, Kentik co-founder and CEO, moderated the discussion. He was joined by:
- Alex Guerrero, Senior Manager of SaaS Operations at Zoom
- Bill Long, SVP of Core Product Management at Equinix
- Dave Temkin, VP of Network and Systems Infrastructure at Netflix
- Dzmitry Markovich, Senior Director of Engineering at Dropbox
Architectures and Recent Traffic Growth
These leaders are in charge of running networking and engineering functions for their organizations. All of them have observed significant changes in network usage and traffic patterns due to the changes in the global workforce. As a result, many of them needed to execute a six-month infrastructure growth plan, or even a 2020 growth plan, in a matter of just weeks.
The panelists discussed how their organizations run their networks and data centers. Zoom and Dropbox, for example, leverage public cloud infrastructure and automation tools so their services can quickly burst in size for additional capacity. Their investment in automation is essential for handling the dramatic growth in the demand for their services.
Equinix connects the world’s leading businesses, including Kentik, to customers, employees, and partners inside some of the most-interconnected data centers. They have a large global footprint of data centers and exchanges, including all major cloud providers. According to Bill Long, Equinix has recently seen anywhere from 10% to over 40% growth in traffic.
Netflix and Dropbox both operate edge and content delivery networks designed to handle the increase in traffic volume. According to Dave Tempkin, Netflix operates all of its content delivery on its own global infrastructure. According to Dzmitry Markovic, Dropbox operates its own backbone network to process multiple terabytes of data per second. He also mentioned that Dropbox uses Kentik to understand their traffic flows and peering.
Cloud Bursting for Fast Growth
There were several common patterns between most of the organizations on the panel. While the burst in traffic and usage required more infrastructure and capacity, most of the organizations noted having plenty of headroom to grow. The panelists from Dropbox, Zoom and Netflix noted how bursting to the cloud supports additional short-term capacity requirements. The cost of the flexibility provided by public cloud comes with consumption or variable pricing. It’s harder to predict where the costs will go, but if your business is aligned with the consumption of these resources, it’s easier to absorb the costs. Public cloud allows the provisioning of resources only based on what is being used, versus the typical over-provisioning we see across infrastructures. To save costs, organizations using a lot of public cloud will reserve instances in advance. Those who are bursting are often using “spot” or on-demand instances. It’s important to note that cloud providers can also have capacity constraints, as we saw this past week on Azure.
New Projects to Expand Infrastructure
All the panelists created projects to expand their traditional infrastructures because of the traffic changes. Zoom’s Alex Guerrero mentioned his team is building and establishing new peering and connectivity to handle from where the bulk of the traffic is coming. Alex mentioned Zoom uses the Kentik platform for the analytics to drive those decisions. The Zoom team also uses many of the Equinix cloud exchanges to connect across providers. Netflix’s Dave Temkin mentioned needing to find new partners from which to source servers used to move content caches closer to the eyeballs. Getting closer to the end-user is key for all the panelists who deliver content, collaboration technologies, or infrastructure.
Bill Long said Equinix is in a position to handle the increased capacity, having spent the previous 2-3 years upgrading from 10Gbps to 100Gbps links in their core infrastructure as part of a normal technology refresh cycle. When remote work and shelter-in-place orders began, Equinix had significant resources and capacity to handle the additional traffic. Bill said they are also seeing a lot more connections and peering in their facilities to make the traffic flow more smoothly for collaboration and unified communications along with moves towards VPN services.
Remote Work-Life Balance
Most of the panelists said they manage distributed, remote teams who are accustomed to working that way. There was a discussion around supporting employees who are now balancing having their families in the background and supporting new ways for teams to socialize. The panelists noted that these are common challenges that their organizations are managing quite well. They also noted the importance of work-life balance, even for the network teams who are working harder than ever to keep us all connected.
You can watch the replay of the virtual panel on-demand here.
We will also be hosting more of these panels on various topics as we all change the way we work. The first step to scaling your infrastructure is being able to measure it.