I spent the last few months of 2022 sharing my experience transitioning networks to the cloud, with a focus on spotting and managing some of the associated costs that aren’t always part of the “sticker price” of digital transformation.
I originally envisioned this data gravity content as part of the Managing Hidden Costs series, but the more planning I did, the more I realized it was an important topic all its own. Why data gravity? With data sets reaching record scale, and organizations pressed from multiple angles to distribute their data across availability zones, it is more important than ever for network operators to understand how data gravity is impacting not only their bottom line but other critical factors like user experience, development velocity, and engineering decisions.
In this and the following articles of this series, we will:
I have Dave McCrory to thank for the conceptual framework of data gravity, an idea he introduced in 2010. Similarly, concepts I will introduce later, service energy and escape velocity, were also coined by Dave when they are used in reference to data in cloud systems.
To quote him directly:
“Consider data as if it were a planet or other object with sufficient mass. As Data accumulates (builds mass), there is a greater likelihood that additional services and applications will be attracted to this data. This is the same effect Gravity has on objects around a planet.”
As data accumulates in a specific location, its gravity increases. As Dave notes in the quote above, this draws apps and services that rely on this data closer. But why? Physics. As the gravity of the data grows, associated services and apps will rely more on low latency and high throughput and continue their acceleration toward the data to compensate for the increase in data mass.
One of the best examples of data gravity in the cloud is the cloud providers themselves. As Amazon, Microsoft, and Google (and others) started offering cloud data services, their data centers had to grow dramatically to accommodate the increased data. The more apps and services that use these cloud services, the more bound enterprises are to the provider’s data centers. This increased gravity meant pulling in apps and services from a wider area, though eventually at the expense of cost and performance for users.
The cloud providers understood this problem and started to build data centers around the country/internationally to redistribute their data mass, enabling high throughput and low latency for a greater geographical range of customers. In the early days, this could involve days or weeks of downtime (unthinkable to today’s cloud consumers), as data had to be physically copied and moved – a sensitive and time-consuming endeavor.
For network operators and teams building hybrid cloud systems, managing this data redistribution within and between private and public clouds is easier said than done. As the mass (scale) of the data grows beyond a certain point, the movement of the data becomes time and cost prohibitive. This may be accounted for as a once-off event, but it needs to move freely for data to be valuable.
Additionally, services that update, validate, prune, report, and automate will build structures around the data. These services are interesting components when planning out application stack infrastructure. These data services become the network and run on top of “the network.”
To better understand how and when to use these abstractions to help your data achieve escape velocity (or where high gravity is beneficial), let’s look at data gravity and its relationship to cost, performance, and reliability in your cloud networks.
The closer your apps and services are to your data, the cheaper your egress and transit costs. At face value, this suggests that a high-gravity, “monolithic” approach to data storage is the way to go. Unfortunately, most network enterprises handling massive data must do so across various zones and countries for reliability, security, and performance.
So, in a vacuum, high data gravity equals low costs. Still, with operations and customer use spread geographically, costs can start to rise without multi-zonal presence and additional data infrastructure.
Higher data gravity for distributed networks at scale will ultimately equal higher costs. The data needs to be moved in from the edge, moved around for analytics, and otherwise exposed to egress/transit costs.
As pointed out earlier, the closer services and apps can be to the data, the lower the default latency. So, high data gravity for a system with a geographically-close footprint will result in improved performance.
But, as with cost considerations, performance makes its own demands against data gravity. To deliver optimal performance, efficient data engineering and networking infrastructure are critical when generating or interacting with massive data sets. If apps and services can not be any closer to the data, abstractions must be engineered that bring the data closer.
For me, reliability is the crux of the issue with data gravity. Because even if an organization is operating in a relatively tight geographical range, allowing data gravity to grow too high in a given cloud zone is a recipe for unintended downtime and broken SLAs. What if their data center experiences an outage? What if your provider has run out of capacity for your (massive) data analysis processes?
Reliability, though, is not just about availability but about the correctness and persistence of data. Let’s take a look at a few services needed to make a large chunk of data useful:
Many of the items above can be skipped entirely when crafting the application stack. However, when data volumes get big, most organizations must incorporate one or most of the above concepts. The more valuable the data store becomes, the more likely the services above are required to enable the growth of the data set.
Most organizations that house large data structures will build many teams around data usage. Each team will take on an aspect of adding data, cleaning up data, reporting on data, taking action on the data stored, and finally purging/pruning/archiving data. In cloud networking, this means taking on more cloud-native approaches to these concepts, forcing teams to rethink/rearchitect/rebuild their stack.
This process is what is often called digital transformation. Digital transformations do not require big data, but big data usually requires digital transformations.
Network architects and operators handling massive data sets in the cloud must be aware of the effects of that data accumulating in one region (or even with one provider). If left alone, these high-gravity data sets become impossible to move, difficult to use, and vulnerable to outages and security threats while weighing down the balance sheet.
Network components and abstractions are required to counter this and “free” data. But, this introduces some of its own challenges, which we will explore in part II of this series.