As we end 2020, networking infrastructure has become even more critical to connect people, applications, and the economy and distributed workforce that make the world go.
At the same time, networks and IT infrastructure overall are becoming more diverse, dynamic, and interdependent. The internet is now the critical glue that connects traditional and cloud infrastructure. And the distributed workforce and online-focused lives we’re living have driven growing adoption of SASE, CDNs, and other methods of delivering service to the edge.
The last five years have seen a major move to observability from the systems and application side. There are a number of definitions, but observability in the DevOps world has been about using diverse telemetry to know the internal states of systems over time (generally focused around metrics, logs, and traces), and providing answers to the unbounded questions needed to run modern applications.
Since Kentik’s launch, we’ve been at ground zero for a parallel and exciting move towards network observability, and are excited to continue to partner to move the industry forward.
Observability for the network looks at different telemetry and with a networking spin, but is based on the same principles — answering the questions you need to run the network infrastructure that drives the digital world.
How do we define network observability?
The goal is to answer any question of your network infrastructure — quickly and easily:
… and to have support from your observability stack to get those answers quickly and flexibly, and both proactively and interactively.
The goal is to free up ops team time to architect, build, and develop for increased orchestration, automation, uptime, and performance!
Our most successful customers to learn from their observability journey have invested in three key areas: telemetry, data platform, and action.
I’ll talk more about each of these areas this month in subsequent blog posts.
In order to see and reason about the network, it’s critical to gather telemetry:
Without a complete picture of the state and activity of all your networks, you’re missing key capabilities to ask the questions and take the actions needed to ensure great traffic delivery.
To take telemetry and support asking questions, knowing about issues, and driving the actions needed to run infrastructure, there are common patterns and requirements for underlying data platforms:
Depending on the scale of the architecture, planning, engineering, and operations teams, it can also be important that the underlying data platforms are:
For network observability, the goal of asking questions is to understand and take action. As we look across the networks we work with, they say they are looking to be able to:
How is network observability different from the hundreds of existing network monitoring management tools and platforms that have been around for many years?
Historic tools have been standalone, closed systems, generally on-prem and one or few-node, without modern open data architectures.
With limited enrichment, granularity, and retention they’ve also generally focused on the kind of rollups and pre-defined queries that have driven the move towards observability. Often vendor-specific, they generally don’t understand cloud or orchestration at all, or at most view them as separate kinds of networks.
These systems have also been geared at a deep network expert, and as infrastructure layers converge, and ops teams need infrastructure and application visibility, these older more closed and limited systems have not found a place in greenfield observability and monitoring stacks.
DevOps observability platforms have been a driver over the last few years at unifying a wide set of telemetry — traditional APM instrumentation with traditional logging, as well as metrics and the more recent waves of innovation in distributed tracing. Many of the platforms (though not all) also can deal in part or whole with the kind of cardinality seen in network data.
But viewed from a “can I ask these questions about the network?” lens there are still some gaps in how easily the leading DevOps platforms take network telemetry.
And more critically, gaps in understanding of network primitives like prefix, path, underlay, and overlay, and gaps in the kinds of workflows that network professionals engage in to plan, build, operate, debug, scale, and automate their infrastructures.
This all makes sense — even network observability platforms like Kentik that take application-layer data as telemetry don’t have the kind of workflows that developers and app operations teams need to ask questions requiring deep application context.
My view is — better together!
At Kentik, we’re super excited about helping bridge the DevOps/NetOps gap.
Watch this blog over the next month for a series of announcements about how we’ll be feeding unified, enriched network telemetry to a wide range of observability platforms, and some exciting work to drive network-focused views in leading DevOps and App Observability platforms — and the reverse, in Kentik.
Networkers need the same observability principles, tooling, and platforms that those up the stack have been building towards, but with a network-savvy bent.
The legacy network tools aren’t architected for modern infrastructure and the more modern DevOps-focused platforms still lack network savvy, especially around what happens when packets leave eth0.
Network teams practicing observability in architecture and action are already driving better performance, reliability, security, remediation, and growth. As a passionate network, data, and ops nerd, I’m beyond excited about what these emerging practices mean for the industry over the next decade and beyond.
It’s possible to get there, whether building yourself, working with a vendor, or both. At Kentik, we’re here as a resource wherever you are in your observability journey.