What do summer blockbusters have to do with network operations? As utilization explodes and legacy tools stagnate, keeping a network secure and performant can feel like a struggle against evil forces. In this post we look at network operations as a hero’s journey, complete with the traditional three acts that shape most gripping tales. Can networks be rescued from the dangers and drudgery of archaic tools? Bring popcorn…
A Summer Blockbuster in Three Acts
I recently had the chance to present at A10 Connect, a user conference for A10 Networks. I thought it would be fun to frame my presentation in three acts like a typical summer blockbuster. In act one, you have the “fundamental problem” facing the hero, typically an external problem like a zombie apocalypse or broken romance. In act two, you have the “deeper challenge,” which is more internal, such as self-doubt due to a traumatic history, unreasonable demands from a supposed ally, or a betrayal from within inside the hero’s circle of trust. The third act is where new resources are typically revealed to help the hero gain resolution.
Act 1: The Network Traffic Visibility Problem
Networks are delivery systems, like FedEx. What would happen if FedEx didn’t have any package tracking? In a word, chaos. Sadly, most large data networks operate in a similar vacuum of visibility. As anyone in operations knows, despite decades of continual advances in networking technology, it’s still a daily struggle to answer basic questions:
- Is it the network?
- What happened after deploying that new app?
- Are we under attack or did we shoot ourselves in the foot?
- How do we efficiently plan and invest in the network?
- How do we start to automate?
Why is it still such a challenge to get actionable information about our networks? Because “package tracking” in a large network is a big data problem, and traditional network management tools weren’t built for that volume of data. As a point of comparison, Fedex and UPS together ship about 20 million packages per day, with an average delivery time of about one day. A large network can easily “ship” nearly 300 billion “packages” (aka traffic flows) per day with an average delivery time of 10 milliseconds. Tracking all those flows is like trying to drink from a fire hose: huge volume at high velocity.
The Red Herring: More Tools & Screens
In traditional network management, the answer to this data overflow is to pile on more tools, resulting in more screens to look at when you need to figure out what’s going on. But study after study shows that the more you have to swivel between network management tools the less you are able to achieve positive network management outcomes. A representative paper from Enterprise Management Associates (EMA), The End of Point Solutions, concludes that network teams with one to three tools are able to detect problems before end users do 71% of the time. For teams with 11+ tools, that figure drops to only 48% of the time. Similarly, only 6% of teams with one to three tools experience days with multiple network outages, whereas 34% of teams with 11+ tools experience multiple-outage days. If the goal of a NetOps team is to ensure the delivery of high quality service, those are pretty telling numbers. Your network team is the hero in this movie, and your fundamental challenge is now clear.
Act 2: The Deeper Challenge
At this point in the movie, you can’t have all doom and gloom, so there is a ray of light. The good news is that most networks are already generating huge volumes of valuable data that can be used to answer many critical questions. Of course, that in turn brings up the deeper challenge: how on earth can you actually use that massive set of data? Old ways of thinking clearly aren’t going to get you anywhere, so a revelation is needed — an opening of the mind to new possibilities. In the case of network data, that means realizing that to get value from that data you need to consolidate it into a unified solution with the following capabilities:
- Data unification: the ability to fuse varied types of data (traffic flow records, BGP, performance metrics, geolocation, custom tags, etc.) into a single consistent format that enables records to be rapidly stored and accessed.
- Deeply granular retention: keep unsummarized details for months, enabling ad hoc answers to unanticipated questions.
- Drillable visibility: unlimited flexibility in grouping, filtering, and pivoting the data.
- Network-specific interface: controls and displays that present actionable insights for network operators who aren’t programmers or data analysts.
- Fast (<10 sec) queries: answers in real operational time frames, so users won’t ignore the data go back to bad old habits.
- Anomaly detection: notify users of any specified set of traffic conditions, enabling rapid response.
- API access: integration via open APIs to allow access to stored data by third-party systems for operational and business functions.
Of course just opening one’s mind to the dream isn’t the same as having the solution. How do you get your hands on a platform offering the capabilities described above? You could try to construct it yourself, for example by building it with open source tools. But you’ll soon find that path leads to daunting challenges. You’ll need a lot of special skills on your team, including network expertise, systems engineers who can build distributed systems, low-level programmers who know how to deal with network protocols, and site reliability engineers (SREs) to build and maintain the infrastructure and software implementation. Even for organizations that have all those resources, it may not be worthwhile to devote them to this particular issue. At this point in the movie, you, as the hero, are truly facing a deeper challenge.
Act 3: Big Data SaaS to the Rescue
Fortunately for network teams, there’s an accelerating trend toward cloud services as the solution for all sorts of thorny problems. In the network traffic visibility realm, Kentik Detect is that solution. Kentik offers an easy-to-use big data SaaS that’s purpose-built to deliver real-time network traffic intelligence. Kentik Detect gives you detail at scale, super-fast query response, anomaly detection, open APIs, and a UI built by-and-for network operators. With Kentik Detect, network teams now have the analytical superpowers they’ve always needed. Our public cloud ingests and stores hundreds of billions of flow records in an average day and makes all of that incoming data queryable within three seconds of receipt. Even more impressive, ad-hoc multi-dimensional queries across multiple billions of rows return answers in less than two seconds (95th percentile). That defines “fast and powerful.” Instead of swivelling between several tools to troubleshoot a network performance issue, your teams can now quickly detect and address a huge range of network-related issues from a single solution. Kentik Detect doesn’t require a costume — though we’ve been known to give away some pretty snazzy t-shirts and socks — to transform you into a networking superhero. And your network stakeholders need not suspend belief to experience Kentik’s blockbuster story (spoiler alert: it has a happy ending). Learn more by digging into our product, seeing what our customers think, or reading a white paper on the Kentik Data Engine. Better yet, dive right in by requesting a demo or starting a free trial today.