Big Data for NetFlow Analysis
Cisco Live 2016 Interview Covers Why, How, and What’s Next
Cisco Live 2016 gave us a chance to connect with scores of visitors to our booth — both old friends and new — as well as the opportunity to meet with BrightTalk for some video-recorded discussions on hot topics in network operations. This post focuses on the first of those videos, in which Kentik’s Jim Frey, VP Strategic Alliances, talks about the complexity of today’s networks and how Big Data NetFlow analysis helps operators achieve timely insight into their traffic. You can read an overview below of what Jim had to say. For the full video check our BrightTalk channel.
The founders of Kentik have been watching for many years now to see the effect that Big Data is having on network management and analytics. It’s really fascinating, I think, that network management is one of the last areas within digital and IT operations that has taken advantage of Big Data technologies.
One of the things that’s really different about Kentik is that we’ve decided to build a Big Data architecture at the core of a network monitoring solution. That brings some real advantages, because Big Data is great not only about handling large volumes of data, but also about letting you navigate through and explore that data very quickly.
You have to build the right architecture to do that. You have to package it in a way that makes it effective and efficient. But once you’ve figured that out — and that’s one of the things we’ve done at Kentik — it becomes a very powerful solution for understanding the state of your network and then being able to drill in, drill down, drill left, drill right, and pivot your analysis. However you want to, you can change your view to get to the bottom of any sort of interesting or worrisome situation that you see.
Our Big Data implementation also gives us a basis for doing automated evaluation of the metrics, so we can start doing things like automated alerting. DDoS detection is a big use case for us, and Big Data enables us to be both definitive and clear when you’re trying to answer a question about whether your network is performing the way you expect it to perform.
Big Data analytics really helps with a couple of key things in particular. One is clear visibility into current activity on the network, so you can see exactly who’s talking to whom and driving what kind of activity, traffic, and volume. It can help you see trends in activity. You can use it to drill down on those trends and to help you recognize what’s normal and what’s not. When you see something that you’re not sure about, you want to be able to get down into it and understand what’s underneath the surface as quickly as possible.
Organizations that are interested in Big Data as an architecture for network management have to think about a few things. Number one, not all Big Data architectures are specifically designed for the network monitoring use case. You can take other Big Data tools and try to adapt them, but it’s a fair bit of work to get to the level of functionality that most folks expect out of the non-Big Data tools that are out there. You don’t want to give up the goodness of those tools just to have Big Data.
So what you really need to look for is solutions that take advantage of all the great things that Big Data can do for you but that have been adapted and optimized specifically for your network management and security management use cases. You need a solution that does the heavy lifting for you, and gives you all of the benefits without having to build it all yourself.
A lot of Big Data systems do a really good job of harvesting all this data, and storing it, and letting you get to it. But then when you want to run a report, it can take tens of minutes, dozens of minutes, even in worst cases hours, to get the data back out that you want. It’s hard to change and shift and ask new questions, because it means starting from scratch each time. So one of the big challenges with solutions that are built around Big Data architectures is to provide that ease of access, that quickness of access, with unrestricted ability to reach to all of the data that’s there available for you.
That’s one of the things we’ve really focused on at Kentik. We handle very large volumes of data in the backend, with billions of records coming into our SaaS each day. But the other thing is being able to get data back out as quickly as possible. The vast majority — 95% of the queries run against our backend by customers today — return results in less than two seconds. So that means that pretty much right away you’re going to know what’s going on, or you can find out.
Some of the other trends that I think are really interesting in this space, and that Kentik is watching, are things like SDN. SDN changes the way that networks behave. So we’re going to be monitoring what happens, and we can help tell you, for instance, how the behavior of the network changes after a change has been made through some sort of software-defined policy enforcement. So we’re keeping a close eye on that whole SDN and automated configuration management space.
Another thing that we’re watching is the internet of things, or the “internet of everything,” as Cisco likes to call it. So we watch all the trends, and we try to figure out how we are going to be positioned for what’s coming next. And there’s always something new, right?
So the Big Data architecture that we built is designed to be very flexible, and to have a long, very long life. Because we actually use open APIs to allow you to connect it to your other systems. It can connect very simply and easily with whatever sorts of other data sources you have coming in or outputs that you want to put in place. And that really adds to its longevity.
We also have plans to continue to expand this platform, and grow it, and provide the SaaS service from multiple geographies. So that’s going to help us keep up with the geographic growth that’s a natural part of all sorts of network systems and businesses.
We also intend to continue adding more functionality. Some of the things that we’re looking at now are to add deeper and richer and more-specific security-type investigations, alerts, and anomaly detection. We’re in the process of enhancing our ability to automatically recognize departures from normal, so we can understand the anomalies when they come up.
We’re also enhancing our ability to directly integrate with external systems to trigger responses to problems. You can automatically hand off data to another system for action, or even set up an automated action if you think it’s warranted. We have a number of customers who are already doing that, using our system to monitor and recognize problems, and then automatically taking corrective action with other related systems.
The absolute explosion in the number of connected devices means that there’s going to a huge increase in the total traffic that’s using the Internet, the sources that need to be tracked, and the behavior that will need to be understood and characterized as normal versus abnormal. There are going to be some really interesting challenges coming along with that. And we’re going to be in a really good position to help, because of the scalability that we already have within our architecture.
Want to learn more about the industry’s only purpose-built Big Data SaaS for network traffic analysis? See for yourself what you can do with Kentik by signing up for a free trial. Need a more basic overview of NetFlow analysis? Check out NetFlow analysis in the Kentipedia.