NetFlow offers a great way to preserve highly useful traffic analysis and troubleshooting details without needing to perform full packet capture. In this post, we look at how NetFlow monitoring solutions quickly evolved as commercialized product offerings and discuss how cloud and big data improve NetFlow analysis.
NetFlow is a protocol that was originally developed by Cisco to help network operators gain a better understanding of their network traffic conditions. Once NetFlow is enabled on a router or other network device, it tracks unidirectional statistics for each unique IP traffic flow, without storing any of the payload data carried in that session. By tracking only the metadata about the flows, NetFlow offers a way to preserve highly useful traffic analysis and troubleshooting details without needing to perform full packet capture — the latter of which can be very expensive and yield few incremental benefits.
NetFlow monitoring solutions quickly evolved as commercialized product offerings represented by three main components:
- NetFlow exporter: A NetFlow-enabled router, switch, probe or host software agent that tracks key statistics and other information about IP packet flows and generates flow records that are encapsulated in UDP and sent to a flow collector.
- NetFlow collector: An application responsible for receiving flow record packets, ingesting the data from the flow records, pre-processing and storing flow records from one or more flow exporters.
- NetFlow analyzer: An analysis application that provides tabular, graphical and other tools and visualizations to enable network operators and engineers to analyze flow data for various use cases, including network performance monitoring, troubleshooting, identifying security threats and capacity planning.
Cisco started by providing NetFlow exporter functions in their various network products running Cisco’s IOS software. Cisco has since developed a vertical ecosystem of NetFlow partners who have mainly focused on developing NetFlow collector and analysis applications to fill various network monitoring functions.
In addition to Cisco, other networking equipment vendors have developed NetFlow-like or compatible protocols, such as J-Flow from Juniper Networks or sFlow from InMon, to create exporter interoperability with third-party collector and analysis application vendors that also support NetFlow, creating a horizontal ecosystem across networking vendors.
The IETF also created a standard flow protocol format called IPFIX that embraces NetFlow from Cisco but now serves as an open, industry-driven standards approach to consistently enhancing flow protocols for the entire networking industry instead of Cisco evolving NetFlow unilaterally.
NetFlow collector and analysis applications represent two key capabilities of NetFlow network monitoring products that are typically implemented on the same server. This is appropriate when the volume of flow data being generated by exporters is relatively low and localized. In cases where flow data generation is high or where sources are geographically dispersed, the collector function can be run on separate and geographically distributed servers (such as rackmount server appliances). In these cases, collectors then synchronize their data to a centralized analyzer server.
Products that support NetFlow components can be classified as follows (with example vendor products listed in each category):
NetFlow exporter support in a device:
- Cisco 10000 and 7200 routers
- Cisco Catalyst switches
- Juniper MX and PTX series routers (via IPFIX)
- Alcatel-Lucent routers
- Huawei routers
- Enterasys switches
- Flowmon probes
- Linux devices
- VMware servers
Stand-alone NetFlow collector:
- SevOne NetFlow Collector
- NetFlow Optimizer (NetFlow Logic)
Stand-alone NetFlow analyzer:
- Solarwinds NetFlow Traffic Analyzer (NTA)
- PRTG Network Monitor
- ManageEngine NetFlow Analyzer
Bundled NetFlow collector and analyzer:
- Arbor Networks PeakFlow
- Plixer Scrutinizer
Open source NetFlow network monitoring:
Network monitoring products that focus on machine and probe data:
- Sumo Logic
- Cisco Tetration
Open source network monitoring tools seem like a good option, but they are very difficult to scale horizontally and most do not understand IP addresses as anything more than text, so prefix aggregation and routing data cannot be used.
Network incident monitoring vendors like Splunk collect a lot of machine and probe data. Many vendors in this product category are seeing the value of integrating NetFlow. However, these platforms are designed primarily to deal with unstructured data like logs. Highly structured data like NetFlow often contains fields with formats that require translation or correlation with other data sources to provide value to the end user.
Pushing NetFlow Limits
With DDoS attacks on the rise, NetFlow has been increasingly used to identify these threats. NetFlow is most effective for DDoS troubleshooting when sufficient flow record detail is available and can be compared with other data points such as performance metrics, routing and location. Unfortunately, the state-of-the-art NetFlow analysis tools up until recently have been challenged to achieve troubleshooting effectiveness, due to data reduction. The volume of NetFlow data can be overwhelming with millions of flows per second, per collector for large networks.
Since most NetFlow collectors and analysis tools are based on scale-up software architectures hosted on single servers or appliances, they have extremely limited storage, compute and memory capacity. As a result, it is common practice to roll-up the details into a series of summary reports and to discard the raw flow record details. The problem with this approach is that most of the detail needed for operationally useful troubleshooting is lost. This is particularly true when attempting to perform dynamic baselining, which requires scanning massive amounts of NetFlow data to understand what is normal, then looking back days, weeks or months in order to assess whether current conditions are the result of a DDoS attack or an anomaly.
How Cloud and Big Data Improve NetFlow Analysis
Cloud-scale computing and big data techniques have opened up a great opportunity to improve both the cost and functionality of NetFlow analysis and troubleshooting use cases. These techniques include:
- Big data storage allows for the storage of huge volumes of augmented raw flow records instead of needing to roll-up the data to predefined aggregates that severely restrict analytical options.
- Cloud-based SaaS options save the network managers from incurring CapEx and OpEx costs related to dedicated, on-premises appliances.
- Scale-out NetFlow analysis can deliver faster response times to operational analysis queries on larger data sets than traditional appliances.
The key to solving the DDoS protection accuracy issue is big data. By using a scale-out system with far more compute and memory resources, a big data approach to DDoS protection can continuously scan network-wide data on a multi-dimensional basis without constraints.
Cloud-scale big data systems make it possible to implement a far more intelligent approach to the problem, since they are able to:
- Track and baseline millions of IP addresses across network-wide traffic, rather than being restricted to device level traffic baselining.
- Monitor for anomalous traffic using multiple data dimensions such as the source geography of the traffic, destination IPs, and common attack ports. This allows for greater flexibility and precision in setting detection policies.
- Apply learning algorithms to automate the upkeep of detection policies to include all relevant destination IPs.
Kentik Detect has the functional breadth for capturing all the necessary network telemetry in a big data repository to isolate even the most obscure DDoS attacks network events — as they happen or predicted in the future. Network visibility using NetFlow is key to managing your network and ensuring the best possible security measures. To understand more about NetFlow see this Kentipedia article and blog post. To see how Kentik Detect can help your organization monitor and adjust to network capacity patterns and stop DDoS threats, read this blog, request a demo or sign up for a free trial today.