NetFlow is a protocol developed by Cisco Systems that is used to record statistical, infrastructure, routing and other information about IP traffic flows traversing a NetFlow-enabled router or switch. A NetFlow collector is one of three typical functional components used for NetFlow analysis:

  • NetFlow Exporter: a NetFlow-enabled router, switch, probe or host software agent that tracks key statistics and other information about IP packet flows and generates flow records that are encapsulated in UDP and sent to a flow collector.
  • NetFlow Collector: an application responsible for receiving flow record packets, ingesting the data from the flow records, pre-processing and storing flow record from one or more flow exporters.
  • NetFlow Analyzer: a software application that provides tabular, graphical and other tools and visualizations to enable network operators and engineers to analyze flow data for various use cases, including network performance monitoring, troubleshooting, and capacity planning.

A NetFlow Collector’s main functions include:

  • Ingesting flow UDP datagrams from one or more NetFlow-enabled devices
  • Unpacking binary flow data into text/numeric formats
  • Performing data volume reduction through selective filtering and aggregation
  • Storing resulting data in flat files or SQL database 
  • Synchronizing flow data to the NetFlow Analyzer application running on a separate computing resource

NetFlow Collector and Analyzer applications are two functions of a NetFlow analysis system or product.  In some cases, the NetFlow analysis product implements both functions on the same server.  This is appropriate when the volume of flow data being generated by exporters is relatively low and localized.  In cases where flow data generation is high or where sources are geographically dispersed, the collector function can be run on separate and geographically distributed servers (such as rackmount server appliances).  In these cases, collectors then synchronize their data to a centralized analyzer server.

Historically, the most common way to run NetFlow collectors was on a physical, rackmounted Intel-based server running a Linux OS variant.  More recently, flow collectors have been deployed on virtual machines.  Unfortunately, in either case, compute and storage is severely limited the amount of detailed data that could be retained or analyzed. 

Most recently, a unified, cloud-scale approach to NetFlow collector and analyzer architecture has emerged.  In this architecture, a horizontally scalable big data system replaces physical or virtual collector and analyzer appliances.  Big data systems allow for dramatically high volumes of ingest, greater data retention, deeper analytics and more powerful anomaly detection.  To learn more about big data NetFlow analysis, visit the Kentik Detect overview page.