Network Performance Monitoring Outcomes

Network Performance Monitoring (NPM) refers to the process of measuring, diagnosing and optimizing the service quality of a network as experienced by users. NPM requires multiple types of measurement or monitoring data on which engineers can perform diagnoses and analyses, such as:

  • Bandwidth: Measures the raw versus available maximum rate that information can be transferred though various points of the network, or along a network path.
  • Throughput: Measures how much information is being or has been transferred.
  • Latency: Measures network delays from the perspective of clients, servers and applications.
  • Errors: Measures raw numbers and percentages of errors such as Bit Errors, TCP retransmissions, and out-of-order packets

Common Use Cases

Application performance optimization – Monitor and troubleshoot performance issues for networked and distributed applications:

  • Monitor HTTP and database calls for three-tier networked applications. Understand whether application performance issues are related to network factors. Resolve performance issues.
  • Evaluate complex network API communications for highly distributed applications:
    – recognize and diagnose emergent performance issues;
    – measure relative performance of API partners for vendor selection.
  • Guide decisions on distributed application architecture, such as when to locally cache network API calls. 

Datacenter traffic management – Monitor intra- and inter-datacenter performance issues. Isolate and troubleshoot infrastructure root causes.

Internet traffic management – Make efficient routing decisions by monitoring performance across hops (first, second, and third) and destination ASNs and geographies. Quickly and  cost-effectively bypass network roadblocks by serving traffic from alternate PoPs or via alternate first-hop ASNs.

Cloud networking – Monitor relative quality of IaaS and other cloud providers to guide network connectivity architecture, vendor selection, and contract negotiation.

Network change and new deployment validation – Provide instant visibility for network changes and new deployments when building or changing applications, servers, network elements, circuits, or peering/transit.

Moving Past Traditional NPM Approaches

NPM solutions have traditionally utilized an appliance deployment model.  An appliance-based PCAP probe with one or more interfaces connects to router or switch span ports or to an intervening packet broker device (such as those offered by Gigamon or Ixia).  The appliance records all packets passing across the interface into memory and then into longer-term storage.  In virtualized datacenters, virtual probes may be used, but they are also dependent on network links in one form or another. 

Physical and virtual appliances are costly from a hardware and (in the case of commercial solutions) software licensing point of view.  As a result, in most cases, it is only fiscally feasible to deploy PCAP probes to a few, selected points in the network.  In addition, the appliance deployment model was developed based on pre-cloud assumptions of centralized datacenters holding relatively monolithic application instances.  As cloud and distributed application models have proliferated, the appliance model for packet capture is less feasible, because in many cloud hosting environments, there is no way to deploy even a virtual appliance.

A cloud-friendly and highly scalable SaaS model for network performance monitoring splits the monitoring function from the storage and analysis functions.  Monitoring is accomplished with the deployment of lightweight monitoring software agents that export PCAP-based statistics gathered on servers and open source proxy servers such as HAProxy and NGNIX.  Exported statistics are sent to a SaaS repository that scales horizontally to store unsummarized data and provides Big Data-based analytics for alerting, diagnostics and other use cases.  While host-based performance metric export doesn’t provide the full granularity of raw PCAP, it provides a highly scalable and cost-effective method for ubiquitously gathering, retaining and analyzing key performance data, and thus complements PCAP.

Kentik offers the industry’s only Big Data-based, SaaS NPM solution that integrates nProbe host agent performance metrics and billions of NetFlow, sFlow, IPFIX, BGP records matched with geolocation data.   Visit the Kentik NPM solution page to get an overview, read the blog post to see how it works, or start a free trial.