Kentik - Network Observability
Back to Blog

Cloud-Scale Visibility Tools: The Right Stuff

Stephen Collins
Stephen CollinsPrincipal Analyst, ACG Research

Network Engineering
john-glenn-thumb

Summary

Digital transformation is not for the faint of heart. In this post, ACG Analyst Stephen Collins discusses why it’s critical for ITOps, NetOPs, SecOPs and DevOps teams to make sure they have the right stuff and are properly equipped for the network visibility challenges they face.


This series of guest posts has concentrated on the numerous challenges facing enterprise IT managers as businesses embrace digital transformation and migrate IT applications from private data centers into the cloud. The recurring theme has been the critical need for new tools and technologies for gaining visibility into cloud-scale applications, infrastructure and networks. In this post, I would like to finally expand on this theme.

The scope of cloud-scale visibility is daunting and technically demanding. Monitoring needs to span multiple domains: the private enterprise data center and WAN; fixed and mobile service provider networks; the public Internet; and hybrid multi-cloud infrastructure. Full stack visibility is compulsory, including application software, computing infrastructure and visibility into both virtual network layers and the various physical underlay networks.

Network and computing infrastructure is increasingly software-driven, allowing for extensive, full stack software instrumentation that provides monitoring metrics for generating KPIs. Software probes and agents that can be easily installed and spun up on-demand are displacing costly hardware probes that need to be physically deployed by on-site technicians. Active monitoring techniques now play a key role in tracking the performance of cloud-based applications accessed via the Internet, including synthetic monitoring that simulates user application traffic flows for proactively detecting problems before they impact a large number of users.

Performance metrics and other types of monitoring data can be collected in real time using streaming telemetry protocols such as gRPC. At the network layer, streaming telemetry data is displacing SNMP polling and CLI screen scraping for gaining visibility into state information. Now that support for NetFlow, sFlow and IPFIX is commonplace in routers and switches, flow metadata is a readily available source of telemetry for real time visibility into network traffic flows across all monitoring domains.

Network data is big data. The collection of massive amounts of streaming telemetry requires a high-speed data pipeline for ingesting data in real time and distributing it to the appropriate monitoring and analytics tools. Highly scalable Kafka clusters that utilize a publish/subscribe model are a commonly deployed pipeline solution, supplying telemetry data to multiple consumer analytics engines and tools.

Streaming analytics engines consume and process data for generating operational insights in real time. Column-oriented databases ingest data to support near real-time multi-dimensional analytics for correlating a wide range of time series data types. Machine learning engines analyze huge data sets to discover correlations and trends that might be impossible for operators to discern using traditional monitoring techniques. Hadoop-based data lakes support offline batch processing on massive amounts of data for gaining business intelligence insights.

While Big Data open source software is freely available, many enterprise IT organizations can’t sustain the investment needed for developing Big Data monitoring and analytics tools in-house, or the IT managers would prefer to rely on the vendor community to supply fully supported productized solutions based on open source.

Big Data was born in the cloud and Big Data analytics is well-suited for cloud-based deployments. SaaS-based Big Data analytics solutions are also an attractive option for organizations seeking a productized solution with low upfront costs, no on-site installation required and minimal ongoing maintenance.

I conclude by referencing a quote often attributed to astronaut John Glenn — someone who unquestionably had “the right stuff.” Nobody is asking IT managers to do something as outrageously risky as “sitting on top of 2 million parts — all built by the lowest bidder on a government contract.” But digital transformation is not for the faint of heart, so it’s critical that ITOps, NetOPs, SecOPs and DevOps teams make sure they have the right stuff and are properly equipped for the challenges they are facing.

We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.