Identifying Idle Paths in a Data Center Leaf-Spine Fabric

Phil GervasiDirector of Tech Evangelism

June 3, 2025Data Center Networking Network Infrastructure

Table of contents

Mapping the fabric Identifying load imbalances Alerting on idle links From insight to action

Summary

In leaf-spine data center networks, traffic often becomes imbalanced, leaving some uplinks idle and resulting in wasted bandwidth. Kentik helps engineers identify underutilized paths, diagnose the causes, and take corrective action using enriched telemetry, visual topology maps, and intelligent alerts, turning hidden inefficiencies into actionable insights.

A well-designed leaf-spine topology distributes traffic across many links, but in real life, those links rarely share the traffic load equally. Hot spots form on a few spines, while others sit almost idle, leaving expensive bandwidth unused and forcing premature upgrades. This risks performance issues if sudden surges lead to congestion on already oversubscribed links. Kentik’s network intelligence platform gives network engineers a data-driven way to find those idle paths, understand why they’re idle, and use the answer to inform a new routing policy or subscription ratio architecture.

I made a short video discussing this common data center networking challenge:

In a perfect leaf-spine network, traffic is spread evenly across all available links. But in reality, that's rarely the case. So this means usually there are idle paths ultimately connected to very expensive switches even when you break it down by cost per port. With Kentik's network intelligence platform, engineers can easily identify these idle paths and then fix them before they become major problems. Kentik starts by ingesting detailed telemetry like flow records, SNMP, streaming telemetry, metadata about your organization, and builds a real-time visual map of the traffic running in your fabric. And in Data Explorer, you can zero in on your busiest interfaces as well as your least utilized ones. And then you can also compare different time windows to spot persistent imbalances, not just momentary blips. And that could be because of ECMP hashing problems, maybe an elephant flow being pinned incorrectly, or maybe it's just poor oversubscription planning. Either way, with Kentik Network Intelligence, now you know. So manual filtering is great. We need to answer a quick question or do some troubleshooting. That does happen. But what's even better is to set alerts based on those queries or whatever metrics, variables, and thresholds are important to you and then integrating that into your ticketing and messaging systems. So for example, you can create an alert for data center uplinks that stay under one percent utilization for more than five minutes or whatever time period makes sense for you. Then when it happens, Kentik notifies you. And then from there, you can test a new ECMP policy, adjust routing, or rebalance VLANs, or whatever the best solution is to eliminate those idle paths and make the most of your infrastructure. Kentik makes hidden data center inefficiencies like idle paths visible and then turns that visibility into action that you can use to optimize your data center fabric. Visit kentik.com/datacenter to learn more.

… but read on to learn more about how Kentik can help. (And you can find more quick overviews like this in our Kentik Bytes video series.)

Mapping the fabric

Kentik starts by ingesting high-resolution flow records (NetFlow, sFlow, IPFIX), SNMP counters, BGP tables, and, in many organizations, gNMI-streamed interface stats from both leaf and spine switches. During ingest, the Kentik Data Engine enriches this data with information such as the role of each leaf or spine in the architecture, which is taken from LLDP and user-defined labels.

A visual representation of the fabric is generated in the Kentik Map as a tiered graph, with leaves at the bottom and spines at the top. Every 25 Gbps or 100 Gbps uplink is shown as an edge. In the graphic below, you can see a representation of our Dallas data center, featuring ToR switches at the bottom and spine and super-spine switches above. Hovering over each element provides the ability to dig deeper into the device details, topology, or the specific connections

Spine leaf example of Dallas data center

This topology context matters because it allows any query to focus on “traffic that traverses the spine tier” without needing to guess interface names. We can also use this to drill into specific device metrics to get a quick glance of average interface utilization, CPU and memory usage, and so on.

A typical leaf-spine data center may use an oversubscription ratio of 3:1, 4:1, or even 5:1, which means there is variation link-by-link that also changes over time. If flows are pinned incorrectly or if the oversubscription ratio is too high, links sit idle, wasting resources in terms of cost-per-port, as well as incorrectly sized switching hardware, among other issues.

In this next image, notice that we can see what a leaf switch is connected to, which then forms the basis of our more in-depth queries in Kentik’s Data Explorer.

Spine leaf data center example with details

Identifying load imbalances

Data Explorer enables us to identify capacity on our links and filter for specific devices, timeframes, or traffic types as needed. In the image below, we have the result of a relatively straightforward query of our San Jose data center. We’ve filtered by device, site, source, and destination interface name, as well as average traffic, capacity, and other relevant factors. At even a quick glance, we can see which interfaces in this timeframe are averaging the most traffic and which are not seeing much at all.

Sankey view of data center showing interfaces with the most traffic

A single snapshot doesn’t tell us whether the imbalance is transient. Kentik stores telemetry so that we can compare traffic over previous time periods. That way, engineers can discover that the same small set of ECMP paths has been hogging traffic for weeks, a classic elephant flow pinning or a hashing bias toward specific spine IDs.

In the following screenshot, notice that we are examining the same query, but over the previous three weeks.

Lastly, we can provide a comparison summary, which is presented below in the following image.

In our example, we observe that there is very little to no deviation in the traffic volume on our links over the last three weeks, resulting in some links that do most of the work and others that sit almost completely idle.

Alerting on idle links

Idle links remain hidden until someone actively searches for them. If there isn’t a noticeable performance degradation or failure, underutilized links will often go unnoticed. Kentik’s Alerting function lets teams convert the queries created in the Kentik Map and in Data Explorer into relevant and meaningful alerts. For example, a simple way to start could be creating an alert that fires when any spine uplink in a specified group stays under 1% utilization compared to its capacity.

Alerts can certainly get more complex by using baselines, multiple triggers and filtering options, and so on, but regardless of how complex or straightforward the alert is, the core benefit is that engineers know when links are sitting idle automatically, without having to search for them and without having to wait until there’s a problem.

From insight to action

So, Kentik doesn’t just detect idle paths; it turns that insight into a workflow. With Kentik network intelligence, engineers can:

Discover imbalances with automated policies.
Diagnose root causes via the Kentik Map and Data Explorer.
Validate fixes with synthetic traffic and real-time flow feedback.

Whether the fix involves a hashing tweak, a routing policy change, or redrawing VLAN boundaries, Kentik shortens the journey from “we suspect waste” to “we prove and eliminate waste.”

A leaf-spine fabric’s beauty is its built-in path diversity, but its curse is the ease with which subtle hashing quirks can undermine that diversity. Kentik shines by making per-path utilization visible, searchable, and meaningful. With flow records, topology enrichment, and AI-assisted analysis in one place, network operators reclaim idle bandwidth, delay expensive upgrades, and deliver the balanced throughput the fabric was built to provide without guesswork or packet captures.

Identifying Idle Paths in a Data Center Leaf-Spine Fabric

Summary

Mapping the fabric

Identifying load imbalances

Alerting on idle links

From insight to action

Explore more from Kentik

Platform

Solutions

Technology

New and Notable

Learn

Company