Modern networking relies on the public internet, which heavily uses flow-based load balancing to optimize network traffic. However, the most common network tracing tool known to engineers, traceroute, can’t accurately map load-balanced topologies. Paris traceroute was developed to solve the problem of inferring a load-balanced topology, especially over the public internet, and help engineers troubleshoot network activity over complex networks we don’t own or manage.
How we consume applications today requires a clear understanding of the paths application traffic takes over our networks, both locally and over the internet. Traceroute has always been an indispensable tool for tracing network paths and ultimately for troubleshooting network problems, so despite it being a technology almost four decades old, it’s more important today than ever.
However, classic traceroute has limitations that hinder its usefulness in modern networking. Our reliance on the public internet, distributed resources, deterministic and non-deterministic routing, various forms of load balancing, and many network-adjacent devices means engineers need a better tool to trace packets from a source to its destination.
Paris traceroute solves some of the limitations of classic traceroute and its variants and helps engineers address concerns about path anomalies, false paths, and especially with load balancing and equal cost multipath, or ECMP. Ultimately, these advancements are critical when troubleshooting network activity over complex networks we don’t own or manage.
Origin of traceroute
Traceroute was first introduced in 1987 by Van Jacobsen, an American computer scientist and prolific contributor to the development of the internet and several commonly used diagnostic tools. Jacobsen and Steve Deering, another computer scientist credited with many early innovations in networking, developed a way to edit an IPv4 packet header’s TTL field so that a hop-by-hop path could be traced between a source and destination.
Though some consider this a sort of hack because, in essence, traceroute “tricks” devices in the path to divulge information about itself, it nonetheless quickly became a valuable tool in the hands of engineers using almost any operating system. Over time, that has included network and security devices and compute. Ultimately, the IP protocol lacks basic telemetry features, which is why traceroute was developed.
The traceroute function sends out a series of network packets toward a destination, incrementing the time-to-live (TTL) value with each set. The TTL is a field in the IP header specifying the maximum number of hops a packet is allowed before it’s discarded. Along the way, each router decreases the TTL by one, and when it hits zero, the router sends back an ICMP “time exceeded” message, revealing its identity.
This means that if we set the TTL to one for the first hop, we get an immediate response from our next-hop device. We then send a follow-up packet with a TTL of two to discover the second hop in the path. Then, a TTL value of three, four, and so on until we reach our destination and see a complete trace of the path in terms of hops by device, often a router of some type.
By sending out a series of probes with increasing TTL values and listening for ICMP responses, the source computer builds the path by looking at the ICMP message received from each time exceeded message and noting the time taken for the round trip. So, by starting with a TTL of 1 and incrementing it with each new set of packets, traceroute builds up a list of routers on the path and the round-trip time to each. This continues until it reaches the destination or reaches its limit, usually 30 hops.
Classic traceroute uses ICMP, most widely known as ping, for the outgoing packets but can also use UDP and TCP. The destination port is used as the sequence number for UDP traceroute packets, so it can vary throughout the entire trace.
The output of a classic traceroute command shows the list of hops along the path, the IP addresses (or domain names if resolvable) of those routers, and the round-trip times for each hop.
It looks something like this:
There are several immediate use cases for classic traceroute, including:
- Network path verification: Ensuring packets are taking the expected route through the network.
- Troubleshooting network issues: Identifying routers or network segments experiencing issues such as dropping packets.
- Latency measurement: Understanding the time packets take to travel to a destination and back.
Verifying a network path means simply using traceroute to ensure packets take the expected route. If there’s a problem, we can use classic traceroute to troubleshoot network issues, such as identifying routers or other devices in the network path that could be the source of the problem. And since we’re able to determine the RTT from each hop, we have a good understanding of latency in terms of the entire path from source to destination and between each hop.
Limitations of classic traceroute
Classic traceroute does have its limitations, however.
In networks where there are multiple possible paths or load balancing in use, subsequent packets may report different devices in the path from source to destination, which can lead to confusing output at best and an inaccurate output at worst. Nodes or entire links in the path could be missing from the output of classic traceroute, and there is no mechanism to identify hops that could contain multiple interfaces, as is the case with load balancing.
This is important to consider because load balancing and multipath networking are very common in more extensive networks, especially the global internet. For instance, OSPF and other IGPs use a form of dynamic multipathing to ensure packet delivery within internal networks. On the public internet, DNS load balancing and elastic load balancing are used to dynamically adjust the flow of traffic over the internet and among public cloud providers.
Also, some routers or firewalls may be configured to ignore the traceroute packets or the ICMP time-exceeded messages, resulting in
* * * in the output and an incomplete path, leaving holes in an output that adversely affects an engineer’s ability to understand the entire path.
We can end up with anomalies or incorrect outputs such as loops (often where there aren’t any), cycles, and diamonds.
A loop is when the same node appears multiple times in an output. Normally, routers don’t forward traffic back to itself on the same interface. However, misconfigured routing could potentially create a scenario in which a router sends a packet to its next hop. Still, the next-hop router could have the originating router configured (manually or dynamically) as its next-hop.
However, more often in a classic traceroute output, a loop in an otherwise successful trace is likely an output anomaly in that a loop doesn’t actually exist. If the destination is reachable, the loop must then be some sort of artifact in the observed time-exceeded responses.
In this case, a loop is likely caused by load balancing when there are multiple paths with different lengths. Notice the image below where a load balancer forwards the traceroute probes with TTL 7 and 8 to router A and the probes with TTL 9 to router B, producing two different results from the same source and destination.
Loops in the traceroute output can also be caused by misconfigured routing or faulty devices, such as when a router receives a probe with a TTL of 0, and instead of dropping the packet, it forwards it to the next hop with a TTL of 1. Assuming it is functioning correctly, the next router will receive the packet, decrement the TTL to 0, and send it to its next hop, which then drops the packet and generates an ICMP time exceeded message. This continues repeatedly, which appears as a loop in the traceroute output.
Yet another cause of apparent loops is address rewriting. Address rewriting is most commonly found in network address translation, or NAT. The purpose of NAT is to change the IP address in a packet’s source and/or destination fields, which can lead to an anomalous traceroute output.
An anomalous loop in a classic traceroute output is a recurring problem, especially as organizations rely more heavily on the internet and public cloud, where load balancing is commonly deployed.
The term “cycle” is sometimes used interchangeably with loop, but there is a subtle difference. Less common than loops, a cycle in a classic traceroute output refers to any repetitive output in the trace. Loops are considered a redundant output, but the term “cycles” is typically used to refer to a loop with additional nodes between the two looping devices (usually routers).
A traceroute diamond is often an anomalous output that occurs when there are multiple traceroute probes per hop. This is primarily caused by load balancing and can result in false links leading to an incorrect trace between source and destination. Diamonds occur in a significant number of classic traceroute results in large networks such as the internet itself.
However, keep in mind that diamonds are an artifact of load balancing and, therefore, a genuine part of the topology. Diamonds in a traceroute output do not always indicate false links; instead, they may be representative of the actual topology.
In the image below on the left, a load balancer (L) can send each individual probe of a single trace via a different path to the destination (G). The result is a diamond-shaped trace and an inaccurate view of the actual network path.
Paris traceroute is an important development of the traceroute tool because it solves the problem of flow-based (as opposed to packet-based) load-balanced network paths, causing the inaccuracies in classic traceroute results. It ensures a consistent path is taken by all packets in a session, providing a more accurate view of the network path. Considering our reliance on the public internet and how common flow-based load balancing is, this method for topology inference is a critical advantage over classic traceroute.
Introduced by Brian Augustin, Timur Friedman, and Renata Teixeira in the 2007 workshop End-to-End Monitoring Techniques and Services in Munich, Germany, Paris traceroute is an “adaptive, stochastic probing algorithm, called the Multipath detection algorithm, to report all paths towards a destination.”
In their research, Augustin, Friedman, and Teixeria found that classic traceroute often produces inaccurate and incomplete measured network paths. This means engineers using classic traceroute need an adequate tool for network troubleshooting and understanding where latency is occurring in a multi-hop path with intermediate routers.
Additionally, they found that many modern routers deployed in production perform per-packet, per-flow, and per-destination load balancing, none of which are effectively measured with classic traceroute.
Their work with Paris traceroute aimed to solve several of these problems, especially the difficulty of tracing flow-based load-balanced paths, and enable engineers to see more clearly how traffic traverses modern networks.
How Paris traceroute works
Maintaining session information
Paris traceroute changes the probing strategy and improves topology inference by increasing the number of probes and controlling the various identifiers in packet headers.
First, the key to ensuring that all packets in a traceroute session are treated as part of the same flow is keeping certain header fields (commonly referred to as the 5-tuple) constant. For UDP packets these are the IP src/dst, IP protocol, and UDP src/dst port fields. For ICMP packets they are the IP src/dst, IP protocol, and ICMP type, code, and checksum. Classic traceroute changes some of these fields as the session progresses causing the packets to be treated as different flows by load-balancers. Paris traceroute rectifies this design flaw and uses other header fields to encode the state required to process the ICMP responses received from routers.
Load balancers are typically designed to use destination port numbers to identify all the incoming packets in a flow and forward them down the same path. However, classic traceroute uses different port numbers for each probe. On the other hand, Paris traceroute maintains the same port number so that an inline load balancer will send all the probes down the same path.
In the image below, notice the fields that are used for flow-based load balancing shaded in gray. These fields must be kept constant throughout the flow, but the set is different depending on whether it’s an ICMP traceroute or a UDP traceroute.
Second, Paris traceroute will also send probes with different flow identifiers (typically different destination port numbers) to differentiate between flows. This helps identify if there are multiple potential next-hop interfaces that would cause incorrect or confusing output in the trace.
Like classic traceroute, Paris traceroute also incrementally increases the TTL value to discover each hop. However, using several different modes to maintain consistency in packet headers it maintains the flow consistency for all packets. So, like classic traceroute, it systematically discovers and records the path taken by packets through the network but can account for load-balanced paths.
Notice in the image below that a network may contain multiple paths due to load balancing, dynamic routing changes, or even faulty network devices. In this image, if we want to trace the path from source (Src) to destination (Dst), we have to contend with a load balancer (L) and the potential for multiple forwarding paths at A and B. On a network with a global scale, the complexity would be significantly greater.
This becomes even more apparent when different path selections result in different nodes in the output and a different number of nodes, affecting the TTLs generated by the source.
In the image below, notice that depending on the path probes take, the TTL value and the subsequent time-exceeded responses will be affected. This will result in anomalies and generally inaccurate results in the traceroute output.
Note that there is a variation of the Paris traceroute algorithm, the Multipath Detection Algorithm, or MDA, which sends six and up to 96 probes to factor in possible load balancing. By sending more probes and varying their flow identifiers (UDP packets), the MDA hopes to trace multiple paths more accurately, including diamonds caused by load balancing. However, the MDA is not commonly used, whereas Paris traceroute has become the industry standard.
Paris traceroute solves the problem of flow-based load balancing by using several modes.
First, in UDP mode, Paris traceroute keeps the source and destination port fields constant for all probes in a single run. This is different from traditional traceroute, which usually randomizes these values, especially the destination port, which traditional traceroute repurposes as a sequence number.
Second, TCP mode is similar to UDP mode in that the source and destination port fields are kept constant across probes. Due to the nature of how TCP operates, it’s important to maintain flow consistency, which Paris traceroute accomplishes by controlling the TCP Sequence and Acknowledgement Number.
By keeping these fields constant (particularly the port number), Paris traceroute can keep all probes in a flow identifier and, therefore, the same network path even when a flow-based load balancer is involved.
- Third, ICMP mode uses the ICMP sequence number field for its sequence number. However, this would cause the ICMP checksum to change, resulting in a changing flow identifier. To counteract this, Paris traceroute also manipulates the identifier field in the ICMP echo request message to keep the checksum constant throughout the session, which is conceptually different from UDP and TCP modes.
There are several limitations of Paris traceroute:
- Per-packet load balancing load balances packets individually, so there isn’t an easily discernible flow to track.
- Dynamic routing can affect the path a flow takes if there is any type of change in the routing table and next-hop destinations on a hop-by-hop basis.
- NAT devices and other middleboxes will cause Paris traceroute to infer incorrect results.
- Paris traceroute isn’t able to accommodate faulty network devices and code bugs that can cause anomalous outputs in a trace.
- It’s more complex than classic traceroute and might require more understanding to interpret the results.
- Like any active measurement tool, it generates additional traffic, which might be a consideration in bandwidth-sensitive networks.
- Paris traceroute only sends one probe per hop, so it hides diamonds, which is an important limitation of Paris traceroute to consider.
Understanding the output
The output of Paris traceroute is similar to classic traceroute in that it includes route information, or in other words, the path(s) taken by the packets, including hop-by-hop information. A significant difference (improvement) is that it also detects and reports on multiple paths when they exist due to routing policies or load balancing.
The specific information reported when executing a Paris traceroute to a destination includes:
- List of hops: Each line in the output represents a layer three hop along the path from the source to the destination.
- IP addresses: For each hop, the IP address of the intermediary device (router, L3 switch, firewall, etc.) is displayed.
- Round-trip time (RTT): The output usually shows three round-trip times for each hop, representing the time it takes for a packet to go from the source to that hop and back. RTTs are measured in milliseconds.
- Possible path variations: Since Paris traceroute is designed to detect different paths taken by packets due to load balancing, the output displays multiple paths or varying IP addresses for the same hop number across other traceroute executions.
- Destination reach: The final line in the output shows the destination IP address and the round-trip times to it.
An example output can look like this:
The image above shows each individual hop by IP address and the RTT for each probe sent for that specific next hop. We use these times to determine latency between hops and between source and destination.
Verifying network paths over the public internet
Load balancing is very common on the public internet, so Paris traceroute (among several other traceroute versions) has become a primary method for tracing a path from source to destination, even on a global scale.
In the image below from the Kentik Portal, we are tracing a path between a source and a destination, in this case, between two instances in AWS and a destination target. You can see that by using Paris traceroute, we can trace the multiple paths between a source and destination over the public internet.
If we drill down into each node, we can also see the gathered metrics for packet loss, network latency, and jitter for each individual hop in the path.
The graphics above are built on the raw data underlying every trace and every probe in that trace, which you can see in the following image below.
From its origins in the 1980s to the versatile tool it is today, traceroute remains a vital tool for network engineers and system administrators interested in tracing the path applications over a network. Indeed, the way we consume applications today, including our complete reliance on the public internet, means that to troubleshoot delivery and performance problems successfully, we need an accurate understanding of the paths our applications take now more than ever.
Especially in an environment where flow-based load balancing is the norm, Paris traceroute has become the de facto path-tracing solution for inferring network topology on networks we don’t own or manage. Often replacing classic traceroute entirely, Paris traceroute assists engineers in addressing concerns about anomalies, false paths, and the limitations of classic traceroute in load-balanced networks.