Kentik - Network Flow Analytics
Kentipedia
Kentipedia
SD-WAN: Software-Defined Networking Defined and Explained

SD-WAN: Software-Defined Networking Defined and Explained

When searching on “What is SD-WAN,” the results display plenty of articles and videos explaining the features of any given vendor’s SD-WAN solution, but very few go into the details on how SD-WANs work. How do they handle ARPs, DNS queries and connection requests to domains like Salesforce, Netflix or Facebook? How do they handle redundancy and convergence when connections go down? What about prioritization and troubleshooting insight? This article touches on all of this.

What is SD-WAN?

Software-defined wide area network, or SD-WAN, is a new way of implementing WAN connections where the edge router makes fewer local decisions in contrast to a traditional edge router. When a connection request comes into the edge, it forwards the connection request to a controller. The controller decides whether or not to allow the connection and then sends instructions back down to the edge.

The traditional physical WAN connectivity is still there but has been renamed the underlay. In contrast, the logical understanding of the topology and its connectivity is made up of VPNs and is called the overlay. To rephrase that, the underlay is how everything is physically connected up to form a network. It’s the overlay that determines the logical topology: full mesh, partial mesh, hub and spoke or point to point.

What is SD-WAN?

Two planes and some SD-WAN terms

As stated above, connection requests are sent by the edge up to the controller to check policy. This is done over a secure connection called the control plane. Once the edge hears back from the controller, a connection is established. The actual end-user connections traverse the SD-WAN fabric over something called the data plane.

To summarize what has been covered to this point:

  • Edge: In traditional WANs, these are called edge routers and provide connectivity to the WAN fabric. They are different from routers because they send connection requests up to the controller and make only a few connection decisions locally.
  • Traditional WANs: Edge routers connect to public internet or private WAN connections such as MPLS. Typically forwarding decisions are programmed locally into the router to make the choice of pathway. Configuration of local traditional routers is typically static, using local routing protocols like OSPF, BGP, etc.
  • SD-WAN: The programming and choices are provided dynamically by the control plane. Configuration is applied at the control plane and handed off to the local edge device for execution.
  • Controller: The controller is usually located in the cloud. It communicates with the edge devices over a secure connection via the distributed control plane. At the controller, NetOps configures traffic prioritization policy. This can be based on characteristics such as the top level domain where the traffic is headed. The controller is the central point of configuration for all edge devices. Redundant, fault tolerant controllers can be configured to help ensure “five nines” of availability.
  • Underlay: The traditional physical infrastructure of the WAN.
  • Overlay: A series of VPNs making up the logical topology that is used to move traffic in the configurable direction between end systems.
  • Control plane: A secure connection between the edge devices and the functions provided by the controller.
  • Data plane: The fabric that carries the network connections.

The above terms are just the beginning when entering the world of SD-WAN. We try to stay generic here, but each vendor makes significant contributions to this list of acronyms.

How SD-WAN works: Making connections

Let’s say User A would like to make a connection to salesforce.com. First the user may need to ARP for the DNS. The edge receives the ARP and resolves it as specified by the policies set in the controller.

This means the edge may resolve it locally, send it to a specified DNS or send it up to the controller for resolution. This process is vendor-dependent.

Once User A has the MAC for the DNS, a request is then sent to the DNS to resolve salesforce.com (for example) to an IP address. Again, the process is vendor dependent, but in most cases the edge will send this request up to the controller. The controller evaluates the request for salesforce.com and compares it to the configured policy list.

How SD-WANs work

Many policies are based on top level domain (TLD). Here are a few examples:

  • If the request is for netflix.com (for example), deny it.
  • If the request is for facebook.com, allow it, but limit bandwidth to 250Kbits/s.
  • If the request is for salesforce.com, allow it with highest priority.

These instructions are then sent back down to the edge for enforcement, and the edge replies to User A’s DNS query for facebook.com with the proper IP address. Keep in mind that if the end user’s web browser is configured to use DoH (DNS over HTTPS), this could cause problems for some SD-WAN solutions as the browser receives the IP address back from a different DNS instead of the controller.

User A then sends a SYN to the salesforce.com IP address to initiate the TCP handshake that is required to make a HTTPS connection. The edge provisions for the connection as instructed by the controller.

Every now and again something happens like congestion, packet loss, latency or maybe a severed WAN connection. Then what happens? What will the SD-WAN environment do about it? The answer to this is generally a significant vendor differentiator.

How do SD-WANs ensure connectivity

Some vendors build technology into the edge devices that will routinely ping high priority destinations (like salesforce.com) to measure things like latency, packet loss and jitter. Think of it as a type of synthetic monitor. When the edge detects that a problem exists on a connection to a high priority domain, it may balance the flows or packets to the target over multiple links. This is because most SD-WAN implementations support something called active/active where there are no secondary links, rather, only additional load-carrying connections.

SD-WAN benefits

Beyond saving money by eliminating expensive leased lines such as MPLS and utilizing VPNs over the internet, here are some additional SD-WAN benefits:

  • Multiple redundant links, the best being active/active and over two or more links
    • Links can be private WAN lines or public internet lines or even LTE
    • Redundant edge — devices can be deployed in pairs to provide fault tolerance at local sites
  • Fast connection failover. Some vendors tout under one second. Traffic-shaping where connections to specified TLDs receive a predetermined amount of bandwidth and priority. Other TLDs can be blocked or assigned fixed bandwidth limits.
  • WAN optimization. This is vendor-dependent, but might be available on the edge device and can reduce bandwidth consumption using compression techniques.
  • Firewall integration. Some vendors include this on the edge devices. Packet loss compensation or packet duplication. Packets carrying content like voice are replicated and then used on the other end of a connection if and when packet loss occurs.
  • Redundant controllers. In an effort to maintain five nines of availability, multiple controllers can be deployed to ensure fault tolerance
  • Advanced router features like OSPF, EBGP, etc. NetFlow, IPFIX, SNMP, Syslog, etc. support. (Check Kentipedia soon for more on this topic.)

Many of the above benefits are vendor-dependent and should be tested under a load to ensure real world operation.

Ultimately, the only SD-WAN features that matter are the ones that will support the business critical applications in the optimal way. Spending extra money on things like sub-second convergence, mesh topologies, multitenancy, and multicast doesn’t matter unless they improve the user’s experience.

Measuring SD-WAN performance: Trust but verify

At the end of the day, every SD-WAN vendor will tout how necessary their features are and how great their performance is. One of the areas that is sometimes overlooked during the sale is performance monitoring. After the SD-WAN deployment, most NetOps teams want insight into how the SD-WAN is performing.

Companies like Cisco, Silver Peak/HPE and VMware export IPFIX, allowing network observability companies to provide performance insights into the SD-WAN fabric. Flow data is correlated with telemetry that is exported from the SD-WAN management platform. For example, the Kentik architecture provides the capability to ingest vendor-specific fields, which are very important in the SD-WAN space (e.g., Viptela: VPN Identifier, and Silver Peak: Application, Business Intent Overlay).

For example, the Silver Peak-Kentik integration features provide the inclusion of application name dimension and business intent overlay (BIO) into interface traffic metadata source.

Sankey showing the relationship between overlay networks and applications

The above screen capture from Kentik shows which subnets are speaking with each other. The user can also see the relationship between overlay networks and applications. These combined data sets allow NetOps to discover what applications are running between sites, the internet, and to the data center. They can be used to better understand service providers, link utilizations, and traffic patterns. These details help NetOps fine-tune policies at the SD-WAN controller.

Kentik didn’t leave any SD-WAN vendors out. We provide overlay, underlay and application traffic visibility to Silver Peak, Cisco, VMware and all other major open SD-WAN solutions.

SD-WAN terms

Conclusion

SD-WAN is a powerful new network architecture and technology that helps on multiple fronts. SD-WAN can improve performance, increase security and lower costs all at the same time. However, like any networking technology, SD-WAN delivers more benefit when it is properly managed. Auditing application traffic policies, understanding the SD-WAN traffic paths taken as well as link utilization are critical maintenance functions. Having the ability to troubleshoot, plan capacity, and optimize costs is also important. Network observability solutions like Kentik can help you perform the operational oversight that you need to make SD-WAN successful.

If further clarification is needed on any of the SD-WAN terms used in this post, please reach out to the Kentik team.

Updated: November 19, 2021
Analyst Report
EMA Radar Report: Network Performance Management
Case Study
Kentik Delivers a “Networker’s Dream” for Large Enterprise’s Global Network Operations
Analyst Report
Enterprise WAN Transformation: SD-WAN, SASE, and the Pandemic
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.