Kentik - Network Observability
Back to Blog

Introducing BGP Monitoring from Kentik

Anil Murty
feature-bgp-monitoring

Summary

Kentik’s BGP monitoring capabilities address root-cause routing issues across BGP routes, BGP event tracking, hijack detection, and other BGP issues.


Understanding the Border Gateway Protocol (BGP)

Designed at the dawn of the commercial internet, the Border Gateway Protocol (BGP) is a policy-based routing protocol that has long been an established part of the internet infrastructure. BGP (and BGP routing) was created to enable the internet to scale and accommodate a growing number of autonomous systems (ASes) that needed to exchange routing information with each other. As the backbone of the global internet routing system, BGP is responsible for directing traffic between ASes, which are essentially networks operated by different organizations such as internet service providers (ISPs), data centers, and large enterprises.

BGP is crucial because it maintains the stability and reliability of the internet by ensuring that traffic is routed efficiently across various networks. As the internet has evolved, so too have the complexity and demands placed on BGP, making its monitoring and management increasingly essential. This has given rise to the need for BGP monitoring, a process that helps network operators detect and troubleshoot issues in their routing infrastructure. By understanding and analyzing BGP data, operators can optimize network performance, minimize downtime, and maintain the overall health of their networks.

What is BGP monitoring

BGP monitoring refers to the process of monitoring the Border Gateway Protocol (BGP) to detect and troubleshoot issues in a network’s routing infrastructure. BGP is a protocol used by internet service providers (ISPs) to exchange routing information between autonomous systems (ASes), and it plays a critical role in ensuring that data is routed efficiently and reliably across the internet. BGP route monitoring is historically of interest primarily to ISPs and hosting service providers whose revenue depends on delivering traffic.

Additionally, the threat landscape has evolved, with bad actors exploiting BGP vulnerabilities to carry out attacks such as route hijacking and leaks. These incidents can lead to traffic being redirected to malicious networks, causing significant disruptions to the affected organizations. BGP monitoring helps network operators detect and remediate such incidents and, in turn, protect their networks and ensure the security of their data. BGP hijack detection is one of the essential aspects of monitoring BGP today.

Ultimate Guide to BGP Routing
An effective BGP configuration is pivotal to controlling your organization’s destiny on the internet. Learn the basics and evolution of BGP.

As we saw with Facebook’s historic outage, monitoring BGP proactively has become equally important for digital enterprises and web businesses. That’s because their user experience and revenue streams depend on reliable, high-performance internet traffic delivery. To help our customers manage this critical element of network performance, Kentik now includes BGP performance monitoring as part of the Kentik Network Observability Platform.

BGP monitoring in the Kentik platform
Path Visualization is a part of the BGP Monitor test results. You can see the AS paths currently and at any point in time.

Why choose Kentik as your BGP monitoring solution?

While free and commercial solutions for monitoring BGP have existed for several years, there was something missing that compelled many of our customers to nudge us towards building our own. Part of this had to do with our approach to network observability. Our customers give us great reviews for user experience and for our approach that enables users to answer any question about their network. They wanted BGP monitoring to be a part of the solution.

The other big reason to use Kentik’s BGP monitoring solution is that it addresses many of the limitations of the current “best of breed” alternatives with features including:

1. Large number of data sources

Most solutions on the market today are solely reliant on publicly available BGP monitors. While these are excellent sources of data, Kentik is uniquely positioned to take this to the next level by leveraging our rich BGP data sets that include both public and private (anonymized) sources of BGP data.

2. Immediate data retrieval

Given the size of the datasets that a BGP monitoring solution needs to work with, some solutions have a delay (up to several hours, sometimes) before they present you with data. Our design goal has been to make this near-instantaneous.

3. Instant alerts

What good is a BGP monitoring solution if it alerts you after the whole world has found out about the issue, for example, on Twitter? Kentik alerts you nearly instantaneously.

4. Clean user experience

Collecting and presenting the data is one thing, but doing it in a way that makes it a pleasure to use is a whole different story. Kentik already has information about customer networks, ASes, and prefixes which reduce the time needed to start surfacing up BGP data.

5. Multitude of APIs and integrations

It’s 2022, you want the data you need, where you need it.

6. Benefits of a single pane of glass

While synthetic monitoring is a crucial part of managing services in production, correlating test failures to internet routing issues caused by BGP changes completes the picture.

The limitations of competing solutions, coupled with our customers’ need for true network observability, were the key reasons for why we embarked on the journey to creating a next-generation BGP monitoring solution, as part of the Kentik Network Observability Cloud. While Kentik BGP Monitor is already more feature-rich than many existing solutions, we can’t wait for you to start using it so we can continue to build on what we have.

BGP monitoring use cases and features

Kentik’s BGP Monitor tool addresses the most common use cases around monitoring BGP state as well as root-causing routing issues when they occur. These include:

Event tracking

See route announcements and withdrawals over time and filter the data by day, hour, AS, prefix and announcement type. This is a crucial part of the day-to-day observation of BGP routing infrastructure and policies.

BGP event tracking
BGP Monitor “Events” tab showing BGP announcements for Facebook’s /23 and /24 prefixes, before, during and after the historic outage.

BGP hijack detection

Malicious exploits of BGP’s vulnerabilities can cause routes between the internet’s tens of thousands of Autonomous Systems (ASes) to change, resulting in disruptions to application and service delivery. Being able to alert as soon as these happen is one of the primary use cases of BGP monitoring.

Route leak detection

Route leaks are similar to the malicious hijacking of BGP routes, but caused by inadvertent misconfiguration (for example, human error).

RPKI status check

Resource Public Key Infrastructure (RPKI) is a best practice for securing BGP route announcements, but the improper configuration of ROAs can cause reachability issues. Knowing when these occur and getting alerted is a crucial part of monitoring BGP.

RPKI status checking in Kentik
The BGP Monitor Test in Kentik Synthetics, enables you to detect and get alerted on both unexpected origins as well as invalid RPKI status. Both of these conditions can be set up to notify you at your chosen channels (Slack, email, PagerDuty, etc.).

Reachability tracking

We help you track changes in the reachability of your prefixes from hundreds of vantage points all over the internet and will alert you when any of them become unreachable. You need to be sure that traffic from your ASes can make its way to your customers and the service providers you depend on.

visibility of prefixes from hundreds of BGP vantage points
Time-series chart showing visibility of prefixes from hundreds of BGP vantage points. Filters show the visibility per prefix by origin AS.

AS path change tracking

Frequent changes in the path that BGP route announcements take between ASes can be a sign of instability. Monitoring for these changes and getting alerted as soon as they occur is a key part of ensuring service reliability.

changes in AS path over time
Time-series chart showing average number of changes in AS path over time.

AS path visualization

Fast troubleshooting of issues requires being able to visualize data to find trouble spots quickly. We give you a 10,000-foot view of changes in BGP routes over time — an indispensable tool!

AS path visualization in Kentik
AS path visualization showing a hop-by-hop view of routes that is “scrubbable” across time.

Convenient notifications

Last but not least, all of the above metrics can be set up to alert within the product and can be tied to the most common notification channels including:

  • Slack
  • Microsoft Teams
  • JSON
  • OpsGenie
  • Pagerduty
  • Servicenow
  • Splunk
  • Syslog
  • VictorOps
  • Xmatters
BGP monitoring notifications

Additional advantages of proactive BGP monitoring

Proactively monitoring BGP is essential to verify route changes during regular network operations. For instance, when altering a service provider relationship, it is crucial to ensure that your routes are accurately advertised and reachable post-change.

Potential BGP issues

With many entities and devices involved in BGP networks, there are numerous points where issues can arise. BGP’s known weaknesses in authentication and verification of routing claims can lead to problems. Here are some common challenges businesses face:

BGP route misconfigurations

Advertising routes that cannot carry traffic is called “blackholing”. If you advertise a part of the IP space owned by someone else, and your advertisement is more specific than the owner’s, internet data intended for that space will be directed to your border router. This will effectively disconnect the black-holed address space from the rest of the internet.

BGP route hijacking

Route hijacking involves using another network’s valid prefix as your own, potentially causing severe network disruptions. Most route hijacking on the internet results from unintentional misconfigurations. While malicious intent is possible, a simple configuration typo is often the cause.

BGP route flapping

Route flapping happens when a router initially advertises a destination network through one route, then quickly switches to another or alternates between “available” and “unavailable” status. This forces other routers to recalculate routes, consuming processing power and potentially affecting service.

Infrastructure failures

Hardware and software errors, configuration mistakes, and communication link failures (e.g., unreliable connections) can result in route flapping and other issues. Reachability information may be repeatedly advertised and withdrawn. A frequent failure scenario is when a router interface experiences a hardware problem, causing the router to alternate between “up” and “down” announcements.

BGP and DDoS attacks

BGP hijacking can facilitate DDoS attacks, in which an attacker impersonates a legitimate network by using another network’s valid prefix as their own. If successful, traffic may be redirected to the attacker’s network, effectively denying service to the user.

Learn more about Kentik’s BGP monitoring tools

Visit our BGP route monitoring solutions page to learn more about how Kentik’s BGP monitoring features can help you visualize, optimize, and secure BGP routing for your networks.

Conclusion

This blog post is just a preview of some of the features of Kentik’s new BGP monitoring solution. We’d love to hear from you about other use cases we can solve for. Please reach out (here or via your account team) today if you’d like to set up a conversation with our product and engineering team.

Sign up for a free trial today to start using BGP monitoring capabilities in Kentik.

START NOW

Explore more from Kentik

We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.