Exploring the Latest RPKI ROV Adoption Numbers
In this blog post, BGP experts Doug Madory of Kentik and Job Snijders of Fastly update their RPKI ROV analysis from last year while discussing its impact on internet routing security.
Last year, we published two pieces of analysis that assessed where we were with RPKI ROV adoption (RPKI is Resource Public Key Infrastructure, ROV is Route Origin Validation). This routing security technology continues to be the best defense against accidental BGP hijacks and origination leaks. For it to do its job (rejecting RPKI-invalid routes), two steps must be taken: ROAs must be created, and ASes must reject routes that aren’t consistent with the ROAs.
In the first piece, we looked at the state of ROA creation through the lens of traffic volume using Kentik’s aggregate NetFlow. What we found was that although only a minority of BGP routes had valid ROAs, the majority of traffic sent on the internet was destined for those routes. We had made more progress than most people realized!
In the second, we looked at the propagation of RPKI-invalid routes and found that they experienced a dramatic drop in circulation compared to valid or unknown routes. When the “blast radius” of problematic routes is reduced, so is the potential for connectivity disruption as a result of an origination leak.
How have things changed from last year? Let’s take a look at the numbers.
Measuring ROA creation with NetFlow
As mentioned above, our NetFlow analysis from the spring of 2022 on the state of ROA creation showed that despite only a third of BGP routes having valid ROAs (34.89% for IPv4 and 34.28% for IPv6), we were seeing a majority (56.4%) of internet traffic (bits/sec based on aggregate NetFlow) going to RPKI-valid routes — a far more optimistic picture.
Since our analysis last year, the number of ROAs has climbed continuously and, at the time of this writing, stands at 43.17% for IPv4 and 45.17% for IPv6. These new figures represent percentage increases of 8.28% and 10.89%, respectively. Below is NIST’s chart of BGP routes over time that are evaluated as RPKI-valid (green), RPKI-invalid (red), and RPKI-not-found (yellow) for those BGP routes without a ROA.
So in the past year, the share of BGP routes (IPv4 and IPv6) evaluated as RPKI-valid increased by roughly a third! Given this increase, we would expect to see a change in the share of traffic to RPKI-valid BGP routes, so let’s take a look.
As one would expect to see, the percentage of internet traffic to RPKI-valid BGP routes increased — from 56.4% last year to 62.5%! As was the case with our analysis from last year, these numbers are driven by major RPKI deployments in both large content providers (Amazon, Google, Cloudflare, Akamai, etc.) as well as large access networks (Comcast, Spectrum, etc.). These networks are responsible for the lion’s share of traffic exchanged on the internet, which has become only more concentrated in recent years.
If we are to assume steady growth of the share of BGP routes with ROAs, it should become the majority case in about a year from now (May 2024). Mark your calendars!
Reduction of propagation of RPKI-invalid routes due to RPKI
The second part of last year’s analysis was to understand the rejection of RPKI-invalids better. Multiple groups have attempted to enumerate which ASes reject RPKI-invalid routes, which is a tricky endeavor. Instead of trying to determine precisely which ASes reject invalids, we chose to measure how route propagation differs between RPKI-invalid routes and routes of other types.
Our conclusion was that “the evaluation of a route as invalid reduces its propagation by anywhere between one-half to two-thirds.” This really hasn’t changed much in the past year, in part because the impact was driven by tier-1 backbone providers such as Arelion, Lumen, NTT, and Cogent rejecting invalids. Due to the immense scale of these backbone providers, they end up shielding much of the internet from RPKI-invalid routes.
We see the impact of RPKI on the propagation of invalid routes every day. Kentik’s BGP visualization captures the drop in reachability (see upper stacked plot below) as propagation drops when a route becomes invalid and starts getting rejected. This recently occurred during yet another instance of an all-to-frequent prepending typo like this one from April:
Or due to the recent mysterious daily BGP hijacks of Iranian networks by AS41689 of Iran - many of which are RPKI invalid (and therefore getting rejected). Note that ROAs cover 97% of Iranian IPv4 space:
If an AS doesn’t reject RPKI-invalid routes, but its transit providers do, it is almost like they do, too. Unless, of course, the invalid routes are arriving over a peering connection, circumventing transit.
This brings us to Cloudflare’s recent analysis from December. Their report observed a “very low effective coverage of just 6.5% over the measured ASes, corresponding to 261 million end users currently safe from (malicious and accidental) route leaks.”
Let’s contrast their observation with our conclusion that RPKI ROV dramatically reduces the propagation of RPKI-invalid routes. These assertions might seem contradictory, but they are not. It is certainly the case that, numerically, very few ASes reject RPKI-invalids (Cloudflare’s observation), and, at the same time, RPKI-invalids experience a severe propagation penalty (our observation).
Cloudflare refers to our observation as “limit(ing) the blast radius” through “indirect validation”:
In other words, the methodology used focuses on ROV adoption by end-user networks (e.g., ISPs) and isn’t meant to reflect the eventual effect of indirect validation from (perhaps validating) upper-tier transit networks. While indirect validation may limit the "blast radius" of (malicious or accidental) route leaks, it still leaves non-validating ASes vulnerable to leaks coming from their peers.
Unless a leak originates from a highly peered network like Cloudflare’s, problematic routes will need to traverse large transit providers to have a widespread impact. That is why having backbone providers rejecting RPKI-invalid routes is highly beneficial for the health of the global internet.
There are a few unheralded successes every day due to RPKI. Take this example from January. Pakistani incumbent PTCL began leaking 188.8.131.52/24, which was a more-specific of a route announced by Orange’s domestic network in France (AS3215), 184.108.40.206/17. This was probably an internal route that was accidentally leaked onto the internet. Still, since Orange had a ROA for this route, most backbone carriers automatically rejected the bogus Pakistani route.
In Kentik’s BGP visualization below, we reported that only 25% of our BGP sources observed the problematic more-specific route, which without RPKI would have been globally propagated (i.e., 100% of BGP sources). When our analytics picked this up, we contacted Orange, who alerted PTCL of the leak, and they stopped announcing it.
In this case, had one additional backbone carrier been rejecting RPKI-invalids, the propagation percentage would have been as low as 1% (just peers of PTCL that do not reject invalids). The lower the percentage, the lower the potential for disruption.
To be sure, RPKI ROV does not alone solve the security issues facing BGP. A determined adversary can still forge an AS path to create a route that would be evaluated as RPKI-valid, as was the case in last year’s Celer Bridge attack.
When discussing the impact of RPKI ROV, we intentionally point to its benefits against accidental hijacks due to typos and leaks, such as the PTCL example above or the leak of a Russian hijack against Twitter last year.
Yet to be fielded, BGPSEC is the technology intended to prevent the impersonation of ASes from forged AS paths. However, since BGPSEC’s protection will only extend through ASes, which are BGPSEC-aware, its benefits are limited to a subset of the internet. Despite this limitation, it is important to understand why BGPSEC will still offer protection in a partial deployment scenario.
Returning to the Celer Bridge incident, let’s look at the problematic AS path of that BGP hijack:
… 1299 209243 14618
If Amazon (AS14618) and Arelion (AS1299) were BGPSEC-aware, Arelion would have ignored the above announcement because the signed route would have been preferred. The BGPSEC-verified announcements governing the same IP space passed directly between Amazon and Arelion would have trumped the unverified announcement from the phony Amazon. The hijack route would not have been selected, and the attack would have been prevented without any other AS needing to be BGPSEC-aware.
Like their impact with RPKI ROV, adoption by major cloud providers and network service providers alone can severely limit the efficacy of AS impersonations by greatly restricting the propagation of those harmful routes. Partial deployment does offer benefits as it immediately benefits the deployer.
Getting there will take time. The progress described above on the adoption of RPKI ROV took many years, but if we are to ever secure BGP, we must keep marching forward.