More episodes
Telemetry Now  |  Season 2 - Episode 58  |  September 25, 2025

Tracking the Red Sea Cable Cuts with Kentik’s Cloud Latency Map

Play now

 
Doug Madory joins us to unpack the recent Red Sea submarine cable cuts and how Kentik’s Cloud Latency Map revealed the global impact in real-time, offering critical insight into cloud performance, interconnectivity, and internet resilience.

Transcript

In today's episode of Telemetry Now, we'll be talking about one of our more favorite subjects, submarine telecommunications cables.

But we'll be talking about submarine cables from sort of a macro view of how these cables connect public cloud providers and, of course, service providers as well across vast distances around the world. And also, some new ways that we can get visibility into this activity, things like traffic metrics and indications of trouble like cable breaks and so on.

Joining us again is Doug Madory, director of Internet analysis at Kentik. My name is Philip Gervasi, and this is Telemetry Now.

So what happens when a handful of undersea cables are severed and suddenly the entire Internet or at least a big chunk of it slows down across countries, regions, maybe entire continents.

So with me today is Doug Madory, no stranger to the podcast. In fact, he's been a, guest host, and a guest many times now. So, Doug, really happy to have you back. It's been too long.

But today, we're gonna be talking about undersea cables, but we'll be talking specifically about, your analysis of the Red Sea cable, but in the context of a, a Kentik tool and feature, the cloud latency map, and how how that helped you. So so, Doug, without taking any thunder away from you, I just want you to take a moment to introduce yourself. I know that a lot of folks know who you are. I get that, but maybe you could, give us a little bit of light about what you've been doing lately.

Yeah. Thanks. Happy to be here. So I am the director of Internet analysis for Kentik, and, I have the best job in the world. I have a ton of data to find cool stuff to, analyze and talk about, and this is this is yet another example.

One of the areas of interest I've had in my career in Internet measurement going back fifteen, sixteen years is looking at the impacts of submarine cable activations and cable cuts. I've got a number of cable activations where, starting with, like, Cuba. We're we're the first to publish, independent evidence of the activation of the cable, all the way up to Saint Helena a couple years ago.

And then, you know, cable cuts, these are, you know, can be really dramatic events, for Internet traffic. Yeah. They show up in a lot of places. They show up in a lot of our data. And so they're worthy of analysis, and it's something that, we we just had another incident, that we're gonna talk about today in the Red Sea. And, it's another in a long line of submarine cable cut autopsies that I've been digging into, for a number of years now.

Yeah. You you and I have talked about it a few times, and I've and I've read all of your coverage over the years as well. But there's something in particular about the Red Sea. You've written about the Red Sea in particular, at least the cables, you know, in at the bottom of the Red Sea, several times now. So if if if you if you don't mind, I'd like to start with that. What's going on with the Red Sea that, lends itself to these kinds of, of these kinds of issues?

Sure. I guess if you were to just zoom out for a moment and think of how, how does maritime traffic which maritime traffic, you know, go from point a to point b if you just look at where ships go. They're gonna be this a lot of the same paths that the submarine cables, traverse because this is the most direct point between import important cities, and and the Internet is wired in a similar way. So you end up having the the same, maritime choke points that exist for international maritime, traffic, including, like, the Suez Canal, and there's some points in Southeast Asia.

These are also the the choke points of the submarine cable industry. You have a lot of cables concentrated in a single, area, and are prone to cable breaks, sometimes multi cable breaks. And, and so there's there's an analogy here with just, you know, maritime traffic. What's what makes the, the Red Sea particularly problematic is you have a lot of, vessels, cargo vessels, tankers queuing up to have wait their turn to traverse the Suez Canal.

And so while there's waiting, either in the Red Sea or on the Mediterranean, they're dropping anchor and, and camping out, and those anchors, are the number one cause of cable cuts around the world. The Red Sea is relatively shallow water as far as, like, open seas go. And so, for most of the submarine cables in the world, the depth of the water actually affords a certain level of protection because they're just too far. You can't drop an anchor five miles down.

You're like, you just there's, there's some limit to, as far as the depth of the anchors will go. So, but with, the Red Sea, you you very well may be putting the anchor right on the surface of the, sea floor. And if the ship, is, moves, it may drag the cape drag the anchor and snag a cable. And so I think the first time I wrote about, the Red Sea cable cuts was an anchor drag that occurred in February two thousand twelve.

This, knocked out a lot of connectivity in East Africa and the Middle East. And I think I this is Renesas days. In my analysis at the time, I I was kinda taking a glass half full full, view of it because at that point in African connectivity, it wasn't submarine cables had not there hadn't been that many submarine cables, and we were kind of, heartened by the fact that a lot of providers were able to stay online despite the fact that cable cables were cut. There was a lot of, Internet disruption at that time, but it was, you know, a little bit of a mark of progress that you could survive, to some extent, an incident like this.

But there's been a steady beat of cable cuts in that area, and then we had another one last year, that got a lot of headline.

So if you recall, this is, the circumstances were that, you have a war in Gaza, Houthi government in Yemen that is sympathetic to the, the Gazans and, is firing missiles into Israel and also firing miss, missiles, rockets, drones at passing ships. So we mentioned earlier, this is a choke point of maritime traffic. It's a lot of, traffic going through this, body of water. And so they're in a you know, that where Yemen sits on the Arabian Peninsula, they're they're at a good vantage point to take potshots at, a lot of vessels going by, and they were doing so.

So they were able to hit strike a a ship. The Rubymar, was disabled. The crew was sit was rescued, and then the ship was kind of abandoned, had dropped an anchor, and then, it dragged, the ship, you know, kinda kept moving and dragged its anchor and ended up cutting a handful of cables. And then, I published something last year, in collaboration with Wired Magazine who had, pieced together satellite imagery, AIS data.

This is like the the tracking location, data that's on, most, vessels of a certain size.

We're able to piece together the the timing that we saw of the cable cuts. So I went spent a lot of time to just, figure out, like, what time that each cable get cut because you could you could infer this based on who were the, the customers that were affected and the ownership of the cables that, there's some, inferences you can make there. And, you know, later, one of the cable operators did actually confirm the timing in the course of the writing that. So we had published that last year.

And then here we are again, just a couple weeks ago. We have another, anchor drag. There's a there's a ship that's sus that's the suspect. I guess I don't wanna say it because I don't have I'm not sure if it's a rumor or what, or just, but there's, it does look like it's a, another ship, tanker that, dry dragged an anchor, a couple of cables, and here we are again with connectivity problems, in in the Middle East and and and India based on the the cables that were cut.

Right.

So the, cause of these disruptions is primarily or maybe entirely a result of an accident, or or maybe, I don't know, some some sort of, you know, lack of paying attention to, you know, what you're doing and and and Yeah.

Maybe, accident mixed with negligence. I don't know. Like, I'm not I'm not there. I'm not a I don't wanna cast aspersions here.

Yeah. No.

I understand what you're saying.

But but as opposed to, some a malicious attack. Although you did Yeah.

There was a lot of was something like that last year, but that's sort of been ruled out as the actual root cause.

That wasn't the intention.

It is, it is, something on the tips of a lot of people's tongues right now is, you know, how many attacks or cable cuts, excuse me, are the result of some sort of intentional attack, sabotage.

And, there really are vanishing few confirmed cases, that, you know, we can really attribute to, some sort of intentional it it's it's a possibility, you know, I we can't I can't sit here and say that no one would ever do this on purpose, but, a lot of these you know, there there's there's cable cuts in the, around Taiwan, that were blamed on, Chinese, fishing vessels. I think they actually did, arrest and and prosecute a ship captain, in, you know, recent months, and I I've kind of blamed it as intentional. I mean, the thing is with the ship with the fishing, so there's a lot of lot fishing's a hell another topic here, but fishing is also another area that that, you know, we're we're kind of, to some extent depleting a lot of the traditional fishing areas of the world to get, for for, fishing.

And as as the ships have to migrate to more places, then they have to take more risks or maybe not, abide by what's marked out on nautical maps to stay out of this area. And some of the, shipping, the fishing techniques, you know, they're they're not just dragging a a net. They're trawling, so they're just kinda like dragging something that goes along the surface of the the seafloor and, and just capturing every every life form and then pulling it up and just picking up the things they want, throw everything else out. And so that trawling is another, area that, poses great risk to submarine cables, and not everybody onboard a fishing vessel really cares that much about, Alright.

Submarine cable health. You know? They wanna get their haul and head home. So, so in that case, you know, is it is it a, it sounds like it could very well be a a fishing accent.

Could somebody dress up an accent, make it be intentional? I don't know. There's just there's just it's just very hard to find cases where there's, this is, intentional. And I think, there's that's a that's a that's a debate.

That's an ongoing debate. I think most people who are in the submarine cable, space. I was on a panel just a few weeks ago, with some folks from the submarine cable space and then some also national security, figures, and I would say that the, the opinion from the industry is that they're, this is, you know, a somewhat overblown, threat and that, cables break all the time. They fix them all the time, and, there's probably better ways to go about sabotaging another company country than going up to the, submarine cables.

So, I mean, this kinda speaks to the idea that there's very little public information, especially when you have a cable break and, you know, it's on the news. So maybe, you know, more information starts to come out from from folks like you among among other people that are able to do some, some sort of analysis. But that's what's gonna lead to a lot of those, rumors. And, of course, a lot of people like to jump to those conclusions because that does catch more eyeballs in the news. Right? You know?

Some kind of I mean, we that's our nature as humans.

If someone's out doing, you know, something intentionally, then, it's gonna attract our attention then right and rightfully so if that's that's, we would wanna know that the Yeah. Yeah. Yeah. You're so you're getting to, I think I kinda got on a soapbox on this last blog post at the end, and this is something I've kinda gotten on a a couple times.

And I have to say, as somebody who, has covered this for a long time, I'm I'm both a beneficiary and a critic of this phenomenon where there's very little information that comes about, about these cables. So there's a, you know, I think if we contrast that to the Internet industry that we're in and you think of you know, CloudFlare is probably the most, extreme example of if they have any kind of outage, you can guarantee there's gonna be a blog the next day to describe, you know, and that's that's, I think, considered professional practice because others could learn from your mistakes. And

every time there's some big outage, you know, there's there's gonna be some kind of postmortem that's, that's published with some level of technical detail. I don't know. I think a lot of us are often we're like, I it's just not enough. We want more detail and stuff, and there's they're gonna be limited on, you know, what they can do, but you can kinda read between the lines to understand what took place.

But I I'd say that's there's a certain, social expectation within our industry that's a good thing. It helps us heal and helps us learn from each other. We had in twenty twenty one, we had, like, the big Facebook outage. It was a a couple of Amazon outages.

It was a Fastly outage. There's a lot of big outages that year and then a lot of postmortems, and I feel like we, I'd like to think that we learned a lot as an industry from some of those, and we're able to because we haven't had a year quite as bad as that was, a while back. We've learned a lot about out of band communication and all these other, types of, you know, how automation could can go awry.

I think we've improved, and that helps us helps the industry get better. There's no equivalent in the submarine cable space. It is the opposite where, every, so the way it works, you've got an organization that's the cable operator, and then you've got they will sell service to generally, the telecoms of the various countries, where they land. Now they will communicate, to the, the operators that are buying service, you know, when there's stuff going on, maintenance.

You know, in fact, there's quite a lot of information. There's, on a on a cable, especially one that traverses between Europe and Asia, there's a lot of things happening, a lot of maintenance, a little bit down over here. There's some kind of thing here. Like, there's every day, there's there's stuff.

There's actually a steady stream of news, and the cable operators, come you know, communicate that to their customers. And they, but they don't they're they're pretty adamant that they don't have a responsibility to publicly acknowledge any kind of anything beyond that, and that it's really on the, on the responsibility of the of the telecom that's buying the service to to make, you know, any kind of public pronouncement about a a cable incident.

And, and so then you end up with this patchwork stuff. Like, in this one, I think we first saw a couple of the cables that were impacted when PTCL out of Pakistan put out a press release saying, that they that we people should in Pakistan should expect some kind of Internet slowdown for some, period of time due to a couple of cables that were cut. And then there was two cables named, that everybody started talking about these two cables. I started digging into it with my industry contacts.

Turns out there's four cables, and, you know, there's more to it. And, as we end up with this patchwork of, you know, the of, you know, of coverage, and I it's just, it's not a healthy space. And so then you had, like, Microsoft, put out something saying a notice to their customers in the region, the Middle East, and, South Asia that, they customers may experience some increased latency as they're routing traffic around these breaks. Totally, understandable thing.

And so then you look at the coverage, and everything's like, Microsoft got whacked, and and we're sitting here in Kentik with, all this telemetry data and be like, they all got whacked. It's not just Microsoft. Microsoft's the only one who took the, you know, tried to be responsible and put out some, come some communication about this. They got nothing from the other cloud providers, and, and then they're getting kind of you know, they they it looked like if you were to not understand, how this works, if you looked at the coverage, you'd think, oh, Microsoft must must be something screwed up with with them.

All the other guys, there's nothing that came out. They're like, well, they all got impacted. So it's it's, Yeah. So then I've gotten pushback in my reporting over the years.

There's a there's a LinkedIn there's a LinkedIn group. I got I'm not that active in it nowadays, but there was a time when I was really active in this this, submarine cable stuff. And I remember getting, a pushback at one point where I had published another there's another cable break somewhere in the world. I'm detailing the impacts and how people reroute it around it.

I find this, you know, I find this interesting, but it's also good for our collective understanding of how does the Internet operate and how do how significant is any one of these incidents.

And someone from one of the, the industry wrote to me or responded to the thing and said, you are violating the NDA that was signed between the cable operator and the telecom customers, and, by putting any of this out in the public. And I'm sitting there like, that is a fascinating perspective.

Yeah. Because I didn't sign any NDA, and this is a public event. I've got my own data, so I don't need to, you know, violate the trust of any anyone. But there's that sentiment exists where this is not to be spoken about, publicly, and let just let the cable operators do their thing and don't you know?

And I I think when it gets to the point where you've got headlines and, it's it it's really impacting countries, because I think I think one of the another pushback I'd I'd get in asking for greater transparency is that, like, hey. Listen. We have, you know, ten things a day on all these different cables. There's, like, a thousand.

Like, you like like, you don't want a a flood of information. That's true. Yeah. I don't want a flood of information.

And so there has to be some sort of common sense threshold of, like, when does this cross the line where everybody's talking about this, you've got cable multiple cables down due to some sort of anchor drag. When do we, start communicating to the public? And if the cable operators don't wanna do it, then the next thing I in my opinion is the ICPC. So this is the International Committee for Protection of Cables.

So this is a overarching organization that represents, all the submarine cables in the world, and they do a lot of phenomenal work, advocating for cable safety, just just, like, they they do a lot of myth busting when there's a lot of work to be do done around there.

And they had a new new administration come in a few years ago that was gonna be a little more you know, supposed it was gonna be a little more open and, public. I guess I would love to see if the cable operators feel like they, they they don't wanna be in a position like Microsoft was where only one cable operator, says something, and now their their cable sucks, and everybody else is fine. Like, and so to do it maybe to do it in a fair way, maybe the ICPC, which I think is has a a high level of trust in the community, they already collect, a lot of detailed data about cable breaks for their own statistics.

You know, could they be trusted to be a mouthpiece for the industry for when there's a an event that kind of is is big in their bread box, and this one is capturing a lot of attention. I think that would be really helpful for them to be like, hey. There's actually four cables. This is when I can't we think it's like anchor drag.

It's, you know, like, here's the here's the basics. That would be, I think a good service, because in the meantime, I don't know. I see a lot of, weird stuff out there. And, you know, like I said, it's good for me, because I get you know, people call me, to, ask about these incidents.

I had a a fair amount of interviews with outlets in the middle Middle East, on this. But, so, I wish I wish there was a a a similar level of, you know, posting postmortem postmortems, about what's going on because it would just help I think the the, help the general understanding of of, you know, submarine cable safety and, yeah. So that's Sure. We'll see if that happens.

We'll see. We'll see if that happens. But, I mean, especially because this impacts, it it's not an isolated small thing. We all rely on the Internet, and then, I'd like to transition in a moment to the cloud latency map. But, you know, thinking about visibility and, you know, all the cloud providers rely on these same cables, for the most part.

Yeah. That's what I've kind of foot stomped in the last few years is, you know, we I think we look at, you know, AWS and Azure. Like, these are hyperscalers with just phenomenal resources and lots of data center. Like, just, amazing, you know, people that work at these places, but, yeah, they have to use the same submarine cables as everybody else, you know, like, as us little people.

Yeah. And, and so when there's a cable break, you know, it shows up in their stuff. And and, you know, these guys are staying online. I'm not saying that, in general, they're not, you know, we don't see, these guys.

They they've done they've done the work of setting up the backups and the backups to the backups. I mean, the the level of resilience built into the, like, the big three is just, it's incredible. But it's also visible and measurable. And so Right.

Last year, we launched this, this tool called the cloud latency map. And the idea was, you know, I had already kind of been digging into this. We have, let's see. We have agents that are running in every region of every cloud in the world.

We have, like, hundreds of these things.

And we set up a full mesh where everyone's pinging every other one. So you have this just, you know, whatever it is, ten, twenty thousand, you know, individual, measurement pairs, for a full mesh of of measurement, and this is gonna catch it catches a lot of stuff. And, and so then I had already started kind of, like, in in covering submarine cable cars. We're like, alright.

Well, here's, you know, here's these these telecoms in these countries that got impacted. But, hey. You know, like, you can see between, the cloud region between Western Europe and Middle East or South Asia, the latencies get whacked, as they're routing around. So the outage, I'd you know, we'd see this.

We get caught in these in these measurements, and so they're like, you know, we could build a tool that just kind of barrettes out this, stuff. Like, what's interesting, because on any given day, there's kind of, stuff happening. And then on a day like this, where you have multiple cables cut, then, there's a lot happening, and there's a lot, you know, a lot of latencies are going, you know, through the roof as, traffic's getting rerouted. So, that was something that as soon as that thing happened, I was like, alright.

This is a great, task for, the cloud latency map because I think we can pull up, you know give me, you know, regions in Western Europe and regions to Middle East or regions to, the South Asia. And did any of them experience any kind of a a a latency, you know, event in the past seven days? And and this thing will come back and be like, well, here they are. And you're like, alright.

Well, then you can kinda zero in on, what was, impact. And I would say back to the earlier conversation, you know, we could see impact for Microsoft, but we could see impact for every every cloud provider that we, track had some sort of adverse event as you would expect. And, you know, and that, as they're as they're kinda, trying to stay online, despite the loss of four submarine cables that Right. Traversed the Red Sea.

And we're talking about hard metrics. We're not talking about qualitative, like, interviews with your contacts like you said. So there's certainly room for interpretation of metrics, and then there's some inference. I I know what you do for a living. There is definitely an analysis to the metrics.

So there is a little bit of room for that qualitative aspect, but they are hard metrics. And so for me, listening to you, it sounds like, yeah, service providers are very interested in this kind of data. Folks that pay attention to submarine cables are very interested in this kind of data. The folks like you that look at things on a macro global scale, very interested in this kind of data, but really aren't it shouldn't network operators of, like, a large enterprise that rely on AWS or Azure also be kind of interested in this kind of data if they're hosting resources in multiple regions that rely on moving data across, you know, we're talking about the Red Sea, but across any large body of water from a, you know, US region to a Europe region. And therefore we'll be concerned about that. Yeah.

So I think, like, you know, the cloud latency map, we're just measuring region to region. I think someone could look at that and say, well, who gives a crap about, you know, traffic between Marseille and Mumbai? Like, well, who's who's what what workload depends on that? Like, I think that's the wrong way to look at it. And, you know, we're we're casting a very we're casting a very wide net here by by looking at this full mesh.

Now for an organization like you just mentioned, they're gonna be have some level of dependency. I think we can agree that it's not gonna be zero on some kind of cloud, deployment cloud resources.

And so I think the part of the objective with the cloud latency map is not that this would be, you know it's not gonna tell you how your, connectivity, to your cloud look workloads, got impacted. Sure. However however, it is it is something that you could do. I got we can't build this for, you know, and know everybody's, situation. So, it's a little bit to just kinda spur the imagination and just think, like, from our, you know, campus networks to the cloud regions that we depend on, you could have the same kind of visibility, and we could fair it out, you know, what's what's happening. And and that's, that's something important because I think the cloud cloud's lifeblood to a lot of, you know, infrastructure, these days.

So Right. Absolutely. And it's the same underlying technology. It's a a mesh or partial mesh or whatever of, synthetic testing agents that are are doing this. So if it's internally on your behalf in your, public cloud instances.

But in this case, with the cloud latency map, it is from a macro scale where we're doing it with, you know, where we chose to do it in the big three and I believe also IBM and Oracle, to test this And Alibaba.

Alibaba. Alibaba. Oh, okay. Great. But to test this macro kind of perspective of inter cloud connectivity. Right?

That's right. Yeah. So like I said, we can't Yeah.

I'm not gonna know It's still useful.

Yeah. But, you know, it's free. You could just, there's there's no there's no anything. You don't have to log you don't have to log in or anything.

Just visit the site. If it's not interesting, don't worry about it. But, like, you can there are a million permutations of queries you can run, here to to, look around for, like, I don't know. We've we've published a couple of things of things that insights that have come out of this.

Like I mentioned to a second ago, Alibaba is kind of interesting looking at, you know, the difference between their routing, with all their regions that are inside of China and outside of China, you know, clearly have some different scheme, and how they would achieve a connectivity, to the outside world, is under a different like, they'll they'll exhibit similar, latency events, you know, within China and outside, but not across. Like, they just have different, they have just different networking policy. But, so now when there's a latency event or some some spike in latency, is that due to, you know, a submarine cable cut or some sort of, you know, routing issue, you know, we're not gonna know because this is purely just a a latency measurement.

You'd have to have some sort of out out of band information. If you're a if you got a train that I like I do of looking at latencies around the world for a while, sometimes you recognize, like, the the jump that you saw looks like this is getting routed. You know? This is in the far east, and now it's getting bounced off West Coast of, the United States.

And just based on the the latency that's that's involved, or some kind of hairpinning, thing that's getting introduced that wasn't there before.

You know, I I think I, I I think I can make pretty good educated guesses, the a lot of the things that take place there just from having looked at this for a long time. Yeah.

There is domain knowledge for sure. But the the cloud latency map itself isn't just a static page that I go and visit. There is a ability to filter and manipulate a little bit to to drill down into certain areas and filter.

That's right. Yeah. So you can you can you know, I think, as I was putting this together, initially, we had this as an internal tool just for my own analysis, and I was like, hey. Why don't we why don't we turn this into out to the public?

It's a fairly simple, concept. But, like, for example, I know years ago, we would get, like, like, we'd get a question or maybe you maybe people listening here have heard something like this. We're like, oh, something happened in Singapore. Can you tell?

And you're like, Singapore is like a it's got a lot of connectivity. I don't know. Like, you'd have to start digging into a lot of, things. And if it's not an outage and it's just like a change of routing, the subtleties of, like, teasing out what's normal, like, it'd be hard to do.

But, like, in this case, you could just pull up Singapore, and just feel like Singapore to the world. And so then how many individual, you know, from all the cloud regions that are there to all of the other ones. We're looking we can just scan through hundreds, if not a thousand, time series of latency data, and it'll pull out it'll try to estimate what was the what were the biggest impacts in the last seven days. It's just like there's not another way to get at that, to my knowledge of a of a certainly, a public tool, that would just quickly answer the question.

And if you had some idea of the geography, where something took place and some knowledge of how the Internet is laid out, you know, if you were to pull it up and say, you know, what what happened? Did something happen in connectivity to Mumbai or or maybe South Asia? You can make it just a regional thing.

Right.

We've got, different different ways to group up stuff. Or or is it, you know, AWS to Azure or so you can kinda go between clouds, anyway, or stay within a cloud. You know, I think, one thing that has always been an observation from this type of analysis is that, generally, the connectivity between clouds, like, if you're staying AWS to AWS, then that usually does better. It it like, it's, you know, the the same networking team is on both ends of the of the connection, and, the latencies are generally more stable.

They can be, you know, generally lower, or the you know, there's some theoretical minimum that they're gonna approach, but, when you when you cross, you go between, clouds, then then there's more room for, you know, two different networking schemes or pairing steering approaches, and there's some, you know, suboptimal routing that that can occur there when you're crossing between, but you see that a lot less when you're staying within a cloud. Anyway, the the same kind of thing shows up, but I even in the blog post looking at the Red Sea incidents, I went out of my way to point to, like, AWS to AWS, Azure to Azure, like, stuff that's staying in the cloud that normally is the most reliable also is experiencing impacts, because, you know, they were they were dependent on a cape on one particular cable.

They gotta route traffic a different way, and there's a there's a latency penalty that's incurred, and it'll probably go away sooner than later as they come up with another, another route they can use.

Yeah. Well, let you know, let's dig into that a little bit. I'd like to hear you explain how you use the cloud latency map specifically, and you can get into as much detail as you like, Doug, to identify and then further analyze what happened specifically, in the Red Sea, recently a few weeks ago. I mean, I know you use it for other things as well, but this is really interesting to me because it, you know, also, comes alongside our interest our combined interest in submarine cables. And Sure. Also, it completely rolls right into, you know, our reliance on cloud and that entire discussion considering that the two kind of are, you know, joined at the hip. So how how did how did you use the the cloud lacing app for for this analysis?

Yeah. So I guess I was, like I said earlier, you you there's there's a couple of, ways you can slice up the the measurements and just say, I just wanna look for any latency events, from, Western Europe to South Asia. Like that, everything between those two ought to be going through the Red Sea, like, ninety nine point nine nine percent. There's very little that that's not gonna go through the Red Sea.

So then I will so you'd see you'll see some I mean, if you do it now, we've already kinda moved out of the time window so that that it's a sliding seven day window. When we when we use, links out of the, cloud leasing mapper, if you generate a link, that link is persistent. The view is persistent. But the if you're to go there de novo right now to the site, it's gonna be just the last seven days.

So we've you're not gonna see any of the events anymore. But when I during that time, when something took place and I'm like, alright.

Generally, you know, just what region what region pair ought to ought to have experienced something, and so then, you know, you can see, or, just just, you know, from working on this particular case, if you pulled up the Middle East to anywhere, then that also was would would yield, you know, the, I think, OCI in Jerusalem. You had, like, other, yeah, I'm trying to remember off the top of my head. The, there there was some other, regions in the in the Middle East that kinda got whacked to a lot of places, and you didn't have to specify the other end. It would it would just kinda bubble up, and tell you, like, well, is it this stuff?

And, you know, and so then so not all these, it doesn't do any, clustering temporarily clustering. So it's not gonna know you, you know, you you gotta pay more for something for that kind of service. But the, for the briefing, you know, it just knows something took place in this seven day period. So, you'll have to use your eyes to, in this particular case, it wasn't very hard because everything that bubbled up at the top, was occurring at almost the exact same time, which is kinda interesting because I think these cables probably cut, were probably cut, the, you know, different distinct times, but it does seem like there was one particular time that affect I didn't I haven't done the work to try to isolate which cable that was, but it does seem like, almost all the impact that we saw in the cloud leasing map was all at a single time.

And, and then we saw, you know, probably in, so maybe it's I don't know if it's seen before or which of these four cables was the most, in use. But because, like, I don't know. The other the other thing to point out in this particular incident is that there's also a lot of cables that survived, and, you know, not all of connectivity, was down. So, and and, actually, the more modern cables are the ones that were, survived.

So, you know, a lot of connectivity stayed up the line. I didn't really hear so much about issues in, in East Africa. There were some issues in, in India, definitely, like UAE, got, got whacked. But, like, Simi five eighty one piece cable, the last two, partially or in total funded by the Chinese, these cables were, those are the those are the latest, the biggest cables, and they were unaffected.

And so, you know, I think a lot of connectivity stayed up. Probably a lot more stayed up than went down. But you saw And then she was cut off.

Intra and, inter cloud, impacts, which was kind of like a indicative of something that was going on beyond because, you know, you did mention earlier that, yeah, so intra cloud traffic is gonna it's gonna look a little bit better. It's gonna a little bit better latency and and and we get that. But if there is a cable cut, it's gonna impact both. And so it's indicative of, okay, there's some sort of a root, underlying cause here that's affecting all traffic, period.

And, you know, you you I I remember reading your blog post. Yeah. It affected the Gulf region. I mean, that kinda goes without saying, but it really did ripple out beyond just that region.

Right?

Yeah. In this case, yeah, I gave some examples of providers in India we could see losing. So the way it kinda manifests itself is that, you know, a lot of these, operators, the telecoms mobile operators in, you know, the Middle East, South Asia, they are gonna buy international transit, via submarine cable, from a handful of international carriers. And so in in BGP data, you can see, and we've got tools to help us find this. The moment that they lose, you know, Aurelion, they lose NTT, they lose some kind of, upstream and, in in routing.

And when multiple providers lose the same upstream at the same time, then there's probably some kinda common infrastructure, event that's taken place, that's taken that, the source of of bandwidth out of out of play. And then it then it shifts over to the others. And so then it ends up being a thing where, you know, like, alright. Sure.

The Internet's routing around, the break. The only question then is, you know, is, you know, what does capacity look like? And that's not something we can see from BGP, and latency isn't exactly you know, that's not really a measure of capacity. But, you usually do lose some capacity if you do the plumbing analogy.

One of the pipes, carrying the water is gone. Okay. Now, do the other pipes are they big enough to carry this? Are they provisioned?

You know, I think I I think, I think sometimes we assume that a lot of this is is purely automatic, and it usually ends up not being the case, that, you know, some of this routing will you know, BGP will will, react via its protocol on its own.

But, will it be sufficient, and not, incur some sort of, congestion due to, lack of capacity or higher latency due to more geographic, distance traversed?

Yeah.

BGB doesn't know. It's just trying to keep the routes up. And so, anyway, so you end up, things could stay online, but the service could degrade. And, right.

And so so in in the case of I think I don't know if I had this in the blog post or not, but so in in UAE, you've got EddySlott and Dew are the kinda two main, providers there. And we could see Dew, this is, like, lowercase du, that's, their retail name, suffer outage, lose providers, and then within a couple hours, they, in and this is visible in BGP, they activate GBI, Gulf Bridge International, which is another, regional, player that, actually has a terrestrial route, from the Gulf, going from Alfa in Iraq up through Iraq to Turkey. And this is actually one of the few, used, you know, commonly used routes that avoids the the Red Sea and the Suez, and GBI got, you know, got activated.

They were not a provider of due before. Now they are, and I'm sure that was some kind of and it took about took a few hours, so there's some, you know, frantic phone calls probably taking place in that time Sure. Yeah. People talking to people, like, what's you know, let's work out a deal to get this up.

And money probably changed hands, and they were able to establish the the connectivity they needed. And so, yeah, that side of it is people. That's, that's engineers and, the the, business, folks in those, those companies working on a deal to keep the the bit the packets flowing.

And so, you know, some of it is done automatic through protocols, technology, and then another side is just like the business and people calling each other and having to establish, some kind of emergency circuit to keep the keep the lights on. And and so we can kinda see a little of that as well.

Yeah. So, I mean, ultimately, you're gonna see some of these things with the cloud latency map with regard to latency, hence the name, and things like that. But, you know, you you referred to, KMI. Maybe you could speak to that a little bit in the blog post. And and now you're talking about global routing tables, and looking at BGP and understanding where prefixes are being advertised and pulled and things like that. So there is a greater picture where all of these things fit together, to, like, identify those macro changes in in transit and and how providers are rerouting and things like that, whether it's manual or automatic.

Yeah. So KMI, this is Kentik Market Intelligence. So this is a tool that's strictly based on, BGP data. And this is, the objective of this is to try to be able to inform, the industry on changes, like, who's changes in transit in any market.

So you'd mention, like, you'd pull up a country, and you can figure out, like, who are the retail providers, who are they buying service from, and how is that changing from day to day. And then we got a little thing on the side that says insights, and the insights are basically those changes. And so then if you were in this case, if you were to pull up, you know, though that region, or any of the individual countries and look at, like, what, you know, what what took place in UAE, like, as far as transit gain or loss, and you'd find, you know, you know, Eddie Slot and UAE losing, the the carriers they lost, and then you do sign up GBI.

And, like, these these things, as they appear in BGP, and, you know, we've got a automated process that's just going through this, updating it. It's every six hours. This isn't like a instantaneous thing, but, it give you some some idea. I mean, in in general, if you pull up that region around this, you're gonna you're gonna see a whole lot of routing violence as we used to call it, renaissance.

Just like a lot of conversions around. You've got lost, lost circuits, new circuits, and so there's a lot of rejiggering happening to keep connectivity up. And this is, I think, a way to make sense of all of the, the stuff that's happening in BGP.

And, and so we try to boil it down into something that a person who's who has interest in this this area can consume and understand, like, oh, okay. Like, these guys lost these guys, and they, got this other room. You know? And so, and that's happening.

You know, this thing is running all the time. You pull up any country in the world, it'll tell you that the things that are changing. And so then it's just built around a tool for, you know, sales prospecting for, transit providers to, you know, look look at their competitors, who are their competitors' customers, how is that changing. And, so this is kind of like a situational awareness for, transit sales.

But, you know, I, have had a tool, either KMI or something like it for many years, and I've made a lot of hay out of being able to, learn very easily some kind of, you know, esoteric observations out of BGP that you could only gain, you know, from, having a, like you said, like a macro level AS to AS view and be able to tally this up, but what are the changes for that are affecting country x, y, or z?

Yeah. Yeah. But it's still certainly part of a larger ecosystem of visibility tools and, approaches.

And that's really what's gonna inform your analysis. So really interesting. For sure. And yeah. Yeah.

Absolutely. So I I do wanna kinda end with some lessons learned. So for network operators, folks that are turning the physical and virtual wrenches and and and running networks, whether whether it is a large enterprise that relies on, you know, multiple cloud providers in multiple regions or service providers or or folks, you know, anywhere in between, you know, how how does a tool like the cloud latency map specifically? I mean, certainly, we we just talked about it in an ecosystem, but how does a tool like the CLM is that what we call it, by the way, the CLM?

Internally, we Yeah. Yeah. Or the CLAM. The CLAM. Yeah.

Yeah. How does so I guess, there's some there's gonna be like, depending on the footprint of your organization, you know, if you only operate in Scranton, Pennsylvania, this isn't gonna help you. You know? So I guess if you if it's a, if you have a global footprint and you have interest in understanding connectivity around the world, then this is a it's a very cheap way to get us a little bit of situational awareness of what's happening, by you this this very wide network casting with all these, full mesh measurements between all the cloud regions.

Now would that help your operational, day to day? Maybe you probably are gonna need to, instrument your own measurements from your own campuses to the the workloads that you care about. I don't wanna I don't wanna oversell this, because you're gonna need to have you know, if you want the answers to of your own network, you gotta instrument your own network. But, the, so I think that's that's, that's how I would phrase how would how would somebody make use of this.

Like, something happening in some particular geography. Hey. Check this, see if we can see if there's a a latency impact, and what cloud operators are affected. You know, I think another takeaway is, you know, the Red Sea, like, there's there's these choke points around the world, and I'm not sure we've kinda solved these.

And maybe these aren't gonna get completely solved, as far as, you know, being the like, eliminate the the risk imposed by choke points. In the conversations I had with the, outlets from the Middle East, a a a question that comes up sometimes, and you see this in social media, is like, hey. We got Starlink now. Why don't we just route everything into space?

And you're like, like, the the so the main pushback there is that, you know, the entire, as as amazing as Starlink is, and I'm definitely on the side of saying this is an incredibly, amazing technology. The the the total, you know, capacity of the entire global constellation is gonna be, on the order of, like, maybe less than one submarine cable. So, like, we just can't, recreate, a submarine cable out of, Starlink.

Right. It Right. Definitely serves, certain roles better than anything, but, like, we're it's just not gonna be able to replace this. So, we're gonna be left with some something that's either submarine or submarine or terrestrial.

You know, just, we've been at this Internet business for a few decades now, and it just, it's very hard to, compete with the, the the dollar for dollar value out of a submarine cable.

Even with the cable cuts, it's still worth it, and what ends up happening? And the other thing that happens with these choke points is that, because everybody's going in the same route, everybody's gotta use it. It actually brings the cost down. You have this economy of scale, and then it becomes this inescapable gravity, where, like, sure, you could use a terrestrial route that's that's gonna avoid the Red Sea.

You just do you wanna pay ten x? Like, what's the what's your how do you feel about that risk of outage that could be, you you know, temporary? Like, what premium are you willing to pay to avoid the the possibility? Right.

Yeah. Because, like, I don't know what I don't know what the GBI, you know, comparison price is, but, I can I can assure you it's gonna be more? Like, it's just the the way these things work. And so, ends up being the most cost effective thing is to move it, put on a submarine cable.

Everybody makes the same decision. The price ends up going down, and then then you can never get out of that, route. So, you know, that's, I don't know. Is this gonna be, with us like the weather?

I think to some extent, we do our best to try to, avoid these catastrophic incidents, as best we can, but, there's no I don't see a scenario where this is gonna get eliminated. So, you know, this just, you know, this is general knowledge and awareness for how the, you know, the global Internet, operates.

Yeah. Yeah. And certainly, it does speak to, especially this particular incident in the Red Red Sea and others like it to the need, and you spoke to it already, for transparency, for more communication, from cable providers, from maybe cloud providers as well, just to to to address the issue of misinformation, people jumping to conclusions, which, you know, maybe they're innocent innocent enough, but, you know, perhaps also could cause ripple effects of other adverse, activity. You know, and considering that the, Internet is, you know, the basis of our modern economy and for many, our social structure, and, and all of that, certainly, it requires, better better communication, better, openness, And and and to an extent, a certain level of transparent transparency.

I mean, you you mentioned that there are certain things that you can't really divulge, but, you know, to at least be, open enough to, communicate with those that are relying on your service, to let them know, hey. You know, briefly, this is what's happened. This is the time to resolution, and that sort of thing, and absolve the issue of misinformation. And, you know, that's better for everybody.

And, certainly, I see how the cloud latency map fits into that as far as visibility into what's really going on regardless of what people say, among other tools, whether that's at Kentik or anywhere else for that matter. And, and then, of course, to layer upon all of that is the incredible vast amount of domain knowledge that resides within your brain, Doug, to interpret all of that. So we appreciate it. So, Doug, thanks thanks so much, for for joining again as always, and, look look forward to having you on again really soon.

Thanks, Phil.

So, with that, thanks so much for listening to today's show. If you have an idea for an episode of Telemetry Now or you'd like to be a guest on the show, love to hear from you. You can reach out to us at telemetrynow@kentik.com. So for now, thanks again for listening. Bye bye.

About Telemetry Now

Tired of network issues and finger-pointing? Do you know deep down that, yes, it probably is DNS? Well, you're in the right place. Telemetry Now is the podcast that cuts through the noise. Join host Phil Gervasi and his expert guests as they demystify network intelligence, observability, and AIOps. We dive into emerging technologies, analyze the latest trends in IT operations, and talk shop about the engineering careers that make it all happen. Get ready to level up your understanding and let the packets wash over you.
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.