More episodes
Telemetry Now  |  Season 2 - Episode 53  |  August 1, 2025

Adding Context to Network Telemetry with Data Enrichment

Play now

 
Phil Gervasi sits down with Kentik Product Marketing Manager Eric Hian-Cheong to discuss why data enrichment is the "secret sauce" that turns raw flow logs, metrics, and cloud telemetry into true network intelligence. They explore how tagging telemetry with human-readable context—such as customer names, app IDs, Kubernetes labels, and more—shrinks mean-time-to-insight, empowers cross-team troubleshooting, and lays the groundwork for AI-driven operations.

Transcript

What's one of the most important things that we need when we're analyzing our network telemetry? And I'm gonna give you a hint. It's not a prettier dashboard, although I'm not one to deny the value of a nice and pretty dashboard. But considering the sheer volume and diversity of information that we collect from today's modern networks, probably the most important thing that we can do is put all of that data into a relevant context, relevant to the network engineer for sure, but also relevant to the business itself.

So what does that CPU utilization, that metric really mean in the context of your network with the specific flows that you're ingesting, the firewalls that you're running, the routers that you have in Rack two in your standby data center, all the cloud services, all the different circuit IDs, all the different customers and applications that you have running, everything that's going on that is your network.

So with me today is Eric Young Chung, a senior product marketing manager at Kentik working on a variety of technologies, but one in particular that I wanna focus on, data enrichment.

It's a term that you might know from more traditional data science and data analytics, but it's something that we do in networking as well to get us to that next level of network intelligence, to get us to that next level of context for all that network data that we collect.

Now I love talking about data and that might sound weird coming from a network guy, but this is definitely the world that we live in now. So stay tuned for a good show today. My name is Philip Gervasi and this is Telemetry Now.

Eric, thanks so much for joining on the podcast today. It's great. I you know, you and I have spoken many times, you know, just whether in person or over our company Zooms, but it is really great to have this kind of conversation and get into the weeds about what we're doing at Kentick and some of the interesting things that I think sets us apart. So great to have you on.

And, you know, before we get started, I'd like to, for the audience's sake, just get a quick background of what your role is at Kentic, what you do here.

Thanks, Phil.

Yeah. So I guess I'll start by saying I'm the senior product mark one of the senior product marketing managers here at Kentuck.

In that role, I primarily work on the NMS, on our NMS product, so, metrics collection, as well as AI. But really, you know, as the company is shifting toward this idea of network intelligence and the, you know, how all the telemetry that we gather from networks gets unified to tell a story, you know, that those those product specific lines are kind of blurring a little bit and are really, really you know, I'm I'm focused a lot more on, like, the whole value that Kentick provides to our customers.

You know, product marketing y'all often hear is is is said as, like, we're the voice of the customer. We have, you know, we understand the competitive nature, you know, what what Kent is actually trying to deliver for people.

But really what I I like to joke to people is that, one of the key roles I play is like as a value added translator, you know, understanding what it is that we are doing from a technical standpoint and how it maps to what our customers actually do in their daily lives.

Okay. Interesting. I've actually never heard it put that way, so that's new to me.

So, you know, today what I want to get into is this concept of data enrichment specifically. And yeah, we're going to talk about how that relates to network intelligence.

And, of course, the goal of data enrichment, which is to kind of give us that context of what we really care about when we're looking at particular metrics or flow records and that kind of stuff. So before we get into that, like really deep at least, I think it's important to sort of level set on what the definition of enrichment is. And I know that the term might be familiar to some from like traditional data science, data analytics, but really it's the same idea except, you know, using it in the context of of networking, of network data. So why don't we start with that, Eric?

Sure. So if you think of and when we talk about enrichment, you know, we're we're talking about what we do for enrichment on on our flow product mainly. What what you know, we take raw flow log data and we basically tag it. I'm a photographer, so I've got a huge library of images.

I've been shooting photos since I was in my teens, over fifteen years. I've got a library of one hundred and ninety thousand images. And I have to say, I am terrible at enriching them with metadata and tags. And it's starting to hurt.

When you've got one hundred and ninety thousand images and you're trying look back on them and you're like, I know that piece of information is out there. I know that photo is out there. I know I can picture where I took it, but I cannot for the life of me find it anymore. And that's kind of the same problem that you have traditionally with like flow data without getting good enrichment for it.

Right? It gives you things like IP addresses, protocols, but it doesn't tell you like what's going on. It doesn't help you answer who's talking to who and what is that conversation actually look like and why does it matter to the business. And so enrichment is the act of taking other sources of information and attaching it to that flow record and saying, you know, if this were a photograph, I'd say this was taken in Paris, France, you know, not just the date, time, and and the and the the data of the photograph, but where it was taken, why it took it.

So it's attaching things like, the application that may have actually sent the information. Do we pull in third party information on like host names that instead of an IP address, you have a human readable thing that says that this is web server instead of public DNS server.

So in thinking about network observability and network intelligence, I mean, just seems like we're always adding more data to what a network engineer is utilizing. So I know years and years ago, I'm in some sort of operational role, right? Whether I was a VAR engineer working in enterprise, I'm looking at whatever's on my screen, some SNMP information, or maybe it's flow, but it's usually just that. You know what I mean?

It's just flow and it's, you know, five tuple stuff. So it's useful or it's just SNMP and I'm looking at memory utilization, whatever it happens to be. And then like in my brain, in my mind as a human being, I'm piecing it together, you know, like the tribal knowledge of like, okay, so I know that that IP goes to that switch over in closet two, right? It's in my mind.

Or I'm checking documentation. But in any case, everything is like completely disparate data. It's not joined together. I really feel like that's what we've been doing with observability and then now with network intelligence even more so.

And enrichment, yeah, we can get into like database architecture and how we ingest and do stuff on the fly with our ETL pipeline and all that cool stuff. But it's really just taking that raw telemetry. So you talked about flow and that's like kind of the heart of what Kentick is all about flow. But of course, do streaming telemetry and SNMP and everything else now.

And we add the context. That's actually one of our one of our lead solutions engineers here at Kentech. He always uses that term. He almost hits me with that term because I'll say metadata and he's like, No, it's context.

But that's the idea is like we're wrapping that raw telemetry in like an overall umbrella of context, the application, it's geographic IDs, regions or something like that or something relevant to the business. You know what I mean?

Yeah. And I'm glad you actually mentioned metrics and, you know, SNMP and, of course, streaming telemetry as as kind of the the future evolution of that. Because in in my mind, those are actually also kind of a form of enrichment in and of themselves.

Okay.

If you consider what you just said, like these things used to be in disparate tools. Even when you had a solution that may have had, you know, the ability to do and show you traffic information and collect, SNMP metrics. Oftentimes, those were, you know, separate separate tools, maybe maybe access through a common, you know, web interface, but, you know, fundamentally different in kinda how they looked and fielded and operated, collected data, you know, different databases even sometimes. And, so one of the things that we set up to do is use those metrics as experiential flow enrichment, if you will.

Right? So it's not it may it maybe is not necessarily adding a specific piece of metadata joined in a, you know, column database somewhere to a specific flow log, but instead it's to be able it it is putting those things side by side so that you can look and say, alright. My metrics is telling me the what's happening. I see interface utilization is above whatever threshold I set, and it's causing performance issues, downstream for end users.

But what why is that? So I'm seeing kind of the what, but then the why on the flow side. Or vice versa, I'm seeing tons of, you know, tons of traffic going over this interface. And I wanna know, is that the is that causing the problem of of interface utilization going high?

A lot of times, it's not always clear, like, the causative direction between what may be going on in the network. So take for instance, you've got a situation where a device is struggling for some reason. It it's it's interface utilization is is through the roof and it's starting to drop packets. And Mhmm.

You know this is causing some sort of performance issue downstream. Is that the result of a huge amount of traffic, of legitimate traffic that is that is for some reason being routed over that interface, perhaps because there's an issue somewhere else in the network? Or is the device itself truly buckling? Is it I don't know.

Is is a cooling unit out in the data center, and it's thermally throttling itself. And the direction of causality is not always clear unless you can have both of those things at your fingertips side by side to pivot fast between them.

They've always been available. They've just not always been available as fast as we'd like it to be. Right? You have to go on over here and find a piece of information. Then you have to go log in to another system, find another piece of information over there.

Oh, darn, I forgot my password. Or that was, oh, Joe runs that tool. I've got to go ask Joe to get that piece of information, right?

Yeah.

Yeah. And the thing is that it is based primarily in the olden days on tribal knowledge and the sneaker net and really just your gut intuition and gut instinct as an engineer and just knowing that the red wire is the one that goes to building two over in closet whatever. And so we actually did everything that you just said, putting everything together. It's just that we did it literally in our brains manually.

And I think the idea of data enrichment, which we take from traditional data science when we're using it here in network observability and in networking more broadly, is giving us the ability to do all of that stuff programmatically and getting away from that manual clue chaining of thinking to yourself, all right, here's some IP addresses, but I know which DNS servers they are because I've tagged those IPs or or whatever whatever however they exist in the database, there are correlated other elements in our larger dataset or maybe other, you know, multiple data sets that allow us to understand, you know, how these things relate to each other.

So I know that, like, this particular DNS server is primarily used by this particular customer, and we've kind of dedicated that way. And how that's not gonna be in a configuration somewhere. You know what I mean? It's not gonna be in a flow record or in a, you know, some some SNMP trap.

But we can add that information, that qualitative sometimes sometimes it's quantitative, sure we can add that to the database of flow records. And so it sits alongside of it. So when we do our query programmatically with our system, there it is alongside that data. Not just in Phil's head or in Eric's head.

I think that's the real key here. And and that's the the real power that that, you know, data enrichment and network telemetry gives us, this ability to to have that greater context. You know, and I I know I know that a lot of the time we have that base flow. Right?

So we have our source and destination. We have our ports and protocols and all that kind of stuff. We get some TCP flags. We we see that all there in our database.

But then, like you said earlier, well, who's the customer?

What g I what what building is this in? Or what department is this? Or what circuit IDs? Or and then if we have that other information, we could start to do something more sophisticated, like tying things to, like, cost. Like why is that traffic going over that transit gateway and not over this direct connect, or whatever it happens to be. I don't know.

But I like photography analogy, because that makes a lot of sense. There were so many pictures in my house from the olden days when I took actual cameras. Sorry if that's not old for you, Erica. But I look on the back of the picture, and there's nothing there. And I'm like, all right, I recognize me, but I don't know who these other people are.

So I lose the context. I lose the meaning of that picture at that point.

It's actually funny you mentioned that. I'm actually probably one of, if not the last generation of photographers to have actually learned and started on film.

You know? I not to each myself, but I I did start shooting on a on a film camera. But yes, I mostly it's my parents who I'm like, I've gotta go back through all the photos and ask her to tell me what all these things are because I don't I don't remember, you know, from my childhood. Yeah.

But I wanna I wanna dig into one thing that you said there about, I think you were talking a little bit about you kinda started saying cross team collaboration, basically, and and saying, like, these different people who have different interests in, like, in the network. I think that's actually a really good point to dig into a little bit. Okay. You know, enrichment, it's all like you said, it's all about context and being able to apply a human or more importantly, a business understanding of what your network is telling you.

And especially as we've seen the increase of different types of, like, application architectures. Right? You know, we we the network is way more complicated, infinitely more complicated than it was when a lot of these concepts that we still rely on originated. You know?

Mhmm. I was talking to a customer a couple of weeks ago, and they were pointing out that the flow enrichment that we have and, you know, specifically, the ability for some of our AI tools to to understand and and and help triage, you know, and troubleshoot problems has been really invaluable to their SRE team. And you're like, SRE? You're not they're not a typical network data consumer.

But they were you know, they don't understand that. They don't understand IP IP networks beyond the, like, superficial level. But they wanted they were trying to figure out without going through, you know, the whole network team, why an application you know, it was a hybrid cloud application was not performing the way that it should. And so they were piecing together, you know, the flow logs and stuff like that, using context to be able to understand.

And I don't I don't remember the specifics on exactly where the where the issue was. But the point was, it was actually the SRE team who was digging into Chemtech in order to understand what was if if the network was actually the problem with an application performance.

Yeah. Yeah. Yeah.

Did not think that when I said that earlier. So when you say, yeah, you mentioned a thing, I didn't actually know I was mentioning it. So that's interesting that you bring that up. But you're right. I wasn't thinking cross function or cross team necessarily. But you can enrich your flow records with things like threat feeds or information about logins and stuff. So you actually have a security component, which is not a traditional pure network engineering role.

But if there is a security breach not always, but very, very often it's going to manifest itself in some sort of a slowdown in an application or maybe a hard down.

If it's more sophisticated, maybe it's just traffic piggybacking off of some protocol. I get it. But even then, it's the network. Even then, you want to know what ports and protocols are being used over here with this customer on this port and protocol. And then you're able to identify the difference. So you're right.

There's a cross function and a cross team opportunity that happens when you have this more sophisticated observability as a result of enriching your data. I mean, I'm also thinking about when I first started, I was configuring interfaces and OSPF. I was not configuring Kubernetes namespaces. I didn't know that existed. In fact, I don't think it did exist, so now that I think about it.

But certainly, you kind of get what I'm saying here is that you get these folks that are maybe more on the developer side, and they're just focused on that space, on that world system side. And so what we can do is grab those Kubernetes namespaces or labels that we're using for various pods and stuff like that, workloads, whatever it happens to be, and add that. Because it is relevant to the network. There are still network components running around going from one cluster to another cluster and perhaps from going from one region to another region.

So we to be able to monitor that and saying, why is my latency so high? Latency on what? On this particular namespace. So that's really it's interesting that you picked up on that.

I wasn't thinking it. But now that you have, that does make a lot of sense. I always default to security a lot because I did a short stint as a network security person, and I hated it. I didn't like it at all.

I wanted to get back to routing, routing and switching, at the time, wireless. But in any case, yeah, very much it adds those additional dimensions to the data to not just make it more complex, but also I was thinking it makes it more queryable. So you were talking about a customer just now that they were able to query the data in a better way, in such a way that was more relevant and more useful for them because it had, I'm assuming, that many more and accurate tags, relevant tags as well in their data set. So not only were they just querying IPs, they could start querying I don't know about your customer, but I'm just using, you know, hypothetical.

They could start querying by account number or querying by customer name or whatever, whatever happens to be. Right?

Yeah. That's exactly right. It was all because they had that context. And this idea of human readable context too. Right?

Like, just numbers that are relatively opaque or, like you said Good point.

May exist in the tribal knowledge of the handful of network engineers who set it up. They're like, oh, I remember that I you know, I remember that IP address goes to, you know, such and such host because I was down there and I configured it four years ago.

And I've got a photographic memory. So the SRE team doesn't know that. They don't know that.

And then, you know, something else that popped into my mind as you were talking was, the inverse of it too is network engineers.

You know, I don't know to what extent this, you know, this is more about our, you know, Kentic specific ability to kind of, like, dig into overlay, you know, headers in in, like, VXLAN, you know, additional, you know, encapsulation and stuff like that. But, like, of the things that I've also heard from customers is, this is like the network engineer saying, I built this network. You know? And by network, I'm talking about the underlay, the physical routers and switches and stuff like that.

And then the DevOps team came and stuck an overlay on it because they didn't want my you know, my network wasn't architected in quite the right way for what they were trying to do. And now I'm stuck managing this thing. I've got this VX land that I didn't implement. I don't really understand fully how it was built and implemented.

But now people are coming to me to say, why is the thing not working? And I'm like, I don't know. But we So that's also like a It's kind of the flip side. It's network engineer, network core network people also finding a lot of value in the enrichment to be able to understand.

It's almost like shadow networking. You know? There's like that talk about security. There was a talk about shadow IT back a decade ago, where people were just bringing their own devices and plugging flash drives in willy nilly and stuff like that.

It's kind of funny to see that a similar kind of thing is starting to happen in networks.

Yeah. Yeah. And now we have shadow AI, right, where people are using AI tools when they shouldn't be.

I don't know what you're talking about.

No. No. We'll move on. Moving on very quickly.

So, you know, so I'm gonna stick with Kentick because that's what we know the best. But, you know, Kentick in ingests kinda like just as a kind of a recap, the classic flow data, any kind of exported flows stuff. So like, you know, NetFlow, Sflow, IP fix, but also we've been talking about cloud. So cloud flow logs, I mean, there's from AWS and Azure and Google, whatever, but we're also going to ingest streaming telemetry, SNMP.

And so there's like your classic telemetry.

Maybe the foundation is flow, maybe not whatever. But that's like kind of that classic core of telemetry.

The enrichment part is when we start adding all that metadata to add the context, to help us to get to, you know, meantime to meantime to context, whatever you want to call it. Right.

So what is that metadata? What is that context? You mentioned a few things we talked about like Kubernetes namespaces. We talked about like adding a circuit ID or a customer name or a device ID, whatever it happens to be.

Did we cover everything? Is there anything else that's specific that you wanted to add to that list? What else are we enriching that classic kind of telemetry? What else are we enriching it with?

Yeah.

Well, there's there's actually a there's a huge amount of stuff that we can add. You know, things like routing, you know, next hop, ultimate exit, geo IP, you know, like, literally physically where things are coming from application names. You know, like, is this is this traffic that is, you know, is this traffic the traffic we're using to record this? You know, is it is it we're on Riverside.

Right? Like, is this Mhmm. Is this traffic going over Riverside? You know, things things like that.

So it it really is a there's a huge amount of of things that we've built. And, you know, that's one of the strengths, I will say, of ChemDick is that we've spent over a decade figuring out what are those meaningful pieces of telemetry that come in that are meaningful to understanding network traffic. Right? And one of the big things that we also do is custom dimensions.

So being able to empower a user to just add their own tags that are meaningful to their own their own business context. Right? The names of your SD WAN link, customer IDs, you know, like, literally, is this is this customer a or is this customer b? Yeah.

Does this you know, are these devices is this traffic specific to a GPU cluster versus, you know, just an edge router. Right? Or, you know, all sorts of different things.

The sky's really the limit when it comes to what we can do. Mhmm.

You know, and it like like so many tools, so many platforms, so many what you you get out of is what you put in. Right? I mean, not not everyone has taken advantage of all of this of this enrichment. It's really, you know, contextually based on your business needs.

Yeah. No. I'm sure there's a lot of folks that are still using, you know, very siloed. And I use this tool for flow.

I use this tool for my SNMP. I use this tool. And then I have these spreadsheets with all my circuit IDs that I just Control F or Command F or whatever. But yeah, imagine putting it all together and you just went through the custom dimensions or dimensions being basically the filters, whether they're dimensions in the sense of actual attributes of the data, like metrics, or dimensions like derived an average would be a derived dimension.

You don't ingest an average. You get all your metrics, then you mathematically figure out your average. So you have a lot of options there for stuff.

So I do want to get in a little bit more of the technical piece here, because I know that for a lot of folks, when you keep adding they're thinking, right, I'm assuming. When you keep adding all of this data Kentig has a columnar database, so you keep adding more columns and rows, you keep adding more fields to search.

I'm a network engineer, right? And I'm working on some problem in real time. Maybe something is down, so it's mission critical real time.

And I got to go search a database of all this stuff.

That sounds cool in theory that I can do that, especially if it's like some kind of historical analysis. But I can't imagine that that's gonna be efficient or fast to use in in real time. You know what I mean?

Well, so you'd be surprised.

Okay. Because that's that's probably the number one singular thing that, you know, Kentic is known for is not only the richness of our data. You know, we're it's unsampled flow coming in, you know, you know, streaming in and stuff. But it's also the speed and performance of the of the Kentic data engine, which is really the, you know, v twelve under the hood capability that that drives this thing.

And I'm I am not gonna pretend to be an expert on our database architecture. I know that there's some slicing and charting happening in there. I know we've got, you know, some very high performance clustering. But what I will say is and I I pulled a few stats recently.

We are currently ingesting and, you know, this is this is a ballpark number. I basically went on to one of our data center clusters and looked at one of our edge routers and did a backward calculation. We're currently pulling in two hundred twenty six terabytes of raw flow data from our customers per day.

Per day?

Yep.

Works out now. I don't know exactly how much of this, you know, we we are, you know, storing long term. That that all depends on, you know, the customers, the retentions, you know, all that stuff. But in two thousand twenty four, it was over five hundred forty trillion unsampled flows that we pulled in.

So, you know So it's That's not including all the other telemetry that we do ingest.

I I don't mean I don't mean enrich data. I mean, the other telemetry that we ingest, like push Kurt. You know? Kurt.

Correct. The Jeez. The flows. You know, the two hundred twenty six terabytes of data that probably in per day, that probably includes some of the other telemetry in there. It's probably Yeah.

A fraction that's a tremendous that's a tremendous volume.

So, you know and customers still say that, wow, Kentic is one of the most fast, know, high performing ways to Mhmm. Slice and dice flow data and and query against it, you know. And then and then when you start to add on the fact that we we are innovating rapidly in bringing artificial intelligence to this type of data, which has never been done before, power is just so, so rich. And it works really, really fast because you know, time is money.

And, you know, maybe not so much on like a cost architecture optimization, you know, like do it quarterly, do it every couple of weeks type of workflow. But, you know, certainly when you're when you're doing the troubleshooting, you know, some sort of performance issue. Right? And you're trying to figure out, like, why is this thing down?

Or why are why can you know, why is my mobile banking app not not working, you know, and I'm losing Yeah.

I used to joke to people, happens, you know, time is money, what happens when the Domino's mobile order app goes down for five minutes during the Super Bowl? Right? Like, people don't wait around waiting for the Domino's app to come back online. They go and they order Papa John's. Right?

Well Well, maybe not you.

I neither.

Well, of course, because you're from New York and you're Italian. Yeah. Well, okay. You order.

But I understand the spirit of your example. Now, that's also presupposing that it's a network problem.

Sure, sure, sure.

Application problem itself. But the thing is that how do we how do folks actually get to applications in twenty twenty five? How many applications live on your computer locally? Like four.

And then all the rest are activated or rather accessed over a network. So by and large, the network is the critical piece more than it ever has been to application performance. You know what I mean? That's why APM and traditional APM, traditional NPM, they're really, really joined at the hip today, whereas in yesteryear, they were more standalone.

But that's not the case anymore. And I know for what we do at Kentick, we don't do table joins of whatever metadata and then flow records when you're doing your query, because that would take a long time. So I have to assume that this is going to be a crux of why it's so fast queries, that is is because enrich that base classic telemetry with this additional dimensions or whatever it happens to be at the point of ingest.

Yes. Yeah. That's right. I I failed to mention that. That is that is actually talking about troubleshooting, you know, real time and performance. That is that is a foundational reason why Yeah.

The the platform is so That's where I was going about it. Because we're not doing those, like, table joins and that kind of thing literally in runtime, like at the time of query.

It's all done. So the query, especially the way we do it in Data Explorer, is very SQL like. So you're going to put all your different filters together and you have a you can have a very complex query, including all those dimensions, and hence get a really, really interesting question into the system and therefore a very relevant and and and interesting answer with all those aspects.

Yeah. You know, another reason that queries are so quick with Kentik is is, you know, this gets a little bit into the architecture, but I'll I'll leave it high level. We've we actually generate two tables, basically, for each each customer, each time you know, each device, each each data point. One, you know, a full one and a fast one.

And, you know, as the names can probably suggest, the full one is everything. It's your whole historical table going back to your to to when the customer started with Chemtech. And then the first one is more of an aggregate look. It does a little bit of deduplication.

It does some roll ups, and it it kinda gets into really, it looks back just about the last twenty four hours that, you know, you have that really fast access query for when you're looking at kind of real time or or short term past data. But you can, you know, you can always fall back on that full data data set for when you're doing more like your optimization cost historical, you know, network network architectural type of capacity planning workloads or workflows. Right?

Yeah. That makes sense. And so we have what you're saying is that there's built in mechanisms to allow folks to be able to query data full granularity and fidelity, which we know is inherently gonna be slower. And so what we've done is develop another mechanism to make it faster with the fast table.

However that works from an underlying structure is fine. That's interesting stuff for the the the data geeks. But from a network engineer perspective, I can run a query and get answers back at lightning speed. And that's really what I care about.

I don't you know?

And to be clear Exactly. The, you know, the user like, you you don't have to you know, if I'm a customer in Contech, I don't I don't pick between the fast table or the full table. Right? Like, the it it it's contextual based on the query you're trying to do Mhmm.

And what your you know, the time frame you're looking at. If it if Kentick needs to, in order to answer your question, go to that full table, it will. Mhmm. But, you know, if you can't answer it with the fast table, you you're gonna get your answers using the fast table much faster.

Okay. So if I'm doing a query and I'm doing something, let's say, Data Explorer. Right? So I'm doing a manual query and piecing it together.

Or I'm doing it through, like, journeys where it's conversational. That's the AI product. So it's a large language model that allows me to do that same query, but with natural language. Either way, either way.

I can do a query, and in the same query talk about flow data and a transit gateway, and it'll build a query with those multiple dimensions in there?

Yeah, actually, that's a really good point.

And this goes back to you unifying all these sorts of network telemetry, you know, not just flow data, not just SNMP metrics, but your network is not just the stuff that you racked and stacked, you know, your, like, your top of rack switches, your Yep. Edge routers. Right? It's it's the full delivery chain of the digital ecosystem.

Right? It's the thing that allows my computer to convert, you know, electrical signals to light waves to send to my router, which we convert it to an electric signal, which converts it to a to a to a a light signal over fiber, you know, and then basically does the whole thing and unpacks the whole thing in real time on your end. Right? And at some point, you know, we're using I said, we're using Riverside to record this.

At some point, this is transiting through the cloud. Right? I don't know where Riverside if they're in AWS or Azure or whatever. But right?

Like, having visibility, you know, if I were Riverside into that is critical to making sure that this whole application ecosystem works. And that's one of the big things that we've worked a lot on is being able to link cloud contacts and cloud networks. Because, you know, clouds are clouds there are networks in clouds. You know, sometimes if you're people don't always think of, like, clouds having networks.

It's like it's in the cloud. Mhmm. But, of course, like, there's data passing, you know, from VPC to VPC and, you know, VPC to on prem and back and forth. Right?

Like, that's that's all important to be able to query and understand side by side in in context with each other.

Yeah. Yeah. Mean, if I got a front end in AWS, a back end in in Azure, and a user on prem somewhere going over over an SD WAN, you know, and and then maybe throw in a colo, you got and it's all for the same application. Right?

It's all to get to the same data somewhere, you know, for my all the QuickBooks in the sky. It relies on the networking to connect all that stuff. So, yeah, you better believe there's networking in the cloud. And and not the cool networking like the underlay stuff that we like to talk about and, like, how people are using, like, Sonic and all these overlays.

It's really cool. But just from a everyday use, it's completely reliant on the network. So when I get deeper into this stuff, it really starts to feel more and more like data analysis and data engineering stuff. And it's just so interesting because I've always looked at Kentic as kind of like a data analysis platform that happens to do network telemetry.

You know what I mean? Like, yeah, we we we are focused on the network, of course. And so therefore, you know, that's what the data is all gonna be. But that's also how things are built.

It's built around how those types of data work. So that's fine. But really, at its core, it's it it is very much a data analysis platform. And and now I I I wanna warn the audience.

Eric actually said to me before we started recording, we probably shouldn't talk about AI. Let's stay focused on enrichment. And I'm like, Okay. But I got to bring up AI because I talk to a lot of folks about their own AI initiatives, whether they're network teams or IT more broadly.

Sometimes it's like traditional machine learning, by the way. And it's not necessarily a gentic workflow that's very sophisticated and brand new. Although you can have very sophisticated ML workflows, of course. And once I start to talk to them, and let's say it's a network team, it starts to come up that, like, well, you know, they wanna query data.

They wanna push config. They wanna have, like, automated all these things. Okay. But their databases are all disparate.

Their data has not been cleaned properly.

They have no ability. They have no, like, mechanisms in their data pipelines to do anything really in real time or in real enough time that it's relevant to them. And so one of the things that I'm seeing lately, Eric, not just with, you know, KENTIC in general, but like our enrichment and our what we do with data and that whole concept of us being a data analysis company is that it's like the perfect way to ingest, prepare, process, clean data, enrich it with all the additional information, all those columns and roles, so you can then take it, and now you have it to serve to an ML model or an AI workflow, because it needs that.

In fact, I've read several times that an AI workflow, building that out, any kind of sophisticated workflow, like eighty percent of it, seventy percent of it, is the data piece, is the ingest and the cleaning and the preparation. And how do I take care of missing rows and columns? What do I do when I have duplicates? Do I just average them?

Things like that. And then like we've been talking about today, adding all of that metadata in that context. So now when I apply a logistic regression, I can classify data more easily or correctly, I should say. But you can't get there unless your data is ready to go, properly labeled, if you want to get into unsupervised learning, properly cleaned and processed, beyond the scope of this conversation, also properly stored and stuff like that and secured.

So really interesting stuff. And I'm really excited. Again, I know you didn't want to get into the AI piece particularly, but I'm really excited about how all of this conversation does lead into how Kentick is kind of can become a backbone to AI projects just from a data perspective. Does that make sense?

Yeah. And actually, it's you know, the the if you're a customer of ours and and you've been using some of the Kintiq AI tools, you know, I just I wanna just say, just wait until you see what we've got coming next. You know, I was in some of our you know, there are I've been playing around with some of our dev environments, and it's really, really cool some of the way that we're we're applying, you know, some of the new LLMs to network data. And being able to say, you know, don't you know, not just take a natural language question and turn it into a good network query, like, you know, show me, I don't know, show me the top ten ASMs that, you know, my network is talking to or whatever.

But be able to say like, what's going on with my network? What do I want to care about today? And, you know, and And allow the LLM to intuitively understand, like, what does that mean? When a human is saying, what does my network like, what do I want to care about?

What could that mean? And what types of things? What kind of data do I, as the AI, have access to that answer that's begin to answer those questions.

Well, that's the thing.

It can answer the question if it doesn't have access to the data and cleaned and properly I was just gonna say.

Prepared. Yeah.

And the fact that we have this huge amount of very, very enriched high speed, right, like that performance thing is also critical to an AI being able to react quickly. Right? It doesn't do a whole lot of good if you're talking to an AI and the AI is like, oh, let me get back to you in fifteen minutes.

Right? You want those answers fast. You wanna have that kind of real time interaction with an AI tool in order to make it useful to you.

And I think that's where we there's a lot of exciting stuff on the horizon for temp to work for our Oh, for sure.

Yeah, it's really fascinating stuff. But I mean, for me, it's clear. AI aside, this entire process of enriching that kind of base classic telemetry And then, of course, the way we do it is the way we do it. But just the fact that we do it, that really brings just traditional visibility, traditional even network observability I can't believe I'm saying that into like, you know, network intelligence, this modern concept of context and and relevance and understanding how all the parts some of these parts, by the way, ephemeral.

Right? IP addresses that live in your container land just disappear after however long, you know, you destroy containers. So very complex there. How do all these parts move relate to each other and then relate to what we care about, which is usually delivering an application to somebody, right, or a machine talking to another machine for the purposes of delivering an application to somebody.

You know, one of the things I'll just leave us with as we wrap up is network intelligence is, in a sense, it's the thing that humans have always been trying to do. Right? Because we've always been taking data, and we've been trying to apply our own intuition and gut feeling and spidey sense to it and come up with an answer to a question.

Difference is now that technology, you know, and the data has gotten so enriched and so powerful that we can start to programmatically apply the same things that network engineers have always done to the data in order to get those answers even faster. And it it that that speed is important because Mhmm. The complexity and the volume I mean, we've been saying this for, what, two decades. Right? Like, the big data, right, as you used to, like, talk about, like, fifteen years ago, is like, what do we do with all this data? Well, now now we're kind of actually getting to the point where we have an answer for that.

Yeah. Yeah. Yeah. Absolutely. I I I would even add one minor thing. It's not minor, but one one additional small quick thing is that not only, you know, is network intelligence as a platform sort of doing what engineers have always done, and it's kinda mimicking that.

It's it's gonna and and faster, as you said. It's gonna do it across more data because I know as a human being, and I often work by myself, I can't correlate as much data as we have in the system in my head. This whole podcast, you and I have been talking about how, oh, we do this stuff in our head and we clue chain. But I clue chained with a finite amount that I could handle in my human brain. Imagine a system that has many, many, many, many rows and columns of all sorts of data and metadata, where the computer is not constrained like we are. And so it not only does it faster, but it also does it with a greater scope and ability to access even more data.

So so with that, Eric, thanks so much for being on the podcast, talking about data enrichment, what it is, why we do it, why KENTIC does it, you know, how we do it, and really how it is enabling this new idea of network intelligence. And honestly, I'd love to have you on again in the near future to really flesh out what network intelligence is all about. Hint, hint to the audience. It is all about, among other things, bringing context and a unification and a relevance to all of the diverse and very significant amount of network telemetry that we ingest from modern networks today. So again, Eric, thanks so much for joining today. Appreciate it.

And to our audience, if you have any comments or feedback for today's episode, I'd love to hear from you. You can always reach out to us at telemetry now at kentic dot com. That does go directly to me, so I see everything that comes into that account. So for now, thanks so much for listening. Bye bye.

About Telemetry Now

Do you dread forgetting to use the “add” command on a trunk port? Do you grit your teeth when the coffee maker isn't working, and everyone says, “It’s the network’s fault?” Do you like to blame DNS for everything because you know deep down, in the bottom of your heart, it probably is DNS? Well, you're in the right place! Telemetry Now is the podcast for you! Tune in and let the packets wash over you as host Phil Gervasi and his expert guests talk networking, network engineering and related careers, emerging technologies, and more.
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.