Kentik NMS has the ability to collect multiple SNMP objects (OIDs). Whether they are multiple unrelated OIDs, or multiple elements within a related table, Leon Adato walks you through the steps to get the data out of your devices and into your Kentik portal.
In my last post, I waxed poetic (or at least long-winded) on how to add a custom SNMP object ID (OID) into Kentik NMS. Despite the fact that NMS collects a metric butt-tonne (that’s a highly technical form of measurement) of data, there are always custom elements needed by various folks for specific use cases.
However, in sharing my example – collecting a custom OID for CPU temperature – I omitted a critical piece of context: Most modern systems have more than one CPU and, therefore, more than one temperature value.
I left out that detail due to the need to clearly explain the mechanics of adding custom OIDs without overwhelming the audience with a bunch of additional complexities.
With the basic process out of the way now, I felt it was important to circle back and talk about how you can add multiple custom SNMP metrics at the same time. And when I say “multiple,” there are two different scenarios I’m going to cover:
When a single OID returns multiple values
Collecting several different, unrelated metrics in one configuration file.
Don’t feel like reading? Watch this demo video to see a walkthrough of adding multiple custom metrics to Kentik NMS.
Everyone. It's Leon. Now I've been doing a few videos on Kentik NMS and showing you how to get into it and how to do stuff and this is just a continuation of those in the last video, I showed you how to do a custom object SNMP object or OID. But it's really rare that you would add a single oid, you know, to the data set. You're usually gonna add multiples. Either you're gonna wanna have a bunch of different ones that all sort of go together or that are important together or you're gonna have one object that actually has different variations and that's what we're gonna cover in this video. Now I want to be clear that once again, there's gonna be a lot of typing, a lot of text and screens that have a lot of information on it. And if you have the urge to pause and scribble things down, don't because there's a blog that goes along with this. It is linked down below in the description. So you can either read along while you're watching the video or use it as your notes later or whatever, but everything I'm showing you is already written down for you. So there it is. Now before I get into today's topic, I wanna review a little bit. So there's once again, there's three main areas that you need to be aware of for custom files. It's the sources file, the reports file and the profile or sorry, the sources directory, the reports directory, and the profile's directory. And Each of those have a different use. The sources tell you where the data is coming from. The reports tell you where to take that data and put it within the Kentik platform and the profiles actually combine the two. They tell you where how to take those sources and what devices to pull that data from. So with that said, I wanna introduce the idea today, which is that I have along with the temperature that I did last time I wanna have a really interesting value. It's the number of pings received by the machine. So here, I've got my little terminal window. I've actually made it a little bigger from the last video because it was harder to read. This is the, object ID or the OID that I want what you can see is that it gives me a number which is the number of pings that that machine has received. So if I go and I say ping one nine two one six eight one zero one twenty, and I ping it a few times, and then I run that command again. It's gone up. It has actually recorded not the pings that have been sent, but the ones that have been received by that device. So I wanna include that along with temperature, not side by side, but just I want to collect it all at once. So to do that, I have my three files. We're gonna pull up this file here. We'll put this down to the side just to get it out of the way. And starting off, I have the the sources file. I know that because it is called sources, the kind of sources. Once again, even though I showed you those directories, The fact is that you could put everything in a single directory be fine as long as you named everything correctly within the file. Again, the name local Linux and also I said the kind was correct, but don't do that because it gets really messy really fast. So the temperature is the same thing that we did last time. That's not new. All I did was I added another source ICMP in echoes, the pings I've sent inbound and there's that object ID again. And then along with the report that I had before. Again, this is the exact same I had before. I now have a new YEML file that I've called the file is called ping count, but that doesn't matter. More importantly, the name within the file is local ping count. I could call it anything, but that's what I'm calling it. It's a report. It's gonna put its data under device Linux Pings. That's where it's gonna show up in metrics explorer. It is an SNMP value. It's gonna show up as ping count and there's that OID again. It is a metric and we're gonna collect it every sixty seconds. And then finally, In the profiles, again, I know it's a profile because it says the kind is profile. The name is local dash net dot snmp. The profile is gonna take anything that matches the system object ID or sys object ID, OID, one dot three dot six dot one dot four dot one dot eighty seventy two, basically any Linux type machine, and it's gonna do two reports. It's gonna use the local temp report that we did the last time again local temp report and then also the local ping count report. Again, local ping count So that's where I'm getting that and also include device name and IP. So what does that look like in, the Kentik dashboard? Well, let's go ahead and roll in here. We're in the Kentic portal. We'll go to Metrics Explorer and from here, I can go and see that I've got that Linux pings and Linux temps. So let's just do the temperature first. I'm gonna grab my temperature metric We'll also show the device name and the device IP address, run a query, and I can see that I've got some data coming in there. Let me span the time frame or contract the time frame to fifteen minutes just so we can see more of those data points coming in. There they are. So I can see that I've got temperature information coming in for my three Linux boxes I have running here in my home lab. That's fine. And then if I wanna look at those pings, I can go there. Again, the the data that I'm collecting ping count, I'm gonna group by device name, and we'll throw an IP address just for fun, hit query again. And again, that, you know, there's the pings that are coming in for those three devices, not much on two of them and then a lot for these, for this one that's going right here. So I'm collecting two different kinds of metrics, but out of the same files, I've actually combined them together. Now that's one situation, but what happens when what I want is that I have one object but it has multiple values that may come out of it. Let me explain what I mean. So I'm gonna move this window out of the way and, alright. So if you'll recall the temperature, oid, is, this oid, I'm not gonna read it to you because it's just a really long line. Actually, let me extend this line out just a little bit so you can see it all there. There we go. And it gets me one value. Now it turns out that that dot one at the end is indicating a particular element. If I change that to a two, that gives me a different, I know it looks like the same one, but it's actually a different, item. I can change it to three and to four. Oh, we're getting similar temperatures today just to make things look in, all the same. Oh, there we go. There's a thirty three hundred thirty three thousand. And then number six oh, there is no six. Okay. So what's really happening here is that I'm pulling from a table of values and I could see that by using not S and P get but S and P walk and I'm gonna back up out of the individual numbers and there I can see the values for all five of the gauges that I have. Now that's good but I don't know what they go to. It turns out that if I back up just two more elements. I can see that there's actually three sets of elements all within the same table. Now the first one is actually like an interface ID or an object ID. So one, two, three, four, five. Then the second one are the labels, the labels tell me what each one is called. So number one is called package ID zero. Number two is called core zero. Number three is called core one and so on and so forth. And then I have the app actual temperature values that are down below. So what I wanna do is I want to create a set of configuration files that will pull all of that data simultaneously. So let's go here and I want to start off with my sources file and it looks almost identical to the sources file we use for a single value. The only difference is that I'm backing all the way up to that sixteen dot two, the thing that pulls the entire table, and also notice it says value not table. So I'm pulling a table of values and I'm also indicating that that key value pair is a table key, not a value key. Alright? So there's that. Then the second piece the reports file. Here, things get a little bit more complicated because, once again, we're gonna put the data in the device Linux temp area and the name is gonna be the the name here isn't the name of the variable. It's actually well, it is. It's the name that name is the thing I want to appear in Kentic portal because it's gonna give me the name of each object. Again, that core zero, core one, core two, core three I'm indicating once again the table to pull from and the specific sub table where the actual data values are And finally notice that it's not a metric. I specifically had to tell Kentik that this is not a men a metric. Next, is the CPU temperature. The table it's pulling from is the same overall table. The values are coming from a different, sub table and this one is a metric and once again we're polling every sixty seconds. And finally, this is the same. That doesn't change. So this is all the same. And if I go into my Kentik portal now and I go back to device Linux temp. I wanna pull CPU temperature. I wanna pull the device name and I'm also gonna pull the name name. Now if you were looking carefully, you might have noticed that it was up there before because I've set up my lab but there we go. And now when I run a query, you see that I'm getting not just for every device, but I'm getting for every sub element temperature value one, temperature value two, core zero, core one, core two, core three, etcetera, I'm getting each one broken down. So what we've gone over so far is the ability to pull two different objects and just collect them in the same set of of configuration files and then also if you have a table type data set, you can pull that entire table out with the labels and everything as long as you give it those two different sections. The the name or label section and then the actual values. But for the bonus round, let's put it all together. We're gonna create one set of, item of configuration files that's gonna pull both temperature and pings and it's gonna pull multi temperature along with those pings. Here I am in my sources file. This looks very much like the one that we did in the first example. I've got ICMP echos in, and the OID that goes along with it, and I have CPU temp The only thing is that we're pulling from a table, not a specific value. So we've sort of merged those two things together. Here is the ping count report and that hasn't changed from the first example we did. The temp the temperature report that we're doing is actually from the previous example where we're pulling all of the values altogether along with their labels and the actual numbers that are coming back with it. And finally, in the profile, we are telling it just like we did in the first example, for every device that has this type of system object, which are Linux type system objects use both of these reports. Both the local temp report and the pain count report. And if we go back into Metrics Explorer, it looks exactly the way we've seen it because I've already done both things. Here we are collecting the CPU temperature broken down by sub element. That's at multiple And then also if I go back into pings and I pull that ping count and I say give me my name and the IP address, and run that query. I can see that we're still pulling pings that we're still doing that. So what we've gone over is the ability to pull two separate objects and, collect them with the same set of configuration files, or take a table type data source and collect all of the elements of a table all at once or do both things together. That's what we've covered so far. There is gonna be a lot more of the how two type videos for Kentik NMS coming in the future. If there's something that you'd like to see, please let me know in the comments below. Don't forget to hit like and subscribe. I gotta save that like all the cool kids and, I appreciate you taking time to come along on this journey with me. For Kentik, I'm Leon Adato. Thanks for watching.
A quick review of custom OID collection
Just to review the process for custom OIDs in Kentik NMS:
Move to (or create if it doesn’t exist) the dedicated folder on the system where the Kentik agent (kagent) is running:
/opt/kentik/components/ranger/local/config
In that directory, create directories for /sources, /reports, and /profiles
Create three specific files:
Under /sources, a file that lists the custom OID to be collected
Under /reports, a file that associates the custom OID with the data category it will appear under within the Kentik portal
Under /profiles, a file that describes a type of device (Using the SNMP System Object ID) and the report(s) to be associated with that device type
For just one temperature setting, it would look like this:
Is it time to migrate to a modern network monitoring system?
Collecting multiple unrelated OIDs
Once you understand the process of collecting a single OID, adding others is pretty simple.
For our example, we also want to collect icmpInEchos (the number of pings received) for Linux/net-snmp type devices. The OID for this is 1.3.6.1.2.1.5.8.0. Using the same files from above, I’d make the following modifications:
Let’s unpack some of the things you see there, and how they differ from the collection of a single OID:
sources/linux.yml has two different sources: one for temp (temperature) and one for icmpEchos.
An entire new file under /reports, named ping-count.yml, describes the OID to be collected, and that its data will appear under /device/linux/pings within the Kentik portal.
Finally, profiles/local-net-snmp.yml (which was already present for the single temperature OID) has been modified to also associate the report named “local-ping-count”.
I want to emphasize two other points: First, the file name doesn’t matter at all. The key is to make sure the “name: ” element within the YAML files is correct. Second, the directory structure is just a housekeeping mechanism. As long as the “kind: ” element within the YAML file is correct, you can have everything in the same folder if you prefer.
The result is that you will now receive and can display data within the Kentik portal for both temperature and ICMP echos received:
Collecting a table of OIDs
(or, “I Contain Multitudes”)
Some SNMP OIDs return a single value, like the icmpEchos metric in our last example.
But others are effectively the tip of an iceberg of metrics. Examples of this type of OID include CPU, temperature, fans, disks, and even the SNMP OID that displays running processes.
Which brings us back to the original OID – You’ll recall that in my previous post, the OID I used was 1.3.6.1.4.1.2021.13.16.2.1.3.1, which gave me one temperature stat. But if I used “.2” instead of “.1” at the end, I would see another temperature reading.
In fact, I can do that for five different OIDS:
This is because the humble little Raspberry Pi I’m monitoring in this example has four cores (stick with me; I’ll explain why it’s not five in a minute). While I’d like to monitor all of them, I have to recognize that not every device has the same number of cores, and therefore, I need something that will flexibly collect whatever number I have. Luckily, SNMP handles that more or less automatically. At the command line, an snmpwalk (instead of snmpget) accomplishes the same thing:
What’s missing is the names of each of these elements. While it may not be a big deal for CPUs, it’s far more important when I’m collecting the same data point for disks, interfaces, and the like.
https://oidref.com/ tells me the names of the sensors can be found at the OID 1.3.6.1.4.1.2021.13.16.2.1.2.
From this I can see that the first OID is the aggregate temperature, and the next four OIDs are temperatures for each of the four cores in my Pi.
With that information in hand, our goals are to:
Collect all five temperature readings without having to call out each one explicitly.
sources/linux.yml uses the OID 1.3.6.1.4.1.2021.13.16.2. Two things are notable:
This is a couple of levels “up” the OID chain from both the temperature (1.3.6.1.4.1.2021.13.16.2.1.3) and the labels (1.3.6.1.4.1.2021.13.16.2.1.2).
The original example used a metric type of “value”. This one is using the “table” type.
/reports/temp.yml, describes two fields instead of just one:
A “name” field which pulls two data sets:
the overall table from the OID ending at 16.2
the actual values from the OID ending at 16.2.1.2
A “CPUtemp” field which pulls two data sets:
the overall table from the OID ending at 16.2
the actual values from the OID ending at 16.2.1.3
The result of this structure is that it will associate the labels from the 16.2.1.2 branch of the OID table with the temperature values in the 16.2.1.3 branch.
Note that the profiles/local-net-snmp.yml is unchanged from our original example of collecting a single temperature value.
BONUS ROUND: Putting it all together
By this point, you should be getting pretty comfortable with the idea, if not the technique, of adding custom SNMP OIDs to Kentik NMS. But in this last example, we’re going to include custom OIDs for every CPU temperature, along with icmpEcho data. Here’s what the files look like:
That’s right. If you’re looking closely, you’ll see that it’s mostly just the same files from our previous example, but the inclusion of reports/ping-count.yml and the combining of both report names in local-net-snmp.yml.
The mostly unnecessary conclusion
Two things should be clear at this point: Far from the shriveled, washed-up, has-been of a monitoring technique it’s often accused of being, SNMP continues to be a powerful, flexible, and valuable tool in your observability toolkit.
Moreover, Kentik NMS is an equally powerful, flexible, and useful solution for collecting and displaying those metrics alongside other data types, providing you with complete insight into the health and stability of your network.
As always, I hope you’ll stick with me as we learn more about this together. If you’d like to get started now, sign up for a free trial.