Inside the Kentik Data Engine, Part 2

Jim Meehan

May 2, 2016How-To Product

Try Kentik

Start a free, 30-day trial.

By completing this form, you agree to our Privacy Policy and Terms of Use.

Summary

In part 2 of our tour of Kentik Data Engine, the distributed backend that powers Kentik Detect, we continue our look at some of the key features that enable extraordinarily fast response to ad hoc queries even over huge volumes of data. Querying KDE directly in SQL, we use actual query results to quantify the speed of KDE’s results while also showing the depth of the insights that Kentik Detect can provide.

In part 1 of this series we introduced Kentik Data Engine™, the backend to Kentik Detect™, which is a large-scale distributed datastore that is optimized for querying IP flow records (NetFlow v5/9, sFlow, IPFIX) and related network data (GeoIP, BGP, SNMP). We started our tour of KDE with a word about our database schema, and then used SQL queries to quantify how design features such as time-slice subqueries, results caching, and parallel dataseries help us achieve extraordinarily fast query performance even over huge volumes of data. In this post we’ll continue our tour of KDE, starting with filtering and aggregation. SQL gives us easy methods to do both filtering and aggregation, either by adding additional terms to the WHERE clause or by adding a GROUP BY clause. We can filter and aggregate by any combination of columns using those options. Let’s look at traffic volume for the top source → dest country pairs where both the source and dest are outside the US:

SELECT src_geo,

dst_geo,

Sum(both_bytes) AS f_sum_both_bytes

FROM big_backbone_router

WHERE src_geo <> ‘US’

AND dst_geo <> ‘US’

AND i_start_time > now() - interval ‘1 week’

GROUP BY src_geo,

dst_geo

ORDER BY f_sum_both_bytes DESC

LIMIT 10

|** src_geo | dst_geo | f_sum_both_bytes **|

| HK | BR | 27568963549063 |

| GB | BR | 8594666838327 |

| NL | DE | 6044367035356 |

| HK | GB | 6004897386415 |

| HK | SG | 5305439621766 |

| HK | CO | 4893091337832 |

| NL | BR | 4330923877223 |

| HK | JP | 4086102823771 |

| HK | PL | 3833512917644 |

| HK | TR | 3501243783418 |

SELECT 10

Time: 0.675s

The first row, HK (Hong Kong) → BR (Brazil) seems like an interesting, unlikely pair. Let’s filter on that and see who was talking to whom from a network/ASN perspective:

SELECT src_as,

i_src_as_name,

dst_as,

i_dst_as_name,

Sum(both_bytes) AS f_sum_both_bytes

FROM big_backbone_router

WHERE src_geo = ‘HK’

AND dst_geo = ‘BR’

AND i_start_time > Now() - interval ‘1 week’

GROUP BY src_as,

i_src_as_name,

dst_as,

i_dst_as_name

ORDER BY f_sum_both_bytes DESC

LIMIT 10

| 65001 | Global Transit Net | 65101 | Eyeball Net A | 11686340849432 |

| 65001 | Global Transit Net | 65201 | Eyeball Net B ASN1 | 5076468451751 |

| 65001 | Global Transit Net | 65301 | CDN Net A | 3337948976347 |

| 65001 | Global Transit Net | 65102 | Eyeball Net C | 1261908657743 |

| 65001 | Global Transit Net | 65103 | Eyeball Net D | 1234101190857 |

| 65001 | Global Transit Net | 65104 | Eyeball Net E | 1211922009485 |

| 65001 | Global Transit Net | 65105 | Eyeball Net F | 334959552542 |

| 65001 | Global Transit Net | 65202 | Eyeball Net B ASN2 | 247936925394 |

| 65001 | Global Transit Net | 65106 | Eyeball Net G | 229671528291 |

| 65001 | Global Transit Net | 65401 | Enterprise Net A | 209961848484 |

SELECT 10

Time: 0.458s

Substring and regex matching If we wanted to drill down on the first row, we could additionally filter on the specific source/dest ASNs. But let’s filter on the ASN names instead, so we can see how KDE supports SQL substring and regular expression matching on text columns. Substring/regex matching also works on other strings such as interface names and descriptions, AS_PATHs, and user-defined flow tags.

SELECT ipv4_src_addr AS f_cidr24_ipv4_src_addr,

ipv4_dst_addr AS f_cidr24_ipv4_dst_addr,

Sum(both_bytes) AS f_sum_both_bytes

FROM big_backbone_router

WHERE i_src_as_name ~ ‘Peer|Transit’

AND i_dst_as_name LIKE ‘%Eyeball Net A%’

AND i_start_time > Now() - interval ‘1 week’

GROUP BY f_cidr24_ipv4_src_addr,

f_cidr24_ipv4_dst_addr

ORDER BY f_sum_both_bytes DESC

LIMIT 10

|** f_cidr24_ipv4_src_addr | f_cidr24_ipv4_dst_addr | f_sum_both_bytes **|

| 10.129.204.0 | 10.156.25.0 | 115419904954801 |

| 10.221.155.0 | 10.141.201.0 | 78651524382556 |

| 10.156.25.0 | 10.129.204.0 | 62329500664567 |

| 10.254.246.0 | 10.31.38.0 | 39162753399340 |

| 10.117.35.0 | 10.31.38.0 | 39073550458830 |

| 10.144.99.0 | 10.254.210.0 | 28582936121869 |

| 10.31.73.0 | 10.254.244.0 | 27632400104976 |

| 10.31.75.0 | 10.17.153.0 | 26265050083173 |

| 10.144.99.0 | 10.254.244.0 | 25763076333705 |

| 10.17.100.0 | 10.93.63.0 | 23713868194889 |

SELECT 10

Time: 0.980s

As we see above, KDE can aggregate IP address columns by arbitrary subnet masks. In this query we’ve grouped the data by source → dest /24 subnet pairs. KDE also natively understands IP addresses and CIDR notation in filters, so we can drill down on the top subnet pair and look at pairs of individual IPs:

SELECT ipv4_src_addr,

ipv4_dst_addr,

Sum(both_bytes) AS f_sum_both_bytes

FROM big_backbone_router

WHERE ipv4_src_addr LIKE ‘10.129.204.0/24’

AND ipv4_dst_addr LIKE ‘10.156.25.0/24’

AND i_start_time > Now() - interval ‘1 week’

GROUP BY ipv4_src_addr,

ipv4_dst_addr

ORDER BY f_sum_both_bytes DESC

LIMIT 10

|** ipv4_src_addr | ipv4_dst_addr | f_sum_both_bytes **|

| 10.129.204.41 | 10.156.25.5 | 101922511168965 |

| 10.129.204.34 | 10.156.25.4 | 16534277019052 |

| 10.129.204.69 | 10.156.25.79 | 12821801454 |

| 10.129.204.85 | 10.156.25.79 | 12408606234 |

| 10.129.204.116 | 10.156.25.79 | 11170668135 |

| 10.129.204.110 | 10.156.25.79 | 11078339112 |

| 10.129.204.76 | 10.156.25.79 | 10895308401 |

| 10.129.204.84 | 10.156.25.79 | 10497115055 |

| 10.129.204.115 | 10.156.25.79 | 10361345421 |

| 10.129.204.75 | 10.156.25.79 | 9923494659 |

SELECT 10

Time: 0.660s

Time-series data Summary tables are great, but often we want time-series data to build visualizations. In KDE, this is as simple as adding a time column to the SELECT and GROUP BY clauses. Let’s take the top IP pair from the results above and get time-series data (bytes and packets) over the last week:

SELECT i_start_time,

Max(i_duration),

Sum(both_bytes) AS f_sum_both_bytes,

Sum(both_pkts) AS f_sum_both_pkts

FROM big_backbone_router

WHERE ipv4_src_addr LIKE ‘10.129.204.41’

AND ipv4_dst_addr LIKE ‘10.156.25.5’

AND i_start_time > Now() - interval ‘1 week’

GROUP BY i_start_time

ORDER BY i_start_time ASC

| 2016-03-10 23:00:00+00:00 | 3600 | 475231866603 | 558646000 |

| 2016-03-11 00:00:00+00:00 | 3600 | 141987820665 | 180911990 |

| 2016-03-11 01:00:00+00:00 | 3600 | 85119841990 | 130098569 |

| 2016-03-11 02:00:00+00:00 | 3600 | 102749092833 | 124245217 |

| 2016-03-11 03:00:00+00:00 | 3600 | 40349266424 | 74404852 |

| 2016-03-11 04:00:00+00:00 | 3600 | 47615668871 | 80084659 |

| 2016-03-11 05:00:00+00:00 | 3600 | 39601556357 | 71966274 |

| 2016-03-11 06:00:00+00:00 | 3600 | 44595721100 | 55644084 |

| 2016-03-11 07:00:00+00:00 | 3600 | 36984645947 | 73379683 |

| 2016-03-11 08:00:00+00:00 | 3600 | 57309415120 | 86561840 |

| 2016-03-11 09:00:00+00:00 | 3600 | 221576669330 | 219835996 |

SELECT 168

Time: 0.430s

It doesn’t make sense to return 10,800 1-minute intervals for a week-long time series query; you can’t display that many data points in a visualization that fits in your average web browser. So KDE auto-selects the interval width to return an appropriate number of points / rows, based on the overall time range covered by the query. In this case, for a query covering a week, we get 168 one-hour intervals. For more information, see Time Rounding. KDE can also return a column showing the width (in seconds) of each interval so we (or the Kentik Detect frontend) can easily calculate rates like bits/sec or packets/sec. Maintaining full granularity None of the per-interval data above was pre-calculated; it was generated on the fly, at query time, from the individual records that we saw in the response to the first query in part 1 of this blog series. That means that when we see an area of interest it’s easy to narrow the time range or apply filters to drill down. We can get full-resolution, 1-minute granularity for any historical period within the data that KDE stores, such as this one-hour time range from 90 days ago:

SELECT i_start_time,

Max(i_duration) AS i_duration,

Sum(both_bytes) AS f_sum_both_bytes,

Sum(both_pkts) AS f_sum_both_pkts

FROM big_backbone_router

WHERE i_start_time > Now() - interval ‘2159 hours’

AND i_start_time < Now() - interval ‘2158 hours’

GROUP BY i_start_time

ORDER BY i_start_time ASC

| 2015-12-15 22:29:00+00:00 | 60 | 179245157376 | 189853696 |

| 2015-12-15 22:30:00+00:00 | 60 | 181873404928 | 192246784 |

| 2015-12-15 22:31:00+00:00 | 60 | 183132584960 | 193918976 |

| 2015-12-15 22:32:00+00:00 | 60 | 180520254464 | 191270912 |

| 2015-12-15 22:33:00+00:00 | 60 | 179917988864 | 190438400 |

| 2015-12-15 22:34:00+00:00 | 60 | 175917901824 | 185893888 |

| 2015-12-15 22:35:00+00:00 | 60 | 174799783936 | 184879104 |

| 2015-12-15 22:36:00+00:00 | 60 | 175613580288 | 185396224 |

| 2015-12-15 22:37:00+00:00 | 60 | 173256493056 | 182279168 |

| 2015-12-15 22:38:00+00:00 | 60 | 170268498944 | 179223552 |

| 2015-12-15 22:39:00+00:00 | 60 | 169344593920 | 178819072 |

| 2015-12-15 22:40:00+00:00 | 60 | 169141132288 | 178192384 |

| 2015-12-15 22:41:00+00:00 | 60 | 169238467584 | 178177024 |

SELECT 60

Time: 1.293s

In summary, the Kentik Data Engine provides a very performant, scalable, and flexible platform that enables Kentik Detect to analyze network traffic data without making any compromises on the granularity or specificity of the results. But that’s actually just the tip of the iceberg. We’re excited about the other types of datasets that KDE will consume, and the additional applications we’ll be able to build on top of it in the areas of network performance, security, and business intelligence. Want to try KDE with your own network data? Start a free trial of Kentik Detect and experience KDE’s performance first hand. Or maybe you’d like to help us take KDE to the next level? We’re hiring.[/vc_column_text][/vc_column][/vc_row]

Inside the Kentik Data Engine, Part 2

Summary

Platform

Solutions

Technology

New and Notable

Learn

Company