At Kentik, we’re big fans of Pandora. Not just because of its non-stop, personalized music, or because they’re a long-time Kentik customer, but also because of the modern, cloud-native approach they’ve taken to running the infrastructure behind their booming music streaming service.
As part of its always-innovating mentality, just last month Pandora announced its migration to Google Cloud Platform (GCP). In a blog post, the company said the move would support its scientists, developers, and analysts, who “traverse about 6 PB of data with tools such as Hive, Spark, and Presto every day to gain insights and improve the product.”
For Pandora’s NetOps and SecOps teams behind the migration, we know cloud visibility is now more important than ever. It helps ensure cost-effective, scalable infrastructure and fast investigation into any potential issues to keep services fast and highly available across regions.
That’s why we caught up with James Kelty, senior director of network operations and engineering at Pandora, to tell us how the music service plans to maintain visibility across its infrastructure, GCP now included. (For background on why Pandora initially decided to leverage Kentik, you can watch James tell the story in this short video case study.)
KENTIK: Why is cloud visibility important for Pandora?
JAMES at PANDORA: You always want your developers to be autonomous. However, a big challenge with cloud is that those developers don’t think in terms of types and size of infrastructure. Rather, they think more about having the ability to use what’s available, as quickly as they need to move. When you introduce cloud — and the metered services that come along with it — that can get a company into trouble. Cloud visibility supports cost control by showing us what’s happening across infrastructure and where developers are putting workloads and applications. No more billing surprises.
Additionally, it’s important for alerting and security. When it comes to large looming infrastructure that can exist anywhere in world, keeping track of internal-to-external traffic flows and vice versa (e.g. seeing unexpected flows between specific areas and alerting on that) helps to make sure people aren’t circumventing any security policies.
Simply put, cloud is an extension of infrastructure with new and differing rules, and cloud visibility allows us to tie both parts together.
KENTIK: What is the advantage of VPC Flow Logs for cloud visibility?
JAMES at PANDORA: Flow data extends our ability to see into and across GCP, and our use of VPC Flow Logs is focused on making sure our on-prem and GCP assets are speaking to each other in a way we’ve planned for.
With VPC Flow Logs, we can see what our services are doing and if they’re talking to each other across regions. We can also drill into why they may or may not be talking to each other and improve services by making them more autonomous. That helps us tie together our on-prem and cloud platforms and have visibility into traffic in between them.
For us, there is no demarcation line between on-prem and cloud. Knowing what is happening on and between both — especially as developers and engineers start using services based on availability rather than location — will save on cost and headaches.
KENTIK: How does Kentik provide cloud visibility to Pandora?
JAMES at PANDORA: In terms of the migration itself, GCP’s elastic infrastructure made it an easy decision for us. On top of that, we knew Google got VPC Flow Logs and their impressive granularity right from the beginning. So when we heard Kentik added support for Google VPC Flow Logs, we wanted to check it out.
Since we’re standardized on Kentik as our flow platform, having the built-in capability to consume VPC Flow Logs was so much better than having to try to tie them together manually. With Kentik, we were able to immediately and very simply start pulling the logs in from GCP.
As just one example of how we’re leveraging Kentik for cloud visibility, we have a nice dashboard in Kentik’s platform that shows us one of our projects using three zones in a single region. With the added visibility, we were able to see a lot (kilobytes per second) of cross-talking happening between the zones. Each egress to another zone has a metered cost. With Kentik’s ability to drill into the issue, we were able to reduce the unintentional flows, saving us both instantly and in the long term.
We also put together a dashboard within Kentik’s platform to show us both our GCP internal traffic and on-prem traffic to see what applications have moved across our infrastructure. This allows us to see what applications need to be more cloud-native and helps us to determine whether to push the application into the cloud or not based on the costs associated with it.
KENTIK: Who at Pandora benefits from cloud visibility from Kentik?
JAMES at PANDORA: Our network operations and security operations teams use the Kentik platform for both on-prem and cloud visibility. While our developers don’t directly use Kentik (yet), when they have specific questions about cloud interconnects, we can easily spin up a report to give them a quick view of the information they’re after, including what their traffic looks like in the very moment they spot a potential issue.
KENTIK: What advice do you have for the cloud providers or those who are seeking cloud visibility?
JAMES at PANDORA: GCP VPC Flow Logs are more granular than other data sources, and therefore, they’re more helpful for network operators than other means of attempting to monitor network activity in cloud environments. If competing cloud platforms could come up with the level of granularity that Google Cloud has, we’d all be better off.