Kentik - Network Observability
Kentik Blog
Kentik Blog

Kubernetes and the Service Mesh Era

Cloud Solutions Architect
January 05, 2023

Kubernetes is a game-changer for enterprise organizations. Automating deployment, scaling, and management of containerized applications allows organizations to embrace a cloud-native paradigm at scale and more easily employ best practices, such as microservices and DevSecOps.

But as with all tech, Kubernetes has its limits. Kelsey Hightower famously tweeted that “Kubernetes is a platform for building platforms. It’s a better place to start; not the endgame.”

And networking is arguably the area where this quote is most applicable. Kubernetes provides a generic networking baseline—a flat address model, services for discovery, simple ingress/egress, and network policies—but anything beyond these basics must come from an extension or integration.

Service meshes were built to close this gap by providing advanced services around traffic management, security, and observability.

In this article, we’ll look at how Kubernetes approaches networking, the gaps in networking that come with Kubernetes out of the box, and how service meshes address those gaps.

Let’s start with a background on the Kubernetes networking model.

Kubernetes and service meshes

Understanding the Kubernetes networking model

Kubernetes runs workloads (services, applications, and jobs) in pods. Each pod contains one or more containers. Each pod also has a unique (within the cluster) private IP address.

All of the containers in a pod run on the same node and (since they share the same IP address) can communicate with one another over localhost. Within a cluster, pods are directly reachable by these private IP addresses. Kubernetes has an internal IP service that is used for this communication and load balancing inside the cluster.

However, communication with a pod outside the cluster is a little more complicated. There are several options:

In short, a service in Kubernetes is a way to group a set of pods and present them as a single entity via an IP address and/or DNS name. When the service receives a request, it delegates it to one of the backing pods.

Pods can discover services in two ways: environment variables and DNS. The environment of each pod contains the endpoint to every service in the cluster, such as REDIS_SERVICE_HOST_PORT=10.0.0.11:6738. Every service has a DNS name based on its name and namespace, such as <service>.<namespace>.svc.cluster.local.

Let’s look at several other important networking constructs that Kubernetes works with.

DNS

DNS is a staple of networking beyond Kubernetes. Kubernetes comes with its own internal DNS server, CoreDNS, to create DNS records for pods and services. CoreDNS is its own CNCF project.

Kubernetes NetworkPolicies

Kubernetes NetworkPolicies allow you to manage traffic between pods at the host and port level. You can set policies for ingress or egress within the cluster.

Ingress

Ingress resources control HTTP and HTTPS traffic from the outside world to services inside the cluster. You can define ingress rules, but Kubernetes actually doesn’t know what to do with these rules. This is one of the extensibility points where you need to deploy a third-party ingress controller to watch for the ingress resources you define and enforce the rules.

Gateway API

The Gateway API is the evolution of Ingress. The Gateway API is more expressive and flexible. It is broken into multiple resources that allow application developers and cluster administrators to cooperate without stepping on toes.

The network requirements of enterprise systems

As you can see, Kubernetes provides a robust and clean networking model. Many of the fundamental building blocks of networking are supported. However, as an enterprise organization, you probably need much more.

For example:

  • Sophisticated routing
  • Strong security
  • Observability
  • Inter-service authentication and authorization
  • Load balancing
  • Health checks
  • Timeouts and retries
  • Fault injection
  • Bulkhead
  • Rate limiting

Before the cloud-native era, the landscape for enterprise organizations was proprietary. Organizations ran their systems in private data centers. Infrastructure was mostly static, with separate IT teams responsible for capacity planning. Software was typically a large monolith with long release cycles.

To handle the enterprise networking requirements mentioned above, the common practice to ensure adherence to policies and interconnectivity between subsystems was to have standard client libraries used by all software teams. This, of course, led to a lack of flexibility, over-budget and past-deadline project failures, and slow decay, as there was no way for complex software systems to stay up to date with modern innovation.

Let’s fast forward to the cloud-native age!

Networking in the cloud-native age

In the modern age, software systems are deployed in the cloud, on multiple clouds, private data centers, and even edge locations. The infrastructure is dynamic. The software comprises hundreds and thousands of microservices that may be implemented in multiple programming languages. The infrastructure and application development follow DevOps practices for continuous delivery. Security is integrated into the process following DevSecOps practices. Different components of the system are released constantly.

This was a boon for productivity and flexibility—but brought on new problems of management, control, and policy enforcement. All these microservices implemented in multiple languages somehow need to interact. Developers and administrators need to understand the flow of information, be able to detect and mitigate problems, and secure the data and the infrastructure.

Enter the service mesh.

Service mesh to the rescue

A service mesh is a networking software layer that uses a control plane to configure policies and a data plane made of proxies or node agents to intercept network traffic and apply those policies.

A service mesh has many benefits in a modern, large, and dynamic networking environment, such as Kubernetes-based systems, where new workloads are deployed constantly, pods come and go, and instances scale up or down.

  • The service mesh externalizes all the networking concerns from the applications. Now they can be managed and updated centrally. By offloading all networking concerns to the service mesh, service developers can focus their efforts solely on their application and business logic.
  • With a service mesh, you can upgrade your service mesh, and everyone immediately enjoys the latest and greatest transparently. Traditionally, to introduce a change or upgrade to a client library, you would need to negotiate with each team individually, supporting multiple versions of libraries and across multiple programming languages.
  • You benefit from the efforts of experts that keep evolving, improving, and optimizing the service mesh. The service mesh is also used and battle-tested by many organizations. This means that problems that might impact you may have been discovered and reported by other users.
  • As a central component that touches all of your services, the service mesh can handle cross-cutting concerns—such as observability, health checks, and access policy enforcement—across all services in your Kubernetes-based system.
  • The service mesh can add a layer of security to an enterprise’s inter-service communication by employing a zero-trust approach to access and using mTLS to encrypt traffic for secure communication. Additionally, limiting access from application to application helps to ensure that a malicious attacker who exploits one service cannot move laterally through your network to exploit other services.

Service meshes on Kubernetes

Service mesh fits Kubernetes like a glove. Kubernetes makes it easy for service meshes to integrate with the platform due to its extensibility. The synergy between Kubernetes and service meshes is powerful as the service builds on top of the basic Kubernetes networking model.

For large systems—in particular, systems composed of multiple Kubernetes clusters—the service mesh becomes a standard add-on. Once enterprises begin working with multiple clusters, which might spread across different clouds, the service mesh becomes an essential component for properly facilitating and securing inter-service communication.

A quick review of service meshes on Kubernetes

If you’re ready to implement a service mesh on top of Kubernetes, there are many choices. Let’s look at a few of them and their strengths and attributes.

  • Istio is arguably the most popular service mesh for Kubernetes. Google, IBM, and Lyft originally developed it. It uses the Envoy project from Lyft as its data plane.

  • Linkerd is the first service mesh. Its claim to fame is that it is more performant and less complicated than Istio. It implements its own data plane using Rust.

  • Kuma is a service mesh originally developed by Kong, who also has an enterprise service mesh called Kong Mesh built on top of Kuma. Kuma also uses Envoy as the control plane. Its claim to fame is that it allows connecting Kubernetes clusters with non-Kubernetes workloads running on VMs (but Istio now has this capability, too).

Here are several other service meshes you may want to explore:

  • Traefik Mesh (node agents as control plane)
  • Open Service Mesh (heavily pushed by Microsoft, can be enabled on AKS as an add-on)
  • AWS App Mesh (AWS proprietary service mesh, strong integration with EKS, ECS, and EC2)
  • Cilium Mesh (up and comer service mesh using eBPF in the data plane)

Kubernetes, service meshes, and Kentik Cloud

As we’ve seen, Kubernetes with a service mesh is a powerful combination that lets you connect workloads across clouds, data centers, and the edge and enforce policies and best practices.

As an additional benefit, as the service mesh works, it collects a lot of valuable data from flow logs and metrics related to your network traffic. This data can help you create a more robust and reliable system. But being able to understand and use this data in a meaningful way, and make it actionable, can be difficult. A strong and robust observability solution (such as Kentik Cloud) can help you make sense of the data from your service mesh, ensuring your system is cost-effective, healthy, and performs well. It can also help to mitigate incidents and/or attacks.

Conclusion

Kubernetes is a powerful tool for modern cloud infrastructure. Out of the box, it offers some networking capabilities, but by adding a service mesh on top, you gain a long list of benefits. Hopefully, you now understand how Kubernetes and service meshes can work together to create modern and robust enterprise systems.

To learn more about how multi-cluster service meshes solve hybrid and multi-cloud networking complexities, read our previous article, Kubernetes and Cross-cloud Service Meshes.

These might interest you:

Join the Kentik Slack Community
Be part of a community of Kentik users who can help you along the way.
Join Now
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.