At the recent AWS re:Invent conference, we heard many attendees talking about cloud-native architecture and container-first approaches to application development. The discussions were not only focused on leveraging cloud-native architecture to foster innovation but also to speed up development for the attendees’ growing businesses. With all of the cloud-native buzz, we wanted to provide a deeper dive on the topic.
First: What Does It Mean to be Cloud-Native?
Cloud-native is a term used to describe a modern software architecture, deployment, and operations model. It originated with early adopters of public cloud infrastructure, but it’s now gaining popularity across application development in general. It’s appealing because it speeds innovation by breaking large software projects into smaller components and creates portability by abstracting away infrastructure dependencies. The end goal is to deliver apps at the pace a business needs.
In theory, a cloud-native architecture promises the best user experience with the least resources, which we see in four specific areas:
Deployment agility - This is the idea that applications should be independent of the infrastructure they run on, so development teams can focus on creating value for the business rather than infrastructure dependencies. This provides the agility to quickly distribute or move workloads between in-house or cloud infrastructure in response to changing business conditions.
Rapid scalability - One of the characteristics of cloud-native architectures is “elasticity,” which unlocks autoscaling — the ability to scale down when the workload is small, and scale up without any reprovisioning during peak time for the business.
Operational efficiency - With increased scale, manual provisioning, management, and troubleshooting processes become unworkable. Automation becomes essential for managing workflows and reducing human errors.
High performance - With cloud-native architecture, it’s possible to leverage massive computing resources (e.g. CPUs and GPUs), and it lowers the barrier to enhanced digital performance for anyone (especially for AI and ML workloads which eat up a ton of computing resources).
Some concepts and technologies that are usually associated with cloud-native architectures:
DevOps - In order to speed up deployments and reduce operational risk, development and operations teams started collaborating more frequently and brought in system automation. Eventually, these Dev and Op teams merged into “DevOps” teams. DevOps teams now go hand-in-hand with cloud-native architecture enabling streamlined application deployment and faster time to market.
Containers and Container Orchestration - Gitlab has a good definition for containers: “A container is a method of operating system-based virtualization that allows you to securely run an application and its dependencies independently, without impacting other containers or the operating system. Each container contains only the code and dependencies needed to run that specific application, making them smaller and faster to run than traditional VMs.” Nowadays, many applications are run in “containerized” form in the cloud. And with larger containerized applications, orchestration becomes necessary to automate, scale, and manage them. Kubernetes is by far the most popular one.
Microservices (or sometimes called Microservice architecture) - This refers to structuring an application with a collection of loosely coupled, lightweight services, each implementing a specific, granular piece of the application. Development teams can iterate or scale each microservice independent of the others, speeding development. Microservice architectures are also platform-agnostic, allowing services to be written in different languages or deployed across multiple types of infrastructure for maximum flexibility.
CI/CD (a.k.a. Continuous Integration/Continuous Delivery) - CI/CD isn’t totally new, (a Wiki definition can be found here) — it’s basically a software engineering practice of constantly deploying new, small code changes with automated fallout recovery mechanisms in place. Avoiding infrastructure monoliths is fundamental to achieve CI/CD.
Observability - Given the increasing complexity of microservices, there has been greater emphasis on the need for modern monitoring practices to gain better insight into how applications perform and operate. Observability incorporates concepts of pervasive instrumentation and retention of fine-grained metrics.
Network Challenges Brought by Cloud-Native Architecture
While a cloud-native architecture has the potential to significantly improve ROI for organizations, on the other hand, the above-mentioned new technologies are bringing in network challenges to the infrastructure. Why?
First, the infrastructure challenges: Networks still underpin everything in cloud-native architectures. In fact, networks become even more critical because workloads are no longer in the form of monolithic software. Services that are single-function modules with well-defined interfaces need to talk to other services and resources that are all over the place. These communications dramatically increase network complexity.
Second, dynamics! Containers and workloads can spin up or down based on demand anytime, anywhere. The ephemeral nature of containers makes it difficult to troubleshoot historical events, because the container may no longer be running. And because network identifiers like IP addresses may represent different services from moment to moment, they no longer reliably identify specific services or applications on their own.
Next, multi-cloud makes it even more complicated. Although cloud-native architectures are designed to be infrastructure agnostic, operations teams still need to understand how applications affect network infrastructure to manage cost, performance, and security. This becomes very difficult when each infrastructure has a separate console for visibility and workload management, which creates silos and operational complexity.
When an issue occurs, you need to see it first and then troubleshoot, understand it, and fix. Cloud-native architectures also create challenges for visibility tooling today, too:
Traditional tools can’t deploy where you need them: Hardware appliances cannot be plugged into public cloud infrastructure, where many cloud-native workloads are deployed. Even VM-based appliances pose challenges here. These solutions may be useful for traditional infrastructure, but not for modern architectures.
Lack of context: Basic network identifiers like IP addresses, interfaces, and network devices lose their meaning in cloud-native environments. In order to provide useful insight, tooling must continuously map those identifiers to new labels like container and service names, customers, applications, and geolocations.
Compliance-centric: We see many tools that are focused on compliance requirements that are common in security operation center (SOC) workflows, but aren’t useful for other operational concerns like troubleshooting performance issues, or cost management.
At the AWS re:Invent conference this year, we spoke with many attendees who visited Kentik’s booth, and many agreed with us on the challenges that they have been facing while adopting cloud infrastructure and cloud-native architecture. Three of the most challenging problems we heard attendees mention were around a tooling gap in their organization, impacting:
Performance management: This includes both network and application performance. Not only keeping services running and stable and fixing issues quickly, but also building and evolving the cloud architecture to achieve the best performance that aligns with business goals.
Cost management: We’ve heard from many cloud adopters who were surprised by some of the charges on their cloud bills. Being on the receiving end of this surprise is never fun (for you or your financial controllers). However, in many cases, you can cut your bill with informed network architecture decisions — for example, by reducing expensive inter-region and egress traffic.
Security: The cloud model is a shared responsibility model. While cloud providers are responsible for protecting the lower infrastructure layers, cloud adopters are the ones who are responsible for the applications they run on top of it. Because of these layers of responsibility, it’s harder than ever to ensure your cloud environment is fully protected.
Kentik Cloud-Native Visibility
Cloud-native architecture has huge benefits, but it brings big challenges, too. Getting it right is crucial for the success of application development and migration. Kentik’s platform provides solutions that benefit every major cloud stakeholder, including NetOps, NetEng, SecOps, DevOps, and executives.
Kentik’s cloud analytics suite addresses all of the cloud management challenges we discussed earlier in the post: network and application performance management, cost management, and cloud protection. Via comprehensive analytics from the big picture down to fine details, and proactive anomaly detection, we provide a comprehensive visibility solution for every piece of your cloud-native infrastructure.