Are You Spending Too Much on Kubernetes? | by Denilson N. | Jun, 2022

Probably, yes.

Photo by Vitaly Taranov on Unsplash

Anyone running a fleet of Kubernetes clusters knows it takes sustained investment in hardware and operations personnel to keep everything online. That level of investment keeps Kubernetes in a rarefied layer of architecture diagrams reserved for companies managing large environments, often in regulated industries that need to run them on-premises.

When the choice was Kubernetes versus Docker, the decision to embrace Kubernetes was straightforward, but several waves of new and sophisticated managed services invalidated many assumptions supporting those decisions, from cost to security to reliability.

Nowadays, many viable alternatives exist to schedule container-based workloads without running the entire cluster stack from the ground up.

Large box representing a Kubernetes cluster. The box is split in 3 horizontal slices, a top slide for pods scheduled by developers, a middle slice for all Kubernetes components (etcd, nodes, kubelet, etc,) and a bottom slice for all the hardware supporting the cluster.
It takes many different skills across different layers of abstractions to run a Kubernetes cluster.

I compiled some common themes on the most cost-effective and frictionless ways of running containers, inside a Kubernetes cluster or not, offering a landscape view of what is available to get there.

Before discussing whether it still makes sense for you to run entire Kubernetes clusters on your own, it is essential to break down the significant phases of developing a product.

Different people group those phases in different ways, but let’s use a generic arrangement as the backdrop for this story:

  1. Design. End-user experience, specification of system components, and their interactions.
  2. Development. Coding, functional testing, documentation for end-users, and documentation for system integrators. It also includes the continuous integration pipeline that supports development activities.
  3. Deployment. Rollout development releases to production. It includes the continuous delivery pipeline.
  4. Operations. Everything related to running the system, including hardware and personnel.
  5. Support. Deals with product failures at the customer’s hands.
Box with 5 layers, from top to bottom: Design, Development, Deployment, Operations. Support.
People participating in a product cycle often do not consider all parts of the business when making decisions. Product owners must continually assess the cost of an opportunity (or new feature) across all layers.

Note that different teams and organizations may co-own these stages. For instance, a customer purchasing software as a license owns most of the “deployment” and “operations” stages.

When discussing resources, we must acknowledge that money and time are intertwined.

Adding capital to any system takes good credit and a reasonable business plan. Adding more “time” resources to a system, on the other hand, can take many forms. For instance, you can hire a consultancy, hire more people, or procure tools and services that increase productivity. Paradoxically, adding more “time” capacity into a product cycle takes effort and time. In short, you cannot increase your “time” capacity as quickly as you can increase your money supply.

When making decisions that shift costs from one product stage to another, consider the nature of the shift and ask yourself:

“Will it increase capital or time expenditures?”

It is essential to realize that decisions in the early stages of the cycle disproportionately affect the later stages. Adding an off-the-shelf middleware component during the “Design” stage may lower the costs during the “Development” stage but also increase the expenses during the “Operations” stage. I covered that scenario in “Asking the wrong question: Should developers be on call?”.

Takeaway: Keep focusing on opportunities as long as you are still managing time. If you run out of capital, you can get a loan, maybe dilute ownership, and try again. On the other hand, when you run out of people, it is the end of the line.

People specialized in one stage of the product cycle generally lack awareness of what matters most in other areas and tend to focus almost exclusively on improvements in their field of expertise.

Improvements to the system architecture may sound great, such as shifting from fixed pod allocation in a Kubernetes cluster to dynamic container scheduling using a managed container service. Many of these improvements are indeed great, but we must always pause for a moment to consider the impact on subsequent stages of the product lifecycle.

For instance, if your product is heavy on R&D, such as a medical decision platform, and the IaaS bill represents only 5% of the overall budget, even a 50% reduction in those costs will not make a big difference, especially when you factor in the investment required to achieve those savings.

In that scenario, optimization efforts should start with the “Development” stage, not with the “Operations” stage. While it may be difficult for development infrastructure-minded folk like us to accept it, there are cases where something like an improved spreadsheet template can do more good than revamping parts of the system architecture. The incentives that keep technical people from making that kind of trade-off deserve their own story.

Takeaway: Kubernetes adoption is a means to an end (running workloads,) not the only way, even for heavy adopters. On any given business, many vital processes still run elsewhere, are more critical, and consume more resources than a fleet of Kubernetes clusters.

Kubernetes is great at further virtualizing a data center into a well-established set of portable APIs, which significantly simplifies the ownership of static resources, but you still own everything. When a storage disk fails, the corresponding persistence volumes in the cluster also fail, and you need people who understand both to diagnose and fix the problems in each layer.

To stress the point, I was writing a Terraform template just the other day to create all the IaaS resources required to install a Kubernetes cluster. The final count was nearly one hundred resources, covering DNS records, subnets, security groups, load balancers, network gateways, and many others. The template became so extensive that it now requires a dedicated CI process to validate changes before applying them to the production system.

With a managed Kubernetes service, you no longer own the underlying infrastructure resources, such as VPC (virtual private clouds,) virtual machines, disks, and networking. That transition shifts administrative costs (less time chasing infrastructure resources) to runtime costs (the service provider charges you a premium to operate all the infrastructure.)

Copy of the first picture with the 3 layers of a box showing containers, Kubernetes components, and then infrastructure components. The middle and bottom layers have lock signs in them, indicating that they are managed by a service provider.
With a managed Kubernetes service, the underlying infrastructure and parts of the control plane are locked down, and kept in a read-only state for anyone outside the service provider. Your company still has access to the cluster, including many parts of the control plane, such as being able to add storage classes and daemon sets.

The deciding factor here is the scale of deployment. Setting up CI/CD pipelines to validate new versions of clusters and skilling up an entire shift of on-call engineers (12–18 people) make no financial sense when you only need a handful of clusters. Once you reach ten or more, it may be cost-effective to staff an entire Kubernetes operations team and develop that practice in-house.

Takeaway: Managed clusters offer a different balance between cost and ownership, making even well-established Kubernetes shops consider their options.

It is tempting to analyze the workloads in a Kubernetes cluster and stare longingly at shifting everything to a container service in the same IaaS.

Using a managed container service may be a perfect fit for stateless and short-lived workloads, such as a micro-service that translates a page of text from one language to another. Still, if you consulted with your operations team, they would probably tell you that stateful workloads are their most significant source of worry and where they spend most of their engineering efforts. These are your database and messaging clusters, with their intricate procedures for storage, backups, archiving, upgrades, and so on.

Copy of the first picture with the 3 layers of a box showing containers, Kubernetes components, and then infrastructure components. The middle and bottom layers have lock signs in them, indicating that they are managed by a service provider. Some of the services are no longer in the first layer, but on a separate box, managed by a separate company (named ACME.)
Running stateful dependencies inside a cluster is significantly more complex than running stateless workloads. Things like backup, archival, disaster recovery, and upgrades require specialized skills, detailed planning, and continuous upkeep.

Recalling lesson #2 (“90% of nothing is nothing”,) dealing with stateless workloads first may not result in a meaningful reduction in the overall budget of the “Operations” cycle of the product, so consider dealing with stateful workloads first.

Those workloads do not fit a serverless model well, as the compute aspect (CPUs) sits inside the pod, coupled with mounted storage volumes. In that case, the starting point is to replace those workloads with a service in the IaaS, such as using managed instances of PostgreSQL or Kafka.

Note that such shifts still leave residual operational costs behind, such as the need to monitor the health of the service and have procedures in place to track eventual problems to failures in the service. Still, those activities require fewer people and do not require in-depth knowledge of the service runtimes.

Takeaway: Assess how much of your overall product cost is spent on each service in your data center, from design to support. Then assess whether it could cost less (again, in terms of total cost) to lease the services from a specialized company.

Although not as pervasive across all cloud providers (see examples here and here,) this type of service is a clever addition to a managed Kubernetes service. Virtual Kubelets extend the cluster scheduling service to run select containers elsewhere in the IaaS instead of inside a cluster node.

That is an excellent solution for scenarios where you are not ready to rearchitect the entire system into a combination of services and still need to keep Kubernetes clusters around for a while, with the added advantage of keeping the number of fixed worker nodes as low as possible.

Since the approach reuses Kubernetes development and deployment APIs, the costs of transitioning to a serverless model are minimal.

Copy of the first picture with the 3 layers of a box showing containers, Kubernetes components, and then infrastructure components. The middle and bottom layers have lock signs in them, indicating that they are managed by a service provider. Some of the services are no longer in the first layer, but on a separate box, managed by a separate company (named ACME.) Some of the containers and pods are on yet another box, also managed by ACME.
With a virtual kubelet, someone creates a new workload in a managed cluster. The service provider uses additional user configuration to decide whether to schedule the pod inside the cluster or elsewhere in the service provider cloud.

Takeaway: If you already leaped into using managed Kubernetes clusters, isolate workloads that don’t run frequently and don’t need to run inside the cluster. Let the service provider schedule those pods outside the cluster, then deallocate whatever fixed cluster capacity you had set aside for them.

If you got this far, you have workloads that cannot be sensibly moved out of a Kubernetes cluster. The only thing left is eliminating wasteful capacity overprovisioning within the cluster.

For instance, if your cluster has ten worker nodes and they are all above 90% resource utilization (CPU, memory, or disk), the temporary loss of a worker node means the cluster can no longer schedule all pods.

When you look closer into that utilization, the waste becomes more apparent, as you realize that a sizeable slice is allocated to container resource requests, whether or not a container requesting the resources is using them.

And remember what I mentioned about the total cost of operations? Unschedulable pods will likely cause problems in other systems and prompt several people to deal with the situation, from customers contacting support to automated monitors paging people to look into the issue.

I covered auto-scaling options in this story, but in a nutshell, you want to consider:

  • HPA (Horizontal Pod Autoscaling): Adapts the number of replicas for a workload based on resource utilization in the workload containers
  • VPA (vertical pod autoscaling): Modifies container sizes in scenarios where one cannot modify the number of container replicas.
  • KPA (Knative pod autoscaling:) Schedules containers based on request-per-second targets, with the option to scale a deployment down to zero if there are no pending requests for a while.
  • Cluster autoscaling: Scales pools of worker nodes according to the resource requests and utilization.
Illustration of how pod autoscaling technologies handle containers that need more resources. The overloads containers are represented as small boxes stretched in their middle and with steam coming out of corners. HPA creates another box of same size, but not overstretched anymore. VPA makes the box larger. KPA creates more boxes, then deletes them after the work is done.
Pod autoscaling technology automatically adjusts capacity to match workloads. When paired with cluster-autoscaling, they allow the operations team to run containers with fewer nodes. The effects are more noticeable on the IaaS bill as a marginal change in the number of cluster nodes has little impact on the effort required to manage the cluster.

Takeaway: Autoscaling is not for everyone; it takes work to get right. It is an option for large-scale deployments where the savings in runtime costs offset the investment in deploying and tuning the autoscalers.

Kubernetes is a noticeable portion of an IT infrastructure, and optimizing resource utilization may be a tempting starting point for cost reductions.

Please resist the temptation to start at the most visible target and take the time to understand the total cost of operations for a product and its distribution across the multiple stages of the product lifecycle.

Teams with design and development responsibilities must grasp how other stages work and make every decision with the overall cost to the project in mind.

Shifting entire system layers to IaaS vendors can be a sensible choice for small operations teams, where it may not be possible to staff 24×7 on-call shifts of people skilled across all layers of the stack. Adopting managed Kubernetes over self-hosted clusters is a good starting point, while stateful workloads make for an excellent second step.

News Credit

%d bloggers like this: