After 3 years of managing the K8S cluster, I learned 10 lessons

Background

Over the past three years, I've traversed the sometimes choppy realm of managing kubernetes clusters. This journey of challenge and discovery has given me a deep understanding of this cutting-edge technology, as well as many other aspects. In this post, I'd like to share with you ten of the most valuable lessons I've learned as a Kubernetes cluster administrator.

The lessons cover a variety of topics, from managing the underlying infrastructure to optimizing the deployment process, including best practices for ensuring cluster scalability and security. Whether you're new to Kubernetes or a seasoned expert, these tips will give you a rich perspective on how to effectively manage your Kubernetes cluster.

Let's dive into these lessons, which are the culmination of three years of experience, successes, and challenges.

Lesson 1: Use Kubernetes in the cloud

Unless there are extreme constraints, don't manage the underlying Kubernetes infrastructure yourself. You spend time debugging issues that are of no value to your business. It's great to be an expert in kube-api, kube-apiserver, kubelet, etcd, kube-proxy, etc., but maintaining these on your own on a daily basis doesn't create any business value. You don't need to be an expert in these concepts to manage clusters effectively. Delegate this low-level task to a cloud service provider (AWS, Azure, GCP, OVH, etc.) who does it better than you. At HK-Tech, we chose AWS and EKS clusters (note that ECS is not Kubernetes!). ）。

Lesson 2: Deploy all Kubernetes-related infrastructure with **

No part of the cluster should be done manually on the console, not even a simple tag. Especially to avoid the "I fixed it quickly on the console first, I'll update ** later" mentality. Myth: You'd never do that.

Lesson 3: Avoid overusing helm charts that you don't have full control over

Yes, they are great, they work quickly, and you don't have to bother with writing your own yaml, unless one day an update causes everything to crash. If you're really lazy or short on time, at least make an effort to understand the valuesYAML file and avoid using default values. At hk-tech, the rule is not to use helm charts;In the worst-case scenario, we'll just get the template.

Lesson 4: Kubernetes doesn't like "lift and shift".

Therefore, in order to use k8s, you need to start with the cloud adaptation of your legacy applications. It's not about k8s adapting to your app, it's about adapting k8s to your app. If you don't have the ability to rewrite your application, it may be best to stick with the old virtual machine running mode.

Lesson 5: To mesh or not to mesh?

Don't install a service mesh if you don't need it. So how do you know if you need it?Ask yourself two questions: Do the applications in my cluster communicate with each other?Is there a need for security policies for switching between applications in my cluster?If the answer to both is yes, then installing a service mesh may be useful. I don't have specific recommendations;Usually the various mash techniques are similar to each other.

Lesson 6: Avoid using too many tools

Kubernetes offers a plethora of ancillary tools that promise copycats and wonders for better managing your clusters: Argocd, Lens, K9s, Keda, Krew, Kubectx, Kubens, Kail, and many more. Avoid collecting them like stamp collects, and let's be honest: 90% of the demand can be met with kubectl. Personally, I've limited myself to using kubectx, kubens, and k9s, which are beneficial for cluster management.

Lesson 7: Resource limits (memory and CPU) must be defined for pods

This will prevent poorly coded or misconfigured applications from gobbling up all the resources of your cluster and causing other applications to crash one after the other due to some voracious pods. This is also one of the reasons to be wary of helm chat and always check the manifest source behind the encapsulation in detail**.

Lesson 8: Think about statelessness

Ideally, it's best to avoid storing data in pods. If for some reason it can't be avoided, it's better to use a NAS instead of a disk mount directly. Otherwise, you might be surprised to learn that some of the pods in your deployment don't have access to persistent resources. Yes, hard disks can only be mounted on one node, so if your pods are spread across multiple nodes, pods on the same node will see the same data, but pods on other nodes will not see. With a NAS mount like EFS, you will be able to avoid this problem.

Lesson 9: Configure HPA (Horizontal Pod Autoscaling).

If you want to stay stuck in the old way of working and benefit from the power of Kubernetes, you need to automatically manage resource utilization based on demand, and you need to configure HPA on all application projects. (Another limitation of helm chat, which unfortunately is often very lacking).

Lesson 10: Don't be afraid of change

On average, you should plan to do three version upgrades per year for your cluster, with updates roughly every four months. Some updates are transparent, but there are often impactful changes. To better prepare for these updates, I recommend reading, re-reading, and re-reading the release notes, as well as the experiences of those who have done version updates before you. What I recommend, and what we have implemented at hk-tech, is to always stay on top of the latest version (unless there is a security patch).

Have a great Kubernetes journey!

Author丨Herve Khg**丨***Docker Chinese Community (ID: DockerChina)DBAPLUS Community welcomes contributions from technical personnel, and the submission email: editor@dbapluscn

After 3 years of managing the K8S cluster, I learned 10 lessons

Related Pages

A detailed explanation of the K8S cluster deployment tool Kubeadm

K8s Cluster Observability Data Offload Best Practices

Trivy added KBOM vulnerability scanning for K8s

K8s container debug advanced tips

How to use Helm to integrate Prometheus and Grafana Part 1 on K8s