In some areas of technology, Kubernetes is considered an unnecessarily complex waste of time that startups should avoid. Using Kubernetes and a small team is seen as a sign of over-engineering.
Translated from The Hater's Guide to Kubernetes by Paul Butler.I myself made the mistake of contempt. I may complain about kubernetes at times, but it's really a great technology. I highly recommend it to all my competitors. — Paul Butler (@paulgb) September 9, 2022 Despite some sarcastic things I said, but"Amazing technical product"It is indeed a compliment from the heart. At the time of that post, I wrote about how necessary the complexity of kubernetes is for what it does. We've been running kubernetes production at Jamsocket for a few years now and I've found it to work just fine. The tranquility of Kubernetes has been implemented internally. One of the keys is to strip out a small subset of Kubernetes' functionality and pretend the rest doesn't exist. This post was originally an internal guide to how we use Kubernetes, so it's not meant to be instructive for every startup; Still, I think it's a good starting point to avoid the many sandbars in the vastness of kubernetes.
In my opinion, Kubernetes is the best way to go if you want the following three things:
Run multiple processes on the server to schedule a job.
Run them redundantly and load balance between them.
Configure them and their relationships as **.
At its most basic, Kubernetes is just an abstraction layer that allows you to think of a group of machines as a single (headless) machine. If this is your use case, and you can avoid other parts of it, you can go very far.
Some people have told me that point 2 is overkill, and that startups shouldn't focus on zero-downtime deployments or high availability. But we often do multiple deployments per day, and when our products fail, our customers' products fail for their users. Even a minute of downtime can be noticed by someone. Rolling deployments give us the confidence to deploy haphazardly and frequently.
For background, JamSocket is a service that dynamically starts processes that a web application can communicate with. Kind of like AWS Lambda, but the process lifecycle is bound to a websocket connection rather than a single request response. We use Kubernetes to run the long-running processes needed to support this feature. API servers, container registries, controllers, log collectors, some DNS services, metrics collection, etc.
Some things we don't use kubernetes:
The ephemeral process itself. We tried it early on, but we quickly discovered that it had limitations (more on that later).
Static Marketing**. We use Vercel for this. It's more expensive, but so is the opportunity cost of an hour of engineering time for a small startup, and Vercel saves us more time than it takes. Direct storage of any data that we don't want to lose. We do use some persistent volumes for caching or deriving data, but other than that, we use managed postgres db and blob storage outside of the cluster.
It's worth noting that we don't manage kubernetes ourselves – the main advantage of using kubernetes is that we can outsource its infrastructure-level operations! We're happy with Google Kubernetes Engine, and while the Google Domains fiasco has shaken my confidence in Google Cloud, I can at least rest easy knowing that migrating to Amazon EKS is relatively straightforward. There are some types of k8s resources that we don't hesitate to use. I'll only list the resources here that we've explicitly created; Most of these resources implicitly create other resources (like pods), which I won't mention, but we certainly use them (indirectly).
Deployment:Most of our pods are created through deployments. Each deployment that is critical to the functionality of our service has multiple replicas and rolling updates.
Services:Specifically, clusterip is used for internal services, and loadbalancer is used for external services. We avoid using nodeport and externalname services and prefer to keep our DNS configuration outside of kubernetes.
cronjob:Used to clean up scripts and similar content.
configmapwithsecrets:Used to pass data to the above resources.
statefulsetwithpersistentvolumeclaim: We used some statefulsets. Configurations are slightly more complex than deployments, but they can persist volumes after a reboot. We prefer to persist important data into a managed service outside of k8s. We don't have strict rules for volumes, because sometimes it's nice to keep things like caches after a service restart, but I'd avoid them if possible because they can interact poorly with rolling deployments (deadlocks).
rbac: We use it in a few places, such as granting the service permission to refresh secrets. It adds enough complexity to our small cluster that I mostly avoid it.
Write yaml manually。yaml has enough traps, so I avoid it as much as possible. Instead, our Kubernetes resource definition is created using TypeScript and Pulumi. Non-built-in resources and operators。I've previously written about how controlling the loop pattern is a double-edged sword: it's the core idea that makes K8s powerful, but it's also indirectness and complexity. The Operator pattern and custom resources allow third-party software to do its own control loop using Kubernetes' powerful infrastructure, an idea that's great in theory but I find clumsy in practice. We don't use cert-manager, we use caddy's certificate automation. helm。Helm doesn't work because of the operator and the lack of yaml rules, but I also think that using unstructured string templates to generate machine-parseable content means introducing vulnerability without benefits. It's like hammering nails into a chalkboard for me, sorry. Anything with "grid" in the name。I guess they work for some people, but they don't work for me, and they don't work for this person. ingress resources。I haven't left any scars on this, and I know some people will use them efficiently, but one of the themes of our successful use of kubernetes is to avoid adding unnecessary layers of indirection. Configuring caddy works for us, so that's all we do.
Try to replicate the entire k8s stack locally。We don't use tools like k3s or kind to replicate production exactly, but just use docker compose or our own scripts to spin up a subset of systems we really care about in the moment.
I mentioned the fact above that we run ephemeral, interactive, session-alive processes on kubernetes for a short time. We quickly realized that Kubernetes was designed for container startup time, not for robustness and modularity. As a general rule, my opinion is that kubernetes is good for redundant running some long-running processes, but if one is waiting for a pod to start, kubernetes is the wrong choice. I'll admit I'm talking about my book here, but at least it's an open-source book: we use a MIT-licensed Rust orchestrator called plane, which we specifically designed to quickly schedule and run processes for interactive workloads (i.e. someone is waiting for them). For the sake of completeness, I should also mention that some of the kubernetes alternatives that have emerged are very good. Especially if you don't want or need requirement 3 (the ability to designate infrastructure as **) on my initial list. For one of our products, we chose to use Railway instead of our K8S cluster, mainly for the preview of the environment. Some of my friends who really respect Render are full of praise (I've dabbled in, but personally think Railway's environment model is more concise). I also prefer Flight Control's bring-your-own-cloud approach. For many SaaS type applications, you've probably gone far with those tools (referring to the ones mentioned earlier). But if you meet the three requirements listed at the beginning of this article, and take a rigorous approach, then don't let anyone tell you it's too early to use Kubernetes.