dysphoric.dev


Migrating my personal infrastructure to Kubernetes

Posted: 2021-07-30 12:02:38 |
Last update: 2021-07-31 15:32:00

Introduction

If you don't know what Kubernetes is, that makes sense.
It's a container orchestration tool used mostly by large enterprises.

If you still don't know what it is, that also makes sense.
The fact that it's difficult to explain what Kubernetes is, is kind of a meme within the community.
Instead of hitting you in the head with some graphs that contain a whole lot of logos and no real information, I'm gonna link the Comic Strip that made me decide to look further into k8s.
While this comic is an advertisement for the Google Kubernetes Engine, it also does a really good job at describing what a Kubernetes cluster is.

Motivation

If you've been thinking that a Kubernetes Cluster is (a bit) overkill for the handful of applications I'm hosting, you're absolutely right.
I don't make any money with this stuff, in reality it's actually quite expensive to host, but that's normal for hobby projects.
However, I actually do have some reasons for doing this.

First and foremost, I enjoy learning new stuff.
In my day job I work as a frontend developer.
This is mostly because I enjoy doing this work and I've realized I'm quite good at it, especially in comparison to the full stack devs who hate doing frontend tasks (but have to do them either way due to under staffing) I used to work with.
I never lost interest in backends and infrastructure; I just decided that going into frontend was the best move for me.
Now I am in a position that is about as far away from DevOps as one can go in software development.
So in order to build my skill set in DevOps tasks, I have to do private side projects.

The second reason is the amazing Kubernetes community.
I have never met a tech community that was this welcome to outsiders of any skill level.
I've been seeing their memes and tech takes for quite a while now because one of my girlfriends is part of this community and shares these posts on social media.
I really wanted to be a part of this space and I saw doing this as a good opportunity to bond over yet another shared interest with this girlfriend.
The fact that if I ever got stuck, there would almost certainly be someone able and willing to help me, without judging me for my newb mistakes, made this whole thing seem so much less distressing.

Yet another reason is that I've been toying with the idea of founding my own hosting company for quite a while now.
Don't get me wrong, I really like my job as a frontend dev and I plan on keeping it, but my wage isn't super high and I really want to have a whirlpool at some point in my life. (Having EDS and spending all day looking at computer screens really is a perfect storm for neck and shoulder tension; while being a trans woman means I can't go to public swimming pools without risking harassment and violence by cis people who can't mind their own damn business.)
Pretty much all hosting companies I've got experience with (either using their service, working for them, or reading about them) are huge companies that make their money from selling private data and/ or have absolutely terrible web interfaces and/ or have shitty working conditions for their employees.
The idea of having a worker owned hosting firm that cares about privacy, usability and accessibility feels really good.

Implementation

Meta

It all started roughly a month ago when a girlfriend drew a chart explaining the different parts needed to deploy an app to a Kubernetes Cluster on our whiteboard after I'd told her I was considering moving my infrastructure into a cluster.
She was only visiting for a week and after she'd left I started writing deployments for all the apps I've got.
Since I needed Docker images for my own apps, this also led to me finally learning how to use Drone CI and building pipelines for it.

I started out testing on a local minikube cluster (that performed surprisingly well on my 6 core laptop) and once I felt more or less ready to move into a more production-like environment, my other girlfriend set up some VMs to deploy a testing cluster on.
Since I didn't want to enter the same commands on all nodes, this also led to me finally learning the basics of Ansible.

Some more weekends passed and last week I realized that I was done and had deployed all of my apps into the testing cluster.
This meant that yesterday I was finally ready to re-install my servers, book an additional one as controller node, and set up this cluster.

Tech stack

There goes my "security by obscurity" I guess.
Not that that mattered anyways.

CNI

The container network interface plugin allows pods and nodes inside a cluster to communicate with one another over the network.
Kubernetes does not ship with one per default so you have to decide which you want to use.

First, I set up Flannel since it works with very little configuration.
However, while I was doing this, the girlfriend who is explaining Kubernetes to me developed her own CNI plugin replacement, which is better suited for my deployment and has less moving parts that could break.
She has also written a blog post that explains this better than I ever could, so if you're interested, check that out, too.
To my knowledge, it is also the first CNI to officially support IPv6, which is great. (Although, sadly none of my services are available via IPv6 at the moment because my home ISP only supports v4 and I can't test v6)
Update: IPv6 support is apparently merged into Flannel, it’s just not yet released.
Update 2: My services are now reachable through IPv6 as well since I was able to test it using one of my hosts and curl.

Ingress Controller

Very simplified, the Ingress Controller is what allows your apps to be available on the Internet.
Once again, Kubernetes does not ship with one per default since different setups have widely different requirements.

I first chose Traefik as this is what I was already using as a reverse proxy with my Docker based deployments, however after some evenings spent cursing at my computer screen, I decided to go with the nginx Ingress Controller that is maintained by the Kubernetes community instead.
My decision to do this was mostly motivated by the fact the Traefik breaks with a lot of standards, that make it more difficult to replace, while having a documentation that I was simply not able to parse well.
The nginx Ingress Controller is well maintained, follows standards and has decent documentation as well as community support.

TLS Certificates

Since everything I host is only available over HTTPS, I needed a way to quickly and automatically generate and renew TLS certs.
Luckily, the open source cert-manager is really easy to deploy and integrates well with nginx.

All in all, this was the thing I had been procrastinating for most of my time building the testing cluster, while actually ending up being one of the easiest to deploy.

Storage

Storage within clusters is no easy task.
You need a way to replicate all data so every pod can be scheduled on every node and you have to do that in a performant way.

Here, I actually ran into the most issues. Rook simplifies deploying a Ceph cluster within a Kubernetes cluster a lot, however this is still a beast with loads of moving parts.
During deployment, one of these moving parts increased my downtime from a couple of minutes to 4 hours because it just refused to create the OSDs (Object Storage Devices) needed to store data.

When setting up my nodes, I chose the "small root partition" option from my provider, which creates a root partition that is only 8GiB.
After increasing this to 40GiB, I forgot to increase the ext4 partition size which led to my Ceph pods being constantly evicted as the hosts where running out of memory.

That wasn't the only issue.
Since my hosting provider does not allow me to have multiple block devices for a single host, I opted to install the Ceph OSDs in empty partitions.
When I created said partitions, I thought it would be a good idea to set the type to Ceph OSD, since that's what they are for.
Sadly, Ceph will not initialize OSDs on partitions with said type since it thinks another cluster has already claimed them.

However, some sweat and surprisingly little tears later, Ceph was running and so was most of my cluster.

The only thing missing right now is a replacement for my old static file hosting mechanisms.
For images in my blog I had deployed Chevereto, which I ended up not liking too much; the fact that they are ending support for the free version later this year made it clear I'll have to switch.
For slide uploads I used to have a basic nginx with indexing active.
It's a decent enough solution and it does work, however I want to go full cloud native and replace both image and slide hosting with S3 using my Ceph cluster.
I've still got to learn more about S3 before I can go live with this so at the moment, I don't have a hosting solution for these files in place.

CI

Since I am deploying some apps that I wrote myself, I needed some CI solution.
Here, I chose to go with Drone CI to build images using Kaniko and push them to a private Docker registry.

I was using Drone productively during the last month when it was still running on my Docker infrastructure and everything was working well.
The Kubernetes runner (that runs the pipelines) is sadly still in beta and has some quirks.
It took me quite a while to realize that my security config wasn't broken, instead Drone kept trying to write secrets into the default namespace even though it was deployed into a different one.
Once I moved it into the correct namespace, Drone started building and pushing my images without issues.

Conclusion

There you have it. In about a month of work (next to my day job) I managed to move all of my applications (except for my mail server, which will probably follow at some point) into a Kubernetes Cluster set up from scratch without relying on any of the large cloud hosting providers.

This was an incredibly fun project and I'm rather proud of what I managed to accomplish here.
This would not have been possible without the help from my wonderful girlfriends, who helped me with their knowledge and lots of patience.
I am thankful to everyone who helped me out along the way and I hope to become an active part of the Kubernetes community within the next few time units.

This post will likely be updated in the following days to accommodate for any feedback relating possible technical or major grammatical mistakes.