Home Kubernetes

As with the last post, a job I’m looking at requires Kubernetes experience. This page is for documenting my efforts for setting up Kubernetes in my home network for something practical, deploying a Telegram bot to monitor Garmin inReach satellite trackers for paragliding competitions to forward messages to an entire group.

This page will be on my learnings, and notes for setting up my K8s cluster!

At Facebook, we had Twine (internally still called Tupperware) to manage our jobs. I’ve managed to avoid touching Kubernetes for the past decade, so, a lot of catching up is required. I had previously tried running my jobs in a HA platform by using Docker Swarm, mostly across Raspberry Pi’s, with GlusterFS used across my nodes SD Cards for a reliable persistent storage. At times, the IO wait was in the order of seconds, and, it didn’t take me very long to rip that out.

I have probably over-corrected now. I have a few machines, but the only ones of real use for compute power are the rPi5 with 8GB of RAM, and two gifted x86_64 machines. I run Ansible across these machines, and my playbooks deploy any jobs that I want on one machine. Any persistent storage is stored locally to that node, with hourly rsync cronjobs setup to back them up. In the event of a hardware failure, I’d just modify my ansible to bring up the containers on a different node.

The new goal is to go back to having highly available setup, this time with Kubernetes. To get used to more of the terms and configuration options, and slowly move my apps from being deployed on Docker to being in my Kubernetes cluster. I’ll only use the nodes with 8GB of RAM for this experiment.

Setting up Kubernetes

Requirements

Dual stack, each container needs to able to access the IPv6 internet
Needs to host web services under home.scottyob.com
Needs to be highly available. Any node can fail.

Network Container Interface

I’m using IPv6 DHVPv6 PD to get my prefixes from my ISP.
There’s no guarantees of keeping the same subnets after a power outage, so I intend to use IPv6 unique-local addresses within my cluster. I intend to NAT outgoing connections on the pods to reach the public internet.

As Flannel does not support doing IPv6 NAT, to keep this simple so I won’t have to manage externally, I’m going to use Calico. My Calico config uses a private RFC1918 IPv4 subnet, and an IPv6 Unique Local Address (ULA) to ensure NAT outbound connections are made.

This also has the side effect of using a VXLAN encapsulation if inter-pod traffic was to traverse different subnets on the host.

Control Plane

I’m starting by setting up the virtual address, which should take up priority on node05

VRRP Address will float among all nodes

Each of these three nodes will be control nodes, “untained” then so they can also schedule compute jobs on them.

Ansible

I’ll move my Ansible secrets out and make my repo public one of these days. In the mean time, the task basically:

Creates the k8s apt repos
loads kernel modules for overlay and netfilter
ensured ip forwarding is set
Sets up containerd to use default configs, ensuring SystemdCgroup is set to true
Installs Kubernetes
Sets up the Kubelet config to use a different resolv.conf (I had a wildcards DNS record used on my search domain)

Creating the cluster

Create my cluster configs files
Initialize k8s, and apply the Calico resources

$ sudo kubeadm init --config ./kubeadm.conf
$ kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/tigera-operator.yaml
$ kubectl apply -f ./custom-resources.yaml

Generate a cert key to allow other nodes to join (secrets modified)

[scott@node05 k8s]$ sudo kubeadm init phase upload-certs --upload-certs
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
2384902384903242
[scott@node05 k8s]$ kubeadm token create --print-join-command --certificate-key 2384902384903242
kubeadm join 10.11.1.200:6443 --token jc3ltn.8xelqcilixymd0h2 --discovery-token-ca-cert-hash sha256:237893247823478293478923 --control-plane --certificate-key 2384902384903242
[scott@node05 k8s]$

Run the given kubeadm command on new nodes
Untaint the nodes so workloads can be scheduled on them

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

Starting again

If we want to blow everything away, and, start again.

sudo kubeadm reset -f
sudo systemctl stop kubelet
sudo systemctl stop containerd
sudo rm -rf /etc/cni/net.d /var/lib/cni/ /var/lib/kubelet/* /etc/kubernetes/ /var/lib/etcd
sudo systemctl start containerd

sudo ctr -n k8s.io containers list -q | xargs -r -n1 sudo ctr -n k8s.io containers delete
sudo ctr -n k8s.io images list -q | xargs -r -n1 sudo ctr -n k8s.io images rm

rm -Rf $HOME/.kube

Useful Commands

kubectl get all,nodes -A
[scott@node05 k8s]$ kubectl get all,nodes -A
NAMESPACE          NAME                                          READY   STATUS    RESTARTS      AGE
calico-apiserver   pod/calico-apiserver-595d48b4b9-5gp6t         1/1     Running   0             3h39m
calico-apiserver   pod/calico-apiserver-595d48b4b9-t8zn9         1/1     Running   0             3h39m
calico-system      pod/calico-kube-controllers-7c85cdbf5-hs79p   1/1     Running   0             3h39m
calico-system      pod/calico-node-dcs2z                         1/1     Running   3 (53m ago)   94m
calico-system      pod/calico-node-jktwp                         1/1     Running   0             3h39m
calico-system      pod/calico-node-zdznf                         1/1     Running   0             3h27m
calico-system      pod/calico-typha-869596b6df-684jl             1/1     Running   0             3h39m
calico-system      pod/calico-typha-869596b6df-qrztz             1/1     Running   3 (53m ago)   94m
calico-system      pod/csi-node-driver-7wtpl                     2/2     Running   0             3h39m
calico-system      pod/csi-node-driver-f42kr                     2/2     Running   6 (53m ago)   94m
calico-system      pod/csi-node-driver-zfcbm                     2/2     Running   0             3h27m
calico-system      pod/goldmane-68c899b75-ltwdn                  1/1     Running   0             3h39m
calico-system      pod/whisker-57bfbf8454-h9cgx                  2/2     Running   0             3h39m
kube-system        pod/coredns-66bc5c9577-ghxkr                  1/1     Running   0             3h40m
kube-system        pod/coredns-66bc5c9577-t78qq                  1/1     Running   0             3h40m
kube-system        pod/etcd-node02                               1/1     Running   3 (53m ago)   93m
kube-system        pod/etcd-node04                               1/1     Running   1             3h27m
kube-system        pod/etcd-node05                               1/1     Running   0             3h40m
kube-system        pod/kube-apiserver-node02                     1/1     Running   4 (53m ago)   93m
kube-system        pod/kube-apiserver-node04                     1/1     Running   11            3h27m
kube-system        pod/kube-apiserver-node05                     1/1     Running   0             3h40m
kube-system        pod/kube-controller-manager-node02            1/1     Running   3 (53m ago)   93m
kube-system        pod/kube-controller-manager-node04            1/1     Running   2             3h27m
kube-system        pod/kube-controller-manager-node05            1/1     Running   0             3h40m
kube-system        pod/kube-proxy-77bsq                          1/1     Running   0             3h40m
kube-system        pod/kube-proxy-p8g58                          1/1     Running   3 (53m ago)   94m
kube-system        pod/kube-proxy-qphqt                          1/1     Running   0             3h27m
kube-system        pod/kube-scheduler-node02                     1/1     Running   3 (53m ago)   93m
kube-system        pod/kube-scheduler-node04                     1/1     Running   2             3h27m
kube-system        pod/kube-scheduler-node05                     1/1     Running   0             3h40m
tigera-operator    pod/tigera-operator-db78d5bd4-2pjg8           1/1     Running   0             3h40m

NAMESPACE          NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
calico-apiserver   service/calico-api                        ClusterIP   10.100.73.102    <none>        443/TCP                  3h39m
calico-system      service/calico-kube-controllers-metrics   ClusterIP   None             <none>        9094/TCP                 3h38m
calico-system      service/calico-typha                      ClusterIP   10.110.48.196    <none>        5473/TCP                 3h39m
calico-system      service/goldmane                          ClusterIP   10.105.198.118   <none>        7443/TCP                 3h39m
calico-system      service/whisker                           ClusterIP   10.96.255.181    <none>        8081/TCP                 3h39m
default            service/kubernetes                        ClusterIP   10.96.0.1        <none>        443/TCP                  3h40m
kube-system        service/kube-dns                          ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP   3h40m

NAMESPACE       NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
calico-system   daemonset.apps/calico-node       3         3         3       3            3           kubernetes.io/os=linux   3h39m
calico-system   daemonset.apps/csi-node-driver   3         3         3       3            3           kubernetes.io/os=linux   3h39m
kube-system     daemonset.apps/kube-proxy        3         3         3       3            3           kubernetes.io/os=linux   3h40m

NAMESPACE          NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
calico-apiserver   deployment.apps/calico-apiserver          2/2     2            2           3h39m
calico-system      deployment.apps/calico-kube-controllers   1/1     1            1           3h39m
calico-system      deployment.apps/calico-typha              2/2     2            2           3h39m
calico-system      deployment.apps/goldmane                  1/1     1            1           3h39m
calico-system      deployment.apps/whisker                   1/1     1            1           3h39m
kube-system        deployment.apps/coredns                   2/2     2            2           3h40m
tigera-operator    deployment.apps/tigera-operator           1/1     1            1           3h40m

NAMESPACE          NAME                                                DESIRED   CURRENT   READY   AGE
calico-apiserver   replicaset.apps/calico-apiserver-595d48b4b9         2         2         2       3h39m
calico-system      replicaset.apps/calico-kube-controllers-7c85cdbf5   1         1         1       3h39m
calico-system      replicaset.apps/calico-typha-869596b6df             2         2         2       3h39m
calico-system      replicaset.apps/goldmane-68c899b75                  1         1         1       3h39m
calico-system      replicaset.apps/whisker-57bfbf8454                  1         1         1       3h39m
calico-system      replicaset.apps/whisker-66f55fc78c                  0         0         0       3h39m
calico-system      replicaset.apps/whisker-76944599f8                  0         0         0       3h39m
kube-system        replicaset.apps/coredns-66bc5c9577                  2         2         2       3h40m
tigera-operator    replicaset.apps/tigera-operator-db78d5bd4           1         1         1       3h40m

NAMESPACE   NAME          STATUS   ROLES           AGE     VERSION
            node/node02   Ready    control-plane   94m     v1.34.0
            node/node04   Ready    control-plane   3h27m   v1.34.0
            node/node05   Ready    control-plane   3h40m   v1.34.1
[scott@node05 k8s]$

Gets all namespaces pods, deployments, nodes

Key Terms

CRI Container Runtime Interface: Interface to allow Kubernetes agent (kubelet, runs on local host) to talk to a runtime. This can be things like docker (with a shim) or containerd
CTL ctr tools (or nerdctl for docker like syntax) are just CLI tools to manage the runtime (containerd).
CNI Container Network Interface. Used to create networks, setup environment for forwarding between pods.
Kube-Proxy container that runs on every node enable communication between services.
- When kubernetes uses a service (abstraction for a set of pods), proxy will ensure that a client talks to a service IP. Request is routed to one of the healthy pods backing service.
- Adds routes to ensure that communication can occur.
- Supports operating modes
  - ipvs: Load-balancer in the kernel
  - iptables: Simple to route traffic.