Home Assistant, OpenKruise, and Container Loading Order
Avoiding recorder database failures by making sure the right container starts first.
As those of you who have been using my configuration (or, indeed, your own configuration) for Home Assistant on Kubernetes will know, there is one minor problem with it.
Namely, there is the inconvenience - when using a separate instance for the recorder database - of ensuring things start up in the required order. In Kubernetes itself, there is no way to control the ordering of startup of containers within a single pod (as the configuration given over there uses), or - obviously enough - of controlling the order of startup of two unrelated deployments. This is a problem if your MySQL instance or other recorder database container happens to start up after, or more slowly, your Home Assistant container does because the recorder integration, rather fundamental to the whole thing, won’t start up properly and you’ll have to restart Home Assistant manually to get it working properly.
So much for redundancy and self-repair, alas.
Fortunately, there is a solution, and that solution lies with a rather nifty extension for Kubernetes, OpenKruise. More specifically, it lies with the Container Launch Priority feature of OpenKruise, which fills in that particular Kubernetes feature gap, and lets you ensure that within your unified Home Assistant pod, the recorder database server always starts up before Home Assistant does.
For example, here is the current look of my Home Assistant deployment:
--- | |
apiVersion: apps/v1 | |
kind: Deployment | |
metadata: | |
labels: | |
app: homeassistant | |
name: homeassistant | |
namespace: homeassistant | |
spec: | |
replicas: 1 | |
selector: | |
matchLabels: | |
app: homeassistant | |
template: | |
metadata: | |
annotations: | |
apps.kruise.io/container-launch-priority: Ordered | |
labels: | |
app: homeassistant | |
spec: | |
nodeName: princess-celestia | |
volumes: | |
- name: ha-mysql-storage | |
hostPath: | |
path: /opt/ha-mysql | |
type: DirectoryOrCreate | |
- name: ha-storage | |
nfs: | |
server: mnemosyne.arkane-systems.lan | |
path: "/swarm/harmony/homeassistant/ha" | |
- name: ha-media | |
nfs: | |
server: mnemosyne.arkane-systems.lan | |
path: "/Media" | |
containers: | |
- image: mysql:latest | |
name: mysql | |
env: | |
- name: MYSQL_ROOT_PASSWORD | |
valueFrom: | |
secretKeyRef: | |
name: mysql-recorder-pass | |
key: password | |
ports: | |
- name: mysql | |
containerPort: 3306 | |
protocol: TCP | |
volumeMounts: | |
- name: ha-mysql-storage | |
mountPath: /var/lib/mysql | |
- image: cerebrate/home-assistant:2022.7.7 | |
name: home-assistant | |
volumeMounts: | |
- mountPath: "/config" | |
name: ha-storage | |
- mountPath: "/media" | |
name: ha-media |
The only addition I had to make here (with OpenKruise installed on my cluster, a simple task with Helm) was the line:
apps.kruise.io/container-launch-priority: Ordered
in the pod specification. With that there, the second-defined container, home-assistant, is prevented from starting until the first-defined, mysql, is already up and running, and thereby the recorder integration will always find the database it is looking for.
If you’re considering using this solution, you should take a look through the other features OpenKruise offers as well. There are a lot of useful items there that have greatly simplified the management of my home cluster: I commend AdvancedCronJob, BroadcastJob, ImagePullJob (for ensuring basic utility images are available on every node, for example), and ResourceDistribution in particular to your attention.
Why not create a polling mechanism, instead of this "state" managment?