The Operator Pattern in Nomad

The Operator Pattern from Kubernetes is an excellent way of handling tasks in a cluster in an automated way, for example, provisioning applications, running backups, requesting certificates, and injecting chaos testing. As a Nomad user, I wanted to do something similar for my clusters, so I set about seeing how it would be possible. It turns out; it is much easier than I expected! While Nomad doesn’t support the idea of Custom Resource Definitions, we can achieve an operator by utilising a regular Nomad job and the nomad HTTP API....

November 22, 2021 · 9 min

The Problem with CPUs and Kubernetes

Key Takeaway: os .cpus() returns the number of cores on a Kubernetes host, not the number of cores assigned to a pod. Investigating excessive memory usage Recently, when I was looking through a cluster health dashboard for a Kubernetes cluster, I noticed that one of the applications deployed was using a considerable amount of RAM - way more than I thought could be reasonable. Each instance (pod) of the application used approximately 8 GB of RAM, which was definitely excessive for a reasonably simple NodeJS webserver....

June 2, 2021 · 2 min

Adding Observability to Vault

One of the things I like to do when setting up a Vault cluster is to visualise all the operations Vault is performing, which helps see usage patterns changing, whether there are lots of failed requests coming in, and what endpoints are receiving the most traffic. While Vault has a lot of data available in Prometheus telemetry, the kind of information I am after is best taken from the Audit backend....

May 27, 2021 · 7 min

Observability with Infrastructure as Code

This article was originally published on the Pulumi blog. When using the Pulumi Automation API to create applications which can provision infrastructure, it is very handy to be able to use observability techniques to ensure the application functions correctly and to help see where performance bottlenecks are. One of the applications I work on creates a VPC and Bastion host and then stores the credentials into a Vault instance. The problem is that the “create infrastructure” part is an opaque blob, in that I can see it takes 129 seconds to create, but I can’t see what it’s doing, or why it takes this amount of time....

March 1, 2021 · 4 min

Nomad Isolated Exec

One of the many features of Nomad that I like is the ability to run things other than Docker containers. It has built-in support for Java, QEMU, and Rkt, although the latter is deprecated. Besides these inbuilt “Task Drivers” there are community maintained ones too, covering Podman, LXC, Firecraker and BSD Jails, amongst others. The one I want to talk about today, however, is called exec. This Task Driver runs any given executable, so if you have an application which you don’t want (or can’t) put into a container, you can still schedule it with Nomad....

February 29, 2020 · 4 min

Consul DNS Fowarding in Alpine, revisited

I noticed when running an Alpine based virtual machine with Consul DNS forwarding set up, that sometimes the machine couldn’t resolve *.consul domains, but not in a consistent manner. Inspecting the logs looked like the request was being made and responded to successfully, but the result was being ignored. After a lot of googling and frustration, I was able to track down that it’s down to a difference (or optimisation) in musl libc, which glibc doesn’t do....

December 30, 2019 · 4 min

Nomad Good, Kubernetes Bad

I will update this post as I learn more (both positive and negative), and is here to be linked to when people ask me why I don’t like Kubernetes, and why I would pick Nomad in most situations if I chose to use an orchestrator at all. TLDR: I don’t like complexity, and Kubernetes has more complexity than benefits. Operational Complexity Operating Nomad is very straight forward. There are very few moving parts, so the number of things which can go wrong is significantly reduced....

November 21, 2019 · 6 min

Creating a Vault instance with a TLS Consul Cluster

So we want to set up a Vault instance, and have it’s storage be a TLS based Consul cluster. The problem is that the Consul cluster needs Vault to create the certificates for TLS, which is quite the catch-22. Luckily for us, quite easy to solve: Start a temporary Vault instance as an intermediate ca Launch Consul cluster, using Vault to generate certificates Destroy temporary Vault instance Start a permanent Vault instance, with Consul as the store Reprovision the Consul cluster with certificates from the new Vault instance There is a repository on Github with all the scripts used, and a few more details on some options....

October 6, 2019 · 3 min

Consul DNS Fowarding in Ubuntu, revisited

I was recently using my Hashibox for a test, and I noticed the DNS resolution didn’t seem to work. This was a bit worrying, as I have written about how to do DNS resolution with Consul forwarding in Ubuntu, and apparently something is wrong with how I do it. Interestingly, the Alpine version works fine, so it appears there is something not quite working with how I am configuring Systemd-resolved....

September 24, 2019 · 7 min

Canary Routing with Traefik in Nomad

I wanted to implement canary routing for some HTTP services deployed via Nomad the other day, but rather than having the traffic split by weighting to the containers, I wanted to direct the traffic based on a header. My first choice of tech was to use Fabio, but it only supports routing by URL prefix, and additionally with a route weight. While I was at JustDevOps in Poland, I heard about another router/loadbalancer which worked in a similar way to Fabio: Traefik....

June 23, 2019 · 8 min