Embedding ain't easy, but its alright

I write a lot of blog posts which contain code snippets. Like most people, I have done this using a fenced codeblock in markdown which is fine for short blocks of code. The problem occours when I am embedding code from another repository; often there will be tweaks or bug fixes, and keeping the code in the blog post in sync with the code in the other repo is annoying and manual....

September 14, 2022 · 3 min

The reports of UML's death are greatly exaggerated

This is in response to the recent posts about the death of UML; while I think some parts of UML have fallen ill, the remaining parts are still alive, and useful to this day. TLDR Out of 14 types of diagram there are 3 that I use on a regular basis: Activity Diagram, State Machine Diagram, and Sequence Diagram. I think the Timing Diagram is borderline, but I can only think of a couple of occasions when it has been useful....

September 11, 2022 · 4 min

Pulumi Conditional Infrastructure for Speed

One of the reasons I prefer Pulumi over Terraform is the additional control I have over my processes due to the fact that it’s a programming language. For example, I have a CLI, that creates a cluster of machines for a user; the machines use IAM Authentication with Vault so that they can request certificates on boot. The trouble with this application is that it is slow; it takes 175 seconds on average to provision the machines, write the IAM information to Vault, and then re-run the cloud-init script on all the machines in the cluster (as when they first booted, the configuration hadn’t been written to Vault yet....

July 17, 2022 · 3 min

An NGINX and DNS based outage

I recently encountered a behaviour in Nginx that I didn’t expect and caused a production outage in the process. While I would love to blame DNS for this, as it’s usually the cause of most network-related issues, in this case, the fault lies with Nginx. I was running a very simple Nginx proxy, relaying an internal service to the outside world. The internal service is behind an AWS ALB, and the Nginx configuration was proxying to the ALB’s FQDN:...

April 23, 2022 · 3 min

The Operator Pattern in Nomad

The Operator Pattern from Kubernetes is an excellent way of handling tasks in a cluster in an automated way, for example, provisioning applications, running backups, requesting certificates, and injecting chaos testing. As a Nomad user, I wanted to do something similar for my clusters, so I set about seeing how it would be possible. It turns out; it is much easier than I expected! While Nomad doesn’t support the idea of Custom Resource Definitions, we can achieve an operator by utilising a regular Nomad job and the nomad HTTP API....

November 22, 2021 · 9 min

How do you tag docker images?

An interesting question came up at work today: how do you tag your Docker images? In previous projects, I’ve always used a short git sha, or sometimes a semver, but with no great consistency. As luck would have it, I had pushed for a change in tagging format at a client not so long ago as the method we were using didn’t make a lot of sense and, worst of all, it was a manual process....

November 10, 2021 · 6 min

The Problem with CPUs and Kubernetes

Key Takeaway: os .cpus() returns the number of cores on a Kubernetes host, not the number of cores assigned to a pod. Investigating excessive memory usage Recently, when I was looking through a cluster health dashboard for a Kubernetes cluster, I noticed that one of the applications deployed was using a considerable amount of RAM - way more than I thought could be reasonable. Each instance (pod) of the application used approximately 8 GB of RAM, which was definitely excessive for a reasonably simple NodeJS webserver....

June 2, 2021 · 2 min

Adding Observability to Vault

One of the things I like to do when setting up a Vault cluster is to visualise all the operations Vault is performing, which helps see usage patterns changing, whether there are lots of failed requests coming in, and what endpoints are receiving the most traffic. While Vault has a lot of data available in Prometheus telemetry, the kind of information I am after is best taken from the Audit backend....

May 27, 2021 · 7 min

Getting NodeJS OpenTelemetry data into NewRelic

I had the need to get some OpenTelemetry data out of a NodeJS application, and into NewRelic’s distributed tracing service, but found that there is no way to do it directly, and in this use case, adding a separate collector is more hassle than it’s worth. Luckily, there is an NodeJS OpenTelemetry library which can report to Zipkin, and NewRelic can also ingest Zipkin format data. To use it was relatively straight forward:...

March 12, 2021 · 2 min

Observability with Infrastructure as Code

This article was originally published on the Pulumi blog. When using the Pulumi Automation API to create applications which can provision infrastructure, it is very handy to be able to use observability techniques to ensure the application functions correctly and to help see where performance bottlenecks are. One of the applications I work on creates a VPC and Bastion host and then stores the credentials into a Vault instance. The problem is that the “create infrastructure” part is an opaque blob, in that I can see it takes 129 seconds to create, but I can’t see what it’s doing, or why it takes this amount of time....

March 1, 2021 · 4 min