The Problem with CPUs and Kubernetes

Key Takeaway: os .cpus() returns the number of cores on a Kubernetes host, not the number of cores assigned to a pod. Investigating excessive memory usage Recently, when I was looking through a cluster health dashboard for a Kubernetes cluster, I noticed that one of the applications deployed was using a considerable amount of RAM - way more than I thought could be reasonable. Each instance (pod) of the application used approximately 8 GB of RAM, which was definitely excessive for a reasonably simple NodeJS webserver....

June 2, 2021 · 2 min

Adding Observability to Vault

One of the things I like to do when setting up a Vault cluster is to visualise all the operations Vault is performing, which helps see usage patterns changing, whether there are lots of failed requests coming in, and what endpoints are receiving the most traffic. While Vault has a lot of data available in Prometheus telemetry, the kind of information I am after is best taken from the Audit backend....

May 27, 2021 · 7 min

Getting NodeJS OpenTelemetry data into NewRelic

I had the need to get some OpenTelemetry data out of a NodeJS application, and into NewRelic’s distributed tracing service, but found that there is no way to do it directly, and in this use case, adding a separate collector is more hassle than it’s worth. Luckily, there is an NodeJS OpenTelemetry library which can report to Zipkin, and NewRelic can also ingest Zipkin format data. To use it was relatively straight forward:...

March 12, 2021 · 2 min

Observability with Infrastructure as Code

This article was originally published on the Pulumi blog. When using the Pulumi Automation API to create applications which can provision infrastructure, it is very handy to be able to use observability techniques to ensure the application functions correctly and to help see where performance bottlenecks are. One of the applications I work on creates a VPC and Bastion host and then stores the credentials into a Vault instance. The problem is that the “create infrastructure” part is an opaque blob, in that I can see it takes 129 seconds to create, but I can’t see what it’s doing, or why it takes this amount of time....

March 1, 2021 · 4 min

Forking Multi Container Docker Builds

Following on from my last post on Isolated Multistage Docker Builds, I thought it would be useful to cover another advantage to splitting your dockerfiles: building different output containers from a common base. The Problem When I have an application which when built, needs to have all assets in one container, and a subset of assets in a second container. For example, writing a node webapp, where you want the compiled/bundled static assets available in the container as a fallback, and also stored in an nginx container for serving....

November 3, 2020 · 3 min

Isolated Docker Multistage Images

Often when building applications, I will use a multistage docker build for output container size and efficiency, but will run the build in two halves, to make use of the extra assets in the builder container, something like this: docker build \ --target builder \ -t builder:$GIT_COMMIT \ . docker run --rm \ -v "$PWD/artefacts/tests:/artefacts/tests" \ builder:$GIT_COMMIT \ yarn ci:test docker run --rm \ -v "$PWD/artefacts/lint:/artefacts/lint" \ builder:$GIT_COMMIT \ yarn ci:lint docker build \ --cache-from builder:$GIT_COMMIT \ --target output \ -t app:$GIT_COMMIT \ ....

November 1, 2020 · 3 min

Better BASHing Through Technology

I write a lot of bash scripts for both my day job and my personal projects, and while they are functional, bash scripts always seem to lack that structure that I want, especially when compared to writing something in Go or C#. The main problem I have with bash scripts is that when I use functions, I lose the ability to log things. For example the get_config_path function will print the path to the configuration file, which will get consumed by the do_work function:...

August 28, 2020 · 5 min

Sharing Docker Layers Between Build Agents

Recently, I noticed that when we pull a new version of our application’s docker container, it fetches all layers, not just the ones that change. The problem is that we use ephemeral build agents, which means that each version of the application is built using a different agent, so Docker doesn’t know how to share the layers used. While we can pull the published container before we run the build, this only helps with the final stage of the build....

May 14, 2020 · 4 min

Service Mesh with Consul Connect (and Nomad)

When it comes to implementing a new feature in an application’s ecosystem, I don’t like spending my innovation tokens unless I have to, so I try not to add new tools to my infrastructure unless I really need them. This same approach comes when I either want, need, or have been told, to implement a Service Mesh. This means I don’t instantly setup Istio. Not because it’s bad - far from it - but because it’s extra complexity I would rather avoid, unless I need it....

May 4, 2020 · 6 min

Observability Without Honeycomb

Before I start on this, I want to make it clear that if you can buy Honeycomb, you should. Outlined below is how I started to add observability to an existing codebase which already had the ELK stack available, and was unable to use Honeycomb. My hope, in this case, is that I can demonstrate how much value observability gives, and also show how much more value you would get with an excellent tool, such as Honeycomb....

March 15, 2020 · 7 min